Are you familiar with Memoization? Maybe you’ve heard of it, but are uncertain how you can use it in your code? In this post you’ll get an introductory look into memoization. You’ll learn what it is, how you can use it to speed up your code, and where memoization can bite you in the butt!
First off a simple definition:
Memoization is the process of storing a computed value to avoid duplicated work by future calls.
Broken down into basics memoization is always going to fit into the following pattern:
- Perform some work
- Store the work result
- Use stored results in future calls
In Ruby the most common pattern for memoizing a call is using the conditional assignment operator:
||=. If you’re not familiar with this operator I’d suggest reading Peter Cooper’s excellent explanation on it.
Let’s look at a common piece of code that occurs in many Rails applications and apply memoization to it.
If you’ve ever worked with a user login system, you’re likely familiar with the pattern of loading the
current_user from the
1 2 3
Within any one request in a Rails app you’ll usually see multiple calls to
current_user which means
User.find is run multiple times. Here’s an example console output from a request without memoization in place for
In this image that there is 1 call to
current_user that performs the initial query, then 5 more calls (represented by
CACHE). Those cache calls mean Rails is returning the cached result of the SQL query, but it doesn’t include the cost of building the
User object. And because Rails hides the cost of object creation these queries cost more than the 0.0ms and 0.1ms reported!
As a rule of thumb, if you see
CACHE (X.Xms) you should investigate for inefficiencies in your code base!
In our case, we know the problem is because there are multiple calls to
current_user occurring. Let’s fix this code by introducing memoization into the
current_user method and storing the result of
User.find using conditional assignment to an instance variable:
1 2 3
It’s important to notice that the result of
find is assigned to an instance variable instead of a local variable. If you were to use a local variable (a variable without the
@ symbol) then user wouldn’t be stored and the
find query would occur on every call to
current_user. Nothing would have improved.
Re-running the request with memoized
current_user the console output looks like this:
CACHE lines are gone, which means there are no more calls to rebuild the
User object each time
current_user is called! And if you look closely at the last line in each image, the total time drops 50ms by introducing memoization!
Like all programming techniques and tools memoization has its place. In our case we’re trading space (storage of the
User object) for faster query time. The trick is knowing when to use it and when not to use it.
When should you memoize?
- When you’ve got duplicated database calls (like
- When you’ve got expensive calculations
- When you’ve got repeated calculations that don’t change
When shouldn’t you memoize?
Memoize can introduce some very subtle bugs that are hard to track down. Memoization shouldn’t be used with methods that take parameters:
1 2 3 4 5 6 7 8 9 10
Or with methods that use instance variables:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
In both cases we see
full_name is memoized on the first call for
'Billy Bob' and as a result the second call produces
'Billy Bob' even though different values ought to be applied. Don’t let these contrived examples fool you, hitting problems like this in production code is a total pain!
Be sure to check out the follow up post on advanced memoization that shows you how to get around the pitfalls noted above!