It is a special form of caching that applies only to functions and methods. Depending upon the programming language, it may not involve the various eviction policies, invalidation and disk storage that a typical caching solution offers.

Common use cases for memoization are dynamic programming problems like Fibonacci sequence and factorial computation. It is commonly used to cache frequent computations that can add up to significant processing time. It is best to avoid memoization where the results of the function call may vary e.g. functions that make database calls may return different results and are not suited for memoization.

Let's take a simple example in groovy that defines a function for factorial. Factorial of a positive whole number can be defined as

`factorial(n) = n * factorial(n-1)`

. So, it can be defined as a recursive function like so.
//Example without memoization factorial = { n -> if(n == 0) { return 1 } else { println "-- factorial ("+n+") Invoked --" return n* factorial(n-1); } } println "Factorial of 1 = " + factorial(1) + "\n" println "Factorial of 2 = " + factorial(2) + "\n" println "Factorial of 3 = " + factorial(3) + "\n" println "Factorial of 4 = " + factorial(4) + "\n" println "Factorial of 5 = " + factorial(5) + "\n"

Results:

-- factorial (1) Invoked -- Factorial of 1 = 1 -- factorial (2) Invoked -- -- factorial (1) Invoked -- Factorial of 2 = 2 -- factorial (3) Invoked -- -- factorial (2) Invoked -- -- factorial (1) Invoked -- Factorial of 3 = 6 -- factorial (4) Invoked -- -- factorial (3) Invoked -- -- factorial (2) Invoked -- -- factorial (1) Invoked -- Factorial of 4 = 24 -- factorial (5) Invoked -- -- factorial (4) Invoked -- -- factorial (3) Invoked -- -- factorial (2) Invoked -- -- factorial (1) Invoked -- Factorial of 5 = 120

The program above prints a line every time the function is called. As you see in the results, every call to factorial recurses until it hits factorial(0). You can see a lot of redundant calls happening for computing factorials for numbers that have already been computed before. This algorithm could be easily improved by caching the value of each invocation and using them in the subsequent invocations. And memoization would help with that.

Now let's see the same example with the memoized closure function. The line where the memoization is applied is highlighted.

//Example with memoization factorial = { n -> if(n == 0) { return 1 } else { println "-- factorial ("+n+") Invoked --" return n* factorial(n-1); } }.memoize() println "Factorial of 1 = " + factorial(1) + "\n" println "Factorial of 2 = " + factorial(2) + "\n" println "Factorial of 3 = " + factorial(3) + "\n" println "Factorial of 4 = " + factorial(4) + "\n" println "Factorial of 5 = " + factorial(5) + "\n"Results:

-- factorial (1) Invoked -- Factorial of 1 = 1 -- factorial (2) Invoked -- Factorial of 2 = 2 -- factorial (3) Invoked -- Factorial of 3 = 6 -- factorial (4) Invoked -- Factorial of 4 = 24 -- factorial (5) Invoked -- Factorial of 5 = 120

With the memoization, you can see that the recursion has reduced to just once. The calls are cached and the results are used in the subsequent calls.

### Memoization support in Groovy

Groovy like many other functional languages supports Memoization as a language feature. It can memoize both closures and methods. What we saw in the example above was closure memoization.

#### Closures

Groovy supports the following methods on Closures for memoization.

`closure.memoize()`

- This caches all the calls to the closure. There are no limits. The cache gets cleared only upon garbage collection.
`closure.memoizeAtMost(n)`

- This caches a max of `n`

values. It uses an LRU (Last Recently Used) strategy for eviction. Garbage collection applies here as well and the cache may get cleared.
`closure.memoizeAtLeast(n)`

- This caches all calls without limit, just like `memoize`

. The difference is that when the garbage collection happens, it protects `n`

recent calls from being evicted.
`closure.memoizeBetween(n, m)`

- This protects `n`

items from garbage collection, but sets a max limit of `m`

values on the cache.
Let's check out an example with memoizeAtMost.

//Example with memoizeAtMost factorial = { n -> if(n == 0) { return 1 } else { println "-- factorial ("+n+") Invoked --" return n* factorial(n-1); } }.memoizeAtMost(1) println "Factorial of 1 = " + factorial(1) + "\n" println "Factorial of 2 = " + factorial(2) + "\n" println "Factorial of 1 = " + factorial(1) + "\n" println "Factorial of 2 = " + factorial(2) + "\n"Results:

-- factorial (1) Invoked -- Factorial of 1 = 1 -- factorial (2) Invoked -- Factorial of 2 = 2 -- factorial (1) Invoked -- Factorial of 1 = 1 -- factorial (2) Invoked -- Factorial of 2 = 2

As you can see from the result, the cache size for factorial is just 1. So, when we call

`factorial(1)`

followed by `factorial(2)`

, it evicts the result for 1 and replaces it with 2. If you set the memoizeAtMost to 2, the above example would result as follows
-- factorial (1) Invoked -- Factorial of 1 = 1 -- factorial (2) Invoked -- Factorial of 2 = 2 Factorial of 1 = 1 Factorial of 2 = 2

#### Method memoization

Memoization can be applied to class methods by annotating them with

`@Memoized`

. By default, the cache size has no limit, but if you wish to set a limit, then you can set the `maxCacheSize`

attribute on the annotation and it would behave just like `memoizeAtMost`

.
Let's see an example.

//Example with @Memoized import groovy.transform.Memoized class MemoizationTest { @Memoized(maxCacheSize = 2) def factorial(n) { if(n == 0) { return 1L } else { println "-- factorial ("+n+") Invoked --" return 1L*n* factorial(n-1); } } } def test = new MemoizationTest() println test.factorial(1) println test.factorial(2) println test.factorial(3)

Results:

-- factorial (1) Invoked -- 1 -- factorial (2) Invoked -- 2 -- factorial (3) Invoked -- 6

### Conclusion

As we saw in this post with examples, Memoization is a powerful technique that can be used to cache results and improve the overall performance of a program. Use it on functions where the result is expected to be the same for the given input.

## No comments:

## Post a Comment