Groovy - What is Memoization?

Memoization (a.k.a tabling) is a technique in programming wherein the result of a function call is stored and when the function is subsequently called with the same parameters, the result is returned from the store instead of executing the function again.

It is a special form of caching that applies only to functions and methods. Depending upon the programming language, it may not involve the various eviction policies, invalidation and disk storage that a typical caching solution offers.

Common use cases for memoization are dynamic programming problems like Fibonacci sequence and factorial computation. It is commonly used to cache frequent computations that can add up to significant processing time. It is best to avoid memoization where the results of the function call may vary e.g. functions that make database calls may return different results and are not suited for memoization.

Let's take a simple example in groovy that defines a function for factorial. Factorial of a positive whole number can be defined as factorial(n) = n * factorial(n-1) . So, it can be defined as a recursive function like so.

//Example without memoization

factorial = { n ->
    if(n == 0) {
        return 1
    }
    else {
        println "-- factorial ("+n+") Invoked --"
        return n* factorial(n-1);
    }
}

println "Factorial of 1 = " + factorial(1) + "\n"
println "Factorial of 2 = " + factorial(2) + "\n"
println "Factorial of 3 = " + factorial(3) + "\n"
println "Factorial of 4 = " + factorial(4) + "\n"
println "Factorial of 5 = " + factorial(5) + "\n"


Results:
-- factorial (1) Invoked --
Factorial of 1 = 1

-- factorial (2) Invoked --
-- factorial (1) Invoked --
Factorial of 2 = 2

-- factorial (3) Invoked --
-- factorial (2) Invoked --
-- factorial (1) Invoked --
Factorial of 3 = 6

-- factorial (4) Invoked --
-- factorial (3) Invoked --
-- factorial (2) Invoked --
-- factorial (1) Invoked --
Factorial of 4 = 24

-- factorial (5) Invoked --
-- factorial (4) Invoked --
-- factorial (3) Invoked --
-- factorial (2) Invoked --
-- factorial (1) Invoked --
Factorial of 5 = 120


The program above prints a line every time the function is called. As you see in the results, every call to factorial recurses until it hits factorial(0). You can see a lot of redundant calls happening for computing factorials for numbers that have already been computed before. This algorithm could be easily improved by caching the value of each invocation and using them in the subsequent invocations. And memoization would help with that.

Now let's see the same example with the memoized closure function. The line where the memoization is applied is highlighted.

//Example with memoization

factorial = { n ->
    if(n == 0) {
        return 1
    }
    else {
        println "-- factorial ("+n+") Invoked --"
        return n* factorial(n-1);
    }
}.memoize()

println "Factorial of 1 = " + factorial(1) + "\n"
println "Factorial of 2 = " + factorial(2) + "\n"
println "Factorial of 3 = " + factorial(3) + "\n"
println "Factorial of 4 = " + factorial(4) + "\n"
println "Factorial of 5 = " + factorial(5) + "\n"


Results:
-- factorial (1) Invoked --
Factorial of 1 = 1

-- factorial (2) Invoked --
Factorial of 2 = 2

-- factorial (3) Invoked --
Factorial of 3 = 6

-- factorial (4) Invoked --
Factorial of 4 = 24

-- factorial (5) Invoked --
Factorial of 5 = 120


With the memoization, you can see that the recursion has reduced to just once. The calls are cached and the results are used in the subsequent calls.

Memoization support in Groovy


Groovy like many other functional languages supports Memoization as a language feature. It can memoize both closures and methods. What we saw in the example above was closure memoization.

Closures


Groovy supports the following methods on Closures for memoization.

closure.memoize() - This caches all the calls to the closure. There are no limits. The cache gets cleared only upon garbage collection.

closure.memoizeAtMost(n) - This caches a max of n values. It uses an LRU (Last Recently Used) strategy for eviction. Garbage collection applies here as well and the cache may get cleared.

closure.memoizeAtLeast(n) - This caches all calls without limit, just like memoize. The difference is that when the garbage collection happens, it protects n recent calls from being evicted.

closure.memoizeBetween(n, m) - This protects n items from garbage collection, but sets a max limit of m values on the cache.

Let's check out an example with memoizeAtMost.
//Example with memoizeAtMost

factorial = { n ->
    if(n == 0) {
        return 1
    }
    else {
        println "-- factorial ("+n+") Invoked --"
        return n* factorial(n-1);
    }
}.memoizeAtMost(1)

println "Factorial of 1 = " + factorial(1) + "\n"
println "Factorial of 2 = " + factorial(2) + "\n"
println "Factorial of 1 = " + factorial(1) + "\n"
println "Factorial of 2 = " + factorial(2) + "\n"

Results:
-- factorial (1) Invoked --
Factorial of 1 = 1

-- factorial (2) Invoked --
Factorial of 2 = 2

-- factorial (1) Invoked --
Factorial of 1 = 1

-- factorial (2) Invoked --
Factorial of 2 = 2


As you can see from the result, the cache size for factorial is just 1. So, when we call factorial(1) followed by factorial(2), it evicts the result for 1 and replaces it with 2. If you set the memoizeAtMost to 2, the above example would result as follows

-- factorial (1) Invoked --
Factorial of 1 = 1

-- factorial (2) Invoked --
Factorial of 2 = 2

Factorial of 1 = 1

Factorial of 2 = 2


Method memoization


Memoization can be applied to class methods by annotating them with @Memoized. By default, the cache size has no limit, but if you wish to set a limit, then you can set the maxCacheSize attribute on the annotation and it would behave just like memoizeAtMost.
Let's see an example.
//Example with @Memoized

import groovy.transform.Memoized

class MemoizationTest {

    @Memoized(maxCacheSize = 2)
    def factorial(n) {
        if(n == 0) {
            return 1L
        }
        else {
            println "-- factorial ("+n+") Invoked --"
            return 1L*n* factorial(n-1);
        }
    }

}

def test = new MemoizationTest()
println test.factorial(1)
println test.factorial(2)
println test.factorial(3)


Results:
-- factorial (1) Invoked --
1
-- factorial (2) Invoked --
2
-- factorial (3) Invoked --
6



Conclusion


As we saw in this post with examples, Memoization is a powerful technique that can be used to cache results and improve the overall performance of a program. Use it on functions where the result is expected to be the same for the given input.


No comments:

Post a Comment