Groovy - What is Memoization?

Memoization (a.k.a tabling) is a technique in programming where the result of a function call is stored and when the function is subsequently called with the same parameters, the result is returned from the store instead of executing the function again.

It is a special form of caching that applies only to functions and methods. Some programming language may support eviction policies, invalidation, and disk storage that a typical caching solution offers.

Common use cases for memoization are dynamic programming problems like Fibonacci sequence and factorial computation. It is commonly used to cache frequent computations that can add up to significant processing time. It is best to avoid memoization where the results of the function call may vary e.g. functions that make database calls may return different results and are not suited for memoization.

Let's take a simple example in groovy that defines a function for factorial. Factorial of a positive whole number can be defined as factorial(n) = n * factorial(n-1) . So, it can be defined as a recursive function like so.

//Example without memoization

factorial = { n ->
    if(n == 0) {
        return 1
    }
    else {
        println "-- factorial ("+n+") Invoked --"
        return n* factorial(n-1);
    }
}

println "Factorial of 1 = " + factorial(1) + "\n"
println "Factorial of 2 = " + factorial(2) + "\n"
println "Factorial of 3 = " + factorial(3) + "\n"
println "Factorial of 4 = " + factorial(4) + "\n"
println "Factorial of 5 = " + factorial(5) + "\n"


Results:
-- factorial (1) Invoked --
Factorial of 1 = 1

-- factorial (2) Invoked --
-- factorial (1) Invoked --
Factorial of 2 = 2

-- factorial (3) Invoked --
-- factorial (2) Invoked --
-- factorial (1) Invoked --
Factorial of 3 = 6

-- factorial (4) Invoked --
-- factorial (3) Invoked --
-- factorial (2) Invoked --
-- factorial (1) Invoked --
Factorial of 4 = 24

-- factorial (5) Invoked --
-- factorial (4) Invoked --
-- factorial (3) Invoked --
-- factorial (2) Invoked --
-- factorial (1) Invoked --
Factorial of 5 = 120


The program above prints a line every time the function is called. As you see in the results, every call to factorial recurses until it hits factorial(0). You can see a bunch of redundant calls happening for computing factorials for numbers that have already been computed before. This algorithm could be easily improved by caching the value of each invocation and using them in the subsequent invocations. And memoization would help with that.

Now let's see the same example with the memoized closure function. The line where the memoization is applied is highlighted.

//Example with memoization

factorial = { n ->
    if(n == 0) {
        return 1
    }
    else {
        println "-- factorial ("+n+") Invoked --"
        return n* factorial(n-1);
    }
}.memoize()

println "Factorial of 1 = " + factorial(1) + "\n"
println "Factorial of 2 = " + factorial(2) + "\n"
println "Factorial of 3 = " + factorial(3) + "\n"
println "Factorial of 4 = " + factorial(4) + "\n"
println "Factorial of 5 = " + factorial(5) + "\n"


Results:
-- factorial (1) Invoked --
Factorial of 1 = 1

-- factorial (2) Invoked --
Factorial of 2 = 2

-- factorial (3) Invoked --
Factorial of 3 = 6

-- factorial (4) Invoked --
Factorial of 4 = 24

-- factorial (5) Invoked --
Factorial of 5 = 120


With memoization, you can see that the recursion has reduced to just once. The calls are cached and the results are used in the subsequent calls.

Memoization support in Groovy


Groovy, like many other functional languages, supports Memoization as a language feature. It can memoize both closures and methods. What we saw in the example above was closure memoization.

Closures


Groovy supports the following methods on Closures for memoization.

closure.memoize() - This caches all the calls to the closure. There are no limits. The cache gets cleared only upon garbage collection.

closure.memoizeAtMost(n) - This caches a max of n values. It uses an LRU (Last Recently Used) strategy for eviction. Garbage collection applies here as well and the cache may get cleared.

closure.memoizeAtLeast(n) - This caches all calls without limit, just like memoize. The difference is that when the garbage collection happens, it protects n recent calls from being evicted.

closure.memoizeBetween(n, m) - This protects n items from garbage collection, but sets a max limit of m values on the cache.

Let's check out an example with memoizeAtMost.
//Example with memoizeAtMost

factorial = { n ->
    if(n == 0) {
        return 1
    }
    else {
        println "-- factorial ("+n+") Invoked --"
        return n* factorial(n-1);
    }
}.memoizeAtMost(1)

println "Factorial of 1 = " + factorial(1) + "\n"
println "Factorial of 2 = " + factorial(2) + "\n"
println "Factorial of 1 = " + factorial(1) + "\n"
println "Factorial of 2 = " + factorial(2) + "\n"

Results:
-- factorial (1) Invoked --
Factorial of 1 = 1

-- factorial (2) Invoked --
Factorial of 2 = 2

-- factorial (1) Invoked --
Factorial of 1 = 1

-- factorial (2) Invoked --
Factorial of 2 = 2


As you can see from the result, the cache size for factorial is just 1. So, when we call factorial(1) followed by factorial(2), it evicts the result for 1 and replaces it with 2. If you set the memoizeAtMost to 2, the above example would print like so

-- factorial (1) Invoked --
Factorial of 1 = 1

-- factorial (2) Invoked --
Factorial of 2 = 2

Factorial of 1 = 1

Factorial of 2 = 2


Method memoization


Memoization can be applied to class methods by annotating them with @Memoized. By default, the cache size has no limit, but if you wish to set a limit, then you can set the maxCacheSize attribute on the annotation and it would behave just like memoizeAtMost.
Let's see an example.
//Example with @Memoized

import groovy.transform.Memoized

class MemoizationTest {

    @Memoized(maxCacheSize = 2)
    def factorial(n) {
        if(n == 0) {
            return 1L
        }
        else {
            println "-- factorial ("+n+") Invoked --"
            return 1L*n* factorial(n-1);
        }
    }

}

def test = new MemoizationTest()
println test.factorial(1)
println test.factorial(2)
println test.factorial(3)


Results:
-- factorial (1) Invoked --
1
-- factorial (2) Invoked --
2
-- factorial (3) Invoked --
6



Best Practices


Memoization is not a full-fledged caching solution. A cache can be invalidated based on certain triggers and it supports other advanced features. Keep in mind the following
  • Ensure that the function being memoized, will always return the same result for the same input parameters. It should not be used for dynamic data e.g. data that is being fetched from the database and is bound to change.
  • Keep a check on the max values for Memoization. Keeping it unlimited may shoot up the memory usage and make your application latent instead.


Conclusion


As we saw in this post through examples, Memoization is a powerful technique that can be used to cache results and improve the overall performance of a program. Use it wisely, keeping in mind the limitations.


1 comment: