Groovy recently added a new feature to its closures – memoization – a simple yet very useful feature that can drastically improve the performance when applicable.
It was a shame that I didn’t know the meaning of this term that was coined way back in 1968 by Donald Mitchie (ref: http://en.wikipedia.org/wiki/Memoization) [But now that I have memoized it, next time I am asked what it is, I will reply faster!
]
So, all this new feature means is that a groovy closure can now remember the result of its invocation for a specific set of inputs – by internally caching the result. So, the first invocation for a given set of inputs does the actual work, then the result gets cached behind the scenes and any further invocations with the same inputs simply return the cached result, making the invocation as fast as it can be. Of-course, this feature should be used only when closure processing is such that a set of inputs will ALWAYS result in the same output – it MUST be side-effect-free.
Here is an example to illustrate the use. Let’s first see it without memoization:
cl = {a, b ->
sleep(3000) // simulate some time consuming processing
a + b
}
def callClosure(a, b) {
def start = System.currentTimeMillis()
cl(a, b)
println "Inputs(a = $a, b = $b) - took ${System.currentTimeMillis() - start} msecs."
}
callClosure(1, 2)
callClosure(1, 2)
callClosure(2, 3)
callClosure(2, 3)
callClosure(3, 4)
callClosure(3, 4)
Here is the output showing that everytime the time cost of closure invocation in incurred:
Inputs(a = 1, b = 2) – took 3041 msecs.
Inputs(a = 1, b = 2) – took 3000 msecs.
Inputs(a = 2, b = 3) – took 3000 msecs.
Inputs(a = 2, b = 3) – took 3000 msecs.
Inputs(a = 3, b = 4) – took 3000 msecs.
Inputs(a = 3, b = 4) – took 3000 msecs.
Here is the memoized version:
cl = {a, b ->
sleep(3000) // simulate some time consuming processing
a + b
}
mem = cl.memoize()
def callClosure(a, b) {
def start = System.currentTimeMillis()
mem(a, b)
println "Inputs(a = $a, b = $b) - took ${System.currentTimeMillis() - start} msecs."
}
callClosure(1, 2)
callClosure(1, 2)
callClosure(2, 3)
callClosure(2, 3)
callClosure(3, 4)
callClosure(3, 4)
callClosure(1, 2)
callClosure(2, 3)
callClosure(3, 4)
Here is the output showing how only the first invocation incurs the processing cost and after that the calls with the same inputs are almost free:
Inputs(a = 1, b = 2) – took 3054 msecs.
Inputs(a = 1, b = 2) – took 0 msecs.
Inputs(a = 2, b = 3) – took 3001 msecs.
Inputs(a = 2, b = 3) – took 0 msecs.
Inputs(a = 3, b = 4) – took 3000 msecs.
Inputs(a = 3, b = 4) – took 0 msecs.
Inputs(a = 1, b = 2) – took 0 msecs.
Inputs(a = 2, b = 3) – took 0 msecs.
Inputs(a = 3, b = 4) – took 0 msecs.
The trade-off involved is that you pay for time saving in terms of extra memory consumed by the internally cached inputs/outputs mappings. If you want to have more control over how many results you want to cache, other variants to memoize() are – memoizeAtMost(n), memoizeAtLeast(n), memoizeBetween(m, n)
To quickly see how memoizeAtMost(n) works, in the previous example, let’s replace mem = cl.memoize() by mem = cl.memoizeAtMost(2) – and the output changes as below (now only 2 results are cached, with the LRU ones evicted):
Inputs(a = 1, b = 2) – took 3051 msecs.
Inputs(a = 1, b = 2) – took 0 msecs. /* 1st result – cached! */
Inputs(a = 2, b = 3) – took 3000 msecs.
Inputs(a = 2, b = 3) – took 0 msecs. /* 2nd result – cached !! */
Inputs(a = 3, b = 4) – took 3001 msecs.
/* 3rd result – cached – but now (a = 1, b = 2) result evicted – on LRU basis */
Inputs(a = 3, b = 4) – took 0 msecs.
/* (a=1, b=2) got evicted, processing happens again and result cached, now evicting (a=2, b=3) result */
Inputs(a = 1, b = 2) – took 3000 msecs.
/* (a=2, b=3) got evicted, processing happens again and result cached, now evicting (a=3, b=4) result */
Inputs(a = 2, b = 3) – took 3000 msecs.
/* (a=3, b=4) got evicted, processing happens again */
Inputs(a = 3, b = 4) – took 3000 msecs.
Good feature, isn’t it?
Note: The new feature is only available from groovy version 1.8-beta-3 onwards.
[...] This post was mentioned on Twitter by Guillaume Laforge, marek krajewski, nobeans, トドル, Roshan and others. Roshan said: Groovy (New feature): Closures can now memo(r)ize their results – http://goo.gl/vnUz [...]
Pingback by Tweets that mention Groovy (New feature): Closures can now memo(r)ize their results « Bits that interest me -- Topsy.com — October 18, 2010 @ 1:24 pm
Good feature! Is there any place where Groovy 1.8 API is published? Or it’s available only in trunk? I wonder how a cache key is created from closure arguments ..
Comment by Evgeny Goldin — October 18, 2010 @ 2:03 pm
Groovy doesn’t really provide version specific documentation, as far as I know. The only API it provides reflects the latest official/stable version. So currently http://groovy.codehaus.org/gapi/ points to 1.7.5
However, 1.8-beta-3 snapshot-based documentation can be found here: http://bamboo.ci.codehaus.org/download/GROOVY-GROOVYRELEASETRUNK/artifacts/build-222/documentation/gapi/index.html (package org.codehaus.groovy.runtime.memoize + changes to groovy.lang.Closure may be of interest for this feature)
Comment by roshandawrani — October 18, 2010 @ 2:14 pm
Generating key from closure args seems as simple as:
private static Object generateKey(final Object[] args) { if (args == null) return Collections.emptyList(); return asList(args); /* java.util.Arrays.asList */ }For more, check out: http://svn.codehaus.org/groovy/trunk/groovy/groovy-core/src/main/org/codehaus/groovy/runtime/memoize/Memoize.java
Comment by roshandawrani — October 18, 2010 @ 2:19 pm
Aha, so the key is just a list. I thought it would be a String, composed of arguments. Never used lists as Map keys (but I see now AbstractList provides both hashCode() and equals() so that’s Ok), will do now if it fits the problem.
Thank you!
Comment by Evgeny Goldin — October 19, 2010 @ 10:40 pm
excellent feature!
Comment by steven — October 19, 2010 @ 10:11 am
It really is! If you want to check it out right away, 1.8-beta-3′s snapshot builds(pre-release) are available at: http://snapshots.repository.codehaus.org/org/codehaus/groovy/groovy-all/1.8.0-beta-3-SNAPSHOT/
Comment by roshandawrani — October 19, 2010 @ 10:15 am
Looks like it behaves a bit odd for recursive call, or I am missing something..try this:
def startTime = System.currentTimeMillis()
def fact = {it -> sleep(100); it ? it * call(it – 1g) : 1g}
fact = fact.memoize()
println fact(70)
println fact(71)
println ((System.currentTimeMillis() – startTime) / 1000.0)
Or it’s posted here: http://groovyconsole.appspot.com/script/439003
Why doesn’t it memoize subsequent calls in this case? Because of call() in closure body isn’t routed via memoize() decorator?
Comment by Mikhail Antonov — March 14, 2011 @ 12:34 am
I think the correct way of using memoize with a recursive call is not to go through its internal “call” function, but as below (http://groovyconsole.appspot.com/script/442001):
——————————
def startTime = System.currentTimeMillis()
def mem
def fact = {it -> sleep(100); it ? it * mem(it – 1g) : 1g}
mem = fact.memoize()
println mem(60)
println ((System.currentTimeMillis() – startTime) / 1000.0)
startTime = System.currentTimeMillis()
println mem(61)
println ((System.currentTimeMillis() – startTime) / 1000.0)
——————————
Now I get output like following:
——————————
8320987112741390144276341183223364380754172606361245952449277696409600000000000000
6.175 (time taken for fact(60))
507580213877224798800856812176625227226004528988036003099405939480985600000000000000
0.203 (time taken for fact(61) due to memoize savings)
——————————
Comment by roshandawrani — March 14, 2011 @ 10:31 am
It’s strange that trying to call Closure.memoize() like mem(somestring) results in an exception for me, while using the call() method works (groovy 1.8.0 stable):
Caught: groovy.lang.MissingMethodException: No signature of method: org.codehaus.groovy.runtime.memoize.Memoize$MemoizeFunction.doCall() is applicable for argument types: (java.lang.String) values: [ghh]
Possible solutions: call(), call([Ljava.lang.Object;), call(java.lang.Object), call([Ljava.lang.Object;), equals(java.lang.Object), isCase(java.lang.Object)
at Tghh.main(Tghh.groovy:89)
Comment by avalon — June 20, 2011 @ 9:54 am
Please bring it up on Groovy User mailing list with the exact code you tried.
Comment by roshandawrani — June 20, 2011 @ 11:29 am