Sunday, July 12, 2009

Tinkering with jump tables

I've often wondered whether writing a case or cond-like macro that uses jump tables is worth the effort. To get an idea as to how much I should pursue this, I decided to take Tim Bradshaw's ncase macro and pit it against case.

Tim admits that ncase is meant to show what is possible with macros and that it may be the case it may not make any performance improvement at all. Still, I think it would be interesting to compare it and see the results, simply because I really don't know what will happen.

In addition to Tim's jump table, I wrote a simple jump table implementation called switch that uses a hash table to look up the value and associates the body with a thunk; the hash table is computed with an load-time-value to minimize table creation overhead. switch has the disadvantage of performing a hash table lookup and calling a function with the advantage that the values to match can be anything you would compare with equal.

To compare the running times, I generated functions that contain 10, 100 and 1000 branches respectively (plus the default branch). This was done for each of case, ncase and switch. Each branch would compare a single integer and return that integer. For example, the body of case-100 would look something like the following.

(case v (1 1) (2 2) ... (100 100) (t nil))

It's easy to see why I generated these function bodies with macros...

I ran each function one million times, making it hit the default case each time. The time required to run the function one million times on different Lisp systems are given below in milliseconds. These were run on a 2.16 GHZ 17" Macbook Pro running Mac OS X 10.5.7.

SBCL 1.0.29.54.rc2 case ncase switch
10 469 469 633
100 631 471 612
1000 2455 469 630
CLISP 2.47 case ncase switch
10 348 558 579
100 391 561 645
1000 375 560 571
ACL Free edition 8.1 case ncase switch
10 40 40 280
100 40 50 270
1000 4290 50 350

I was quite surprised at the results. I ran these a few times to make sure I wasn't missing something. In particular, I'm drawn to the results from CLISP. It seems lookups with case is constant. The reason this is so compelling is that SBCL and ACL are quite slow in these cases. ACL seems to be very fast otherwise.

I'll have to take a closer look at the disassembled code to see what is going on (or perhaps you can mention something in the comments if you know). Regardless, it's something that deserves some more attention.

Some notes:

  • SBCL took approximately 4 minutes to compile the ncase with 1000 branches. It also used up an awful lot of memory in the process. It took more than 10 seconds to compile the switch with 1000 cases.
  • The code I used to get these numbers can be found on paste.lisp.org.

8 comments:

Patrick Stein said...

I was sure you had left off a digit in the Allegro numbers or that you knew some magic way to load things into CLISP. Alas, the numbers when I run them are roughly comparable. Here are my numbers on my machine.

michaelw said...

I wrote ECASE/TREE some time ago. Care to try it out with your setup?

Duane Rettig said...

Allegro CL doesn't bother to make a jump-table out of numeric case situations whose "spread" is greater than 256. The spread is defined in this context as the numeric distance between the lowest and highest case key values. This maximum-spread value is configurable per architectures on those which convert case forms to jump-table style, but it turns out that one byte's worth of spread covers 99.99% of all interesting situations, due to the size of the type code internally (exactly one byte). It is, of course, possible that there might be useful applications for spreads of greater than 256, but other than an artificial test case like this one, I've never seen it in practice. People who believe they truly do have such a use need to also consider the extreme space usage of such a large table, especially if the table is sparse (i.e. perhaps only one or two case keys with a very large distance between them - in that situation turning the case form into a cond clause is of course the most efficient). Obviously, the poor timing in this situation is due to the fact that the case form turns into a cond form, and similar blowouts can easily be demonstrated using non-numeric case keys.

Juho Snellman said...

Right now your test framework is causing a huge overhead in the sbcl results, since you're doing a funcall on symbols. Sbcl doesn't store the function as a slot of the symbol, so a symbol-function (implicit or explicit) is a fairly expensive operation.

If you lift the symbol-function out, e.g. by passing time-test the result of (symbol-function func) instead of func, you should see somewhat more reliable results. For example, the runtime for CASE-10 drops from 240ms to 12ms for me, and the linear complexity of case is a lot easier to observe.

Duane Rettig said...

Juho's comment about explicitly using symbol-function is apropos for all lisps, though it affects Allegro CL less because Allegro CL does store symbol-function in the symbol. Also along the same line, other than the test framework, you are mostly performing funcall on functions and not names, and at least in Allegro CL you will benefit by changing all instances of (funcall foo) to (funcall (the function foo)). This assumes, of course, that declarations are being trusted, which in Allegro CL only happens when speed is greater than safety, so e.g in your switch macro you would have to also at minimum wrap the funcall expression in a locally with an optimize declaration.

Geoff Wozniak said...

Thanks for the responses, all.

@Duane

I agree, the case with 1000 branches is extreme. 256 sounds like a reasonable number. I just did 1000 to see what would happen.

@Juho

Good point. I will try again using function objects instead of symbols.

Anton Vodonosov said...

Juho Snellman said:
> Sbcl doesn't store the function as a slot of the
> symbol, so a symbol-function implicit or explicit) is
> a fairly expensive operation.

Interesting to know why SBCl does not store function in symbols.

petrenkov said...

Pretty cool place you've got here. Thanks the author for it. I like such themes and everything connected to them. I definitely want to read a bit more on that blog soon.

Truly yours
Darek Wish