Thursday, July 16, 2009

SMP in Allegro CL 9.0

Franz's latest tech update states that SMP is coming to Allegro CL 9.0 and recommends people work on migrating their code, since it may break things.

With this news, I must say I'm looking forward to trying out ACL 9.0.

Jump tables follow-up

Based on comments from my previous post on jump tables, I decided to follow-up with a slightly more comprehensive analysis.

Below is the running time, in milliseconds, to execute a given case-style macro one million times with n branches; each execution resulted in the code taking the default case. These running times include the changes suggested by Juho Snellman (seconded by Duane Rettig) to make the times somewhat more accurate, that is, use funcall on a function object instead of a symbol.

At the request of Michael Weber, I included his ecase macro, except that has been renamed to case/tree and changed slightly to accept a default case.

A '—' means that I could not get the given use case to compile.

Jump table running times (ms)
Clozure Common Lisp 1.3-RC1 case ncase switch case/tree
n 10 45 52 361 42
100 278 51 443 68
1000 3447 51 725 88
SCBL 1.0.29.54.rc2
n 10 54 60 222 63
100 218 59 203 88
1000 2029 60 222 112
ACL (free edition) 8.1
n 10 40 50 280 50
100 40 50 270 50
1000 4160 40 340 70
Lispworks Personal 5.1.1
n 10 43 44 288 48
100 258 43 237 54
1000 3064 277 69
CLISP 2.47
n 10 363 550 615 804
100 390 549 693 1332
1000 368 549 612 1898

I should point out that you shouldn't put much stock into the n = 1000 case since any case macro that uses 1000 or more cases is almost certainly pathological. Duane made this point in the comments of my previous post and I think it's important to consider. It might be tempting to say that CLISP has a "better" case macro since it has the same performance on 10 elements as it does for 1000; however, this would miss the point that most other Lisp systems are about 6-9 times faster in the vast majority of cases. Even n = 100 is probably so rare that you'd never see it in daily work unless you went digging.

Below is a bar graph of the performance of ncase, switch and case/tree relative to case. Each bar represents the relative difference for the performance of a given macro on n branches compared to case; I omitted the case of n = 1000 since I can't see it being practical. A value of 0 indicates it is equivalent to case. For example, in Lispworks, the ncase macro is 5 times faster with 100 branches than case (represented by the dark blue colour bar). Conversely, in Clozure CL, the switch macro is about 7 times slower with 10 branches than the case macro (the red colour bar).

jump-tables.png

What is clear from the graphs is that switch doesn't provide any benefits over case (when the cases are all single integer values, anyway). Also, using ncase or case/tree is of some benefit when the number of cases is large, except when using Allegro CL or CLISP. Probably the most interesting aspect is that ncase and case/tree tend to be slightly worse for small numbers of branches.

There are still some interesting questions that could be explored here (Can a compelling use case for switch be made? How often do large branch case expressions appear in actual code?), but it seems to me that jump table macros are, for the most part, of academic interest only. You can probably feel pretty confident that using case is sufficient.

Sunday, July 12, 2009

Tinkering with jump tables

I've often wondered whether writing a case or cond-like macro that uses jump tables is worth the effort. To get an idea as to how much I should pursue this, I decided to take Tim Bradshaw's ncase macro and pit it against case.

Tim admits that ncase is meant to show what is possible with macros and that it may be the case it may not make any performance improvement at all. Still, I think it would be interesting to compare it and see the results, simply because I really don't know what will happen.

In addition to Tim's jump table, I wrote a simple jump table implementation called switch that uses a hash table to look up the value and associates the body with a thunk; the hash table is computed with an load-time-value to minimize table creation overhead. switch has the disadvantage of performing a hash table lookup and calling a function with the advantage that the values to match can be anything you would compare with equal.

To compare the running times, I generated functions that contain 10, 100 and 1000 branches respectively (plus the default branch). This was done for each of case, ncase and switch. Each branch would compare a single integer and return that integer. For example, the body of case-100 would look something like the following.

(case v (1 1) (2 2) ... (100 100) (t nil))

It's easy to see why I generated these function bodies with macros...

I ran each function one million times, making it hit the default case each time. The time required to run the function one million times on different Lisp systems are given below in milliseconds. These were run on a 2.16 GHZ 17" Macbook Pro running Mac OS X 10.5.7.

SBCL 1.0.29.54.rc2 case ncase switch
10 469 469 633
100 631 471 612
1000 2455 469 630
CLISP 2.47 case ncase switch
10 348 558 579
100 391 561 645
1000 375 560 571
ACL Free edition 8.1 case ncase switch
10 40 40 280
100 40 50 270
1000 4290 50 350

I was quite surprised at the results. I ran these a few times to make sure I wasn't missing something. In particular, I'm drawn to the results from CLISP. It seems lookups with case is constant. The reason this is so compelling is that SBCL and ACL are quite slow in these cases. ACL seems to be very fast otherwise.

I'll have to take a closer look at the disassembled code to see what is going on (or perhaps you can mention something in the comments if you know). Regardless, it's something that deserves some more attention.

Some notes:

  • SBCL took approximately 4 minutes to compile the ncase with 1000 branches. It also used up an awful lot of memory in the process. It took more than 10 seconds to compile the switch with 1000 cases.
  • The code I used to get these numbers can be found on paste.lisp.org.

Franz's facelift

I guess it's been a while since I visited Franz's website. It appears to have gotten a serious makeover. Last I checked, it looked very 1990s-ish. It looks much better now.

I'd rank it about as effective as SBCL's simplicity and considerably better than the cluttered CLISP bucket 'o links.

Wednesday, March 25, 2009

Thanks to the ILC 2009 Twitterers

Just wanted to say thanks to all those who posted updates about ILC 2009 on Twitter. It made me feel bad for having to miss the conference, but at least I got to keep up on what was going on.

Now, anyone up for writing a blog entry providing a little more detail?

Monday, March 23, 2009

Following ILC 2009 on Twitter

Like Zach said, you can follow the updates tagged with #ilc2009. With what danbernier is posting, I'm wondering if it's the International Clojure Conference... I'm curious to see what today brings now that the tutorials are over.

CADR Emulator on Leopard

I managed to get the CADR Emulator up and running on OS X 10.5; the current distribution only supports 10.4. The changes were minimal, but I'm still testing it out to make sure nothing is seriously wrong.

One thing is problematic: it doesn't like the year 2009.
Screenshot of CADR emulator thinking the year is 19109
I tried entering 1990 once, but it resulted in my machine locking up. I'm pretty sure it was the emulator that locked up my machine, but I don't know if the date had anything to do with it. (Since I'm not at ILC 2009 — sigh — I figured I should at least do something Lisp-y.)