Thursday, October 25, 2007

Specializing object representations

Using profiling information to adjust program code has been an obsession of mine lately. Ever since I started programming, I’ve been interested in program changes over time based on the program’s previous performance. Over the last few months, I’ve been working on a way to feed profiling information back to a Lisp compiler to specialize code.

Specifically, I’ve been experimenting with specializing object representations. Suppose you know that an object is of some generic type, like a sequence, but when used in a program, it is always a string. Furthermore, this information is known to some kind of profiler, which could be an external program or even part of the program itself. Lastly, the information collected by the profiler is available to the compiler. Assuming the information is reliable, the compiler could specialize the sequence object to be a string. To speed up execution, the compiler could also change operations done on the object to be specific to strings (such as changing ELT to CHAR).

Note that I am describing source level changes to the code. Also, I am assuming that the program is profiled in a comprehensive manner.

I'll quickly describe the way I've done this without delving into too many details.

Profiling

It is important to be clear about the scope of objects being specialized in this context. I decided to break down objects according to some general type. I chose to work with sets since they are non-native data types (save for treating lists as sets) and admit many possible implementations. There was a single interface used for all implementations.

The next step was to figure out the scope of those objects within the program itself. I thought that considering the entire program at once to be too broad, so I settled on only considering the places in the code where set objects are created. For simplicity, I added a function to the interface that makes set objects. I was only concerned with static calls for the time being.

The only hurdle left was collecting information that could be used by the compiler. How do you identify places in the source code to the compiler? I came up with a hack that tends to work, but that I wouldn’t count on.

I define compiler macros for the interface functions I am interested in tracking. When profiling code is to be generated, a counter is incremented in the environment of the compiler macros. The value of this counter is the identifier for the place in the source code, known as the lexical place identifier. It is reset each time the compiler is run. This relies on the compiler walking the source code in the same way each time and that no changes are made to the source between compilations.

Finally, to track information about objects, I annotate the objects themselves. Each object is tagged with the lexical place identifier of where it was created. Using the object annotations, information about the use of the object is collected into the environment of the compiler macros.

As an example, suppose the function MAKE-SET creates set objects. I define a compiler macro for it that, when profiling is enabled, takes immediate calls written as

(make-set)

and turns them into something like the following.

(locally (declare (notinline make-set))
  (let ((obj (make-set)))
    (initialize-profiling obj 1)
    obj))

The 1 here is really the value of the lexical place identifier. It would be different for each call site. The declaration is used to prevent any further invocation of the compiler macro.

Specialization

Once profiling information has been collected, the compiler is run again with specialization turned on. In order to know how to specialize the call site, the lexical place identifier is used. During profiling, each object writes information to a database keyed on the lexical place identifier. Thus, when the compiler macro goes to specialize a call, it looks up the information associated with the lexical place identifier, analyzes it, and generates code accordingly.

Using the example above, suppose analysis showed that set objects created at lexical place 1 were usually very small and that lots of them were created. This suggests using sets that are implemented as lists. The compiler macro for MAKE-SET would then produce something to create sets represented as lists at lexical place 1, such as

(make-instance 'list-set)

possibly with extra initialization arguments, such as a specific test function.

This only deals with the lexical places that create set objects directly. It is also possible to use this approach to specialize direct uses of the objects. (I’ve done this with a very general map implementation, where direct access is used when the field is known).

Caveats and stuff to do

The whole thing relies on comprehensive inputs for profiling. Obviously, there is a risk here. It also needs source code to be processed in exactly the same way for each compilation and that the source code doesn’t change. Overall, the system is a bit brittle, but it mostly works.

The hack to identify lexical places to the compiler is just that: a hack. The version I have works with SBCL and ACL, but there is an issue with LispWorks. I wanted to avoid having to write my own source analyzer and use the compiler macro mechanism instead. This allowed me to be somewhat implementation agnostic. I don’t know if it scales or not and wouldn’t be surprised if it didn’t.

ACL has an identifier in their environment objects that could be used instead of the counter. I haven’t tried it out yet, but it seems promising. Some sort of file identifier would have to be issued as well.

Currently, I only look for static calls to the interface functions. The notion of lexical places could work in a dynamic environment too, but you would have to use something other than just compiler macros.

Why?

I started out trying to implement something to model very abstract concepts (sets, maps, graphs and grammars) in such a way that the user doesn’t have to be concerned with implementation details, even more than usual. Really, I was interested in very simple interfaces.

The problem is that a single implementation is rarely appropriate for these structures. However, when an object can take on multiple representations, the overhead can be a problem and the logic is difficult to capture. So I thought it might be interesting to try and come up with ways that objects could be profiled such that either the object can change its representation or the compiler could pick one that is suitable for how it is used.

In the end, I’m aiming to build something where representation choice can be made by the programming environment for simple to moderately complex situations without any necessary user intervention and still have decent performance. Kind of like automatic memory management, but for data structures.

Finally…

The code isn’t ready for public release yet, but if anyone is interested, feel free to get in touch with me. If I can’t write it as a library, I’m thinking about integrating it into SBCL as an extension.

4 comments:

david.hilton.p said...

This makes me think of Synthesis (the OS that uses dynamic code generation as an optimization technique).

Geoff Wozniak said...

David: Thanks for the reference.

Mark Luffel said...

The Glasgow Haskell Compiler has a pragma that lets you generate specialized functions for different concrete data types, and encompasses your "changing ELT to CHAR" optimization. Using profiling to inform data type selection is awfully cool though!

http://www.haskell.org/ghc/docs/latest/html/users_guide/pragmas.html#specialize-pragma

Geoff Wozniak said...

Mark: I didn't know about that. Thanks!