Andreas Gal

June 11, 2007

Parallel Compilation of Trace Trees

Filed under: Trace Compilation — Andreas @ 6:38 pm

We added a clone interface to trace trees, which allows to create deep copies of trace trees at reasonable cost. All compilation runs execute now on a copy of the recorded tree. This has a couple of advantages.

On the one hand, the transient compiler state can be stored directly in IR nodes. Previously we used a separate per compiler thread data structure which was allocated for each node. Accessing the compiler state for an IR node requires a rather expensive indirection in this design. By copying the tree both IR information and compiler state for each node are stored in the same place, reducing access cost and improving cache locality.

Another advantage is that the compiler state in the IR does not have to be cleared when the tree is recompiled. Previousl some state (such as register allocation) had to be reset between compiler runs. Instead the recorded trace is now always in a pristine state, and the compiler only modifies copies of it.

A problem we ran into were guard notes. When compiling code form a copy of the tree its important to store the address of the original recorded tree guard node in the generated code. Otherwise when the VM resumes from compiled code it will access a cloned guard node and thus try to extend some temporary copy of the tree, instead of extended the actual recorded tree. We implemented this with a self-pointer in guard instruction that is initialized to this int the constructor. Subsequent clone() invocations will simply copy this reference, keeping it pointing to the original guard in the recorded tree.

Blog at WordPress.com.