The idea behind Containment Analysis is to use a trace tree to find objects that are created locally in side the trace tree, and also die within the same scope. Such objects can be stack allocated, and even more importantly, do not have to be locked. So whats so different about Containment Analysis? Well, first of all Escape Analysis is pretty cumbersome and expensive because its an intraprocedural analysis. You have to not only look at the code of a method to determine whether objects escape, but also analyze all methods that call it, and any method it in turn invokes. The dataflow analysis required for that is quadratic, and not necessarily cheap to perform. Trace trees represent a hot loop and inline all paths through that hot code into one data structure. This eliminates the need to do intraprocedural analysis. All the code we have to consider is right there in the trace tree.
A specific problem for trace trees, however, is the fact that they are only a partial program representation. Untaken side exits are like a black hole and we have little information on what happens when they are taken. All we know is which values from inside the loop we have to write back to what stack/local variable location in order for the interpreter to resume interpretation at that point. So how do we want to do any meaninful escape analysis in this case?
The answer is Containment Analysis. Since trace trees represent all performance-critical paths, thats exactly what we should focus on: find objects that get created along these paths and are known to die during the iteration (as we return back to the loop header, for example). For these objects we than hoist the allocation out of the loop, and only allocate one object in the loop header which is re-used over and over.
The most significant difference in comparison to standard Escape Analysis is that its ok to have escape points in side exits. If a side exit is taken, the code “will take the escaping object with it”. Here it comes in handy that we allocated the captured object on the heap, and not on the stack. Without having to copy the object around, we already have a valid reference that is allowed to escape in case a side exit is taken.
So why does this work and why will it be fast? Well its a trace tree! A well-formed trace tree should capture all the relevant hot paths through the loop, and most the time execution will happen along the tails of the tree and back to the loop header. For these cases we will reuse the pre-allocated object. Side exits by definition happen rarely, and thus its ok if we let the object escape in this case. We are returning to the interpreter anyway. Next time execution enters the trace tree, a fresh object will be allocated in the loop header, and reused again until we side exit. Neat, isn’t it?