Andreas Gal

August 22, 2008

Tracing the Web

Filed under: Trace Compilation — Andreas @ 12:57 pm

We have landed!

For the last two months I have been working with Mozilla on a just-in-time compiler for the JavaScript engine in Firefox, and a few hours ago this project (codenamed TraceMonkey) has landed in the main Firefox development tree.

TraceMonkey is a trace-based JIT compiler and it pushes the envelope on JavaScript performance. On average, we speed up Apple’s popular SunSpider benchmarks by factor 4.6 over the last release of Firefox. The overall runtime of SunSpider improved by about 1.83x (parts of SunSpider excercise things like the regular expression engine which is outside of the scope of JIT compilation, hence the lesser overall speedup). For the SunSpider ubench suite, which focuses on core JavaScript language features, we achieve a speedup of 22x. Whichever metric you chose to apply, Firefox now has the fastest JavaScript engine in the world.

TraceMonkey Performance relative to Firefox 3.0

TraceMonkey Performance relative to Firefox 3.0

Mike Schroepfer put together a demo showing the real-world performance impact of TraceMonkey. You should also check out Brendan Eich’s and Mike Shaver’s blog post about TraceMonkey, as well as David Anderson’s updates on our 64-bit x86 port (yes, we do 64-bit!).

Dynamic Compilation with Traces

Traditional just-in-time compilers (like Sun’s Hotspot VM) are in their design and structure very similar to static compilers (like GCC). They observe which methods get executed frequently, and translate the method into native machine code once a certaint threshold has been reached. While such methods often contain performance-critical parts (such as loops), they often also contain slow paths and non-loopy code, which barely if at all contributes to the runtime of the method. A whole-method compiler, however, has to always analyze and translate the entire method, even if parts of it are not particularly “compilation-worthy”.

Trace-based compilation takes a very different approach. We monitor the interpretation of bytecode instruction by the virtual machine and scan for frequently taken backwards branches, which are an indicator for loops in the underlying program. Once we identify such a loop start point, we follow the interpreter as it executes the program and record the sequence of bytecode instructions that get executed along the way. Since we start at a loop header, the interpreter will eventually return to this entry point once it completed an iteration through the loop. The resulting linear sequence of instructions is what we call a trace.

Traces represent a single iteration through a loop, and can span multiple methods and program modules. If a function is invoked from inside a loop, we follow the function call and inline the instructions executed inside the called method. Function calls themselves are never actually recorded. We merely verify at runtime that the same conditions that caused that function to be activated still hold.

Trace Trees and Nested Trace Trees

TraceMonkey uses a particular flavor of trace-based compilation which I described in my dissertation: Trace Trees. Loops often consist of more than a single performance-relevant path, and Trace Trees allow to capture all of these and organize them in a tree-shaped data structure which can be compiled trace-by-trace yet produces a globally optimized result for the entire loop. To deal with nested loop, these trees can also be nested inside of each other, with outer loop trees calling the inner loop tree.

Control-Flow Graph representation of a loop with a nested condition.

Trace Tree for the code shown int he Control-Flow Graph. Traces are recorded starting at the loop header (A) and connect back to A after completing an iteration.

A particular advantage of Trace Trees is the fact that they always represent a loop and thus enter function frame and leave function frame operations are always balanced as long we stay on trace. Thus, we can actually completely optimize away the overhead of function calls. As long we stay on trace (which in case of a loop we usually do for many iterations), we don’t construct and destroy function frames. Instead, we simply execute the inlined trace we recorded. Function frames for inlined calls are only constructed should we detect that we have to leave the trace (for example because we reached the end of the loop).

Type Specialization

Trace Trees by their very nature are the result of a control-flow speculation. We speculate that loops tend to execute the same sequence of instructions over and over, which is usually true for many applications. In TraceMonkey we go a step further and also speculate on types.

JavaScript in contrast to Java or C/C++ is a dynamically typed language. Variables are declared by name only, and their type will be determined automatically once a value is assigned to them. Assigning values with different types to a variable changes the type of the variable on the fly to match the new value’s type. Executing such dynamically typed code has been traditionally fairly expensive. Type specialization eliminates much of this overhead.

Mike Shaver ran some benchmarks, comparing the performance of simple loops written in JavaScript and C. Our JIT generates code that is roughly equivalent to the performance of unoptimized C code (gcc -O0). We achieve this through aggressive type speculation. Whenever we see a program assign only integers to a variable, for example, we specialize the generated machine code to hold that variable in an integer machine register. Guards in the traces ensure that the type doesn’t unexpectedly change, in which case we leave the trace and let the interpreter handle this (unexpected and often infrequent) case.

Type specialization removes much of the principal overhead associated with dynamically typed languages, and as we further improve our JIT we expect to get fairly close to the performance of statically typed languages such as Java or C.

Traces Everywhere

Our work on TraceMonkey was done in close collaboration with Adobe’s Tamarin Tracing project. In fact, TraceMonkey and Tamarin Tracing share the same core tracing backend (nanojit), which was contributed by Adobe. Adobe has been criticized in the last few month for the slow performance of Tamarin Tracing on untyped JavaScript code. However, Tamarin Tracing is first and foremost a JIT compiler for ActionScript, a typed JavaScript dialect. While Tamarin Tracing does run untyped code, its not particularly optimized (yet) for this task.

TraceMonkey shows the full potential of Adobe’s nanojit backend when combined with a VM that was specifically designed and optimized for untyped JavaScript code (SpiderMonkey), and we expect much of our work to make its way into nanojit and Tamarin Tracing.

People

TraceMonkey was a tremendous group effort of a large group of extremely talented people. Much of the recent advances in the area of nested trees, aggressive type speculation and runtime type inference are based on work done by graduate students at our research group at UC Irvine (Michael Bebenita, Mason Chang, Marcelo Contra, Gregor Wagner and others). Our research was generously funded by a grant from the National Science Foundation (Principal Investigator Professor Michael Franz, Program Director Dr. Helen Gill) as well as grants and donations from Microsoft, Sun Microsystems, Intel, and last but not least Mozilla.

For me, it has been an amazing opportunity to spend the last two month here at Mozilla, turning our research ideas into actual product code. Its hard to describe what it feels like to work alongside people like Brendan Eich, the inventor of JavaScript, or Mike Shaver, Mozilla’s new VP Engineering and life-long JavaScript VM veteran. And even interns around here are rocket scientists. David Anderson, one of Mozilla’s interns, wrote a complete 64-bit backend for us over the summer, making TraceMonkey the first JavaScript JIT capable of targeting x86-64.

TraceMonkey was developed in close collaboration with Edwin Smith and his Tamarin Tracing team at Adobe, and the web at large owes Adobe a great deal of gratitude for open-sourcing the Tamarin and Tamarin Tracing VMs, allowing Mozilla to build TraceMonkey on top of Tamarin Tracing’s nanojit backend. nanojit is a small and highly efficient trace-based just-in-time compiler backend that is language agnostic and highly portable, and I think it has a bright future. It has just landed in Firefox, and hopefully we will see it pop up in a future release of Adobe’s Flash Player soon.

The Road Ahead

Landing in the central Firefox repository was a big step for us, but there is also definitively a lot of work ahead of us. We are now at the point where we trace a lot of code in benchmarks and on the web, but there is a lot more coverage we will add over time.

Also, we are far away from having exhausted all the potential of trace compilation and we plan to add many features and optimizations over the next few month. Our current speedups are just the beginning of whats possible:

  • Improve register allocation and code generation in nanojit.
  • Runtime analysis of builtins (machine code) to reduce spill overhead of builtin calls (Gregor Wagner from UCI did some work on this recently.)
  • Bring performance of the ARM backend up to par with x86 and x86-64 backends and add a PowerPC backend (joint work with Adobe).
  • Add tree-recompilation and parallel compilation (based on our prior work on Parallel Dynamic Compilation, Mohammad Haghighat from Intel has been looking into this for nanojit).
  • Add more advanced trace optimization techniques like Tree Folding, Load Propagation and Escape Analysis.

Our goal is to eventually close the performance gap between JavaScript and traditional desktop languages, and we believe that for many applications this will be possible.

In parallel to our work with Mozilla on JavaScript performance, we also have a number of exciting tracing-related projects going on at UC Irvine. Mason Chang, one of our graduate students, is currently working with Adobe on the Tamarin Tracing VM, adding context threading and trace visualisation. Michael Bebenita from UCI is currently interning with Sun Microssystems and has been making great progress integrating our Java trace compiler into Maxine, and we plan on switching to Maxine as our main research platform for Java compilation. Alexander Yermolovich (also UC Irvine) is working with Adobe this summer on an exciting project involving fast execution of rich dynamic content that Adobe will hopefully announce to the public soon.

If you are interested in their work, check out their blogs (linked from my website). For further reading material on traces and trace compilation you can also take a log at my earlier blog posts on this topic.

Update: Mason Chang did some benchmarks comparing TraceMonkey to Apple’s WebKit/SquirrelFish VM. Looks like we are on average 2.5x faster than SquirrelFish (about 15% faster on total runtime).

I am looking for a tenure-track faculty position for Fall 2009 to continue my research on virtual machines, dynamic compilation and type-safe languages.

42 Comments »

  1. Awesome. Plain awesome.

    Comment by Nuss — August 22, 2008 @ 2:33 pm

  2. [...] into Spidermonkey. You can hear more about the development from those that were involved: Andreas Gal, Mike Shaver, and Brendan [...]

    Pingback by John Resig - TraceMonkey — August 22, 2008 @ 2:40 pm

  3. [...] just let the cat out of the bag on their new TraceMonkey project. Brendan Eich, Mozilla’s CTO, describes it [...]

    Pingback by Ajaxian » JavaScript JIT: The Dream Gets Closer (in Firefox) — August 22, 2008 @ 4:19 pm

  4. Great work! So where is the Windows x86-64 version to test??

    Comment by Steven Roussey — August 22, 2008 @ 4:52 pm

  5. As far as I know Mozilla doesn’t ship a win64 version of Firefox (but its supposed to be possible to compile it by hand I was just told). We will try to get some more test coverage for linux64 over the next few days. The JIT is fairly agnostic to linux/win, so once the JIT is stable the rest is up to the browser team as far as a win64 version is concerned.

    Comment by Andreas — August 22, 2008 @ 4:58 pm

  6. Thanks Andreas — I have Vista x64 on all our Windows machines, which we had to do ourselves. But I just got a new TouchSmart for home, and it comes with Vista x64 preinstalled (no other option), so I guess the transition is really starting to happen for 64-bit in the consumer space.

    Sadly I can’t get Minefield/Fx 3.1 to work on my page I wanted to test, even with JIT turned off. I guess I will have to wait…

    Comment by Steven Roussey — August 22, 2008 @ 5:33 pm

  7. [...] Fast Is TraceMonkey In Real World? Everybody is writing about TraceMonkey today, the just-landed-on-trunk version of the SpiderMonkey JavaScript [...]

    Pingback by Home of KaiRo: How Fast Is TraceMonkey In Real World? — August 22, 2008 @ 6:45 pm

  8. [...] just let the cat out of the bag on their new TraceMonkey project. Brendan Eich, Mozilla’s CTO, describes it [...]

    Pingback by JavaScript JIT: The Dream Gets Closer (in Firefox) | about ICT — August 22, 2008 @ 7:36 pm

  9. [...] more about this from Andreas Gal, Brendan Eich, Mike Shaver and Mike [...]

    Pingback by To shell and back » Blog Archive » Firefox 3.1 to deliver massive Javascript performance improvements — August 22, 2008 @ 8:49 pm

  10. [...] lot has been said already, but I am really excited about much more than the “look at how we run Sun [...]

    Pingback by TraceMonkey: DOM, Canvas, Opensource and more on Dion Almaer's Blog — August 22, 2008 @ 10:12 pm

  11. Fantastic stuff. Very much appreciate your efforts in radically expanding the boundaries of possibility for both Mozilla and Javascript.

    Comment by voracity — August 23, 2008 @ 1:19 am

  12. Congrats! I took ics142b with you in spring ’08 and thought it really cool to read about your accomplishments with this project

    Comment by Martino Pham — August 23, 2008 @ 2:43 am

  13. I’d like to know what, if any, are the disadvantages of this new approach? More memory? A slowdown in certain circumstances? Or is it just win-win :-)

    Comment by Dan — August 23, 2008 @ 4:38 am

  14. Without having gone over the details of either, this reminds me of the work presented by Craig Chambers in his 1992 thesis “The Design and Implementation of the SELF Compiler, an Optimizing Compiler for Object-Oriented Programming Languages” and in particular chapter 10 on an optimisation technique called “Splitting”.

    It is nice to see dynamic languages making it to the fore front and leveraging their runtime nature as part the means for optimisations.

    Comment by Stewart Gebbie — August 23, 2008 @ 7:02 am

  15. [...] Ten do JavaScriptového enginu SpiderMonkey přidává JIT (just-in-time) code compilation (lepší vysvětlení a ještě lepší vysvětlení), se kterou je JavaScript v Gecku oproti předchozím verzím 1.8x [...]

    Pingback by JavaScript ve Firefoxu 3.1 trhne všechny rekordy - Martin Hassman: blog nejen o prohlížečích — August 23, 2008 @ 8:36 am

  16. Andreas:
    “but its supposed to be possible to compile it by hand I was just told”
    Yes, in fact it is possible and you can get precompiled x86-64 builds for Windows at http://www.mozilla-x86-64.com and http://www.vector64.com already a long time.

    Comment by Philipp K. — August 24, 2008 @ 11:34 am

  17. [...] in der Javascript-Roadmap. Die Entwickler Andreas Gal und Michael Franz stellen in ihrem Blog dar, wie der Tracing-Mechanismus in TraceMonkey arbeitet. Der Trick bei TraceMonkey besteht unter [...]

    Pingback by Javascript-Beschleuniger fr Firefox - Leecher.To Rapidshare Forum — August 24, 2008 @ 12:21 pm

  18. [...] in TraceMonkey arbeitet, erklren die beiden Entwickler Andreas Gal und Michael Franz in ihrem Blog. Beim ersten Durchlauf werden die Aufrufe kompiliert gespeichert und knnen so bei jedem weiteren [...]

    Pingback by Der kommende Firefox beschleunigt Web-Anwendungen enorm | silicon.de — August 25, 2008 @ 3:05 am

  19. We recently had lots of argument within our team, on whether the Javascript & AJAX approach to a rich GUI is the right one, or should we turn to something new, such as XAML. Our users experienced that the javascript based GUI is too slow, especially on slightly older machines. We definitely have to try this!

    Comment by Birotom — August 25, 2008 @ 7:44 am

  20. [...] der Tracing-Mechanismus arbeitet, erklren seine Kollegen Andreas Gal und Michael Franz in ihrem Blog: Beim ersten Durchlauf werden die Aufrufe kompiliert gespeichert, so dass sie sich bei jedem [...]

    Pingback by Firefox 3.1 soll JavaScript deutlich beschleunigen - Software | ZDNet.de News — August 25, 2008 @ 8:05 am

  21. [...] der Tracing-Mechanismus arbeitet, erklren seine Kollegen Andreas Gal und Michael Franz in ihrem Blog: Beim ersten Durchlauf werden die Aufrufe kompiliert gespeichert, so dass sie sich bei jedem [...]

    Pingback by Firefox 3.1 soll JavaScript deutlich beschleunigen - WinBoard - Die Windows Community — August 25, 2008 @ 8:59 am

  22. [...] gory details can be found on Andreas Gal’s blog post about [...]

    Pingback by Outside the Box() » JavaScript at the Speed of Sound — August 25, 2008 @ 10:04 am

  23. [...] also cool how the Mozilla/Firefox effort was fueled in part by the donation of Tamarin and Tamarin Tracing to Mozilla. I hope this collaboration can continue to be productive. This entry was written by [...]

    Pingback by Mozilla TraceMonkey speeds javascript « Shebanation — August 25, 2008 @ 10:20 am

  24. [...] için şimdiye kadar kullanılan JavaScript motoru MonkeySpider’ı, Just-in-Time derleyici TraceMonkey ile değiştiriyor. İlk bütünleştirme çalışmaları güncel test sürümlerinde [...]

    Pingback by TraceMonkey, Firefox 3.1′i hızlandırıyor | Teknoloji Ekseni - Teknoloji haberleri — August 26, 2008 @ 5:53 am

  25. [...] bunun iin imdiye kadar kullanlan JavaScript motoru MonkeySpider’, Just-in-Time derleyici TraceMonkey ile deitiriyor. lk btnletirme almalar gncel test srmlerinde balad. u ana kadar [...]

    Pingback by TraceMonkey, Firefox 3.1'i hzlandryor - KeyfiAlem.Net-Alemin En Keyiflisi — August 26, 2008 @ 8:35 pm

  26. 2.5x to squirrelfish. well done.

    Comment by hussein kanji — August 27, 2008 @ 4:24 am

  27. [...] için şimdiye kadar kullanılan JavaScript motoru MonkeySpider’ı, Just-in-Time derleyici TraceMonkey ile değiştiriyor. İlk bütünleştirme çalışmaları güncel test sürümlerinde [...]

    Pingback by TraceMonkey Firefox’u hızlandırıyor | Bilgisayar-Destek — August 27, 2008 @ 2:55 pm

  28. [...] course, 8% looks pretty puny next to the huge gains of TraceMonkey (congrats to those guys, by the way). But I’m assured that interpreter speedups still count, so I’m chugging [...]

    Pingback by Inline threading, TraceMonkey, etc. :: David Mandelin’s blog — August 27, 2008 @ 7:57 pm

  29. [...] Entwickler Andreas Gal und Michael Franz erläutern die Funktion des Tracing-Mechanismus in ihrem Blog. Nicht nur das Surfen dürfte durch TraceMonkey schneller werden, da Firefox auch intern viele [...]

    Pingback by Firefox 3.1 erhält Javascript-Beschleuniger | OSNote — August 28, 2008 @ 11:36 pm

  30. [...] preffed off. (Tracemonkey promises much faster JavaScript execution speed. Blog posts: brendan, andreas, shaver, schrep, jresig. Use the ‘tracemonkey’ hg branch for serious [...]

    Pingback by The Burning Edge » Blog Archive » 2008-08-29 Trunk builds — August 29, 2008 @ 6:18 pm

  31. [...] descrizione più dettagliata dei Trace Trees si può trovare in uno articolo e in un documento che è possibile [...]

    Pingback by Le performance di JavaScript viste da Mozilla e dal team di IE 8 « dezone — August 30, 2008 @ 2:35 am

  32. [...] Tracing the Web [Andreas Gal] – A very informative post about TraceMonkey, a JIT compiler for Javascript in Firefox (should be in version 3.1). I agree with Steve Yegge here that Javascript may be the next popular language, stay tuned folks. [...]

    Pingback by This month in bookmarks: August 2008 — September 2, 2008 @ 8:15 pm

  33. [...] The Mozilla Foundation and Brendan Eich respond to Google Chrome’s V8 hype with benchmarks of their own: fast results of TraceMonkey, slated for Firefox 3.1. On the todo list is trace recursion with trace trees. For a more details explanation and dissertation go here. [...]

    Pingback by VelociPeek - Eric’s weblog on tech » Mozilla Responds to V8: TraceMonkey With Trace Trees — September 4, 2008 @ 8:35 am

  34. I wonder if these techniques have anything in common with those of psyco for python (or the current pypy project).
    Are you familiar with them?

    Comment by Luis — September 9, 2008 @ 8:42 am

  35. [...] at the benchmarks on Andreas Gal’s blog, it’s good to see that in Andreas’ own words: “No matter which metric you choose to [...]

    Pingback by Most Fastester Browser in the World (Evar) | Machaxor.net — September 26, 2008 @ 5:12 am

  36. [...] algunas pruebas [1][2] puede llegar a ser hasta un 10% más rápido que el motor V8 de Google Chrome (que fue una de [...]

    Pingback by Minefield, Firefox supera en velocidad a Google Chrome | Mareos de un Geek — October 1, 2008 @ 2:51 pm

  37. [...] Tracing the Web [Andreas Gal]. [...]

    Pingback by Firefox Minefield: más rápido que Chrome « Conocimiento Libre (o lo que está detrás del Software Libre) — October 5, 2008 @ 8:08 pm

  38. Hey Andreas, das ist ja wirklich cool! Schöne Grüße, Olaf.

    Comment by Olaf Spinczyk — October 15, 2008 @ 1:41 am

  39. [...] 3.1 enthält die neue JavaScript-Engine TraceMonkey, die Webseiten mit JavaScript noch schneller darstellen soll als der Firefox 3.0. Verwandte [...]

    Pingback by Firefox 3.1 Beta1 steht bereit - NETZ-ONLINE — October 15, 2008 @ 3:52 pm

  40. The work you guys have been doing is simply awesome. Thanks to each and everyone in the team for all their efforts.

    Comment by Kasi Viswanath — October 16, 2008 @ 7:25 am

  41. [...] with Firefox versions 1.5 through 3.0. I’m currently using the beta of Firefox 3.1 for the faster TraceMonkey JavaScript engine and Foxmarks isn’t yet compatible. I had to disable browser extension compatbility for the [...]

    Pingback by WebWorkerDaily » Archive Foxmarks Adds Cross-Platform Password Sync « — October 17, 2008 @ 9:04 am


RSS feed for comments on this post. TrackBack URI

Leave a comment

Blog at WordPress.com.