GraalVM Under The Covers
Written by Nikos Vaggalis   
Monday, 10 January 2022
Article Index
GraalVM Under The Covers
Working with Native Images

At a very high level, GraalVM is a runtime that can compile bytecode into native self-contained executables as well as run programs in languages other than Java. This detailed look at it attempts to put a highly technical and difficult subject into perspective.

The starting point for this article was "Exploring Aspects of Polyglot High-Performance Virtual Machine GraalVM", a paper by M. Šipek, B. Mihaljević and A. Radovan of Rochester Institute of Technology Croatia, Zagreb, Croatia. The presents GraalVM's architecture, its features and examines how it resolves common interoperability and performance problems when having to support multiple programming languages. I've extended it in order to provide a high level overview in simple terms.

Virtual Machines were invented in order to run programs in an independent way regardless of the platform and the underlying hardware. Examples are the . NET CLR, MoarVM the modern virtual machine built for the Rakudo compiler implementing the Raku Programming Language and of course the JVM. Initially the JVM was built in order to make Java portable across platforms by running bytecode coming out of the Java compiler. Soon enough other languages that could emit bytecode for the JVM came along, like Scala, Kotlin, Groovy or Clojure, therefore extending the JVM's application beyond Java.

That still wasn't enough since nowadays a computing problem can sometimes be solved by combining features or libraries found amongst a multitude of languages. GraalVM is the attempt to bring languages that were never designed to work with the JVM together under one roof. We are talking about dynamic languages like Javascript, Python, Ruby as well as static ones like C/C++, Rust, Swift and Fortran.

So how does this magic take place?

The Architecture/Stack

The most important components of the GraalVM project are the Just-In-Time (JIT) compiler named Graal, and a framework used for implementing programming languages named Truffle.

On the top of the stack we find the Java HotSpot Virtual Machine, an implementation of the JVM which became the default since Java 1. 3, used to run code for languages that target the JVM. However it connects to the lowest stack of GraalVM architecture, thus in a way it extends itself.

graalvmarchi

One level down we find the JVM Compiler Interface under which you can implement a custom optimizing JIT compiler in Java that is easier to manage and improve than existing compilers written in C or C++. The importance of this is that a compiler written according to that interface can be used by the JVM as a dynamic compiler. Such
dynamic compiler is Graal. GraalVM uses Graal as its JIT compiler, instead of the Hotspot compiler, something made possible by JVMCI. The best part of it is that Graal runs on an unmodified JVM.

As such, a further level down we find Graal itself, the compiler which is used for both dynamic JIT compilation and static compilation (when taking the shape of an AOT compiler). This AOT compiling aspect is the one that gives it some powerful properties like creating native images or embedding itself into both managed and native applications.

While the Java HotSpot VM has two JIT compilers, the client compiler and the server compiler, in GraalVM a third compiler is added;the Graal compiler, which uses a new IR. Graal however can reuse Hotspot's components such as the interpreter, the garbage collection, the class loading mechanism and the exception handling.

Graal is split into two ends. The Front end turns the bytecode  into platform-independent Graal Intermediate representation (Graal IR), which takes the shape of a directed graph. The Back End is responsible for the translation of high-level Graal IR into low-level IR (LIR). Graal performs several optimizations on the IR and subsequently compiles it to native machine code.

graalvm2astThe Truffle Language Implementation Framework

Truffle is a library for building programming language implementations expressed as self-optimizing Abstract Syntax Tree (AST) interpreters. In other words, it makes it easy to develop an interpreter for your custom programming language as the implementer has just to write a Java based AST interpreter for his language which subsequently gets JIT compiled by Graal into machine code. It's important to note that Graal can compile any language that is implemented as a Truffle interpreter. 

The given language interpreter takes the program as input and works on the generated abstract syntax tree (AST) which it modifies and optimizes during the interpretation process to incorporate type feedback, substitute a node at its parent with a different node during runtime and rewrite itself by collecting language specific profiling information;as such the self-optimizing part. When the AST is stabilized (when no specialization/ optimization occurs anymore) Graal compiles the interpreter using Partial evaluation forming one combined compilation unit for an entire tree to produce Graal IR (not Java byte code). Then, as stated above, this is translated to LIR for the target processor architecture and then to the final machine code.

graalvmtreeeval

The extra advantage of that is that Truffle uses the same interoperability protocols for all interpreters and virtualizes languages implementation, thus from Truffle's perspective there is no significant difference between languages. Therefore, all runtime-based tools such as debuggers, profilers, and dynamic analyzers, can be used as polyglot, while at the same time diminishing the overhead resulting in performance loss due the nature of the interoperability between programming languages.

Also while not described at all in the paper, probably because it is built on top of Truffle, the third component of GraalVM is that of Sulong, the high-performance LLVM bitcode runtime written in Java under which you can execute programming languages that can be transformed to LLVM bitcode. This includes static languages like C/C++, Fortran, and others.

Ultimately GraalVM allows a language implementer to port his language into the JVM in three ways: 

  • The classic way of emitting JVM bytecode-JRuby
  • Write a Truffle interpreter-TruffleRuby
  • Emit LLVM bitcode and run it with Sulong-Rubinius 


Last Updated ( Monday, 10 January 2022 )