JavaScript_870x220

In this article, you’ll learn how JavaScript engines have evolved from a mere interpreter to a performant and efficient engine that produces highly optimized machine code. You’ll also learn about the underlying components in the V8 JavaScript engine, including how the interpreter produces bytecode with abstract syntax tree as the input and how the compiler uses this bytecode to generate optimized machine code. This article will also help in understanding some of the performance optimization techniques for objects and arrays.

This article is a part of the series on The Journey of JavaScript - from Downloading Scripts to Execution

Highlights from Part I of the Series

  1. We learned about the different ways of downloading scripts based on the use case. Scripts are downloaded synchronously and are blocking in nature. When the main thread comes across a script tag, it blocks the parsing of HTML DOM until the entire script is downloaded, parsed and executed. However, scripts can also be downloaded asynchronously, without blocking the main thread, with the use of async keyword in the script tag. If any of these scripts are not required on load, we can defer their execution until the DOM is ready by using defer keyword.
  2. JavaScript engines are built of parser, interpreter and compiler. The JavaScript source code is broken into tokens, which are fed to the parser. The parser generates an abstract syntax tree (AST) and Scopes based on these tokens.
  3. The JavaScript engines do not parse all of the source code on load. We saw the heuristics employed by the V8 engine for parsing JavaScript code.

You can read more on the above points here.

Overview of JavaScript Engines

The semantics of JavaScript is defined by the ECMAScript specifications, and these specifications are written by the TC39 committee. The JavaScript engines are required to follow these specifications while implementing different functionalities in JavaScript. Most of the major browsers have their own implementation of these engines, but they have the same end goal of meeting the semantics laid down by the TC39 committee. So, most of the performance-related techniques are applicable in almost all browsers.

Let’s look at the list of JavaScript engines in some of the major browsers.

Browser JavaScript Engine
Microsoft Edge Chakra
Firefox SpiderMonkey
Safari JavaScriptCore
Google Chrome V8

In this article, we will dig deeper into the internals of the V8 JavaScript engine.

High-Level Architecture of V8

alt High-level Architecture of V8

As we can see from the above image, the JavaScript source code is fed to the parser. The parser generates an AST. Ignition generates bytecode, and TurboFan produces optimized machine code. Don’t worry about the red and green arrows as of now. They will make sense once we get to the working of Ignition and TurboFan.

How Are JavaScript Engines Different from the Engines of Other Programming Languages?

The high-level languages like C++ and Java take two steps to convert source code to machine code. The compiler first converts the source code to an intermediate code and then the interpreter takes this intermediate code and converts it to machine code. For example, a Java file is compiled as javac filename.java. This command generates a bytecode and stores it in filename.class file. This bytecode can run on any machine that has Java Virtual Machine/Java Interpreter. It can be executed using the command java filename.class. The initial compilation step does the heavy work, and, hence, the server can execute the Java bytecode at a faster speed.

Unfortunately, the above strategy is not implemented in JavaScript engines. The JavaScript engines do not follow a two-step procedure for executing the source code. We never do compile filename.js. We directly run the JavaScript file in the browsers or on run-time with Node.js. The engines do not compile the entire source code all at once. Instead, JavaScript engines interpret the source code line by line and execute that line simultaneously.

Don’t you think this would make the execution of JavaScript a lot slower as compared to other high-level languages? JavaScript runs source code, whereas other high-level languages that we saw earlier run the optimized bytecode, which was generated in the previous step of compilation.

Let’s Look at Some Statistics

Here are the performance results with hardware usage benchmarks test. It is taken from this IBM article.

alt Comparison of nodejs with Java

Node.js uses the V8 JavaScript engine as the run-time. In the above image, we can see that Node.js performs better than Java with respect to the CPU usage, while its performance is on par with Java with respect to the usage in memory.

Here is the snapshot of most popular languages from the Stack Overflow survey of 2018.

alt Stack Overflow Survey 2018

As per the above image, JavaScript is the most popular language among professional developers.

Even though JavaScript engines skip the compilation step, its performance and popularity are better than other languages. Let’s see how JavaScript engines make this possible.

Evolution of JavaScript Engines

The first JavaScript engine was a mere interpreter. An interpreter is a software program that executes the source code line by line.

Let’s consider a JavaScript snippet below to understand how former engines used to operate.

function arrSum (arr) {
    var sum = 0
    for (var i = 0; i < arr.length; i++) {
        sum += arr[i]
    }
}

It has a simple arrSum function that adds the elements of an array arr. Inside for loop, there is only one statement that does the work of adding array elements to the variable sum.

Consider this from the point of the interpreter —

It is possible in JavaScript that an array can have numbers as well as strings at the same time. arr might have a mix of different data-types. On every iteration of the for loop, the interpreter checks the type of element in the array and accordingly performs the add/concatenation operation. + behaves as an addition operator for numbers and as concatenation operator for strings.

This type checking and computation on every iteration makes it slow. In earlier days, JavaScript was not considered as a language of choice because it was very slow compared to other high-level languages.

However, as we have seen from the statistics above, JavaScript performs better now and is a much-loved language among professional developers.

Chrome developed the first modern JavaScript engine, V8, in 2008. Its performance was much better than any prior engine. Chrome used just-in-time compilation in the V8 engine to boost its performance. Most of the browsers/JavaScript run-time systems now use the same technique to power up the execution speed of JavaScript code.

What is Just-in-Time Compilation (JIT)?

If the above code of adding elements of an array was compiled first, it would have taken some time to start up, but its execution would have been faster as compared to the earlier technique. JIT takes in the good parts of both the compiler and the interpreter. It interprets the source code line by line, produces bytecode for that line, and gives it to the compiler, which uses the profiling information to produce speculative optimizations. It compiles code during execution at run time.

Browsers started shipping in JavaScript engines with JIT and a profiler. A profiler, or a monitor, watches the code that runs and makes a note of the number of times a particular code snippet runs. If the same lines of code are run for more than some threshold value, that code is called as hot. The profiler then sends this hot code to the optimizing compiler.

The compilation details of the hot code are saved. If the profiler encounters this hot code again, it would convert it into its existing optimized version. This helps in improving the performance of the execution of JavaScript code. We’ll get to the assumptions that the compiler makes to generate optimized code later in the article.

How JIT Works in V8

alt Interpreter and Compiler in V8 JavaScript Engine

Ignition (interpreter) takes in AST and goes through its nodes one by one and accordingly produces the bytecodes. The profiler keeps an eye on the frequency at which a particular code snippet runs. If the frequency crosses some threshold value, it sends the hot code along with some profiling information to the TurboFan (compiler). TurboFan makes some assumptions to optimize this code further, and, if these assumptions hold true, it generates an optimized version for the hot code. The green arrow that signifies optimization success!

If the assumptions are not correct, it falls back to the bytecode that was generated by the Ignition. This is called a deoptimization or optimization bailout. The red arrow signifies de-optimization!

Let’s break JIT into parts and understand the working of each of the components in detail.

How Ignition Generates Bytecode

Bytecodes are considered small building blocks that can be composed together to build a JavaScript functionality. They abstract the low-level details of machine code. V8 has hundreds of bytecodes for different functionalities. For example, it uses the Add bytecode for addition operator and CreateObjectLiteral bytecode for creating an object.

Ignition uses a register machine to hold the local state of the registers. It has a special register called an accumulator that store the previously computed value.

Let’s consider a simple JavaScript snippet.

function add (x, y) {
    return x + y;
}
add (1, 2);

Here’s the bytecode generated for the above code:

alt ByteCode of the add function generated by Ignition

Focus only on the right-hand part. The registers a0 and a1 hold the value of the formal parameters. Add is a bytecode that is used for adding the values in registers a0 and a1.

Before jumping onto the working of TurboFan, we have to understand two important concepts in V8:

  1. Implementation of the object model
  2. Implementation of arrays

Implementation of the Object Model

The ECMAScript specification defines objects like dictionaries with string keys that map to values. In this section, we’ll learn how JavaScript engines store objects and how they implement property access on these objects.

let pokemonObj = {
    id: 12,
    name: 'Butterfree',
    height: 11,
    weight: 22
}

Below is the AST for the object pokemonObj.

{
  "type": "Program",
  "start": 0,
  "end": 87,
  "body": [
    {
      "type": "VariableDeclaration",
      "start": 0,
      "end": 87,
      "declarations": [
        {
          "type": "VariableDeclarator",
          "start": 4,
          "end": 87,
          "id": {
            "type": "Identifier",
            "start": 4,
            "end": 14,
            "name": "pokemonObj"
          },
          "init": {
            "type": "ObjectExpression",
            "start": 17,
            "end": 87,
            "properties": [
              {
                "type": "Property",
                "start": 23,
                "end": 29,
                "method": false,
                "shorthand": false,
                "computed": false,
                "key": {
                  "type": "Identifier",
                  "start": 23,
                  "end": 25,
                  "name": "id"
                },
                "value": {
                  "type": "Literal",
                  "start": 27,
                  "end": 29,
                  "value": 12,
                  "raw": "12"
                },
                "kind": "init"
              },
              {
                "type": "Property",
                "start": 35,
                "end": 53,
                "method": false,
                "shorthand": false,
                "computed": false,
                "key": {
                  "type": "Identifier",
                  "start": 35,
                  "end": 39,
                  "name": "name"
                },
                "value": {
                  "type": "Literal",
                  "start": 41,
                  "end": 53,
                  "value": "Butterfree",
                  "raw": "'Butterfree'"
                },
                "kind": "init"
              },
              {
                "type": "Property",
                "start": 59,
                "end": 69,
                "method": false,
                "shorthand": false,
                "computed": false,
                "key": {
                  "type": "Identifier",
                  "start": 59,
                  "end": 65,
                  "name": "height"
                },
                "value": {
                  "type": "Literal",
                  "start": 67,
                  "end": 69,
                  "value": 11,
                  "raw": "11"
                },
                "kind": "init"
              },
              {
                "type": "Property",
                "start": 75,
                "end": 85,
                "method": false,
                "shorthand": false,
                "computed": false,
                "key": {
                  "type": "Identifier",
                  "start": 75,
                  "end": 81,
                  "name": "weight"
                },
                "value": {
                  "type": "Literal",
                  "start": 83,
                  "end": 85,
                  "value": 22,
                  "raw": "22"
                },
                "kind": "init"
              }
            ]
          }
        }
      ],
      "kind": "let"
    }
  ],
  "sourceType": "module"
}

Notice the use of properties array that holds the key-value pairs of the object. As per the ECMAScript specification, the keys of an object are to be mapped to their respective property attributes. The property attributes tell more about the configuration of that key in the object.

Here is the list of the property attributes:

Attribute Description Default Value
value The value associated with the property. Can be any valid JavaScript value (number, object, function, etc.). undefined
enumerable true if and only if this property shows up during enumeration of the properties on the corresponding object. false
writable true if and only if the value associated with the property may be changed with an assignment operator. false
configurable true if and only if the type of this property descriptor may be changed and if the property may be deleted from the corresponding object. false

Let’s say a program has a hundred instances of the pokemonObj object, which is likely the case in common scenarios. That would mean the engine is supposed to create a hundred objects with all of the four keys of pokemonObjid, name, height and weight — along with their property attributes. That’s such a waste of memory to store repeated instances of the metadata of an object. The JavaScript engines tackle this problem by storing a shared copy of the metadata of an object. So, for hundreds of pokemonObj instances, it creates just one object to store the common metadata.

alt Multiple Objects sharing same metadata

We can optimize this more. Each of these objects holds the same four keys — id, name, height and weight. They all have the same shape of pokemonObj. We can create one shape with these four keys and let all the instances point to that shape. This will become clear from the diagram below.

alt JavaScript Objects share same shape

As it can be seen from the image above, all of the different Pokemon refer to the same shape (id, name, height, weight), and each of the keys in this shape structure refers to its respective property attribute’s structure. That’s a significant saving on the memory side!

There is an additional property offset that is included with the list of property attributes. Please note id has an offset of 0; name has an offset of 1; and so on. If we follow the pattern, it can be inferred that the keys in pokemonObj are assigned a sequential index as per their position in the properties array defined on the AST.

Let’s understand the need for the offset property.

For accessing any property on the pokemonObj, we use the . (dot) operator as pokemonObj.property. Let’s say we want the values of name, height and abilities of a Pokemon.

let name = pokemonObj.name // Butterfree
let height = pokemonObj.height // 11
let abilities = pokemonObj.abilities // undefined

When the JavaScript engine encounters something like pokemonObj.name, it searches the name property defined on the shape of the pokemonObj. If it is found, it returns the value. Since abilities is not defined on the shape of pokemonObj, the engine searches for the abilities property on the prototype object of pokemonObj. It keeps traversing down the chain until it either finds abilities key or reaches to the end of the chain. In our case, it returned undefined, which means abilities was not defined on any of the ancestors of pokemonObj. Property accesses are hard in JavaScript. It is an expensive operation for an engine to traverse through the entire prototype chain for a single property.

Property access for keys name and height is very simple because of the offset property. The JavaScript engine simply returns the value from the object using the offset as defined on a particular property. For name, it would return the first value from pokemonObj, and for height it will return its second value.

Let’s say, we’ve got one more Pokemon called weedle joining in our gang. This one has some unique abilities and we really want to store all of its abilities in our pokemonObj object.

let weedle = {
    id: 13,
    name: 'weedle',
    height: 3,
    weight: 32,
    abilities: 'run-away'
}

We already have defined the shape for pokemonObj. JavaScript allows adding/deleting properties from an object at run-time. How do we handle the addition of abilities property to our pokemonObj shape? Should we make a new shape altogether for this object, or should we extend the previous shape? You might have guessed it right. We’ll extend our original pokemonObj shape to make room for an additional key called abilities.

alt JavaScript Objects extend Shapes

Extending shapes makes it easy to store objects that differ in some properties from their counterparts.

This can also be viewed as a hierarchical chain that starts from an empty object and eventually builds up the entire object.

let obj = {}
obj.a = 'a'
obj.b = 'b'
obj.c = 'c'

It starts with an empty shape. obj.a = 'a' statement extends the empty shape and adds one property a to it. This goes on until we reach c. Please note, this is not an efficient way of creating objects. In order to create the above object, follow the below method of object declaration.

let obj = {
    a: 'a',
    b: 'b',
    c: 'c'
}

Takeaways from the Object Model Implementation

  1. Avoid adding/deleting properties from an object. Strive to maintain a constant structure for objects.
  2. Avoid reordering the keys of an object. The JavaScript engine takes into account the ordering of keys of an object using the offset property.
  3. Property accesses are expensive in JavaScript. The engine goes down to the root to find a particular property. Instead of directly accessing a property as pokemonObj.abilities, use hasOwnProperty method as below.
if (pokemonObj.hasOwnProperty('abilities')) {
    // ... YOUR CODE
}

hasOwnProperty method searches only on the object and not on its prototype chain.

Implementation of Arrays

Similar to objects, arrays are also considered dictionaries. But in the case of arrays, the keys are numeric, called as array indices. The JavaScript engines perform special optimizations for properties whose names are purely numeric. Objects have properties that map to values, while arrays have indices that map to elements.

In JavaScript, we don’t define the type of elements that an array will contain beforehand. While running JavaScript code, the engine determines the type of elements of an array and accordingly performs optimizations based on its type.

let arrIntegers = [1, 2, 3, 4]
let arrDoubles = [1, 2.2, 3.3]
let arrMix = [1, 2.2, 'Hello World!']

The arrIntegers holds elements of type integers; arrDoubles contains integers as well as doubles; while arrMix contains integers, doubles, as well as strings.

The engines identify these three types of arrays a:
arrIntegers - SMI_ELEMENTS
arrDoubles - DOUBLE_ELEMENTS
arrMix - ELMENTS

SMI_ELEMENTS is a more specific type that contains only small integers. DOUBLE_ELEMENTS contains floating-point numbers and integer, while ELEMENTS is a general form that holds everything.

There are two more variants to these types defined above.

If an array is dense, it does not contain any undefined values — it is defined as a PACKED variant. However, if an array contains few spaces or holes or undefined values, it is defined as a HOLEY array.

let packedArr = [1, 2, 3, 4]
let holeyArr = new Array(4)
holeyArr[0] = 1

packedArr has definite values on all its indices. In the second statement, the Array constructor creates an empty array of length 4. It has four undefined values. When we do holeyArr[0] = 1, it still contains three undefined values. Even if we populate all of the indices of holeyArr with definite values, it will still be called as HOLEY. Arrays cannot transition from a more generic type to its specific variant at any point in time. PACKED is a more specific variant as compared to HOLEY.

The JavaScript engine will categorize packedArr as PACKED_SMI_ELEMENTS because it contains all integers and it does not contain any undefined values. V8 can perform better optimizations for elements that are of more specific kinds. Operations on PACKED arrays are more efficient than those on HOLEY arrays.

V8 currently distinguishes 21 different elements kinds, each of which comes with its own set of possible optimizations.

alt Elements Kinds

It is only possible to transition down the lattice. An array cannot go back to its more specific version at any point.

Avoid Creating Holes in an Array

If you initialize an array using an array constructor as below —

let arr = new Array(3)

— it creates holes in the array. The engine identifies this array as of type HOLEY_ELEMENTS. Property accesses on HOLEY_ELEMENTS is expensive. If you want the value at arr[0], the engine first searches for the 0 property on array. Array arr does contain any of the properties yet. Since the engine cannot find the key 0 on arr, it descends to its prototype chain until it finds key 0 or has reached the end of the chain. This makes it similar to objects, and so even after using arrays, we’re not able to optimize it.

Instead, create arrays using an empty literal and then keep pushing values to it.

let arr = []
arr.push(1)
arr.push(2)

With this approach, the number of elements present in the array is the same as that of its length.

For the same reason, access only those properties on the array that are defined within its length. If you access a property that is not defined on an array, it will enter into expensive lookups on the prototype chain.

V8 performs several optimizations for array built-in methods like map, reduce and filter.

Speculative Optimizations by the Compiler

If a certain code snippet has been executed some x number of times, the profiler passes it to the optimizing compiler. TurboFan uses the profiling information and makes some assumptions to produce optimized machine code.

Some of its assumptions are:

  1. The type of a variable is not changed.
  2. The structure of the object has not changed, and, hence, it can directly use the offset property defined on the object.

Inline Caches

Inline caches are the key ingredients in making JavaScript run fast. JavaScript engines use inline caches to memorize information on where to find a particular property of an object. The engines store the offset values and the bytecodes that are repeated too often in inline caches.

Deoptimization or Optimization Bailout

If the assumptions made by the compiler are not true, work done by the compiler is ignored and the engine falls back to the bytecode generated by the Ignition. This is called as deoptimization or optimization bailout. This happens because of one of the following reasons:

  1. The type of element has been changed at run time
let a = 2
a = 'Hello World!'
  1. The structure of an object has been changed.
  2. Element kinds of an array has been changed from PACKED to HOLEY version. Please note, the engines use a lot of optimization techniques for PACKED arrays. Avoid holes or undefined in your arrays.

Let’s Recap All That We Have Learned in this Tutorial

  1. JavaScript engines evolved from a mere interpreter to modern engines that use just-in-time-compilation. Just-in-time compilation combines the good parts of both the compiler and an interpreter. It compiles JavaScript code at run-time.
  2. V8 uses Ignition as the interpreter and TurboFan as the optimizing compiler. Ignition produces bytecode using AST as the input. The profiler watches code as it runs. It sends off the hot code to TurboFan. TurboFan uses profiling information to generate optimized machine code.
  3. We learned that interpreter has a stack of registers and it uses these registers to store parameters/variables. V8 has defined bytecodes for various operations like adding and creating objects or functions.
  4. We learned how JavaScript engines handle different Elements Kinds.
  5. Objects are stored in the form of shapes, which saves a lot of memory and also makes it easy to fetch any of the properties on an object.
  6. We saw some of the assumptions that a compiler makes to produce optimized code. If any of the assumptions fail, it deoptimizes its code and falls back to the bytecode that was generated by Ignition.

References

  1. https://www.youtube.com/watch?v=p-iiEDtpy6I
  2. https://www.youtube.com/watch?v=5nmpokoRaZI
  3. https://mathiasbynens.be/notes/shapes-ics
  4. https://v8.dev/blog/elements-kinds

This post has been brought to you by Kendo UI

Want to learn more about creating great web apps? It all starts out with Kendo UI - the complete UI component library that allows you to quickly build high-quality, responsive apps. It includes everything you need, from grids and charts to dropdowns and gauges.

KendoJSft


Ankita Masand
About the Author

Ankita Masand

A Software engineer trying to make sense of every line of code she writes, Ankita is a JavaScript enthusiast and adores its weird parts. She’s also an obsessed developer, highly focused on the performance of web applications. When she’s not writing JavaScript, you’ll find her engaged in understanding the intricacies of deep neural networks.

Related Posts

Comments

Comments are disabled in preview mode.