Worker threads super charged the abilities of JavaScript(Node), as it was formerly not ideal for CPU-intensive operations. This post explores how worker threads work in Node.
From conception, JavaScript has been a single-threaded language. Single-threaded in the sense that only one set of commands can be executed at any given time in the same process. By extension, Node.js isn’t a good choice for implementing highly CPU-intensive applications. To solve this problem, an experimental concept of worker threads was introduced in Node v10.5.0 and was stabilized in v12 LTS.
The worker_threads
module enables the use of threads that execute JavaScript in parallel. It can be accessed with any of the following syntax:
const worker = require('worker_threads');
// OR
import worker_threads;
Worker threads work in a different way than traditional multi-threading in other high-level languages. A worker’s responsibility is to execute a piece of code provided by the parent worker. It runs in isolation from other workers, but has the ability to pass information between it and the parent worker.
Each worker is connected to its parent worker via a message channel. The child worker can write to the message channel using parentPort.postMessage
function, and the parent worker can write to the message channel by calling worker.postMessage()
function on the worker instance.
Now, one might ask how a node worker runs independently, as JavaScript doesn’t support concurrency. The answer: v8 Isolate. A v8 Isolate is an independent instance of Chrome v8 run-time, which has its own JavaScript heap and a micro-task queue. This allows each Node.js worker to run its JavaScript code in complete isolation from other workers. The downside of this is that the workers cannot directly access each other’s heaps directly. Due to this, each worker will have its own copy of libuv event loop, which is independent of other workers’ and the parent worker’s event loops.
As earlier pointed out, the worker_thread API was introduced in v10.5.0 and stabilized in v12. If you’re using any version prior to 11.7.0, however, you need to enable it by using the --experimental-worker
flag when invoking Node.js.
In our example, we are going to implement the main file, where we are going to create a worker thread and give it some data. The API is event-driven, but it’ll be wrapped into a Promise that resolves in the first message received from the worker:
// index.js
// run with node --experimental-worker index.js on Node.js 10.x
const { Worker } = require('worker_threads')
const runService = (workerData) => {
return new Promise((resolve, reject) => {
const worker = new Worker('./myWorker.js', { workerData });
worker.on('message', resolve);
worker.on('error', reject);
worker.on('exit', (code) => {
if (code !== 0)
reject(new Error(`Worker stopped with exit code ${code}`));
})
})
}
const run = async () => {
const result = await runService('world')
console.log(result);
}
run().catch(err => console.error(err))
As you can see, this is as easy as passing the filename as an argument and the data we want the worker to process. Take note that this data is cloned and is not in any shared memory. Then, we wait for the Worker thread to send us a message by listening to the message
event. Next, we need to implement the service.
const { workerData, parentPort } = require('worker_threads')
// You can do any heavy stuff here, in a synchronous way
// without blocking the "main thread"
parentPort.postMessage({ greetings: workerData })
Here, we need two things. First the workerData
that the main app sent to us, and secondly, a way to return information to the main app. This is done with the parentPort
that has a postMessage
method where we will pass the result of our processing.
It’s as straightforward as that! This is a somewhat simple example, but we can build more complex things—for example, we could send multiple messages from the worker thread indicating the execution status if we need to provide feedback, or we can send partial results. Imagine that you are processing thousands of images and maybe you want to send a message per image processed, but you don’t want to wait until all of them are processed.
Understanding at least the very basics of how they work indeed helps us to get the best performance using worker threads. When writing more complex applications than our example, we need to remember the following two major concerns with worker threads.
To overcome the first concern, we need to implement "Worker Thread Pooling."
A pool of Node.js worker threads is a group of running worker threads that are available to be used for incoming tasks. When a new task comes in, it can be passed to an available worker via the parent-child message channel. Once the worker completes the task, it can pass the results back to the parent worker via the same message channel.
If properly implemented, thread pooling can significantly improve the performance as it reduces the additional overhead of creating new threads. It also worth mentioning that creating a large number of threads is also not efficient, as the number of parallel threads that can be run effectively is always limited by the hardware.
The introduction of worker threads has super charged the abilities of JavaScript(Node), as they has effectively taken care of its biggest shortcoming of not being ideal for CPU-intensive operations. For additional information, check out the worker_threads
documentation.
Writer of all things technical and inspirational, developer and community advocate. In a love-love relationship with JavaScript, a web accessibility nerd.