Clusters in Node.js | blog

Motivation

We want to improve our Node JS app with clustering. More specifically, we want to:

utilize worker threads so that we can take advantage of all of our server's cores
never let our app hard crash, when possible
deal with shutdowns gracefully, to avoid issues like leaving a trail of open database connections

Constraints

We want the simplest solution possible.

Solution

Once again, we're going to utilize an excellent package:

throng

(npm, GitHub).

throng

abstracts away the details of clustering and exposes an intuitive and declarative interface. At a high level,

throng

forks some number of worker threads (determined by you) and creates new ones if they go down.

First we install the package:

npm i -S throng

Throng's API is very simple:

const throng = require('throng')

throng({
    // Fn to call in master process (can be async)
    master: () => {},

    // Fn to call in cluster workers (can be async)
    worker: yourWorkerFunc,

    // Number of workers
    count: os.cpus().length,

    // Min time to keep cluster alive (ms)
    lifetime: Infinity,

    // Grace period between signal and hard shutdown (ms)
    grace: 5000,

    // Signals that trigger a shutdown (proxied to workers)
    signals: ['SIGTERM', 'SIGINT'],
})

Those are the default options for

throng

v5.

Let's go through each of them for clarity.

The

master

function is called only once. We might utilize that for something that we want only to happen once per run, though likely we'll want to have most of our app running on worker threads.

The

worker

function is called once per worker thread. The number of workers is defined by you with the

count

option.

throng

defaults to setting the number of workers based on the number of available CPUs, using Node's inbuilt API to get information about the logical cores of the computer.

throng

ensures that the cluster has

count

workers for the specified

lifetime

, and if the worker count dips below the desired count within the

lifetime

, throng will spin up more workers until

count

is reached. Use

lifetime

to set (in ms) how long

throng

should keep the cluster alive. You'll probably want to stick with the default (Infinity).

The

grace

period, also set in ms, allows us to set aside some time to deal with cleanup before a hard shutdown. One useful example of what we might want to do in cleanup is close existing DB connections for that thread.

The

signals

option allows us to specify which process signals we want to trigger a shutdown. Generally the defaults will be fine here.

Let's say we have a basic Express app that connects to a database. Let's see how we might utilize

throng

. Suppose our Express setup looks something like this:

const express = require('express');

function normalizePort(val) {
    const port = parseInt(val, 10);
    if (Number.isNaN(port)) return val; // named port
    if (port >= 0) return port; // port number
    return false;
}

function startExpressApp() {
    const app = express();

    // ...
    // app routes, etc
    // ...

    const port = normalizePort(PORT || 8080);
    const server = app.listen(port, () =>
        logger.info(`server listening on port ${server.address().port}`)
    );
}

We know that Javascript is single-threaded, but we want to utilize all of our server's CPU cores to serve more traffic. We can use

throng

to run our Express app once per worker!

const throng = require('throng');
const { startExpressApp } = require('./clients/express');

throng({
    master: () => {
        console.log('Started master.');
    },
    worker: async (workerId, disconnect) => {
        console.log(`Started worker ${ id }`);

        startExpressApp();

        const shutdown = () => {
            console.log(`Worker ${ id } cleanup.`)
            disconnect();
        };

        process.once('SIGTERM', shutdown);
        process.once('SIGINT', shutdown);
    },
});

Since

throng

passes the worker's ID to each worker start function, we can pass that value through to our express app if we want. For example:

startExpressApp({ workerId });

and in our express startup function:

function startExpressApp({ workerId } = {}) {
    const app = express();

    // ...
    // app routes, etc
    // ...

    const serverName = workerId ? `worker ${workerId}` : 'server';
    const port = normalizePort(PORT || 8080);
    const server = app.listen(port, () =>
        logger.info(`${serverName} listening on port ${server.address().port}`)
    );
}

Now let's handle our cleanup that we should do when any given worker thread exits. In our example, our workers connect to a database, perhaps using connection pooling. In order to avoid eating up all of our available DB connections, we want to ensure that we close connections when we can no longer use them.

const throng = require('throng');
const { startExpressApp } = require('./clients/express');
const dbPool = require('./clients/db');

throng({
    master: () => {
        console.log('Started master.');
    },
    worker: async (workerId, disconnect) => {
        console.log(`Started worker ${ id }`);

        startExpressApp({ workerId });

        const shutdown = () => {
            console.log(`Worker ${ id } cleanup.`)
            dbPool.endConnections();
            disconnect();
        };

        process.once('SIGTERM', shutdown);
        process.once('SIGINT', shutdown);
    },
});

That's it! Now before exiting each worker will close any connection pool that it may have opened while running.