*Note: This is Part One of our series.
Part Two is here
Visualizing Node.js Scaling - Part 2

Visualizing Node.js Scaling - Part 1 🚀

Intro: The “Oh No, My Server Died” Problem

Alright, so you’ve built this amazing Node.js app.

It works fine on your laptop, and you deploy it to a server.

Everything is golden—until users actually show up.

Then suddenly: BOOM! The server dies a slow, painful death, drowning in requests like a small boat in a hurricane. 🌊

What happened?

Simple.

You hit the single-threaded nature of Node.js, where one process = one CPU core.

And if too many requests come in?

That single process gets overwhelmed faster than a Wi-Fi router at a hacker convention.

So… How Do We Fix This?

We scale.

More specifically, we scale Node.js properly using clustering and reverse proxies.

This article is part one of a series on scaling Node.js.

The Basics of Node.js Scaling

1️⃣ Clustering: The “More Workers = Less Pain” Trick

Node.js runs on a single thread by default.

This is great for simplicity but terrible for scaling on multi-core machines.

Solution? The Cluster module. 🏗️

Instead of running one sad little process, we spawn multiple worker processes, each using a different CPU core.

Example: A Basic Node.js Cluster

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
const cluster = require('cluster');
const http = require('http');
const os = require('os');

if (cluster.isMaster) {
  const numCPUs = os.cpus().length;
  console.log(`Master process ${process.pid} is running`);
  
  for (let i = 0; i < numCPUs; i++) {
    cluster.fork();
  }

  cluster.on('exit', (worker) => {
    console.log(`Worker ${worker.process.pid} died. Restarting...`);
    cluster.fork();
  });

} else {
  http.createServer((req, res) => {
    res.writeHead(200);
    res.end(`Handled by process ${process.pid}\n`);
  }).listen(3000);

  console.log(`Worker process ${process.pid} started`);
}

What’s Happening?

The master process spawns a worker for each CPU core.
If a worker crashes, the master revives it like a zombie. 🧟
Incoming requests are load-balanced across workers (handled automatically by Node.js).

Now, instead of one Node.js process struggling to do everything, we have multiple processes handling requests. Big win! 🏆

2️⃣ Reverse Proxy: The “Traffic Cop”

Even with clustering, one machine can only handle so much traffic.

If we want true scalability, we need multiple machines.

And when we have multiple machines, we need something to direct traffic to the right one.

Enter: The Reverse Proxy. 🏢

Think of it like a nightclub bouncer.

People show up, and the bouncer (reverse proxy) decides which door (server) they should go through.

Example: Setting Up Nginx as a Reverse Proxy

1️⃣ Install Nginx:

1
sudo apt install nginx

2️⃣ Edit the Nginx config (/etc/nginx/sites-available/default):

1
2
3
4
5
6
7
8
9
server {
    listen 80;

    location / {
        proxy_pass http://localhost:3000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

3️⃣ Restart Nginx:

1
sudo systemctl restart nginx

Why This Works

Nginx sits in front of our Node.js processes and distributes traffic.
If we scale across multiple servers, Nginx can load balance between them.
We get better performance and failover protection.

Why This is Just the Beginning

This was Part 1 of visualizing Node.js scaling. 🎉

We’ve covered:
✅ Clustering: Using multiple Node.js processes for better CPU usage.
✅ Reverse proxies: Directing traffic efficiently with Nginx.

But there’s more to scaling than just this.

What about:

Horizontal scaling with Kubernetes?
Auto-scaling on the cloud?
More advanced load balancing tricks?

That’s all coming in Part 2.

Key Ideas

Topic	Summary
Clustering	Uses multiple worker processes to handle requests efficiently.
Reverse Proxy	Distributes incoming requests across multiple servers.
Nginx Load Balancing	Directs traffic to different machines to prevent overloading one server.
Process Management	Cluster module automatically restarts crashed workers.
Scaling Importance	Needed to prevent server overload and crashes.

References

Part Two is here
Visualizing Node.js Scaling - Part 2