Node.js

Node.js Best Practices for Production Applications in 2026

March 1, 2026 · 9 min read

Node.js powers millions of production applications worldwide, from small APIs to systems handling millions of concurrent users. Despite its popularity, many teams deploy Node.js applications without the practices needed for true production resilience. This guide covers what separates a toy Node.js project from a hardened, production-ready service.

1. Use a Process Manager

Node.js is single-threaded — one unhandled exception crashes the entire process. In production, you must run Node.js under a process manager like PM2 or use Kubernetes with appropriate restart policies. PM2 automatically restarts your app on crashes, manages log rotation, and can restart workers if memory usage exceeds a threshold. It also supports cluster mode, which we'll cover next.

Never run node server.js directly in production. Your process will crash, users will hit 503 errors, and the app will stay down until someone manually restarts it.

2. Cluster Mode to Use All CPU Cores

Node.js runs on a single CPU core by default. On a server with 8 cores, you're leaving 87.5% of your compute unused. The cluster module lets you fork worker processes — one per CPU core — that all share the same server port. Incoming requests are distributed across workers by the OS.

PM2 makes this trivial: pm2 start server.js -i max spawns one worker per available core. For CPU-bound workloads, this provides near-linear scaling. For I/O-bound workloads (most web APIs), the improvement is still significant as it prevents one slow request from blocking others.

3. Structured Logging with Winston or Pino

Using console.log() in production is a mistake. It produces unstructured text that's impossible to query, doesn't include severity levels, and doesn't integrate with log aggregation tools like Datadog, Splunk, or CloudWatch.

Use a structured logger like Pino (extremely fast, 5–10x faster than Winston) or Winston. Output JSON logs to stdout and let your infrastructure collect and forward them. Include fields like requestId, userId, durationMs, and statusCode on every request log. This transforms logs from unreadable walls of text into queryable, filterable data.

4. Centralized Error Handling

Every Express or Fastify application needs a centralized error handling middleware that catches errors from all routes. This middleware should log the error with stack trace, return an appropriate HTTP status code (not always 500), and never expose internal error messages or stack traces to clients in production.

Also listen for process.on('uncaughtException') and process.on('unhandledRejection'). Log the error, perform any necessary cleanup, and then gracefully exit — the process manager will restart it. Never swallow these errors silently.

5. Graceful Shutdown

When deploying a new version, Kubernetes or your process manager sends a SIGTERM signal to the running process before killing it. Without graceful shutdown handling, in-flight requests are cut off mid-response, database connections are torn down abruptly, and users see errors.

Listen for SIGTERM, stop accepting new connections, wait for in-flight requests to complete (with a reasonable timeout), close database pools, and then exit cleanly. This essential 20 lines of code prevents hundreds of errors during every deployment.

6. Health Check Endpoints

Kubernetes, load balancers, and uptime monitors all need a health check endpoint — typically GET /health. Return a 200 with a JSON body showing the application status, uptime, and dependency health (database connection, Redis connection). A GET /ready endpoint (readiness probe) should return 503 during startup until the app is fully initialised, preventing traffic being sent to a not-yet-ready instance.

Conclusion

Production Node.js is a discipline. The patterns here — process management, clustering, structured logging, graceful shutdown, and health checks — are the baseline for any serious deployment. Implement them from the start of a project, not as an afterthought before launch. They will save you hours of debugging, prevent unnecessary downtime, and make your application significantly more observable and resilient.