Nginx Reverse Proxy: Performance Tuning and Advanced Configuration

Introduction

Nginx powers over 30% of the world's websites. It's fast out of the box, but the default configuration is conservative--designed to work everywhere, optimized for nowhere.

If you're running Nginx as a reverse proxy in front of Node.js, Python, Go, or PHP backends, tuning a handful of settings can double your throughput and cut latency in half.

This guide covers production-tested Nginx optimizations: worker configuration, buffer tuning, keepalive connections, caching strategies, Gzip compression, and rate limiting. Every setting comes with a concrete before-and-after impact.

Worker Process Tuning

Nginx uses an event-driven architecture with worker processes. Two settings matter most:

worker_processes

Set this to auto on modern Nginx (1.3.8+). It matches your CPU core count:

worker_processes auto;

For containers, you may need to set it explicitly since auto can't always detect cgroup limits:

worker_processes 2;  # Match your container's CPU limit

worker_connections

Each worker can handle this many simultaneous connections. The default (512) is far too low for a reverse proxy:

events {
    worker_connections 4096;
    use epoll;
    multi_accept on;
}

Setting	Default	Recommended	Impact
worker_connections	512	4096+	Max connections = workers x connections
use epoll	select	epoll	10x+ efficiency on Linux
multi_accept	off	on	Accepts all new connections at once

With 4 workers and 4096 connections each, Nginx handles 16,384 concurrent connections. The worker_rlimit_nofile in the main context must support this:

worker_rlimit_nofile 16384;

Buffer Size Tuning

Nginx buffers request and response bodies. Defaults are tiny for modern workloads:

http {
    # Client request body buffer
    client_body_buffer_size 16k;
    client_max_body_size 50m;   # Max upload size
    client_body_timeout 30s;

    # Client header buffer
    client_header_buffer_size 2k;
    large_client_header_buffers 4 16k;

    # Proxy buffers (critical for reverse proxy!)
    proxy_buffer_size 16k;
    proxy_buffers 8 32k;
    proxy_busy_buffers_size 64k;

    # Response buffering
    proxy_buffering on;
    proxy_read_timeout 60s;
    proxy_send_timeout 60s;
}

Why this matters: If proxy_buffer_size is too small, Nginx writes upstream responses to disk instead of memory--massive latency hit. The proxy_buffers setting (8 x 32k = 256KB total) should comfortably fit most API responses.

Proxy Buffering: When to Disable

For streaming or Server-Sent Events (SSE), disable buffering:

location /stream {
    proxy_buffering off;
    proxy_cache off;
    proxy_read_timeout 86400s;  # Long-lived connection
    proxy_pass http://backend;
}

Keepalive Connections

Upstream Keepalive

Nginx opens a new connection to backends for each request by default. This wastes TCP handshakes:

upstream backend {
    server 10.0.1.10:3000;
    server 10.0.1.11:3000;
    keepalive 64;              # Keep 64 idle connections alive
    keepalive_timeout 60s;
    keepalive_requests 1000;   # Requests per connection before recycling
}

server {
    location / {
        proxy_http_version 1.1;
        proxy_set_header Connection "";  # Required for keepalive!
        proxy_pass http://backend;
    }
}

Impact: A keepalive pool of 64 connections eliminates TCP handshake overhead for most workloads. For services handling 1000+ req/s, this alone can reduce p99 latency by 30-50ms.

Client Keepalive

Improve client-side connection reuse:

http {
    keepalive_timeout 65;
    keepalive_requests 100;
}

65 seconds matches common browser keepalive timeouts. Increase keepalive_requests for API clients that reuse connections heavily.

Proxy Caching

Nginx can cache backend responses, dramatically reducing backend load:

http {
    # Cache path: 100MB, 1 level structure, inactive=60m
    proxy_cache_path /var/cache/nginx levels=1:2
                     keys_zone=my_cache:100m
                     max_size=1g
                     inactive=60m
                     use_temp_path=off;

    proxy_cache_key "$scheme$request_method$host$request_uri";
}

server {
    location /api/ {
        proxy_cache my_cache;
        proxy_cache_valid 200 10m;     # Cache 200 for 10min
        proxy_cache_valid 404 1m;      # Cache 404 for 1min
        proxy_cache_bypass $http_cache_control;  # Honor client no-cache
        proxy_cache_use_stale error timeout updating;
        add_header X-Cache-Status $upstream_cache_status;

        proxy_pass http://backend;
    }
}

Key decisions:

Setting	Recommendation	Why
keys_zone size	1MB per 8000 keys	Tracks cache metadata in memory
max_size	1-10GB	Limit disk usage
inactive	60m	Evict untouched items
use_temp_path off	Always	Writes directly to cache, avoids copy
proxy_cache_use_stale	updating	Serve stale while refreshing

Verify caching works:

curl -I https://yoursite.com/api/users | grep X-Cache-Status
# X-Cache-Status: MISS (first request)
# X-Cache-Status: HIT  (subsequent requests)

Gzip Compression

Gzip reduces response sizes by 60-80%. Enable it for text-based responses:

http {
    gzip on;
    gzip_vary on;
    gzip_proxied any;
    gzip_comp_level 6;
    gzip_min_length 256;
    gzip_types
        text/plain
        text/css
        text/xml
        text/javascript
        application/json
        application/javascript
        application/xml+rss
        image/svg+xml;
    gzip_disable "msie6";
}

Setting	Value	Why
comp_level	5-6	Sweet spot: 75% compression at low CPU cost
min_length	256-1000	Don't compress tiny responses (overhead > savings)
gzip_proxied	any	Compress proxied responses too

Never gzip already-compressed formats (images, videos, pre-compressed files):

location ~* \.(jpg|jpeg|png|gif|ico|webp|mp4|webm|zip|gz|bz2)$ {
    gzip off;
    expires 30d;
    add_header Cache-Control "public, immutable";
}

Security & Rate Limiting

Rate Limiting

Protect backends from abuse and brute-force attacks:

http {
    # Define a shared memory zone for rate limiting
    limit_req_zone $binary_remote_addr zone=api_limit:10m rate=10r/s;
    limit_req_zone $binary_remote_addr zone=login_limit:10m rate=5r/m;

    # Connection limiting
    limit_conn_zone $binary_remote_addr zone=conn_limit:10m;
}

server {
    location /api/ {
        limit_req zone=api_limit burst=20 nodelay;
        limit_conn conn_limit 10;
        proxy_pass http://backend;
    }

    location /login {
        limit_req zone=login_limit burst=3 nodelay;
        proxy_pass http://backend;
    }
}

Setting	Value	Meaning
rate=10r/s	10 req/sec	Average rate
burst=20	+20 queue	Allow short spikes
nodelay	Immediate	No queuing, just rate limit
limit_conn 10	10 connections	Concurrent connection cap

Timeouts

Aggressive timeouts protect against slow clients and hung backends:

server {
    # Client timeouts
    client_body_timeout 12s;
    client_header_timeout 12s;
    send_timeout 10s;

    # Proxy timeouts
    proxy_connect_timeout 5s;    # Fail fast if backend is down
    proxy_read_timeout 30s;      # Wait for backend response
    proxy_send_timeout 30s;      # Wait for client to accept response
}

Putting It All Together

Here's a production-ready Nginx reverse proxy config combining all optimizations:

user www-data;
worker_processes auto;
worker_rlimit_nofile 16384;
pid /run/nginx.pid;

events {
    worker_connections 4096;
    use epoll;
    multi_accept on;
}

http {
    # Basic
    sendfile on;
    tcp_nopush on;
    tcp_nodelay on;
    keepalive_timeout 65;
    keepalive_requests 100;
    server_tokens off;

    # Buffers
    client_body_buffer_size 16k;
    client_max_body_size 50m;
    client_header_buffer_size 2k;
    large_client_header_buffers 4 16k;

    # Gzip
    gzip on;
    gzip_comp_level 6;
    gzip_min_length 256;
    gzip_types application/json text/plain text/css text/javascript;
    gzip_proxied any;

    # Cache
    proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=api_cache:100m max_size=2g inactive=60m use_temp_path=off;

    # Rate limiting
    limit_req_zone $binary_remote_addr zone=api_limit:10m rate=10r/s;

    # Upstream
    upstream app_backend {
        server 10.0.1.10:3000 max_fails=3 fail_timeout=30s;
        server 10.0.1.11:3000 max_fails=3 fail_timeout=30s;
        keepalive 64;
    }

    server {
        listen 80;
        server_name example.com;

        location / {
            limit_req zone=api_limit burst=20 nodelay;
            proxy_pass http://app_backend;
            proxy_http_version 1.1;
            proxy_set_header Connection "";
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto $scheme;

            # Proxy buffers
            proxy_buffer_size 16k;
            proxy_buffers 8 32k;
            proxy_busy_buffers_size 64k;

            # Timeouts
            proxy_connect_timeout 5s;
            proxy_read_timeout 30s;

            # Caching
            proxy_cache api_cache;
            proxy_cache_valid 200 10m;
            proxy_cache_use_stale error timeout updating;
            add_header X-Cache-Status $upstream_cache_status;
        }
    }
}

Conclusion

Tuning Nginx isn't about memorizing every directive--it's about understanding the three bottlenecks: connections (workers and keepalive), memory (buffers and caching), and bandwidth (compression and rate limiting).

Start with worker tuning and keepalive connections--these give the biggest immediate wins. Then add caching for read-heavy endpoints and Gzip for text responses. Finally, layer in rate limiting once you understand your traffic patterns.

Test every change with nginx -t before reloading. And benchmark: ab -n 10000 -c 100 https://yoursite.com/ before and after. You'll see the difference in the numbers.