Post

From 499 to 200: Understanding Puma Worker Timeouts, Nginx 499s, and How to Keep Your Rails API Fast

A deep dive into how Puma worker timeouts and Nginx 499 errors reveal performance bottlenecks in Rails applications — and how to fix them with caching, async jobs, and smarter timeout tuning.

🧱 From 499 to 200: Understanding Puma Worker Timeouts, Nginx 499s, and How to Keep Your Rails API Fast

When you start seeing a spike in HTTP 499s in your Nginx logs, it feels mysterious:

“The client closed the connection before the server responded.”

But a 499 isn’t random. It’s a warning signal that your system is slow or unbalanced.
Let’s unpack the why — and the how to fix it.


⚙️ Understanding the Stack

A typical Rails + Puma + Nginx stack looks like this:

1
2
3
Browser → CDN → Nginx → Puma → Rails → DB / Redis / External API

If any layer takes longer than the timeout of the layer above, the client gives up.
That’s when Nginx logs a 499.


💣 Anatomy of a Slow Request

1
2
3
4
5
6
7
8
9
10
# app/controllers/demo_controller.rb
class DemoController < ApplicationController
  def cpu_heavy
    render json: { result: (1..50_000_000).sum }
  end

  def io_heavy
    render json: { result: Net::HTTP.get(URI('https://httpbin.org/delay/3')) }
  end
end
  • CPU-bound: blocks the Ruby GVL and stalls other threads.
  • I/O-bound: wastes time waiting for slow dependencies.

🩹 Quick Mitigations

✅ 1. Align Timeouts (Stop Early Cutoffs)

Puma (config/puma.rb):

1
2
3
4
5
workers ENV.fetch("WEB_CONCURRENCY") { 4 }
threads_count = ENV.fetch("RAILS_MAX_THREADS") { 8 }
threads threads_count, threads_count
worker_timeout 60
force_shutdown_after 10

Nginx:

1
2
3
4
proxy_connect_timeout 5s;
proxy_send_timeout 60s;
proxy_read_timeout 60s;
send_timeout 60s;

Keep client/CDN timeouts slightly shorter.


✅ 2. Move Long Work to Background Jobs

1
2
3
4
5
6
7
8
9
10
11
12
13
class ReportJob < ApplicationJob
  queue_as :default

  def perform(user_id)
    user = User.find(user_id)
    ReportGenerator.generate_for(user)
  end
end

def create
  ReportJob.perform_later(current_user.id)
  render json: { status: "queued" }, status: :accepted
end

Return 202 Accepted and let Sidekiq or Resque do the work.


✅ 3. Cache Expensive Reads

1
2
3
4
5
6
def index
  @articles = Rails.cache.fetch("articles:index", expires_in: 10.minutes) do
    Article.includes(:author).order(created_at: :desc).limit(100).to_a
  end
  render json: @articles
end

Add ETag and Cache-Control headers for even better CDN caching.


🚀 Long-Term Improvements

🧠 1. Separate CPU vs IO Pools

Procfile

1
2
web_io: bundle exec puma -C config/puma.io.rb
web_cpu: bundle exec puma -C config/puma.cpu.rb

Nginx routing

1
2
location /reports/ { proxy_pass http://app_cpu; proxy_read_timeout 180; }
location /api/     { proxy_pass http://app_io;  proxy_read_timeout 30;  }

CPU tasks won’t block I/O threads anymore.


🧠 2. Tune Connection Pools

Ensure DB pool ≥ Puma max threads:

1
2
3
production:
  database:
    pool: <%= ENV.fetch("RAILS_MAX_THREADS") { 8 } %>

🧠 3. Prewarm Caches

Warm caches during deploy:

1
curl -s https://example.com/api/popular?page=1 > /dev/null

🧠 4. Observability

1
Rack::Timeout.service_timeout = 25

Monitor:

  • p95/p99 latency
  • 499 rate
  • Puma thread saturation
  • DB pool waits

🕹 Timeout Hierarchy (Balanced Setup)

LayerExample TimeoutPurpose
Client/CDN30sUser patience
Nginx proxy_read_timeout60sBuffer for app
Rack::Timeout25sSafety cutoff
DB/HTTP client10–20sFail fast
Puma worker_timeout60sDetect stuck workers

⚖️ Quick vs Long-Term Fixes

TypeActionsGoal
QuickAlign timeouts, cache, async jobsStop 499s
Mid-TermSplit pools, prewarm, tune DBBoost reliability
Long-TermAdd tracing, circuit breakers, load sheddingPrevent regression

🧩 Final Checklist

✅ Track request IDs ✅ Fix slow DB queries ✅ Add timeouts for external calls ✅ Use caching smartly ✅ Offload heavy jobs ✅ Align timeout hierarchy ✅ Alert on 499s + latency


🏁 Conclusion

HTTP 499s aren’t bugs — they’re signals. They show your system is slower than your user’s patience.

By caching smartly, tuning timeouts, and separating workloads, you’ll transform your Rails API from fragile to formidable.

This post is licensed under CC BY 4.0 by the author.