From 499 to 200: Understanding Puma Worker Timeouts, Nginx 499s, and How to Keep Your Rails API Fast
A deep dive into how Puma worker timeouts and Nginx 499 errors reveal performance bottlenecks in Rails applications — and how to fix them with caching, async jobs, and smarter timeout tuning.
🧱 From 499 to 200: Understanding Puma Worker Timeouts, Nginx 499s, and How to Keep Your Rails API Fast
When you start seeing a spike in HTTP 499s in your Nginx logs, it feels mysterious:
“The client closed the connection before the server responded.”
But a 499
isn’t random. It’s a warning signal that your system is slow or unbalanced.
Let’s unpack the why — and the how to fix it.
⚙️ Understanding the Stack
A typical Rails + Puma + Nginx stack looks like this:
1
2
3
Browser → CDN → Nginx → Puma → Rails → DB / Redis / External API
If any layer takes longer than the timeout of the layer above, the client gives up.
That’s when Nginx logs a 499
.
💣 Anatomy of a Slow Request
1
2
3
4
5
6
7
8
9
10
# app/controllers/demo_controller.rb
class DemoController < ApplicationController
def cpu_heavy
render json: { result: (1..50_000_000).sum }
end
def io_heavy
render json: { result: Net::HTTP.get(URI('https://httpbin.org/delay/3')) }
end
end
- CPU-bound: blocks the Ruby GVL and stalls other threads.
- I/O-bound: wastes time waiting for slow dependencies.
🩹 Quick Mitigations
✅ 1. Align Timeouts (Stop Early Cutoffs)
Puma (config/puma.rb
):
1
2
3
4
5
workers ENV.fetch("WEB_CONCURRENCY") { 4 }
threads_count = ENV.fetch("RAILS_MAX_THREADS") { 8 }
threads threads_count, threads_count
worker_timeout 60
force_shutdown_after 10
Nginx:
1
2
3
4
proxy_connect_timeout 5s;
proxy_send_timeout 60s;
proxy_read_timeout 60s;
send_timeout 60s;
Keep client/CDN timeouts slightly shorter.
✅ 2. Move Long Work to Background Jobs
1
2
3
4
5
6
7
8
9
10
11
12
13
class ReportJob < ApplicationJob
queue_as :default
def perform(user_id)
user = User.find(user_id)
ReportGenerator.generate_for(user)
end
end
def create
ReportJob.perform_later(current_user.id)
render json: { status: "queued" }, status: :accepted
end
Return 202 Accepted
and let Sidekiq or Resque do the work.
✅ 3. Cache Expensive Reads
1
2
3
4
5
6
def index
@articles = Rails.cache.fetch("articles:index", expires_in: 10.minutes) do
Article.includes(:author).order(created_at: :desc).limit(100).to_a
end
render json: @articles
end
Add ETag
and Cache-Control
headers for even better CDN caching.
🚀 Long-Term Improvements
🧠 1. Separate CPU vs IO Pools
Procfile
1
2
web_io: bundle exec puma -C config/puma.io.rb
web_cpu: bundle exec puma -C config/puma.cpu.rb
Nginx routing
1
2
location /reports/ { proxy_pass http://app_cpu; proxy_read_timeout 180; }
location /api/ { proxy_pass http://app_io; proxy_read_timeout 30; }
CPU tasks won’t block I/O threads anymore.
🧠 2. Tune Connection Pools
Ensure DB pool ≥ Puma max threads:
1
2
3
production:
database:
pool: <%= ENV.fetch("RAILS_MAX_THREADS") { 8 } %>
🧠 3. Prewarm Caches
Warm caches during deploy:
1
curl -s https://example.com/api/popular?page=1 > /dev/null
🧠 4. Observability
1
Rack::Timeout.service_timeout = 25
Monitor:
- p95/p99 latency
- 499 rate
- Puma thread saturation
- DB pool waits
🕹 Timeout Hierarchy (Balanced Setup)
Layer | Example Timeout | Purpose |
---|---|---|
Client/CDN | 30s | User patience |
Nginx proxy_read_timeout | 60s | Buffer for app |
Rack::Timeout | 25s | Safety cutoff |
DB/HTTP client | 10–20s | Fail fast |
Puma worker_timeout | 60s | Detect stuck workers |
⚖️ Quick vs Long-Term Fixes
Type | Actions | Goal |
---|---|---|
Quick | Align timeouts, cache, async jobs | Stop 499s |
Mid-Term | Split pools, prewarm, tune DB | Boost reliability |
Long-Term | Add tracing, circuit breakers, load shedding | Prevent regression |
🧩 Final Checklist
✅ Track request IDs ✅ Fix slow DB queries ✅ Add timeouts for external calls ✅ Use caching smartly ✅ Offload heavy jobs ✅ Align timeout hierarchy ✅ Alert on 499s + latency
🏁 Conclusion
HTTP 499s aren’t bugs — they’re signals. They show your system is slower than your user’s patience.
By caching smartly, tuning timeouts, and separating workloads, you’ll transform your Rails API from fragile to formidable.