Gunicorn Configuration¶

Current Procfile¶

web: gunicorn Lenzeye:app --workers=1 --threads=6 --timeout=120 --graceful-timeout=30 --keep-alive=5 --worker-class=gthread --worker-tmp-dir=/dev/shm --log-level=info --limit-request-line=4096 --limit-request-field_size=8190 --max-requests=1000 --max-requests-jitter=50

Parameter Breakdown¶

Parameter	Value	Reason
`--workers=1`	1	Single process to prevent RAM doubling
`--threads=6`	6	Handle up to 6 concurrent requests in one process
`--timeout=120`	120s	Large encrypted uploads can take >60s on slow connections
`--graceful-timeout=30`	30s	Give active requests time to finish on worker restart
`--keep-alive=5`	5s	HTTP keep-alive for repeated requests (upload parts)
`--worker-class=gthread`	gthread	Thread-based; suitable for I/O-bound Flask workloads
`--worker-tmp-dir=/dev/shm`	RAM-backed	Worker heartbeat files on RAM disk — avoids disk I/O
`--log-level=info`	info	Log all requests and errors
`--limit-request-line=4096`	4096	Max URL length
`--max-requests=1000`	1000	Restart worker after 1000 requests to prevent memory drift
`--max-requests-jitter=50`	50	Random jitter on restart threshold to prevent all workers restarting at once

Why 1 Worker?¶

The Render Starter plan provides 512 MB RAM. Under production load:

State	RAM
Baseline (1 worker, idle)	~237 MB
Normal upload load	243–377 MB
Peak (4 concurrent encrypted uploads)	398 MB
Headroom	~114 MB

With 2 workers, each worker loads the full Flask application independently (models, blueprints, lazy imports). Baseline would be ~474 MB — leaving only ~38 MB headroom. A single encrypted upload would cause OOM.

Do NOT increase workers without re-validating RAM

This configuration was validated over 6h 56min with 2,630+ files and 0 crashes. Increasing --workers on the Starter plan will cause Out-of-Memory kills.

Why gthread?¶

gthread (threaded Gunicorn worker) allows one process to handle multiple concurrent requests using Python threads:

6 threads = 6 concurrent requests
For I/O-bound work (S3 uploads, DB queries), threads are effective — threads block on I/O, not CPU
GIL is not a bottleneck for I/O workloads
Alternative (gevent) would require monkey-patching and has known compatibility issues with cryptography library

Timeout Considerations¶

--timeout=120 — a 50 GB file in 10 MB chunks = 5,000 parts. Each part request is short-lived. The timeout applies per-request, not per-session. 120s per request is sufficient.
If the timeout is too short (e.g., 30s), slow connections uploading a 10 MB chunk will time out and fail the upload.
--graceful-timeout=30 allows in-flight upload-part requests to complete before Gunicorn kills the worker on restart.

Lazy Loading Impact¶

Heavy libraries are lazy-loaded in Lenzeye.py to reduce baseline RAM:

python cv2 = None # Loaded on first use numpy = None # Loaded on first use vision = None # google.cloud.vision — loaded on first use pytesseract = None # Loaded on first use razorpay = None # Loaded on first use

This reduced baseline from ~319 MB to ~237 MB — an 82 MB saving that enabled sustained operation under load.

TL;DR¶

1 worker, 6 threads, 120s timeout, gthread class. This is the validated config for Render Starter (512 MB). The single worker constraint comes from the in-process _hmac_registry (HMAC accumulation) and RAM limits. Do not change without re-validating with a full production load test.