Memory Management¶
Why Memory Matters¶
Render Starter plan = 512 MB RAM. The Flask app + SQLAlchemy + boto3 + cryptography libraries baseline at ~237 MB. Only ~275 MB remain for runtime operations. Every design decision in the upload path is constrained by this budget.
Memory Budget¶
| State | RAM |
|---|---|
| Idle (1 worker, after startup) | ~237 MB |
| Single encrypted upload (1 part in flight) | ~257 MB (+20 MB) |
| 4 concurrent encrypted uploads (peak) | ~317–398 MB |
| Absolute ceiling (before OOM kill) | 512 MB |
| Safe headroom at peak | ~114 MB |
BoundedSemaphore(4)¶
```python _encrypt_semaphore = threading.BoundedSemaphore(4)
In upload-part handler:¶
with _encrypt_semaphore: encrypted_chunk = encrypt_multipart_chunk(plaintext_chunk, key, iv, byte_offset) resp = s3.upload_part(..., Body=io.BytesIO(encrypted_chunk), ...) ```
- Limits concurrent encrypt+S3-upload operations across all users (not per user)
- At 10 MB per part: 4 × 10 MB = 40 MB encryption buffer maximum
- Additional requests wait (queue) rather than being rejected
- Shared between
guest_upload_routes.pyandupload_wasabi_home.py
| Semaphore value | Max concurrent encryptions | Peak RAM from encryption |
|---|---|---|
| 4 (current) | 4 | ~40 MB |
| 8 | 8 | ~80 MB (risky on 512 MB) |
| 1 | 1 | ~10 MB (too slow) |
10 MB Chunk Strategy¶
Each upload part is 10 MB. Why?
- S3 minimum part size: 5 MB
- 10 MB balances: enough to amortize HTTP overhead, small enough to bound RAM per part
- 4 concurrent parts × 10 MB = 40 MB maximum encryption buffer
- Parts above 10 MB increase risk of RAM spikes; below 5 MB violates S3 minimum
Lazy Loading¶
Heavy libraries are not imported at startup in Lenzeye.py:
```python cv2 = None numpy = None vision = None # google.cloud.vision pytesseract = None razorpay = None waitress = None
def get_cv2(): global cv2 if cv2 is None: import cv2 as _cv2 cv2 = _cv2 return cv2 ```
| Library | Size | Loaded when |
|---|---|---|
cv2 (OpenCV) |
~30–50 MB | First image processing request |
numpy |
~20 MB | First numerical operation |
google.cloud.vision |
~10 MB | First Vision API call |
pytesseract |
~5 MB | First OCR call |
razorpay |
~5 MB | First payment request |
Total savings: ~82 MB baseline RAM reduction (from ~319 MB to ~237 MB).
RAM Monitoring Endpoint¶
python
@app.route('/upload/ram-status')
def ram_status():
import psutil
proc = psutil.Process(os.getpid())
ram_mb = proc.memory_info().rss / 1024 / 1024
return jsonify({
"ram_used_mb": round(ram_mb, 1),
"ram_limit_mb": 512,
"ram_free_mb": round(512 - ram_mb, 1)
})
- Poll this endpoint during load testing to monitor real-time RAM
- Used to validate the 6h 56min upload session
- No authentication required (read-only system metric)
What NOT to Do¶
| Action | Risk |
|---|---|
Increase --workers to 2+ |
Each worker duplicates baseline (~474 MB total) — OOM |
Load large files into request.get_data() |
Full file in RAM — OOM on large uploads |
| Re-download from S3 to compute HMAC | File-size RAM spike — OOM |
| Import cv2/numpy at module level | +82 MB baseline — no headroom for uploads |
Remove _encrypt_semaphore |
Unbounded concurrent encryption — OOM under load |
TL;DR¶
512 MB plan, ~237 MB baseline, ~275 MB headroom. BoundedSemaphore(4) caps concurrent encryption at 40 MB peak. 10 MB chunks bound per-part RAM. Lazy loading saves 82 MB. /upload/ram-status for live monitoring. Validated: 2,630+ files, 6h 56min, peak 398 MB.