Monitoring & Observability¶
RAM Status Endpoint¶
GET /upload/ram-status
Returns current server memory usage:
json
{
"ram_used_mb": 312.4,
"ram_limit_mb": 512,
"ram_free_mb": 199.6
}
- No authentication required
- Safe to poll frequently (lightweight psutil call)
- During load testing: poll every 30 seconds and log results
- Alert threshold: if
ram_used_mbexceeds 450 MB, investigate immediately
Render Dashboard¶
Access at: dashboard.render.com
| Section | What to Check |
|---|---|
| Metrics | CPU usage, RAM usage over time graphs |
| Logs | Live Gunicorn logs (errors, warnings, request logs) |
| Events | Deploy history, health check failures, restarts |
| Shell | SSH into running instance for debugging |
Key Metrics to Watch¶
- RAM: Should stay below 450 MB under normal load
- CPU: Should be low for most operations (I/O-bound app)
- Health checks: Should be green — yellow/red means requests are failing
Gunicorn Logs¶
Gunicorn logs are visible in Render's Logs section. Key patterns:
| Log Pattern | Meaning |
|---|---|
[INFO] Starting gunicorn |
Worker started successfully |
[WARNING] Worker with pid XXXX was terminated due to signal 9 |
OOM kill — check RAM |
[CRITICAL] WORKER TIMEOUT (pid:XXXX) |
Request took >120s — check for stuck S3 calls |
500 Internal Server Error |
Unhandled exception — check stack trace in logs |
Worker restarting after X requests |
Normal --max-requests=1000 restart |
Daily Analytics¶
File: daily_analytics.py
- Logs daily upload/download counts, active users, storage stats
- Writes to a DB table or log file (configurable)
- Run as a scheduled job (daily at midnight) via Celery beat or cron
Database Monitoring¶
File: DB_Logger.py
- Logs slow DB queries (>100ms) for debugging
- Logs failed DB operations with full traceback
- Output goes to Gunicorn logs (visible in Render)
To check DB connection health: ```bash flask shell
from lenzeye_database import db db.engine.execute('SELECT 1') ```
Error Tracking¶
Currently: Render logs only. No external error tracking (Sentry, etc.) is configured.
To add Sentry (recommended for production):
python
import sentry_sdk
sentry_sdk.init(dsn=os.getenv('SENTRY_DSN'), traces_sample_rate=0.1)
TL;DR¶
Primary monitoring: Render dashboard (RAM graph, logs, events). /upload/ram-status for live RAM polling. Gunicorn logs for errors and timeouts. No external error tracking currently configured.
Alert thresholds: RAM >450 MB → investigate. WORKER TIMEOUT → check S3 response times. OOM kill (signal 9) → reduce concurrent ops or upgrade plan.