Monitoring & Observability¶

RAM Status Endpoint¶

GET /upload/ram-status

Returns current server memory usage: json { "ram_used_mb": 312.4, "ram_limit_mb": 512, "ram_free_mb": 199.6 }

No authentication required
Safe to poll frequently (lightweight psutil call)
During load testing: poll every 30 seconds and log results
Alert threshold: if ram_used_mb exceeds 450 MB, investigate immediately

Render Dashboard¶

Access at: dashboard.render.com

Section	What to Check
Metrics	CPU usage, RAM usage over time graphs
Logs	Live Gunicorn logs (errors, warnings, request logs)
Events	Deploy history, health check failures, restarts
Shell	SSH into running instance for debugging

Key Metrics to Watch¶

RAM: Should stay below 450 MB under normal load
CPU: Should be low for most operations (I/O-bound app)
Health checks: Should be green — yellow/red means requests are failing

Gunicorn Logs¶

Gunicorn logs are visible in Render's Logs section. Key patterns:

Log Pattern	Meaning
`[INFO] Starting gunicorn`	Worker started successfully
`[WARNING] Worker with pid XXXX was terminated due to signal 9`	OOM kill — check RAM
`[CRITICAL] WORKER TIMEOUT (pid:XXXX)`	Request took >120s — check for stuck S3 calls
`500 Internal Server Error`	Unhandled exception — check stack trace in logs
`Worker restarting after X requests`	Normal `--max-requests=1000` restart

Daily Analytics¶

File: daily_analytics.py

Logs daily upload/download counts, active users, storage stats
Writes to a DB table or log file (configurable)
Run as a scheduled job (daily at midnight) via Celery beat or cron

Database Monitoring¶

File: DB_Logger.py

Logs slow DB queries (>100ms) for debugging
Logs failed DB operations with full traceback
Output goes to Gunicorn logs (visible in Render)

To check DB connection health: ```bash flask shell

from lenzeye_database import db db.engine.execute('SELECT 1') ```

Error Tracking¶

Currently: Render logs only. No external error tracking (Sentry, etc.) is configured.

To add Sentry (recommended for production): python import sentry_sdk sentry_sdk.init(dsn=os.getenv('SENTRY_DSN'), traces_sample_rate=0.1)

TL;DR¶

Primary monitoring: Render dashboard (RAM graph, logs, events). /upload/ram-status for live RAM polling. Gunicorn logs for errors and timeouts. No external error tracking currently configured.

Alert thresholds: RAM >450 MB → investigate. WORKER TIMEOUT → check S3 response times. OOM kill (signal 9) → reduce concurrent ops or upgrade plan.