Logging & Observability¶

The current state of the someli-api is console.log everywhere, no structured logging, and minimal observability tooling. This document describes what's actually in place, what's available but unused, and what to assume when debugging production.

TL;DR¶

Concern	State
Structured logging	❌ None active. Winston is in `package.json` but never used in code.
Log levels	❌ Mixed `console.log` / `console.error` only.
Request logging	❌ No `morgan` or equivalent.
Healthcheck endpoints	✅ `/health` and `/db-health` exist on the main app.
Metrics endpoint	❌ None.
PM2 log rotation	❌ No `error_file` / `out_file` configured per process.
Centralized log aggregation	❌ Not configured at the application layer.

When debugging production, your primary tool is pm2 logs <process-name>.

Logging in the Codebase¶

What's used¶

console.log('something happened', detail);
console.error('failure:', err);

These two calls account for essentially all logging in the codebase. They're scattered through:

helper/helper.js (lines 93, 222, 370, 607, ...)
methods.js
All route files
All job_*.js files
actions/actions.js — console.error in every CRUD method's error path

What's installed but unused¶

package.json lists:

"winston": "^3.17.0"

A grep across the codebase confirms exactly one require('winston') statement — in helper/functionsForAi/cloudRag.js. The package is effectively dead weight everywhere else. If you're starting a new module, do not assume Winston is configured — you'd have to wire it up yourself. (Earlier versions of this doc claimed zero callers; that was off by one.)

Logging conventions¶

By accident rather than design: - console.log for info / debugging - console.error for caught errors (especially in actions/actions.js) - No timestamps added — pm2 adds them at the line level - No request correlation ID — there's no way to trace a single request across log lines

PM2 — The De-facto Log Layer¶

ecosystem.config.js registers ~40 PM2 apps. None of them set explicit log paths:

// What you'll find:
{ name: 'job_auto_post', script: './job_auto_post.js', autorestart: true }

// What you will NOT find:
{ name: 'job_auto_post', error_file: '/var/log/.../err.log', out_file: '/var/log/...' }

PM2 falls back to its default location, typically ~/.pm2/logs/<app-name>-out.log and <app-name>-error.log. In production, find these on the host running PM2.

Useful pm2 commands¶

Command	What it does
`pm2 logs`	Tail all apps' logs
`pm2 logs <name>`	Tail one app
`pm2 logs <name> --lines 1000`	Show recent history
`pm2 logs <name> --err`	Only stderr
`pm2 flush`	Clear all logs
`pm2 reloadLogs`	Re-open log files (after rotation)

Log rotation¶

There is no pm2-logrotate module configured in ecosystem.config.js. Production hosts likely have it installed at the system level, but verify on your specific environment:

pm2 install pm2-logrotate     # if not yet installed
pm2 set pm2-logrotate:max_size 100M
pm2 set pm2-logrotate:retain 30

Healthcheck Endpoints¶

Two healthchecks exist on the main app, mounted at the root in server.js:

`GET /health`¶

{
  "status": "Server is running",
  "port": 5002,
  "environment": "production",
  "timestamp": "2026-05-08T12:34:56.789Z"
}

Liveness check. Returns immediately without touching the database. Note the status value is the literal string "Server is running" — earlier versions of this doc claimed "ok", which is wrong. If a load balancer is configured to compare against "ok" exactly, it will fail.

`GET /db-health`¶

Tests the MySQL connection by running a simple query. Returns:

{ "status": "ok", "db": "connected" }

or an error envelope if the query fails.

Use this for readiness probes — it confirms both the process is alive and the database is reachable. Slower than /health; don't use it for high-frequency liveness polling.

What's missing¶

/metrics (Prometheus / OpenMetrics)
/ping (zero-cost liveness; /health is fine but heavier than needed)
Per-job health endpoints (background workers don't expose HTTP, so their health is "is the PM2 process up?")

What's NOT Logged¶

These important events do not produce dedicated log lines:

Event	Visibility
Inbound HTTP requests	❌ No request logger
Slow queries	❌ Not captured
AI model calls (Gemini, Bedrock, Vertex)	⚠️ Sometimes logged, sometimes not
S3 uploads	⚠️ Inconsistent
Outbound HTTP to social platforms	⚠️ Per-job
Failed background-job ticks	⚠️ Logged via `console.error` if the catch block triggers
5xx responses	❌ No global error handler — see Error Handling

When tracking down an incident, expect to grep across multiple log files and reconstruct the timeline manually.

Debugging Recipes¶

Find recent errors across all jobs¶

pm2 logs --err --lines 500 | grep -iE 'error|fail|exception'

Trace a single account¶

The convention (mostly) is to log identifiers like accountId / memberId when something happens. Grep for them:

pm2 logs --lines 5000 | grep 'accountId: 1234'

Verify a job is running¶

pm2 status
pm2 describe <job-name>     # shows last restart, uptime, RSS

Check if the DB is reachable¶

curl http://localhost:5002/db-health

Recommendations (When You're Ready)¶

This list is not implemented — it's what would close the gaps:

Wire up Winston that's already a dependency. Replace console.log with leveled calls (logger.info, logger.warn, logger.error).
Add a request logger (morgan or Winston-Express). At minimum log method, path, status, latency, and a correlation ID.
Add pm2-logrotate to ecosystem.config.js setup so logs don't fill disk.
Add a /metrics endpoint if you want to plug into Prometheus.
Centralize logs (Loki / CloudWatch / Datadog) so you don't have to SSH into the box.
Add a global Express error handler (see Error Handling → No Global Error Handler) so unhandled exceptions get logged consistently.
Standardize structured fields — at minimum accountId, memberId, jobType, requestId — so logs are queryable.

Error Handling & Response Patterns — what errors look like in responses (separate from how they're logged)
Deployment & DevOps — PM2 setup, infrastructure
Architecture Overview — middleware stack and where logging would attach