Skip to content

Logging & Observability

The current state of the someli-api is console.log everywhere, no structured logging, and minimal observability tooling. This document describes what's actually in place, what's available but unused, and what to assume when debugging production.


TL;DR

Concern State
Structured logging ❌ None active. Winston is in package.json but never used in code.
Log levels ❌ Mixed console.log / console.error only.
Request logging ❌ No morgan or equivalent.
Healthcheck endpoints /health and /db-health exist on the main app.
Metrics endpoint ❌ None.
PM2 log rotation ❌ No error_file / out_file configured per process.
Centralized log aggregation ❌ Not configured at the application layer.

When debugging production, your primary tool is pm2 logs <process-name>.


Logging in the Codebase

What's used

console.log('something happened', detail);
console.error('failure:', err);

These two calls account for essentially all logging in the codebase. They're scattered through:

  • helper/helper.js (lines 93, 222, 370, 607, ...)
  • methods.js
  • All route files
  • All job_*.js files
  • actions/actions.jsconsole.error in every CRUD method's error path

What's installed but unused

package.json lists:

"winston": "^3.17.0"

A grep across the codebase confirms exactly one require('winston') statement — in helper/functionsForAi/cloudRag.js. The package is effectively dead weight everywhere else. If you're starting a new module, do not assume Winston is configured — you'd have to wire it up yourself. (Earlier versions of this doc claimed zero callers; that was off by one.)

Logging conventions

By accident rather than design: - console.log for info / debugging - console.error for caught errors (especially in actions/actions.js) - No timestamps added — pm2 adds them at the line level - No request correlation ID — there's no way to trace a single request across log lines


PM2 — The De-facto Log Layer

ecosystem.config.js registers ~40 PM2 apps. None of them set explicit log paths:

// What you'll find:
{ name: 'job_auto_post', script: './job_auto_post.js', autorestart: true }

// What you will NOT find:
{ name: 'job_auto_post', error_file: '/var/log/.../err.log', out_file: '/var/log/...' }

PM2 falls back to its default location, typically ~/.pm2/logs/<app-name>-out.log and <app-name>-error.log. In production, find these on the host running PM2.

Useful pm2 commands

Command What it does
pm2 logs Tail all apps' logs
pm2 logs <name> Tail one app
pm2 logs <name> --lines 1000 Show recent history
pm2 logs <name> --err Only stderr
pm2 flush Clear all logs
pm2 reloadLogs Re-open log files (after rotation)

Log rotation

There is no pm2-logrotate module configured in ecosystem.config.js. Production hosts likely have it installed at the system level, but verify on your specific environment:

pm2 install pm2-logrotate     # if not yet installed
pm2 set pm2-logrotate:max_size 100M
pm2 set pm2-logrotate:retain 30

Healthcheck Endpoints

Two healthchecks exist on the main app, mounted at the root in server.js:

GET /health

{
  "status": "Server is running",
  "port": 5002,
  "environment": "production",
  "timestamp": "2026-05-08T12:34:56.789Z"
}

Liveness check. Returns immediately without touching the database. Note the status value is the literal string "Server is running" — earlier versions of this doc claimed "ok", which is wrong. If a load balancer is configured to compare against "ok" exactly, it will fail.

GET /db-health

Tests the MySQL connection by running a simple query. Returns:

{ "status": "ok", "db": "connected" }

or an error envelope if the query fails.

Use this for readiness probes — it confirms both the process is alive and the database is reachable. Slower than /health; don't use it for high-frequency liveness polling.

What's missing

  • /metrics (Prometheus / OpenMetrics)
  • /ping (zero-cost liveness; /health is fine but heavier than needed)
  • Per-job health endpoints (background workers don't expose HTTP, so their health is "is the PM2 process up?")

What's NOT Logged

These important events do not produce dedicated log lines:

Event Visibility
Inbound HTTP requests ❌ No request logger
Slow queries ❌ Not captured
AI model calls (Gemini, Bedrock, Vertex) ⚠️ Sometimes logged, sometimes not
S3 uploads ⚠️ Inconsistent
Outbound HTTP to social platforms ⚠️ Per-job
Failed background-job ticks ⚠️ Logged via console.error if the catch block triggers
5xx responses ❌ No global error handler — see Error Handling

When tracking down an incident, expect to grep across multiple log files and reconstruct the timeline manually.


Debugging Recipes

Find recent errors across all jobs

pm2 logs --err --lines 500 | grep -iE 'error|fail|exception'

Trace a single account

The convention (mostly) is to log identifiers like accountId / memberId when something happens. Grep for them:

pm2 logs --lines 5000 | grep 'accountId: 1234'

Verify a job is running

pm2 status
pm2 describe <job-name>     # shows last restart, uptime, RSS

Check if the DB is reachable

curl http://localhost:5002/db-health

Recommendations (When You're Ready)

This list is not implemented — it's what would close the gaps:

  1. Wire up Winston that's already a dependency. Replace console.log with leveled calls (logger.info, logger.warn, logger.error).
  2. Add a request logger (morgan or Winston-Express). At minimum log method, path, status, latency, and a correlation ID.
  3. Add pm2-logrotate to ecosystem.config.js setup so logs don't fill disk.
  4. Add a /metrics endpoint if you want to plug into Prometheus.
  5. Centralize logs (Loki / CloudWatch / Datadog) so you don't have to SSH into the box.
  6. Add a global Express error handler (see Error Handling → No Global Error Handler) so unhandled exceptions get logged consistently.
  7. Standardize structured fields — at minimum accountId, memberId, jobType, requestId — so logs are queryable.