Enterprise Readiness Assessment¶
1. Executive summary¶
Verdict: A specialised content-factory backend with substantial functionality (269 endpoints + 57 jobs + 6 AI bots) and substantial accumulated risk. The codebase reflects an opportunistic, copy-paste evolution: every industry vertical has its own near-clone job; helpers are thin; auth is convention-only; secrets are hardcoded; observability is console.log. The Dockerfile is the most polished single artefact in the repo. Reducing the 30+ industry-clone jobs and adding a shared auth middleware are the two highest-leverage refactors.
Top 3 strengths:
1. Multi-stage Dockerfile is well-crafted (slims node_modules, includes Puppeteer's runtime libs)
2. Functional scope is clear and bounded (content factory) — easy to reason about what the service does
3. Dependencies are mostly modern; newer versions than Someli-admin-api in many cases
Top 5 risks:
1. Hardcoded credentials in source (Slack token, Unsplash key, probably Polotno license, probably Google service account)
2. No shared auth middleware — every endpoint is "protected" only by hope
3. 30+ industry-clone jobs that each duplicate the same boilerplate; bug fixes need 30+ touches
4. forEach(async) everywhere in jobs and bots → unhandled rejections, double-runs, race conditions
5. .env baked into Docker image → anyone with image access reads secrets
Top 5 recommendations:
1. This week: rotate Slack + Unsplash + Polotno + Google credentials; move all to env
2. Phase 0: add middlewares/auth.js + apply via router.use
3. Phase 0: remove .env from Dockerfile; pass via runtime mount
4. Phase 0: replace forEach(async) with for...of await across all 60+ workers
5. Phase 1: collapse the 30+ industry-clone jobs into one parameterised job
2. Methodology & scope¶
Audited via find, cat, diff, grep, wc on the cloned someli-gh/designer-api/ directory. Compared against someli-api/ and Someli-admin-api/ for lineage. No tests run (none exist).
3. Maturity assessment (CMMI 1–5)¶
| Pillar | Current | 12-mo target | Why |
|---|---|---|---|
| Architecture & Modularity | 1 | 3 | One 13608-line routes file; 30+ industry clones; no shared helpers; thin abstractions |
| API Contract & Versioning | 1 | 2 | No OpenAPI; inconsistent path naming; ad-hoc auth |
| Data Architecture | 2 | 3 | Shared MySQL with siblings; mostly parameterised queries; some string interpolation risk |
| Background Processing | 2 | 4 | 57 jobs + 6 bots run; no central scheduling, no idempotency, no retry, no observability |
| Security & Compliance | 1 | 3 | Multiple hardcoded credentials; no shared auth; .env in Docker image; CORS wildcard |
| Observability | 1 | 3 | console.log only; one Slack notifier as the entire alerting strategy |
| Reliability & Resilience | 1 | 3 | sync-mysql blocking; forEach(async) bugs; no Polotno timeout; no retry |
| Scalability | 1 | 3 | Each job = its own Chrome instance; 60+ MySQL connections; no pooling |
| Testing & Quality Gates | 1 | 2 | Zero tests; no CI; no lint |
| CI/CD & Deployment | 2 | 3 | Good Dockerfile, but no CI workflow committed; manual push.sh |
| Infrastructure as Code | 2 | 3 | Dockerfile + nginx.conf in repo (good); no PM2 ecosystem, no Terraform/Pulumi |
| Cost Visibility & FinOps | 1 | 2 | No OpenAI / Polotno / S3 usage tracking |
| Documentation & Knowledge Management | 2 | 3 | This audit covers the gap; no in-repo runbook |
| Team Practices & Governance | 1 | 2 | No PR template, no code-owners, no branch model documented in-repo |
4. Findings (consolidated)¶
| ID | Severity | Description | Source | Phase |
|---|---|---|---|---|
| F-1 | HIGH | Hardcoded Slack bot token | security.md F-1 | 0a |
| F-2 | HIGH | Hardcoded Unsplash access key | security.md F-2 | 0a |
| F-6 | HIGH (if true) | conf/credentials.json likely committed |
security.md F-6 | 0a verify |
| F-9 | MEDIUM | Polotno license possibly hardcoded | security.md F-9 | 0a verify |
| F-D1 | HIGH | .env baked into Docker image |
build-and-deploy.md | 0 |
| F-3 | HIGH | No shared auth middleware | security.md F-3, authentication.md | 0 |
| F-4 | MEDIUM | CORS wildcard + malformed Allow-Credentials | security.md F-4 | 0 |
| F-7 | MEDIUM | SQL string-interpolation risk | security.md F-7 | 0 |
| F-8 | MEDIUM | forEach(async) bugs across all jobs/bots |
security.md F-8, error-handling.md E-1 | 0 |
| F-10 | MEDIUM | No rate limiting on /webauthenticate |
security.md F-10 | 0 |
| J-1 | n/a | 30+ industry-clone jobs | jobs-inventory.md | 1 |
| J-2 | n/a | No PM2 ecosystem.config.js | jobs-inventory.md | 0 |
| J-3 | n/a | No distributed lock for jobs | jobs-inventory.md | 1 |
| B-1 | n/a | Bots both library and cron | bots-inventory.md | 1 |
| B-3 | n/a | OpenAI rate-limit dogpile risk | bots-inventory.md | 0a |
| M-1 | n/a | No Polotno instance pooling (60+ Chromes) | media-processing.md | 1 |
| M-3 | n/a | No file-upload size limits | media-processing.md | 0 |
| M-5 | n/a | No Polotno render timeout / retry | media-processing.md | 0 |
| N-1..6 | n/a | Slack notifier issues | notifications.md | 0a/0 |
| D-2 | n/a | No CI workflow in repo | build-and-deploy.md | 0 |
| E-1..7 | n/a | Error handling gaps | error-handling.md | 0/1 |
5. Strategic decisions¶
-
designer-apiis the platform's content factory. It's not a candidate for retirement — the bots and jobs are the heart of how Someli's template library stays fresh. -
The 30+ industry-clone jobs are the largest single refactor opportunity. Each
job_<industry>.jsfollows the same SELECT/iterate/generate pattern, parametrised by industry id. Replace with onejob_industry_generator.jsdriven by atJobConfig(industry_id, cron_schedule, active)table. This deletes ~6000 lines of duplicated code in one PR. -
Auth needs a baseline. Adding
middlewares/auth.jsis a 1-day task that closes the largest security gap. Until then, assume every endpoint is publicly callable in your threat model. -
The split of work between
designer-apiandsomeli-apiis right. Content factory in one repo, runtime social pipeline in another. Don't fold them back together. -
Cost visibility matters here more than other backends. This repo is the platform's biggest OpenAI + Polotno spender. Adding tracking would surface optimisation opportunities (cf. bots that run hourly but rarely have new work).
6. Roadmap¶
Phase 0a — This week¶
- Rotate Slack token (F-1)
- Rotate Unsplash key (F-2)
- Verify and rotate
conf/credentials.jsonif committed (F-6) - Verify and rotate Polotno license if hardcoded (F-9)
- Stagger bot cron schedules (B-3)
- Add
/healthendpoint (D-4)
Phase 0 — Stabilise (months 0-3)¶
- Remove
.env COPYfrom Dockerfile; pass via runtime mount (F-D1) - Add
middlewares/auth.js; apply per-router (F-3) - Fix CORS configuration (F-4)
- Add
express-rate-limitto/webauthenticate(F-10) - Replace
forEach(async)withfor...of awaitacross all 60+ workers (F-8, E-1) - Audit and fix SQL-injection risks (F-7)
- Add Polotno render timeout (M-5)
- Tighten file-upload size limits (M-3)
- Commit
ecosystem.config.js(J-2) - Commit CI workflow (
.github/workflows/dev-designer-api-deploy.yml) (D-2) - Add structured logging + Sentry + request-id
- Add
tJobRunaudit table (observability.md)
Phase 1 — Foundation (months 3-9)¶
- Collapse 30+ industry-clone jobs into one parameterised job (J-1)
- Split bots into
*_lib.js(no cron) +*_bot.js(cron-only); addIS_WORKERenv flag (B-1) - Add distributed job lock (J-3)
- Pool Polotno instances (M-1)
- Add OpenAI / Polotno / S3 cost tracking (FinOps pillar)
- Migrate
sync-mysqltomysql2/promisein jobs and bots (Scalability pillar) - Centralise prompts (B-6) and S3/Polotno init (M-6/M-7)
Phase 2 — Modular refactor (months 9-18)¶
- Split
routes/routes.jsby domain - Migrate
aws-sdkv2 → v3 - Adopt shared
tokenGenerator.jsfor auth (consistent withsomeli-api/Someli-admin-api) - Add tests (job idempotency + content-pipeline integration tests)
Phase 3 — Selective extraction (months 18+)¶
- Possible split: content-validation jobs into a separate service; AI-generation bots into another; routes-only API as a third
7. Risk register¶
| ID | Risk | Likelihood | Impact | Phase |
|---|---|---|---|---|
| R-1 | Hardcoded Slack token used by attacker for phishing | Medium | Medium | 0a |
| R-2 | Polotno license abuse (if hardcoded and leaks) | Medium | Medium | 0a |
| R-3 | New endpoint added without auth → public exposure of admin data | High | High | 0 |
| R-4 | forEach(async) race condition publishes wrong content |
Medium | High | 0 |
| R-5 | 60+ Chrome processes OOM the deploy box during a content burst | Medium | Medium | 1 |
| R-6 | OpenAI rate limit causes hourly job dogpile failures | Medium | Low | 0a |
| R-7 | SQL injection via unparameterised filter on some endpoint | Low | High | 0 |
| R-8 | One industry-clone bug fix missed in N of 30+ files | High | Medium | 1 |
| R-9 | Polotno render hangs forever on bad template; queue backs up | Medium | Medium | 0 |
| R-10 | .env baked in Docker image leaks all platform secrets |
Low (but high-impact) | High | 0 |
8. Standards compliance¶
- OWASP A02, A04, A05, A07 — see security.md
- 12-factor — Config:
.envbaked into image violates "store config in the environment" - 12-factor — Logs: stdout via
console.logis OK; lack of structure is the gap
9. Open questions¶
See verify-markers.md.