Engineering Practices¶

This is the operational catalog of practices that the Engineering Charter commits to. Each practice has:

What — the rule
Why — the rationale (so engineers can apply judgement to edge cases)
Enforcement — how the rule is enforced (automated, process, cultural)
Escape valve — when and how the rule can be exceptioned

For the per-PR checklist that translates these practices into a single forcing function, see Definition of Done.

Adoption Waves¶

Practices are not adopted all at once — that approach has consistently failed in industry. They are adopted in waves, each closing before the next opens.

Wave	Timing	Focus	Closes when
Wave 1	Week 1	Pre-commit hooks, PR template, branch protection	Hooks installed; one week of zero direct-pushes to main
Wave 2	Weeks 2–4	CI gates: lint, typecheck (TS files), no-secrets, format	One week of zero CI bypasses on these gates
Wave 3	Months 1–2	Structured logging, validation on new endpoints, migrations framework, module-boundary linter (advisory)	One sprint of new code 100% compliant
Wave 4	Months 2–4	Patch coverage gate, OpenAPI required for new endpoints, module-boundary linter blocking, dep justification, ADR process	New code passes all gates routinely
Wave 5	Months 4+	SLO budgets, postmortem process, maturity rating refresh, dep cadence, license review	Steady-state operation

Phase 2 of the enterprise-readiness roadmap (modular monolith refactor) does not start until Wave 3 is in force. Otherwise the refactor stream chases entropy that the feature stream is producing in real time.

The current wave is tracked in verify-markers.md.

A. Code-Level Standards¶

A.1 New files use TypeScript¶

Rule: every new source file in src/ is .ts (or .tsx for components if/when frontend lives here). Existing .js files are grandfathered.

Why: the type system catches the most frequent class of bugs in this codebase (typos, missing fields, wrong shapes) at compile time. TypeScript also makes the codebase legible to new engineers — types serve as inline documentation.

Enforcement: ESLint rule disallowing new .js files in src/. CI fails if a .js file is added under src/.

Escape valve: none. New .js files are not allowed.

A.2 `strict: true` in tsconfig; no implicit `any`¶

Rule: TypeScript's strict mode is enabled. any requires an explicit annotation and a // @ts-expect-error: <reason> if used to silence an error.

Why: the value of TypeScript drops sharply with any. Strict mode forces the discipline.

Enforcement: tsconfig.json strict flag. ESLint @typescript-eslint/no-explicit-any set to error.

Escape valve: explicit any is permitted with a one-line comment explaining why; the comment is reviewed.

A.3 ESLint with team ruleset, no warnings¶

Rule: all source files pass ESLint. Warnings count as errors in CI.

Why: "warnings we'll fix later" become "warnings nobody fixes." Either a rule matters and we fail on it, or it doesn't and we drop it.

Enforcement: eslint --max-warnings 0 in CI.

Escape valve: disable a specific rule with an inline comment + reason: // eslint-disable-next-line <rule> -- <reason>.

A.4 Prettier-formatted; no formatting debates¶

Rule: all files pass prettier --check.

Why: time spent debating formatting is time not spent building. Pick one config; never look back.

Enforcement: pre-commit hook runs prettier --write. CI runs prettier --check.

Escape valve: none. Formatting is not a debate.

A.5 No `console.log` in committed code¶

Rule: use the structured logger from platform/logger.ts. console.log, console.error, console.warn are forbidden in source.

Why: unstructured console.log produces unparseable output. With ~108 workers logging, this is the difference between "grep can find your incident" and "your incident is lost."

Enforcement: ESLint no-console rule set to error.

Escape valve: during local development, comment out the rule for a single line. The rule must be re-enabled before merging.

A.6 File and function size limits¶

Rule: - Files SHOULD be under 400 lines (advisory) and MUST be under 1,000 lines (hard). - Functions SHOULD be under 50 lines and SHOULD have at most 5 parameters.

Why: the 21,000-line routes.js is the result of never having this rule. Files past these thresholds cannot be reviewed effectively or held in working memory.

Enforcement: ESLint max-lines and max-lines-per-function rules. The hard limit fails CI; the advisory limit produces a warning that reviewers can choose to enforce.

Escape valve: the 1,000-line hard limit may be exceptioned for generated files (OpenAPI types, etc.) with a // @generated marker.

A.7 No `// @ts-ignore` without rationale¶

Rule: use // @ts-expect-error: <reason> if you must suppress a type error. The plain // @ts-ignore is forbidden.

Why: @ts-expect-error complains if the underlying error is fixed later, which prevents stale suppressions from accumulating. @ts-ignore does not.

Enforcement: ESLint @typescript-eslint/ban-ts-comment.

Escape valve: none.

A.8 Module-boundary import discipline¶

Rule: code in src/modules/A/ may only import from src/modules/B/index.ts, never from src/modules/B/internal/*.

Why: the public/internal distinction is the entire value of module boundaries. Once one engineer reaches into another module's internals, the boundary is gone.

Enforcement: eslint-plugin-boundaries (Wave 3 advisory; Wave 4 blocking).

Escape valve: if a function genuinely belongs in another module's public API, move it — don't bypass the boundary.

A.9 No new direct DB connections¶

Rule: all DB access goes through platform/db.ts. New mysql.createConnection, mysql2.createPool, or new mysql({...}) calls are forbidden in any new file.

Why: the 892 separate connection-construction sites are the operational nightmare flagged in the readiness audit. Centralizing fixes connection pooling, secrets handling, observability, and migration paths in one stroke.

Enforcement: ESLint no-restricted-imports blocking direct imports of mysql, mysql2/promise, and sync-mysql. The platform/db module is the sole sanctioned consumer.

Escape valve: changes to platform/db.ts itself are obviously exempt.

A.10 No new `sync-mysql` callers¶

Rule: new files must not import sync-mysql. Existing 127 callers are tagged for migration during Phase 1.

Why: synchronous MySQL blocks the event loop. It's an architectural antipattern. Every new caller compounds the migration cost.

Enforcement: ESLint no-restricted-imports for sync-mysql.

Escape valve: none. New sync-mysql usage is rejected.

B. Pull-Request Requirements¶

B.1 At least one review¶

Rule: PRs require ≥1 review approval before merging. Two reviews for security, billing, and authentication paths.

Why: code review is the single highest-leverage quality practice on a small team. It also distributes knowledge — every PR reviewed is one engineer who has seen the code.

Enforcement: GitHub branch protection on dev_api, uat_api, main. CODEOWNERS file routes sensitive-path reviewers automatically.

Escape valve: hotfix-emergency PRs (production down) may be merged with one review post-hoc, with an incident report following.

B.2 No self-approval¶

Rule: the PR author cannot approve their own PR.

Why: self-approval defeats the purpose of review. It happens with surprising frequency on under-pressure teams.

Enforcement: GitHub branch protection setting.

B.3 PRs target the right base¶

Rule: - Feature PRs target dev_api - uat_api ← dev_api (promotion only — no direct feature work) - main ← uat_api (promotion only)

Why: the branch strategy fails if engineers PR feature work directly into main.

Enforcement: branch protection rules.

B.4 PR description follows the template¶

Rule: the PR template's checkboxes (the Definition of Done) are filled out, not skipped.

Why: the checklist is a forcing function. Skipping it is the silent non-compliance failure mode.

Enforcement: repository PR template. Reviewers reject PRs that ignore the checklist.

B.5 Files > 500 lines added in a single PR triggers extra review¶

Rule: PRs that add a file > 500 lines or modify a file by > 500 lines require a second reviewer.

Why: large PRs are reviewed less effectively. A second pair of eyes is a cheap mitigation.

Enforcement: convention; enforced by reviewers and CODEOWNERS.

Escape valve: generated files; fixture files; documentation. Mark with // @generated or place in a designated path.

B.6 PR title convention¶

Rule: PR titles follow conventional-commits style: feat:, fix:, refactor:, docs:, chore:, test:, perf:, style:, build:, ci:.

Why: lets us auto-generate changelogs and release notes, and signals intent.

Enforcement: CI title check (e.g., commitlint or a GitHub Action).

C. Test Requirements¶

C.1 New endpoints have integration tests¶

Rule: any new HTTP endpoint ships with at least one integration test (request → response, hits the DB).

Why: a unit test of the handler is a fiction; the real behavior emerges from middleware + handler + DB. Integration tests catch the breakage that unit tests don't.

Enforcement: patch-coverage gate on changed-line coverage. Reviewers verify test exists.

Escape valve: internal-only / cron-only changes that don't expose an endpoint are exempt.

C.2 New helper functions have unit tests¶

Rule: new functions in helper/, platform/, or modules/*/internal/ have at least one unit test that exercises a non-trivial path.

Why: these are the building blocks. A bug here propagates through every consumer.

Enforcement: patch-coverage gate.

C.3 Bug fixes include a regression test¶

Rule: a bug fix PR includes a test that fails before the fix and passes after.

Why: without the regression test, the fix is one rebase away from being undone.

Enforcement: reviewer checklist. Cannot be fully automated, but reviewers must explicitly confirm.

Escape valve: truly trivial fixes (e.g., typo in a string literal) where a test would be artificial. Reviewer judgement.

C.4 Coverage targets on critical paths¶

Rule: auth, billing, and publishing modules require ≥80% line coverage on changed lines (patch coverage, not whole repo).

Why: these are the code paths where untested code is a customer-impact event waiting to happen. They are also small enough to make the goal tractable.

Enforcement: Codecov or Coveralls patch-coverage gate, scoped to these modules.

Escape valve: none on these paths.

C.5 Flaky tests are quarantined fast¶

Rule: a test identified as flaky (passing then failing without code changes) is moved to a __flaky__ quarantine directory within 24 hours of identification, with a tracking issue.

Why: a single accepted-flaky test trains the team to ignore CI. That habit destroys CI's signal value within weeks.

Enforcement: flaky-test detection in CI (e.g., GitHub Actions plugin); the on-call engineer triages.

Escape valve: the test stays quarantined until fixed. There is no "merge despite the flake."

C.6 No tests skipped without a referenced issue¶

Rule: it.skip(), xit(), etc. are forbidden without a comment containing an issue number.

Why: skipped tests rot. Without a tracking issue they are invisible.

Enforcement: ESLint custom rule scanning test files.

D. Architectural Boundaries¶

D.1 New code lives in `src/modules/`¶

Rule: new HTTP handlers, services, and helpers go in src/modules/<area>/. Adding to routes/routes.js or routes/auth.js is forbidden.

Why: these two files are the architectural failure named in the readiness doc. Every new line added to them is a step away from the modular monolith.

Enforcement: lint rule + reviewer enforcement + grep gate in CI (lines added to those files trigger explicit reviewer attention).

Escape valve: trivial bug fixes inside an existing handler in those files (no logic change). Anything more substantive must be extracted first.

D.2 The boy-scout rule¶

Rule: when modifying a handler in a legacy file, extract it into the appropriate module first. Add at least one test for it as you extract.

Why: this is what makes the feature stream contribute to the cleanup rather than fight it. Over time, every legacy line that gets touched gets upgraded; lines nobody touches were stable anyway.

Enforcement: reviewer enforcement; PR template asks "if touching a legacy file, what was upgraded?"

Escape valve: if extraction would make the PR much larger than its purpose warrants, document the reason and create a follow-up issue. The follow-up issue is then prioritized in the next sprint.

D.3 One module owns one table (logically)¶

Rule: every database table has exactly one module that owns it (writes to it, defines its schema). Other modules read from it via the owning module's public API.

Why: without ownership, tables drift. A table that 8 modules write to ends up with 8 different invariants, and none of them hold.

Enforcement: ownership documented in data-model.md §6 catalog. Reviewer enforcement.

Escape valve: read-only joins across module boundaries are pragmatic and allowed during the migration period (single DB). After Phase 3 (if any module is extracted), this becomes a hard rule.

D.4 No new event buses, message queues, or anti-corruption layers between modules¶

Rule: cross-module communication is direct synchronous function calls (await content.generateDraft(...)). Adding a queue, event bus, or "anti-corruption layer" between two modules in the same process is forbidden.

Why: these patterns add complexity that doesn't pay off in a single-process application. They are appropriate when modules genuinely become services. See enterprise-readiness §5.2.

Enforcement: reviewer enforcement; ADR required for any deviation.

Escape valve: the job queue (BullMQ / Redis) is a special case — it's between the HTTP API and the worker fleet, not between modules of the API. Cross-process events are fine.

E. Security Defaults¶

E.1 No secrets in source¶

Rule: API keys, tokens, passwords, and credentials are loaded from environment variables (or a secrets manager). They never appear as string literals in source files.

Why: every committed secret has been compromised. Rotating it is the hard part; preventing it is the easy part.

Enforcement: - Pre-commit gitleaks hook - CI gitleaks job - Periodic full-history scan - Reviewers reject PRs containing patterns matching credential shapes

Escape valve: none.

E.2 Input validation at the boundary¶

Rule: every HTTP endpoint that accepts request bodies, query params, or path params runs a schema validator (Zod / Yup / express-validator) at handler entry.

Why: input validation is the cheapest first line of defense against injection, malformed-input crashes, and ill-defined contracts.

Enforcement: lint rule that checks for a schema-validation call in handler files. (Wave 4.) Reviewer enforcement until then.

Escape valve: internal-only endpoints (called only by other workers, never by external clients) may skip validation, but must declare so explicitly.

E.3 Auth-required endpoints declare their required role¶

Rule: every authenticated endpoint declares the required role(s) at the route level: router.get('/admin/x', requireRole(['admin']), handler).

Why: scattered auth checks inside handlers are the failure mode that produces missing-auth bugs. Declaring it at the route level makes the requirement greppable and reviewable.

Enforcement: convention; reviewer enforcement.

E.4 External API calls have timeouts and retries¶

Rule: every call to an external service (AI providers, social platforms, payment gateways, etc.) has: - A timeout (default 30 seconds; explicit per-call where different) - A retry policy with exponential backoff - An idempotency key for write operations

Why: without these, a single slow third party stalls the whole worker. Without idempotency, retries cause data corruption.

Enforcement: wrapper utility (platform/http.ts or similar) that enforces these defaults. Direct axios / fetch calls bypass the defaults and are rejected by reviewers.

Escape valve: documented rationale + ADR for cases where the wrapper doesn't fit.

E.5 Cost-bearing endpoints declare a rate limit¶

Rule: any endpoint that triggers AI generation, media rendering, or social-platform posting declares a rate limit (per-user and per-account).

Why: abuse / runaway-cost protection. Without per-tenant limits, one bad actor or one buggy client can produce a six-figure bill overnight.

Enforcement: lint rule (Wave 4) requiring rateLimit({...}) middleware on these endpoints. Reviewer enforcement until then.

E.6 Critical-severity dependency advisories block merge¶

Rule: if yarn audit (or Snyk / Dependabot) reports a Critical or High severity finding, no merge until it's resolved or explicitly time-boxed via an exception.

Why: these are public, indexed by vulnerability scanners, and exploited within days.

Enforcement: required CI status check.

Escape valve: explicit exception by an engineering lead, with a follow-up tracking issue and a < 30 day resolution window.

E.7 Auth and billing files cannot have suppressed type errors¶

Rule: // @ts-expect-error is forbidden in src/modules/identity/, src/modules/billing/, and any file that touches authentication or payment flows.

Why: these are the paths where a silent type bug = customer breach or financial incident. The bar is higher.

Enforcement: CI grep gate scoped to these paths.

F. Observability Defaults¶

F.1 Every request has a correlation ID¶

Rule: middleware attaches a correlation ID (UUID) to every request. The ID propagates through logs, downstream calls, and traces.

Why: without correlation IDs, debugging a single user's request across 100+ workers is impossible.

Enforcement: middleware in app.use chain at startup. Lint check that the middleware is registered.

F.2 Every external API call is wrapped in a span¶

Rule: the platform/http.ts wrapper from §E.4 also creates an OpenTelemetry span for the call.

Why: distributed tracing is the difference between "the AI provider was slow" being a guess and a fact.

Enforcement: wrapper enforces this; direct calls don't get traces.

F.3 Every job emits start / success / failure events¶

Rule: every background job's main function emits structured job.start, job.success, job.failure events with jobType, jobId, accountId (if applicable), and duration.

Why: queue depth, throughput, and error rate dashboards depend on these.

Enforcement: wrapper utility (platform/job.ts or similar) handles this automatically.

F.4 Logging follows the structured shape¶

Rule: logs are JSON, with fields: - level (debug / info / warn / error) - msg (human-readable) - correlationId - accountId (if known) - arbitrary structured context

Why: centralized log aggregation (Loki / ELK / CloudWatch / Datadog) requires structure to be queryable.

Enforcement: the logger from platform/logger.ts produces this shape. Direct console.log is forbidden (§A.5).

F.5 New endpoints add a panel to the dashboard¶

Rule: new endpoints are added to the appropriate observability dashboard before they ship.

Why: the failure mode is "we ship a feature, it breaks, we have no dashboard, we don't know it's broken." Adding the panel during PR ensures visibility from day 1.

Enforcement: convention; reviewer enforcement; on-call review at end of sprint.

Escape valve: trivial endpoints (health-check variants, etc.) may skip.

G. Database Hygiene¶

G.1 Every schema change is a migration¶

Rule: schema changes (new tables, new columns, index changes, etc.) ship as a versioned migration file in migrations/.

Why: ad-hoc ALTER TABLE produces environment drift. Migrations make schema state reproducible.

Enforcement: CI gate: "schema diff between fresh migrate-up and the canonical baseline = empty." If the diff is non-empty, the migration is incomplete.

Escape valve: none.

G.2 No manual `ALTER TABLE` in production¶

Rule: running schema changes outside the migration framework, in any environment, is forbidden.

Why: the moment one engineer does this, the migration framework's invariant ("migrations describe the schema") is broken. Recovery is painful.

Enforcement: operational policy. DBA-equivalent gating on production access.

G.3 New columns are NOT NULL DEFAULT or have a backfill plan¶

Rule: adding a NOT NULL column without a default requires a backfill migration plan in the same PR.

Why: a NOT NULL column with no default fails on existing rows during deploy.

Enforcement: reviewer enforcement.

G.4 New tables go through architectural review¶

Rule: any new table requires a 1-page ADR in the documentation tree (e.g. someli-doc/audit/someli-api/adr/####-table-name.md) covering: owning module, primary key, foreign-key relationships (if any), retention policy, anticipated row count, indexing strategy.

Why: the schema is the single hardest thing to change later. Decisions made at table-creation time persist for years.

Enforcement: required for migrations that include CREATE TABLE. Reviewer enforcement (Wave 4).

G.5 New tables include foreign-key constraints¶

Rule: new tables declare foreign-key relationships at the schema level. The grandfathered legacy tables (which have zero FKs) are not retroactively required to add them.

Why: schema-level integrity prevents the orphan-row class of bugs.

Enforcement: reviewer enforcement.

Escape valve: for tables that will eventually be extracted to a different service (so cross-DB FKs would prevent extraction), document the rationale and use an indexed column instead.

H. Dependency Hygiene¶

H.1 New dependencies require justification¶

Rule: adding a package to package.json requires a paragraph in the PR description: what does it do, what alternatives were considered, why this one.

Why: the npm ecosystem rewards "just install a package." The 91-package tree (with 14 fully unused) is the result of treating this decision casually.

Enforcement: PR template field; reviewer enforcement.

H.2 New packages are scanned for license, size, and maintenance¶

Rule: before adding a package, verify: - License is permissive (MIT / Apache-2.0 / BSD / ISC). GPL / AGPL / proprietary are not allowed without legal review. - Bundled tree size is reasonable (under ~5 MB unless justified). - Package has been updated in the last 12 months. - No critical CVEs in current version.

Why: cumulative effect on bundle size, security exposure, and license compliance.

Enforcement: CI advisory checks for license + size; reviewer enforcement for maintenance status.

H.3 Deprecated packages cannot be added¶

Rule: packages on the deprecated list (currently: request, moment, node-fetch, body-parser standalone, raw crypto-js for new flows, sync-mysql, sync-request) cannot be added to new code.

Why: these are tagged for migration in dependencies-inventory.md. Adding new callers compounds the migration cost.

Enforcement: ESLint no-restricted-imports rule enumerating the banned list.

Escape valve: none for new code. Existing callers are migration targets.

H.4 Bundle-size growth requires acknowledgement¶

Rule: PRs that grow node_modules by more than 10% require explicit acknowledgement from a reviewer.

Why: unbounded growth is how a 1.5 GB node_modules becomes a 3 GB one.

Enforcement: CI advisory check measuring delta.

I. Documentation Requirements¶

I.1 New endpoints update OpenAPI spec¶

Rule: new HTTP endpoints update the OpenAPI spec in the same PR.

Why: spec is the contract. Spec lagging behind code = clients confused, SDK generators broken, test gaps.

Enforcement: CI: schema-validate the spec; require diff in openapi.yaml (or wherever the spec lives) for PRs that touch handlers (Wave 4). Reviewer enforcement until then.

I.2 Architectural changes update the relevant doc¶

Rule: changes that materially alter architecture (new module, new external integration, new infrastructure component) update the corresponding document in the audit tree (someli-doc/audit/someli-api/*.md) alongside the code change.

Why: docs that lag code are wrong docs, which is worse than no docs.

Enforcement: reviewer enforcement.

I.3 New env vars are added to `.env.example` and `configuration.md`¶

Rule: every new process.env.<KEY> reference in source has a corresponding row in .env.example (with a placeholder value) and configuration.md (with the description).

Why: new joiners cannot bootstrap an environment if env vars are tribal knowledge.

Enforcement: CI grep check: env vars in conf.js must appear in both .env.example and configuration.md.

I.4 New jobs are added to jobs-inventory¶

Rule: new background jobs are added to jobs-inventory.md in the same PR.

Why: the operational picture of "what runs in this fleet" must stay accurate.

Enforcement: reviewer enforcement.

I.5 New external integrations are added to Integration-inventory¶

Rule: any new third-party integration is added to Integration-inventory.md with the standard fields (Configuration / Key Files / Operations / Notes) in the same PR.

Why: the integrations doc is the canonical view of "what does this platform talk to."

Enforcement: reviewer enforcement.

J. Operational Practices¶

J.1 ADRs for architectural decisions¶

Rule: non-trivial architectural decisions are captured as ADRs in the documentation tree (e.g. someli-doc/audit/someli-api/adr/####-title.md) using the standard format: Context / Decision / Consequences.

Why: the why of a decision is the most valuable part. Without ADRs, every new engineer re-litigates decisions made years ago.

Enforcement: convention. Reviewers ask "where's the ADR?" for decisions that warrant one.

J.2 Postmortems for Sev-1 / Sev-2 incidents¶

Rule: every Sev-1 (customer-impacting outage) and Sev-2 (degraded service) incident gets a postmortem within 5 business days.

Why: without postmortems, the same incidents recur. Memory of the lesson is shorter than the gap between recurrences.

Enforcement: on-call rotation responsibility. Tracked in a postmortem index.

Format: 1. Timeline (UTC) 2. Customer impact (severity, duration, scope) 3. Root cause(s) 4. Contributing factors 5. What worked 6. What didn't 7. Action items, with owners and due dates — including: which engineering practice would have prevented this?

J.3 Action items from postmortems are tracked to closure¶

Rule: every postmortem action item has an owner, a due date, and a tracking issue. Aging action items are escalated.

Why: action items that aren't tracked aren't done.

Enforcement: weekly review of open postmortem actions.

J.4 Tech-debt rotation¶

Rule: every engineer spends ~10–20% of their time on cleanup work, on a rotating basis. There is no permanent "feature team" vs "refactor team" division.

Why: a permanent refactor team is treated as second-class; a permanent feature team accumulates debt their cleanup peers can't keep up with. Rotation distributes ownership.

Enforcement: sprint planning; reviewed quarterly.

J.5 Recognition follows cleanup work, not just features¶

Rule: performance reviews and promotion criteria explicitly include cleanup, refactoring, and operational improvements alongside feature delivery.

Why: if promotion only follows visible new features, no engineer with promotion ambitions will pay down debt.

Enforcement: management practice; reviewed annually.

How to use this document¶

For an engineer:

Day-to-day: the Definition of Done is the operational summary. Read this doc when you have a question about why a rule exists.
When in doubt: the rule applies. Exceptions are explicit, time-boxed, and documented in the PR.
When you disagree: raise it. This document is reviewed and updated; rules that turn out to be wrong are changed (not silently violated).

For a reviewer:

Reject PRs that silently violate rules. Silent non-compliance is the failure mode.
Ask for the rationale on exceptions. A real exception has a real reason.
Cite the rule, not your opinion. "Practice §E.1 prohibits this" is a clearer feedback than "I don't like this."

For a tech lead:

Update the wave schedule as practices migrate from advisory to blocking.
Track adoption in verify-markers.md.
Sponsor the new wave before opening it — secure buy-in, fix the obvious blockers, then turn the gate on.

For engineering leadership:

Sign the Engineering Charter. Without leadership commitment, the practices below are advisory.
Defend the team's time to do this work. The two-streams problem is solved by not pulling the team off cleanup when feature pressure rises.

Engineering Charter — the principles this catalog enforces
Definition of Done — the per-PR checklist
PR Template Reference — the GitHub PR template content
Enterprise Readiness — the broader modernization roadmap
Verify Markers — tracks which wave is currently in force
Dependencies Inventory — the deprecated-packages list referenced in §H.3
Data Model — table-ownership map referenced in §D.3

Engineering Practices¶

Adoption Waves¶

A. Code-Level Standards¶

A.1 New files use TypeScript¶

A.2 strict: true in tsconfig; no implicit any¶

A.3 ESLint with team ruleset, no warnings¶

A.4 Prettier-formatted; no formatting debates¶

A.5 No console.log in committed code¶

A.6 File and function size limits¶

A.7 No // @ts-ignore without rationale¶

A.8 Module-boundary import discipline¶

A.9 No new direct DB connections¶

A.10 No new sync-mysql callers¶

B. Pull-Request Requirements¶

B.1 At least one review¶

B.2 No self-approval¶

B.3 PRs target the right base¶

B.4 PR description follows the template¶

B.5 Files > 500 lines added in a single PR triggers extra review¶

B.6 PR title convention¶

C. Test Requirements¶

C.1 New endpoints have integration tests¶

C.2 New helper functions have unit tests¶

C.3 Bug fixes include a regression test¶

C.4 Coverage targets on critical paths¶

C.5 Flaky tests are quarantined fast¶

C.6 No tests skipped without a referenced issue¶

D. Architectural Boundaries¶

D.1 New code lives in src/modules/¶

D.2 The boy-scout rule¶

D.3 One module owns one table (logically)¶

D.4 No new event buses, message queues, or anti-corruption layers between modules¶

E. Security Defaults¶

E.1 No secrets in source¶

E.2 Input validation at the boundary¶

E.3 Auth-required endpoints declare their required role¶

E.4 External API calls have timeouts and retries¶

E.5 Cost-bearing endpoints declare a rate limit¶

E.6 Critical-severity dependency advisories block merge¶

E.7 Auth and billing files cannot have suppressed type errors¶

F. Observability Defaults¶

F.1 Every request has a correlation ID¶

F.2 Every external API call is wrapped in a span¶

F.3 Every job emits start / success / failure events¶

F.4 Logging follows the structured shape¶

F.5 New endpoints add a panel to the dashboard¶

G. Database Hygiene¶

G.1 Every schema change is a migration¶

G.2 No manual ALTER TABLE in production¶

G.3 New columns are NOT NULL DEFAULT or have a backfill plan¶

G.4 New tables go through architectural review¶

G.5 New tables include foreign-key constraints¶

H. Dependency Hygiene¶

H.1 New dependencies require justification¶

H.2 New packages are scanned for license, size, and maintenance¶

H.3 Deprecated packages cannot be added¶

H.4 Bundle-size growth requires acknowledgement¶

I. Documentation Requirements¶

I.1 New endpoints update OpenAPI spec¶

I.2 Architectural changes update the relevant doc¶

I.3 New env vars are added to .env.example and configuration.md¶

I.4 New jobs are added to jobs-inventory¶

I.5 New external integrations are added to Integration-inventory¶

J. Operational Practices¶

J.1 ADRs for architectural decisions¶

J.2 Postmortems for Sev-1 / Sev-2 incidents¶

J.3 Action items from postmortems are tracked to closure¶

J.4 Tech-debt rotation¶

J.5 Recognition follows cleanup work, not just features¶

How to use this document¶

Related¶

A.2 `strict: true` in tsconfig; no implicit `any`¶

A.5 No `console.log` in committed code¶

A.7 No `// @ts-ignore` without rationale¶

A.10 No new `sync-mysql` callers¶

D.1 New code lives in `src/modules/`¶

G.2 No manual `ALTER TABLE` in production¶

I.3 New env vars are added to `.env.example` and `configuration.md`¶