Skip to content

Content & Template Pipeline

The defining role of designer-api in the platform is populating and curating the shared content library that someli-api then uses to generate user-facing posts. This document sketches the data flow.

Data model — the relevant tables

(Inferred from the SQL queries observed in jobs and routes; verify against the audit-tree schema snapshot at ../someli-api/someli-schema.sql — not against any in-repo file, since the schema is not committed to any source repo.)

Table Purpose
tIndustries Top-level industry verticals (Auto, Insurance, Real Estate, HR, …). Each industry has an integer id.
tCategories Sub-categories under industries (e.g., for Auto: Ceramic Coating, Window Tinting). parent_id > 0 filters categories that are not industries themselves.
tLibrary The content library — body text + metadata; per category and industry
tDefaultCategories Templates' default categories (with a modelId foreign key to AI model)
tTopics Topics for AI content generation; status enum drives queue progress (0=new, 1=in-progress, …)
tTempalte_Status (sic) Status of generated templates per library entry per day
tAutoDesignPost The post-design queue — what to generate / approve
tMedia, tMediaTemplates Media metadata + JSON descriptors

The factory pattern

For each industry vertical, designer-api runs one or more job_<industry>.js files:

tIndustries (id = <industry>)
SELECT tCategories WHERE industry_id = <industry>
For each category:
   SELECT tLibrary entries (not yet templated today)
   Generate template for each library entry
   Insert into tTempalte_Status (records the template was made today)
   Insert into tAutoDesignPost (queues for designer approval)

The cron tick is typically every 5 minutes (*/5 * * * *), with the isOnProcess guard preventing overlapping runs.

Approval workflow

After generation:

  1. job_post_approved.js runs every 1 minute, pulls rows from tAutoDesignPost WHERE approved = 1 AND status = 'Added' AND attachedLibrary = 0
  2. For each approved post: derives variantCode (a color descriptor), uploads the rendered image to S3 (via polotno-node), updates attachedLibrary = 1
  3. job_preprodpost_approved.js does the equivalent for the pre-production stage (probably before customer-facing visibility)
  4. job_specific_approved_post.js handles per-account-targeted approvals

Validation jobs

job_check_color_with_json.js, job_check_temp_color_with_json.js, job_check_media_json.js, job_check_mediaTemplates_json.js, job_check_temp_valid.js each run every few minutes and:

  1. SELECT tMedia / tMediaTemplates WHERE json IS NOT NULL AND checkedQueue = 0
  2. Parse the JSON design descriptor
  3. Compare against the rendered media for color / dimension / template-validity consistency
  4. Set checkedQueue = 1 (and flag bad rows for human review)

These act as the CI for content — automated regression detection on the template library.

Bots vs jobs (the AI-generation side)

The *bot.js files generate content from scratch using OpenAI:

Bot What it generates
content_generation_bot.js Topic-driven body copy per category
FAQsbot.js FAQ posts
quizbot.js Quiz posts
quotesbot.js Quote posts
trendsbot.js Trending-news-driven posts
post_image_generation1_bot.js Image generation (probably DALL-E or stable diffusion via OpenAI)

These feed tLibrary / tTopics with new entries; the industry-specific job_*.js files then template them.

Polotno integration

When a job decides to "render an image from a template":

  1. The template is a Polotno JSON design (stored as tMedia.json or tMediaTemplates.json)
  2. polotno-node's createInstance().pageToImageBase64(...) is called server-side
  3. The resulting PNG is uploaded to S3 (s3.upload(...))
  4. The S3 URL is stored against the row

polotno-node spawns a headless Chrome (via puppeteer) and uses Polotno's React renderer. Each render takes ~1-5 seconds. Cost note: 57 jobs running every 5 min that each render a few designs adds up. Expect Polotno node usage to be a non-trivial cost line.

Notifications

Every 2 hours during the working day (teamsnotification.js), a Slack message is sent listing any gaps in content. The content team uses this to prioritise manual reviews via Someli-Designer.

Coordination with someli-api

designer-api writes to the same DB tables that someli-api reads. There is no event / signal. The customer-facing app (someli-platformsomeli-api) just queries the latest library state on demand.

Edge case: a customer requests a post in industry X at the same moment designer-api is mutating tLibrary for X. The blocking SQL semantics make this safe at the row level (MySQL row locking). Liveness might suffer if the designer-api's sync-mysql connection holds a row lock for too long.

Recommendations

ID Recommendation Effort
C-1 Replace 30+ industry-clone jobs with one parameterised job driven by tJobConfig table Medium (1-2 weeks)
C-2 Add a tContentGenerationRun audit table tracking each run (start time, rows processed, errors) — replaces console.log archaeology Small (1 day)
C-3 Surface the same data as a designer FE page (instead of relying on Slack notifications) Medium
C-4 Move Polotno rendering to a separate pool of worker processes (current model: each job spins up its own Polotno instance) Medium
C-5 Add a "dry-run" mode for content generation to test prompt / template changes safely Medium
C-6 Centralise prompt templates in helper/prompts/<topic>.json Medium