Skip to content

Media Processing

designer-api is the heaviest media-processing repo in the platform alongside someli-api. It exists primarily to generate, validate, and store template-based images.

Tooling

Tool SDK Use
Polotno (server-side) polotno-node ^2.10.4 Render PNG/JPG from a Polotno JSON design
Headless Chrome puppeteer ^23.4.0 Underlies Polotno-node; renders the React Polotno editor in a hidden browser to produce images
Sharp sharp ^0.33.5 Resize, crop, format-convert (JPEG/PNG/WebP), quality control
FFmpeg fluent-ffmpeg ^2.1.3 + system binary Video processing (if used)
PDF parsing pdf-parse ^1.1.1 Extract text from uploaded PDFs (in pdfToExtractData.js)
Image dimensions (not via npm image-size; see verify) (verify whether dimensions are read from Sharp output)

S3

Two buckets in two regions (matching the platform pattern):

  • conf.S3_Bucket_Name in conf.S3_Region — primary
  • conf.S3_Bucket_Name2 in conf.S3_Region2 — secondary

Two AWS.S3 clients are initialised in routes/routes.js:

const s3 = new AWS.S3({ accessKeyId, region: conf.S3_Region, secretAccessKey });
const s2 = new AWS.S3({ accessKeyId, region: conf.S3_Region2, secretAccessKey });

Paths under each bucket are configured via conf.S3_Path / conf.S3_Path2. Public URLs use conf.S3_Bucket_Url2.

Pipelines

1. Template generation → S3

A job picks up a row from tAutoDesignPost → uses polotno-node to render the JSON design → optionally post-processes with Sharp → uploads to S3 → updates the DB row with the URL.

2. Template validation

job_check_color_with_json.js (every 5 min): 1. SELECT tMedia WHERE json IS NOT NULL AND checkedQueue = 0 LIMIT 50 2. Download the image from S3 3. Use Sharp / canvas-like logic to extract dominant color 4. Compare to the JSON's declared color 5. Flag mismatches

This is the content QA layer.

3. PDF extraction

pdfToExtractData.js uses pdf-parse to extract text from uploaded PDFs (probably for ingesting customer brand docs).

4. Image upload utility

s3_Image_upload.js is a small helper for uploading binary blobs to S3. Used by routes / bots.

5. Polotno upload utility

polotno_image_uploader.js — a wrapper around Polotno's image-upload API. Possibly delegates to S3 too.

Performance characteristics

  • Polotno render: 1-5 seconds per page (depends on complexity)
  • Sharp resize: < 100 ms for typical 1080×1080 images
  • S3 upload: 100-500 ms (depending on network)
  • A single approved post render: ~3-8 seconds end-to-end

With 60+ workers running, the total Polotno render throughput is significant. Each worker spawns its own headless Chrome via puppeteer — that's a lot of memory. Typical Chrome footprint per instance is 100-200 MB. 60 workers × 150 MB = ~9 GB just for headless Chrome.

Recommendation: pool Polotno instances. Use polotno-node's instance reuse (getInstance() once, multiple render() calls).

Findings

F-M1: No image-size limits

server.js accepts up to 50 MB JSON body and uses express-fileupload (default limits unset). A user uploading a 500 MB file could OOM the process. Tighten via:

app.use(fileUpload({ limits: { fileSize: 50 * 1024 * 1024 } }));

F-M2: No virus/malware scanning

Uploaded PDFs / images are not scanned. For an internal tool used by trusted staff, this is acceptable. If customer-uploaded files ever flow through here, add ClamAV or equivalent.

F-M3: Polotno license key location

The license key is initialised somewhere in routes/routes.js (search for setLicenseKey or licenseKey:). Verify whether hardcoded or env-driven. If hardcoded, move to process.env.POLOTNO_LICENSE.

F-M4: Unbounded LIMIT in some validation jobs

Some validation jobs lack a LIMIT clause, processing all unchecked rows in one tick. As tMedia grows, the tick takes longer; eventually exceeding the cron interval; eventually exceeding the sync-mysql connection's transaction window. Spot-check each validation job for LIMIT N.

F-M5: No retry for Polotno render failures

If Polotno's headless Chrome crashes mid-render, the row stays in status = 'Added' forever. Fix: a tries counter + dead-letter behaviour after N attempts.

Recommendations

ID Recommendation Effort
M-1 Pool Polotno instances (reuse Chrome processes) Medium
M-2 Tighten file-upload size limits Trivial
M-3 Move Polotno license key to env Small
M-4 Add LIMIT to all validation queries Small
M-5 Add retry + dead-letter for render failures Medium
M-6 Centralise S3 client initialisation (don't duplicate in routes/jobs/bots) Medium
M-7 Centralise Polotno render utility (don't duplicate setup code across jobs) Medium