Skip to main content

Backend Conventions (Functions + Jobs)

This document defines how we write backend code in apps/functions so it is consistent, scalable, and “big-tech boring”: predictable contracts, safe retries, and clear module boundaries.

It is written to be actionable by engineers doing refactors.

Guardrails (non‑negotiables)

  • Public API is a contract: /v1/... endpoints are stable. Breaking changes require versioning + deprecation policy.
  • Everything is retryable: any request or job may run more than once. Handlers MUST be idempotent.
  • Async-first architecture: user-facing endpoints should return quickly; long work is done by background jobs/workers.
  • Validate at boundaries: validate and normalize inputs at HTTP/job boundaries using schemas.
  • Observability by default: every request/job MUST emit correlated logs/traces and write durable status state.
  • Vendor coupling is allowed, but contained: GCP SDK usage should live in small adapters, not scattered through domain logic.

Paved path: Architecture we build toward

Public HTTP API

  • Express API (e.g. apps/functions/api.js) exposes /v1/... endpoints.
  • Endpoints should be thin: auth → validation → create resource → enqueue work → return ID.

Async engine

  • Cloud Tasks queues drive orchestration and platform publishing workers.
  • Workers update durable state (Firestore) so clients can poll status and/or receive webhooks.

Layering (code boundaries)

  • http/: request/response shaping only (routers, handlers, status codes).
  • domain/ (target state): pure business logic (publishing rules, target resolution policy, mapping of states), no SDK calls.
  • adapters/gcp/ (target state): Firestore/Tasks/Storage wrappers; this is where vendor coupling lives.
  • integrations/: platform-specific API clients + translation (TikTok/X/YouTube/etc).
  • services/: application workflows (glue): calls domain + adapters + integrations.

Note: today you have a large shared/ folder. That’s OK. Refactors should move code toward the boundaries above.

Repository structure: what belongs where (today)

  • apps/functions/api.js
    • Express app composition: middleware order, route mounting, 404, error handler.
  • apps/functions/http/routes/*
    • Route definitions + middleware chain.
    • Should call into services; avoid embedding business logic.
  • apps/functions/http/handlers/*
    • Handler logic that is still HTTP-shaped (req/res). Keep small; delegate to services.
  • apps/functions/middleware/*
    • Cross-cutting HTTP concerns: auth, validation, rate limiting, error handling, request IDs.
  • apps/functions/orchestrators/*
    • Job/workflow orchestration. Should be “workflow glue,” not platform implementation details.
  • apps/functions/services/*
    • Application-level workflows (media acquisition, AI transcription, template rendering, etc.).
  • apps/functions/integrations/*
    • Platform-specific code: OAuth, token refresh, publish calls, polling, metrics.
  • apps/functions/shared/*
    • Shared primitives and utilities. Refactor goal is to split into domain/ + adapters/ over time.

HTTP API standards (integration-friendly)

Request lifecycle (middleware order)

Use a consistent chain for all routes:

  • Request ID / trace context
  • Auth
  • Subscription / authorization
  • Rate limiting
  • Schema validation
  • Handler
  • Error handler (last)

Authentication standards

  • Public API: API key auth for server-to-server integrations (Zapier/Make/n8n) is allowed.
  • Dashboard/API key management: Firebase ID token auth is allowed.

All auth middleware MUST set a single consistent auth context on the request (example: req.auth.userId).

Errors

Follow the project’s REST conventions doc (engineering/api-conventions-rest.md): stable error shape, consistent status codes.

Rules of thumb:

  • 400: validation (client can fix payload)
  • 401/403: auth/permission
  • 404: missing resource
  • 409: conflict/idempotency collision
  • 429: rate limit exceeded
  • 5xx: internal errors or upstream outage

Idempotency (required for side effects)

Endpoints that create resources or trigger side effects MUST support an Idempotency-Key header.

  • Same key + same endpoint + same auth context MUST return the same response.
  • If the same key is reused with a meaningfully different payload, return 409.

Background jobs (Cloud Tasks) standards

Job payloads are versioned contracts

Every task payload MUST include:

  • version: integer (start at 1)
  • requestId: correlation ID (or inherit from trace context)
  • idempotencyKey: required when the job causes side effects
  • createdAt: ISO timestamp (optional)

Validate payloads at the worker boundary using schemas (Zod or @soku/schema).

Idempotency and retries (the core rule)

Cloud Tasks retries are normal. Workers MUST be safe under:

  • duplicate deliveries
  • timeouts and partial failures
  • worker redeploys mid-execution

Required pattern:

  • Persist a durable “run record” before calling external platforms.
  • If a run record indicates success already happened, return success without redoing the side effect.
  • Store platform IDs returned by upstreams (tweet/video/post IDs) and treat them as ground truth.

Trace propagation

If a request enqueues a task, it MUST propagate trace context (correlation) to the task payload and/or headers.

Queue naming and task naming

  • Queue names should be stable and descriptive (e.g. publish-x, publish-tiktok).
  • Task names should be deterministic when possible (so duplicates collapse) and include a stable identifier (submission ID / run ID).

Firestore standards

  • Single source of truth for status: all async workflows must update durable status documents that power UI + API status endpoints.
  • Prefer bounded data: avoid unbounded arrays in documents for high-churn data.
  • Transactions for concurrency: use transactions when multiple workers may race.
  • Timestamps: use serverTimestamp() for durable events; avoid mixing client timestamps for authoritative state.
  • Pagination: use cursor pagination at the API boundary; store indexes as needed.

Integration standards (external platform APIs)

  • Timeouts + retries: external calls should have explicit timeouts and retry/backoff policy (do not retry non-retryable errors).
  • Error mapping: map platform errors into a small set of internal error codes (auth expired, media invalid, rate limited, platform outage, unknown).
  • Token lifecycle: refresh tokens in a dedicated module; do not refresh inline everywhere.
  • Media normalization: platform-specific constraints should live near platform code; cross-platform normalization should be reusable and tested.

Logging & observability

  • Every HTTP request MUST have a request ID and trace context.
  • Every job execution MUST log:
    • queue + task identifiers
    • submission/run identifiers
    • target platform/account
    • outcome (success/failure) + error codes
    • timing (duration)

Redaction rules:

  • NEVER log raw secrets (API keys, OAuth tokens).
  • Avoid logging full request bodies in production; log keys, sizes, and IDs.

Code review checklist (use in PRs)

  • Boundaries: HTTP code doesn’t embed business rules; platform code doesn’t drive orchestration policy.
  • Validation: all external inputs validated (HTTP + tasks).
  • Idempotency: create/charge/publish paths are safe to retry.
  • Status: durable status updates exist and are testable.
  • Errors: consistent error shape and actionable codes.
  • Observability: correlated logs/traces; no secrets in logs.
  • Tests: unit tests for business logic; integration tests for critical flows (emulators where possible).

Refactor playbook (how to “fix the codebase” safely)

When cleaning existing code, do it in this order to minimize risk:

  1. Add/confirm contracts at boundaries (schemas for HTTP + task payloads).
  2. Make execution idempotent (run records + dedupe keys) before changing structure.
  3. Extract adapters (Firestore/Tasks/Storage wrappers) to reduce scattered vendor coupling.
  4. Extract domain modules (pure logic) and write unit tests.
  5. Reorganize folders once behavior is locked by tests.

LLM Notes

  • When adding new endpoints/jobs, follow the guardrails above and the repo’s existing docs:
    • engineering/api-conventions-rest.md
    • engineering/testing-strategy.md
    • engineering/observability.md
  • Prefer small, composable modules with explicit inputs/outputs.
  • Do not introduce new architectural patterns (CQRS, event sourcing, etc.) without an ADR.