The Engine API is the service-to-service surface used by AI agents and
external trigger sources to drive flows. It sits at /api/v1/engine/* and
is intentionally separate from the public chat widget API documented under
API reference — different audience, different auth, different
rate-limit budget.
Four endpoints are implemented:
| Method | Path | Purpose |
|---|
POST | /api/v1/engine/events | Emit a domain event to a tenant; fans out to matching trigger.event flows. |
GET | /api/v1/engine/intents | List a tenant’s published trigger.chat intent catalog (ETag-cached). |
POST | /api/v1/engine/triggers/chat | Start a chat-intent execution from an agent. |
POST | /api/v1/engine/executions/{executionId}/resume | Resume a paused execution with user input. |
The triggers/chat and resume endpoints both return the same
ChatReply envelope, so an agent needs exactly one
response parser for the conversational surface.
Authentication
Every request must carry a bearer token in the Authorization header:
POST /api/v1/engine/triggers/chat HTTP/1.1
Host: flow.example.com
Authorization: Bearer <ENGINE_API_TOKEN>
Content-Type: application/json
The token is a deploy-wide root credential in V1 — set via the
ENGINE_API_TOKEN environment variable and rotated by redeploying with a
new value. The engine_api firewall validates it and grants the synthetic
ROLE_ENGINE_API role; the same token also gates /metrics and the
/api/v1/insights/* and /api/v1/training/* surfaces.
Treat the token like a database passwordENGINE_API_TOKEN is not bound to a tenant — any holder can act for
any tenant via the body tenant_id (see Tenant identity).
Store it in your secrets manager, hand it only to services that
legitimately drive flows, and rotate immediately on suspected leak. As a
detection control, every body-selected tenant is written to the audit log
under the engine.api.call action, so cross-tenant use is visible after
the fact.
A missing or invalid bearer token returns 401 before the request reaches
the controller.
Roadmap: per-tenant tokensPer-tenant tokens (scoped + revocable from the admin UI) are planned. When
they land, the firewall will validate the body tenant_id against the
token’s tenant binding and a mismatch will return 403 — this page will
document issuance and scope rules at that point.
Tenant identity
The request body carries tenant_id explicitly (the bearer token is
deploy-wide, not tenant-scoped). The engine sets it on TenantContext
before any database read so RLS scopes apply for the request thread. The
intents endpoint reads tenant_id from the query string instead, since
it is a GET.
tenant_id is required on every endpoint; omitting it returns
400 invalid_input.
Rate limits
Per-IP cap: 300 requests / minute (sliding window) across all
/api/v1/engine/* endpoints from one source IP. The cap applies even though
the surface requires a bearer token — a leaked token shouldn’t enable an
unlimited firehose. Bursts above the cap return 429:
{ "error": "rate_limited", "message": "Too many requests" }
with a Retry-After header (integer seconds) — the same convention as the
public chat API (see Errors). The cap is
higher than the widget surface because agent traffic is server-side and
typically bursts in tighter windows. If the limiter backend (Redis in
production) is unavailable, the engine fails open rather than rejecting
legitimate traffic.
Variables
triggers/chat and resume accept an optional top-level variables map —
conversation-scoped values that the engine merges onto the conversation row
and seeds into the run as {{ vars.* }}, visible from the first node.
Conversation scope is the lower-precedence layer, so a flow’s own
set_variable can override a key during the run. When you send variables
on a later turn, the new keys are merged into the persisted set; omitting
variables reuses the previously stored values.
The map is validated before the run starts. A violation returns
422 invalid_input with the offending key in details.key:
| Constraint | Limit |
|---|
| Type | Must be a JSON object (not an array). {} and absent both mean “no variables”. |
| Key count | At most 50 keys. |
| Key format | snake_case, 1–64 chars, [a-z][a-z0-9_]* (leading letter required). |
| Value shape | Scalar, null, or nested array. Objects and resources are rejected. |
| Nesting depth | Arrays may nest at most 4 levels deep. |
| Serialized size | At most 4096 bytes (compact JSON). |
"variables": {
"plan": "pro",
"seats": 5,
"preferences": { "theme": "dark" }
}
ChatReply envelope
triggers/chat and resume return the same envelope (HTTP 200):
{
"execution_id": "0193f8a1-…",
"conversation_id": "0193f8a1-…",
"status": "waiting_input",
"blocks": [
{
"id": "blk_1",
"type": "message",
"payload": { "text": "How can I help?", "role": "agent", "format": "plain" },
"meta": { "source_node_id": "node-ask" }
}
],
"expected_input": {
"type": "form_submission",
"block_id": "blk_2",
"schema": { "...": "JSON-Schema-shaped object" },
"form_id": "0193f8a1-…"
},
"metadata": {
"wait_token": "…",
"wait_expires_at": "2026-05-28T18:30:00Z"
},
"token_usage": {
"prompt_tokens": 120,
"completion_tokens": 48,
"total_tokens": 168,
"model": "gpt-4o",
"cost_micros": 2100,
"cost_usd": 0.0021
}
}
| Field | Notes |
|---|
execution_id | The execution this turn advanced. Pass it to /resume. |
conversation_id | Resolved conversation; pass it back on the next triggers/chat to continue the dialogue. |
status | Current state-machine place — see below. |
blocks | Renderable units accumulated since the last reply. The engine describes what to render; your agent decides how. See Blocks. |
expected_input | What to POST to /resume next; null when the execution is terminal. type is one of form_submission, button_choice, file_upload, none; schema is the JSON-Schema-shaped object to validate against locally (the engine re-validates server-side). |
metadata.wait_token | Present only when status is waiting_input; binds the next /resume to this exact pause. |
metadata.wait_expires_at | ISO-8601 instant the wait expires, when set. |
token_usage | LLM token usage + estimated cost for the turn, or null when no tokens were consumed. model is set only when the whole turn used a single model. cost_micros is micro-USD (1e-6 USD); cost_usd is the same value in whole dollars. |
status values:
| Status | Meaning |
|---|
waiting_input | Paused at a UI wait point; expected_input + wait_token are set. Call /resume. |
waiting_time | Paused on a timer; the engine resumes it on its own. |
completed | Flow finished successfully. |
failed | Flow errored. |
aborted | Execution was aborted. |
POST /api/v1/engine/events
Emit a domain event to a tenant. The engine finds every active,
published event subscription matching event_name, evaluates each
subscription’s optional filter expression against the event payload (exposed
as $event.*), and starts a fresh execution per matching subscription. A
broken filter fails closed (the subscription is skipped).
POST /api/v1/engine/events HTTP/1.1
Authorization: Bearer <ENGINE_API_TOKEN>
Content-Type: application/json
{
"tenant_id": "0193f8a1-…",
"event_name": "order.paid",
"data": { "order_id": "4567", "total": 129.00 },
"idempotency_key": "order-4567-paid"
}
| Field | Required | Notes |
|---|
tenant_id | yes | Tenant UUID. |
event_name | yes | Subscription key to fan out on. |
data | no | Free-form event payload; exposed to subscription filters as $event.*. Defaults to {}. |
idempotency_key | no | Body field (not a header for this endpoint). Combined with each subscription id to dedupe the per-subscription executions it starts. |
Response — 202 Accepted:
{
"event_id": "evt_0193f8a1-…",
"subscriptions_matched": 2,
"execution_ids": ["0193f8a1-…", "0193f8a2-…"]
}
subscriptions_matched and execution_ids let the caller confirm the
fan-out happened. A missing tenant_id or event_name returns
400 invalid_input.
GET /api/v1/engine/intents
List a tenant’s published trigger.chat intent catalog — the menu an agent
matches a user utterance against before calling triggers/chat. The catalog
is ETag-cached so agents can poll cheaply.
GET /api/v1/engine/intents?tenant_id=0193f8a1-… HTTP/1.1
Authorization: Bearer <ENGINE_API_TOKEN>
If-None-Match: W/"a1b2c3d4e5f60718"
tenant_id is read from the query string here. Send the last ETag you
saw in If-None-Match; if the catalog is unchanged the engine returns
304 Not Modified with no body. Otherwise it returns 200:
{
"intents": [
{
"name": "track_order",
"description": "Look up order status",
"examples": ["where is my order", "track my package"],
"required_entities": ["order_id"],
"priority": 10,
"flow_id": "0193f8a1-…",
"flow_version": 4,
"display_label": "Track an order",
"subtitle": "Order status & shipping",
"icon": { "kind": "lucide", "value": "package" },
"accent_color": "#3b82f6",
"style_variant": "solid",
"is_pinned": true
}
],
"etag": "W/\"a1b2c3d4e5f60718\"",
"cache_max_age_seconds": 300
}
| Field | Notes |
|---|
intents[].name | Intent name — pass this as intent_name to triggers/chat. |
intents[].description | Authored description, or "". |
intents[].examples | Example utterances for matching. |
intents[].required_entities | Entities the agent should extract before triggering. |
intents[].priority | Integer tie-breaker for overlapping matches. |
intents[].flow_id / flow_version | The published flow + version behind the intent. |
intents[].display_label / subtitle / icon / accent_color / style_variant / is_pinned | Presentation hints for rendering an intent picker. icon is null or { kind, value }. |
etag | Weak ETag over the catalog; echo it as If-None-Match on the next poll. |
cache_max_age_seconds | Suggested client cache window: 300 seconds. |
A missing tenant_id returns 400 invalid_input.
POST /api/v1/engine/triggers/chat
Start a chat-intent execution. The engine picks the published flow version
that owns intent_name, creates a fresh execution, drives the synchronous
step loop until the flow pauses or terminates, and returns the
ChatReply envelope.
POST /api/v1/engine/triggers/chat HTTP/1.1
Authorization: Bearer <ENGINE_API_TOKEN>
Content-Type: application/json
Idempotency-Key: 7c1a…-once
{
"tenant_id": "0193f8a1-…",
"intent_name": "track_order",
"conversation_id": "0193f8a1-…",
"context": {
"recent_messages": [{ "role": "user", "text": "where is my order?" }],
"extracted_entities": { "order_id": "4567" }
},
"variables": { "plan": "pro" }
}
| Field | Required | Notes |
|---|
tenant_id | yes | Tenant UUID. |
intent_name | yes | Must match a published trigger.chat intent (see GET /intents). |
conversation_id | no | Continue an existing conversation; omit to start a new one (the engine returns the resolved id). |
context | no | Agent-supplied chat context. context.recent_messages[] with { role: "user", text } are recorded as user turns; context.extracted_entities is forwarded to the trigger. Defaults to {}. |
variables | no | Conversation-scoped variables — see Variables. |
Idempotency is keyed by the Idempotency-Key header on this
endpoint. With a key set, a replay carrying the same payload returns the
cached response; a replay with the same key but a different payload
returns 409 idempotency_conflict. Keys are scoped per tenant and cached for
24 hours.
Errors:
| Status | error | When |
|---|
400 | invalid_input | Non-object body, or missing tenant_id / intent_name. |
404 | intent_not_matched | No published flow matches intent_name. details.available_intents lists the tenant’s intent names. |
409 | idempotency_conflict | Idempotency-Key reused with a different payload. |
422 | invalid_input | variables failed validation (details.key names the offender). |
POST /api/v1/engine/executions/{executionId}/resume
Resume an execution paused on user input. The engine validates the
wait_token, validates the submitted input.values against the waiting
node’s schema, applies it, routes past the wait point, drives the step loop,
and returns the ChatReply envelope.
{executionId} must be a 36-character UUID (route requirement).
POST /api/v1/engine/executions/0193f8a1-…/resume HTTP/1.1
Authorization: Bearer <ENGINE_API_TOKEN>
Content-Type: application/json
Idempotency-Key: 7c1a…-resume
{
"tenant_id": "0193f8a1-…",
"wait_token": "…",
"input": { "values": { "email": "[email protected]" } },
"context": {},
"variables": { "seats": 5 }
}
| Field | Required | Notes |
|---|
tenant_id | yes | Tenant UUID. |
wait_token | yes | The metadata.wait_token from the envelope that paused this execution. |
input.values | no | The submission, validated against the waiting node’s expected_input.schema. Defaults to {}. |
context | no | Optional agent-side context. Defaults to {}. |
variables | no | Conversation-scoped variables — see Variables. |
Idempotency works identically to triggers/chat: keyed by the
Idempotency-Key header, same-payload replays return the cached
response, key reuse with a different payload returns 409, 24-hour TTL.
Errors:
| Status | error | When |
|---|
400 | invalid_input | Non-object body, or missing tenant_id. |
404 | execution_not_found | No execution with that id for the tenant. |
409 | invalid_wait_token | Missing wait_token, token mismatch, execution not in waiting_input, or a concurrent resume was detected. |
410 | execution_aborted | The execution was aborted and can’t be resumed. |
422 | invalid_input | Submitted input.values failed schema validation (details.validation_errors), or variables failed validation. |
409 | idempotency_conflict | Idempotency-Key reused with a different payload. |
Conversions
POST /api/v1/insights/conversions records a deferred conversion — a goal
reached outside the chat, e.g. a sale completed on your site after the
conversation ended — and attributes it back to the originating chat. It lives
on the Insights surface (also gated by ENGINE_API_TOKEN) rather than under
/api/v1/engine/, so the engine per-IP rate limit does not
apply to it.
POST /api/v1/insights/conversions HTTP/1.1
Authorization: Bearer <ENGINE_API_TOKEN>
Content-Type: application/json
{
"tenant_id": "0193f8a1-…",
"goal": "purchase",
"conversation_id": "0193f8a1-…",
"external_ref": "order-4567",
"occurred_at": "2026-05-28T18:00:00Z",
"props": { "customer_id": "u-42", "value": 129.00, "order_ref": "4567" }
}
| Field | Required | Notes |
|---|
tenant_id | yes | Tenant UUID (the bearer token is deploy-wide). |
goal | yes | An active goal code (authored under Insights → Goals). |
conversation_id | no | Attribute directly to this conversation. |
external_ref | no | Idempotency key — a replay returns the existing completion. |
occurred_at | no | ISO-8601; defaults to now. |
props | no | Free-form conversion payload (stored as-is). customer_id is read from it for window attribution when no conversation_id is given. The conversion is domain-agnostic — there are no built-in value/currency fields. |
The numeric value (if any) is decided by the goal, not by the request: a
goal with value_mode: from_property reads props[<value_property>] (you pick
the key — value, amount, points, …); a fixed goal uses its own amount; a
none goal records no value. Currency, when relevant, comes from the goal’s own
setting. This is the same value model in-chat matching uses, so server and
in-chat conversions behave identically.
Responses:
201 { "completion_id", "attributed": true|false, "conversation_id": "…|null" } — recorded.
200 { "deduped": true, "completion_id" } — external_ref already seen.
404 { "error": "unknown_goal" } — no active goal with that code.
401 — missing/invalid bearer token. 422 — malformed conversation_id.
In-chat goals need no API call — they’re matched automatically from the
telemetry events the widget sends.
See also
- API reference — interactive OpenAPI/Redoc reference for the public chat API.
- Webhooks — inbound webhook intake that can drive
trigger.event flows, and outbound delivery.
- Blocks — the renderable block types returned in
blocks[].
- Errors — shared error envelope, retry algorithm, and rate-limit handling.