Multi-step workflows

Some apps run a single workflow across hours, days, or weeks: a sales cadence with 6 touches over 3 weeks, a customer onboarding sequence, a multi-day approval flow. The instinct from generic software engineering is to build a state machine — finite states, hardcoded transitions, scheduled timers between them.

Don’t. LinkWorld gives you something better: agent heartbeats. A scheduled tick wakes an agent, the agent sees the current state of everything, and the agent decides what to do next. No transition table, no if status == "AWAITING_REPLY". Just context + judgment.

When to reach for this

Need	Pattern
One-shot scheduled action (“send this in 2h”)	`task_create` with one-time interval
Recurring cron-style action (“daily at 9am”)	`task_create` with cron interval
Multi-step branching workflow that depends on state	Agent heartbeat (this page)
Stateless event response	Event handler (`on_inbound`, `on_chat`)
Pure DAG with no judgment calls	Workflow primitives

If the decisions between steps depend on data that wasn’t known when the workflow started — a reply came in, a deadline passed, a trigger signal appeared — you want a heartbeat, not a state machine.

Canonical solution

1. Declare a heartbeat in your manifest

agents:
  - id: outreach
    name: Outreach Agent
    is_default: true
    system_prompt: |
      You drive sales cadences for the tenant. Each tick, review
      active campaigns and decide for each lead: send next touch
      today, snooze, mark complete, or escalate.
    heartbeats:
      - slug: daily_cadence
        # The heartbeat schema requires a pre-filter (skill + tool)
        # even when invoke_always is set. _internal_app_records with
        # tool=query is the lightest no-op for kv-backed apps — the
        # query returns empty; invoke_always: true makes the agent
        # fire regardless and load state via your own tools.
        skill: _internal_app_records
        tool: query
        params:
          record_type: campaign
          status: active
        cron: "0 9 * * *"          # daily at 09:00 tenant-local
        invoke_always: true
        active_hours: "08:00-18:00"
        cooldown_minutes: 60

See the heartbeats reference for all manifest fields. The shared scheduler picks the row up; you don’t write a loop.

2. Load the current state when the agent wakes

The agent gets your records as context. Use a tool call (or, for read-heavy apps, a heartbeat pre-filter that returns the rows directly) to load:

All active campaigns
All open leads in those campaigns
Touches sent so far (count, timestamps)
Replies received since last tick
Trigger events (e.g. new LinkedIn post by a lead)

# In a tool the agent calls (heartbeats also call typed tools, not
# raw ctx.kv — keep the persistence layer in one place).
async def load_cadence_state(ctx) -> dict:
    # App-private records live in ctx.kv under `{record_type}:{id}`
    # keys. List by prefix, filter in code for value-matches.
    campaign_items = await ctx.kv.list(
        prefix="campaign:", limit=500, include_values=True,
    )
    campaigns = [
        {"id": item["key"].split(":", 1)[1], "data": item["value"]}
        for item in campaign_items
        if isinstance(item.get("value"), dict)
        and item["value"].get("status") == "active"
    ]
    campaign_ids = {c["id"] for c in campaigns}

    lead_items = await ctx.kv.list(
        prefix="lead:", limit=2000, include_values=True,
    )
    leads = [
        {"id": item["key"].split(":", 1)[1], "data": item["value"]}
        for item in lead_items
        if isinstance(item.get("value"), dict)
        and item["value"].get("campaign_id") in campaign_ids
    ]
    return {"campaigns": campaigns, "leads": leads}

Persistence note. App-private records live in ctx.kv for marketplace apps. Keys follow a {record_type}:{id} convention so ctx.kv.list(prefix=...) reads back all records of a type; equality matching on arbitrary fields happens in code. For hot lookups, denormalize: store a second key per index dimension you need (lead_by_email:<email> → lead_id) and kv.get directly.

3. Let the agent decide per lead

The agent’s system prompt should encode the policy — not the state machine. Something like:

For each lead in active campaigns:

1. If a positive reply came in since last tick → mark
   campaign.stage=replied, stop sending, draft a meeting
   request for human approval.
2. If a negative reply / opt-out → mark stage=opted_out,
   add to suppression list, stop.
3. If an OOO auto-reply → snooze 7 days.
4. If no reply and last touch was 3+ days ago → send next
   touch from the cadence template, personalized with any
   new trigger signals you find.
5. If 5 touches sent and no reply → mark stage=cold, stop.

Guardrails:
- Never send more than 20 emails today (across all campaigns)
- Never send two touches to the same lead within 48h
- Always include at least one concrete fact about the lead
  in personalized body

The agent reads state, follows policy, calls tools (email_send, odoo_update_record, your app’s typed @app.tools that write to ctx.kv), writes a one-line summary back, exits.

4. Audit + suppression markers

Echo the LinkWorld heartbeat conventions:

Return HEARTBEAT_OK (in the agent’s text reply) when the tick was a no-op — the platform suppresses the run from “activity” surfaces.
Use <escalate-to-vision/> to wake the agent’s vision-loop (full reasoning + longer-running plan) for substantive ticks.

Without these markers, every tick shows up as activity noise.

❌ Anti-patterns

Don’t: write a state machine in your handler

# ❌ DON'T do this
class CadenceStateMachine:
    def transition(self, lead, event):
        if lead.status == "PENDING":
            if event == "SEND_TICK":
                self.send_email(lead, touch=1)
                lead.status = "TOUCH_1_SENT"
        elif lead.status == "TOUCH_1_SENT":
            if event == "REPLY":
                # ... 200 lines of branching

Why it’s wrong:

Every new edge case becomes a new if branch + a new state name.
Adding a behavior (e.g. “react to a LinkedIn post by the lead”) means revisiting the entire state graph.
Bugs are silent: a misnamed state string causes leads to silently fall out of the flow.
No reasoning capacity. The lead’s situation might have changed in ways your states don’t model — the machine sends the next email anyway.

The agent-heartbeat pattern handles all of these by deferring to an LLM that sees the whole state on each tick and writes its decisions back as data, not enum values.

Don’t: use `while True` polling loops

# ❌ DON'T do this
async def on_install(ctx):
    while True:
        await check_and_send_cadence(ctx)
        await asyncio.sleep(3600)

while True loops don’t survive container restarts, can’t be cancelled by uninstall, can’t be observed by the platform’s scheduler, and burn idle CPU. They’re a code smell that you’ve recreated the scheduler. Use heartbeats: in your manifest.

Don’t: chain `task_create` calls as a fake state machine

# ❌ DON'T do this
await ctx.tools.call("task_create", {
    "schedule": {"interval": "3d"},
    "task_type": "send_touch_2",
    "params": {"lead_id": lead_id},
})
# … 3 days later, the task fires and schedules touch_3 …

This is a state machine with extra steps. You’ve moved the transition table into task names, and now a reply that arrives between scheduled tasks can’t cancel the next touch — the scheduled task fires anyway, and you have to add cancellation plumbing for every transition.

The heartbeat sees the reply at the next tick and naturally decides not to send.

Don’t: skip the policy prompt and hope the agent figures it out

# ❌ DON'T do this
system_prompt: "You manage sales cadences. Do the right thing."

The agent needs the policy spelled out — when to send, when to stop, what counts as a positive reply, what guardrails are hard limits. Without this, behavior drifts across model versions and between tenants. Write the policy as you would for a new SDR joining the team.

When this doesn’t apply

Truly deterministic workflows. If every step is the same for every input (e.g. “convert this PDF, save to drive, email the link”), use workflow primitives — they’re cheaper than an LLM tick and have first-class retry + parallelism.
High-frequency events. Heartbeats fire on a cron, minimum ~15 seconds. If you need sub-second reactions, use a stream consumer or webhook handler, not a heartbeat.
Tenant has no LLM budget. Each heartbeat tick costs an LLM call. For very-large tenants (thousands of records to evaluate), use the heartbeat’s skill/tool pre-filter so the LLM only wakes when the cheap query already found work to do.

Heartbeats reference — full manifest schema, pre-filter mechanics, scheduler internals.
Workflow primitives — for deterministic DAGs without judgment calls.
Reacting to emails — the event-driven cousin: when something arrives from outside, use on_inbound, not a heartbeat.