Skip to main content
A long run dies at task 7 of 9 β€” laptop lid, ctrl-C, a provider hiccup. With most tools that is 7 tasks of tokens re-spent. With Nika the run’s own trace is the checkpoint: --resume re-executes the workflow, skipping every task whose recorded work is still valid (ADR-099). No daemon, no run store, no new artifact β€” the reader of a file you already have.

Record, then resume

release-notes.nika.yaml
nika: v1
workflow: release-notes
description: "Collect commits, draft notes, stamp a file."

model: mock/echo                       # swap for ollama/llama3.2:3b or any provider

tasks:
  - id: collect
    exec:
      command: "printf 'feat: resume\nfix: deadline'"

  - id: draft
    depends_on: [collect]
    infer:
      prompt: "Draft release notes from: ${{ tasks.collect.output }}"
      max_tokens: 400

  - id: stamp
    depends_on: [draft]
    exec:
      command: "printf 'stamped'"

outputs:
  notes: "${{ tasks.draft.output }}"
Any --json run is a recording (Traces & replay):
nika run release-notes.nika.yaml --json > .nika/traces/release-notes.ndjson
Resume from it:
$ nika run release-notes.nika.yaml --resume .nika/traces/release-notes.ndjson
  πŸ¦‹ nika Β· release-notes Β· 3 tasks
     permits βœ“ engine floor (no boundary declared)

  βœ”  collect  cache hit
  βœ”  draft    cache hit
  βœ”  stamp    cache hit
  ── 3/3 done Β· $0.000 Β· elapsed 0.0s ────────────────────────────

  resumed Β· 3 skipped (cache hit) Β· 0 ran live
Every skip is visible β€” a cache hit line in the render, a task_cache_hit event in the new trace, and the summary counts skipped vs live. A resumed run never pretends work happened silently.
A trace recorded by an engine version without resume keys is not an error: --resume prints a notice and runs everything live.

The skip rule Β· two hashes, both must match

A task skips iff the trace holds its completed record and
  1. the task definition hashes the same (the verb body, with:, output:, retry: / on_error:, when:, for_each: β€” as now written), and
  2. the resolved inputs hash the same (what its ${{ }} references actually resolved to β€” upstream outputs, vars, env).
Edit a prompt β†’ that task re-runs. Pass a different --var β†’ the tasks that consume it re-run, and the mismatch cascades exactly as far as the data flows β€” untouched sibling branches still skip:
$ nika run brief.nika.yaml --var topic="local-first AI" --resume .nika/traces/brief.ndjson
  resumed Β· 1 skipped (cache hit) Β· 0 ran live

$ nika run brief.nika.yaml --var topic="sovereign memory" --resume .nika/traces/brief.ndjson
  resumed Β· 0 skipped (cache hit) Β· 1 ran live
An infer: / agent: task that matches replays its recorded output β€” that is the point: crash-resume without re-spending tokens. There are no determinism rules, no replay constraints, no workflow versioning: durability is the engine’s problem, never yours. A task that does not match simply re-runs live, side effects included.

--from Β· force a re-run the hashes cannot see

Some changes are invisible to hashing: a rotated secret, external state, an infer: output you want re-rolled. --from <task_id> forces that task and its transitive downstream to re-run even on a match β€” upstream tasks still cache-hit:
$ nika run release-notes.nika.yaml --resume .nika/traces/release-notes.ndjson --from draft
  βœ”  collect  cache hit
  βœ”  draft    infer Β· mock/echo  0ms
  βœ”  stamp    exec Β· printf      4ms
  ── 3/3 done Β· $0.000 Β· elapsed 0.0s ────────────────────────────

  resumed Β· 1 skipped (cache hit) Β· 2 ran live
An unknown task id is refused before the run, like an unknown --var key.
Secrets participate by name, never by value β€” a trace carries no secret-derived material, so a rotated secret does not invalidate the cache. That is the one sharp edge: after a rotation, force the affected node with --from.

The durable human gate Β· pause and answer

A nika:prompt task blocks on a human. Under a non-interactive surface (--json Β· CI) with no usable default:, the run does not hang and does not fail β€” it pauses durably: the trace records a workflow_paused event with the prompt payload, and the process exits with code 4.
gated-ship.nika.yaml
nika: v1
workflow: gated-ship
description: "Build, ask a human, ship only on yes."

tasks:
  - id: build
    exec:
      command: "printf 'build ok'"

  - id: approve
    depends_on: [build]
    invoke:
      tool: "nika:prompt"
      args:
        message: "Ship this build to production?"

  - id: ship
    depends_on: [approve]
    when: "${{ tasks.approve.output == true }}"
    exec:
      command: "printf 'shipped'"
nika run gated-ship.nika.yaml --json > .nika/traces/gated-ship.ndjson
echo $?   # 4 Β· paused β€” not a failure, but non-zero so `&& next` stops here
The paused trace carries everything needed to pick the run back up β€” hours later, on the same machine, by a different process:
the workflow_paused event (one line of the trace Β· reformatted)
{"kind": "workflow_paused", "fields": [
  {"key": "workflow", "value": "gated-ship"},
  {"key": "task",     "value": "approve"},
  {"key": "mode",     "value": "confirm"},
  {"key": "note",     "value": "awaiting a `nika:prompt` answer β€” resume with `--resume <trace> --answer <task>=<value>`"},
  {"key": "message",  "value": "Ship this build to production?"}]}
Resume with the answer bound to the task id via --answer (repeatable):
$ nika run gated-ship.nika.yaml --resume .nika/traces/gated-ship.ndjson --answer approve=true
  βœ”  build    cache hit
  βœ”  approve  invoke Β· nika:prompt  0ms
  βœ”  ship     exec Β· printf         5ms
  ── 3/3 done Β· $0.000 Β· elapsed 0.0s ────────────────────────────

  resumed Β· 1 skipped (cache hit) Β· 2 ran live
The value follows the prompt’s mode:. confirm wants a boolean (--answer approve=true / =false β€” a refusal is a value, and the when: gate downstream decides what happens with it). input takes a string, choice one of the declared choices. Like --var, the value parses as JSON when it parses. Resumed without an answer, a --json run pauses again β€” idempotent, exit 4 every time, so a poller can retry harmlessly. Resumed interactively (a human at the terminal), the prompt simply asks.

Exit codes

ExitMeaning
0Run completed
1Run executed and a task failed unrecovered
2Validation findings β€” the file never ran
3Environment error (also: unknown --var / --from key)
4Paused on a human gate β€” resume with --resume <trace> --answer <task>=<value>