nika:image_generate treats images as workflow citizens: the same declared
permits: boundary that gates file writes gates every save, real spend lands
in the run ledger, and provenance is structural — in the manifest beside the
asset and inside the PNG itself.
The five providers
| provider | default model | wire | what to know |
|---|---|---|---|
local | stablediffusion | your server’s OpenAI-images route | The sovereign path. One wire covers LocalAI (:8080), Ollama (:11434), stable-diffusion.cpp sd-server (:1234), SGLang Diffusion and vLLM-Omni. Never inferred from model: — always explicit. |
openai | gpt-image-2 | Images API | Exact sizes (WxH), native n, background: transparent, webp/jpeg compression. |
gemini | gemini-3.1-flash-image | generateContent | Aspect-ratio classes; may return an interleaved caption (surfaced as provider_text, clamped). |
xai | grok-imagine-image | Imagine API | Native aspect ratios + resolution: 1k|2k classes; the -quality model tier is the quality knob; bills exact cost into the run ledger. |
mock | mock-image-1 | in-process | Real, decodable, deterministic PNG files — zero network, zero keys. CI runs the whole pipeline offline. |
OPENAI_API_KEY / GEMINI_API_KEY / XAI_API_KEY (or NIKA_-prefixed),
NIKA_IMAGE_LOCAL_URL (+ optional NIKA_IMAGE_LOCAL_API_KEY). Check what’s
wired with nika doctor — it prints an image line naming the ready
providers.
Sovereign quickstart (local server)
nika: v1
response_format: b64_json (LocalAI defaults to URL mode),
refuses url-only answers (result URLs are never fetched — that would
reopen the SSRF surface the fixed-endpoint design closed), and gives local
renders a 300s default timeout — CPU diffusion runs minutes, raise
timeout_ms: up to 600000 when needed. SD-family servers honor a
positive | negative prompt split written directly inside prompt:.
What lands on disk
endpoint_host (which server actually rendered it), timing, warnings,
cost_usd, and your metadata: fields — never a credential, by construction.
The PNG itself carries a nika tEXt chunk (tool, engine version,
provider, model, prompt, seed) — so provenance survives cp, the practice
ComfyUI and InvokeAI standardized. Read it back with any PNG tool:
Content credentials (detect-and-preserve)
OpenAI and Google sign their API bytes with C2PA Content Credentials — and C2PA hashes the file’s byte ranges, so any pipeline that writes into a signed render converts valid credentials into « present but tampered ». Nika detects the signals first (PNGcaBX · JPEG APP11 JUMBF · RIFF
C2PA · MP3 GEOB): on signed payloads the nika tEXt embed stands
down (their signed manifest outranks our informal chunk — a loud
content_credentials_preserved: warning says so), and the output +
manifest surface content_credentials: "c2pa" plus watermark_declared
(SynthID is a provider fact — only the vendor can detect it). Detection
labels only: the wire never says « verified ». With EU AI Act Article 50
in force from 2026-08-02, preserving machine-readable marks is part of
an operator’s compliance surface — no other workflow engine even looks.
Honesty rules (what the warnings mean)
Every lossy mapping is a stable, visible warning — silent degradation is non-conformant per the spec:count_shortfall:— the provider returned fewer thann:(Ollama’s compat route ignoresn, moderation can filter variants).size_conflict:·xai_size_class:·aspect_remapped:— your exact size was folded to the provider’s nearest class, loudly.seed_unsupported:·quality_folded:·compression_ignored:— the knob doesn’t exist on that provider; the arg was dropped, visibly.format_mismatch:— fires only when you explicitly asked for a format the provider didn’t honor; magic bytes name the real extension either way.
Real spend in the ledger
xAI bills images in cost ticks — the engine converts them exactly and the render’s cost rides the task line, the run total, and the manifest:cost_usd in its structured output is
metered the same way. Providers that don’t report exact cost show null —
never an estimate dressed as truth.
Cookbook (each proven end-to-end against live APIs)
An LLM writes the brief, another provider renders it:nika: v1