Skip to main content
Two provider numbers exist, on purpose: the language spec locks canonical providers (what a conformant engine MUST ship — see the spec catalog); the reference engine’s catalog goes wider. Nika ships with LLM providers in its catalog, orchestrated behind one unified InferRequest. The infer: verb takes a model: <provider>/<name> string, looks up capability rules ( rules), and dispatches through the right API dialect.
Full inventory with env vars and default models: Providers catalog. This page explains the design — why a single InferRequest works across providers.

One InferRequest for all providers

#[non_exhaustive]
pub struct InferRequest {
    pub provider: ProviderId,          // e.g., "anthropic", "openai", "moonshot"
    pub model: ModelId,                // e.g., "claude-sonnet-4-6"
    pub messages: Vec<Message>,
    pub response_format: Option<ResponseFormat>,
    pub tools: Vec<ToolSpec>,
    pub temperature: Option<f32>,
    pub top_p: Option<f32>,
    pub max_tokens: Option<u32>,
    pub stream: bool,
    // reserved for future minors: memory, budget, baggage, trust_level, trace_id
}
Every L1 provider crate (nika-provider-anthropic, nika-provider-openai, nika-provider-gemini, …) implements the single Provider trait from nika-kernel:
#[trait_variant::make(ProviderDyn: Send)]
pub trait Provider: sealed::Sealed + Send + Sync + 'static {
    async fn infer(&self, req: &InferRequest) -> Result<InferResponse>;
    async fn stream(&self, req: &InferRequest) -> Result<InferStream>;
    fn capabilities(&self, model: &str) -> ModelCapabilities;
    fn provider_id(&self) -> &'static ProviderId;
}
Switching provider is one line in YAML:
tasks:
  - id: first
    infer:
      model: ollama/llama3.1               # <provider>/<name> · swap to mistral/… · anthropic/… · gemini/…
      prompt: "Hello."

API dialects (7)

Providers speak 7 distinct dialects. One provider crate per dialect + an OpenAI-compatible family that covers ~22 providers.
DialectProvidersNotes
openai-chatopenai, mistral, groq, deepseek, xai, openrouter, together, fireworks, cerebras, sambanova, perplexity, moonshot, qwen, minimax, zhipu, nvidia-nim, deepinfra, replicate, hyperbolic, writer, databricks, azure, cloudflare23 providers share OpenAI’s /v1/chat/completions shape with per-provider base URL + auth header
anthropicanthropicNative Messages API, tool use as schema, extended thinking
geminigemini, vertexNative generateContent API, safety settings, structured output via response_schema
coherecohereNative Command API
ai21ai21Native Jamba API
bedrockbedrockAWS SigV4, model-prefix routing (anthropic.claude-...)
voyagevoyageEmbeddings-only
Local runner: the native provider uses mistral.rs to load GGUF models on CPU / Metal / CUDA without any HTTP round-trip. Feature-gated in nika-provider-native, off by default.

Capability-aware routing

Every InferRequest is routed through capability rules before dispatch. The rules decide:
  • What modalities the model accepts (text / image / audio / pdf / …).
  • Whether tool calling + parallel tool calls are supported.
  • Whether JSON mode is unavailable / object / schema.
  • Whether streaming is native or emulated.
  • Whether prompt caching / context caching reduces cost.
  • Which supported parameters are legal (reasoning-effort, thinking- budget, computer-use, citations, …).
If you send tools: [...] to a model whose capability rule says tool_calling = false, Nika fails fast with NIKA-13x before hitting the network — no silent provider-side error.
See Capability rules for the full resolution algorithm (defaults → scan rules in file order → apply matches).

Fallback chains (target design)

When a provider fails, Nika can route to a fallback:
target design — a future minor
tasks:
  - id: primary
    infer:
      model: anthropic/claude-sonnet-4-6   # <provider>/<name>
      prompt: "..."
    fallback:
      - { model: openai/gpt-4o }
      - { model: groq/llama-3.3-70b-versatile }
fallback: is reserved in the kernel DTO but not yet parsed at v. Model fallback chains need a separate runtime policy gate before they become user-facing syntax.

Why and not 9

Earlier drafts of this page listed 9 providers. That was a snapshot from mid-2025 — today’s catalog is -deep, covering frontier cloud labs, fast-inference shops (Groq, Cerebras, SambaNova), Chinese labs (Moonshot, Qwen, MiniMax, Zhipu), enterprise platforms (Bedrock, Azure, Vertex), and niche (Voyage embeddings, Writer enterprise). Source of truth lives in the TOML and is build-time validated.

Adding a new provider

1

Add a `[[providers]]` block to `llm-providers.toml`

Declare id, name, aliases, env_var, key_prefixes, default_model, cheap_model, api_dialect, tags, and at least one [[providers.models]] sub-block.
2

Add capability rules if the provider has unique features

Put rules in model-capabilities.toml scoped via scope.providers. Inherit defaults if the provider is standard (most openai-chat providers need no rules).
3

Implement the L1 provider crate

If the api_dialect already exists, wire a new provider instance with the correct base URL + auth. If new dialect, new L1 crate nika-provider-<name>.
4

Pass the engine's admission checklist

Property test against the provider’s API contract. Criterion bench for token throughput. Capability parity test. How admission works →.

See also

Providers catalog

All providers with env vars, default / cheap models, dialects.

Capability rules

Resolution algorithm and -rule contract.

Concepts · Verbs

The infer: verb and its three siblings (exec / invoke / agent) — fetch is the nika:fetch builtin under invoke:.

L0 decisions

Why capability rules use prefix matching, not regex (Q1).