Providers - Nika

Two provider numbers exist, on purpose: the language spec locks canonical providers (what a conformant engine MUST ship — see the spec catalog); the reference engine’s catalog goes wider. Nika ships with LLM providers in its catalog, orchestrated behind one unified InferRequest. The infer: verb takes a model: <provider>/<name> string, looks up capability rules ( rules), and dispatches through the right API dialect.

Full inventory with env vars and default models: Providers catalog. This page explains the design — why a single InferRequest works across providers.

One `InferRequest` for all providers

#[non_exhaustive]
pub struct InferRequest {
    pub provider: ProviderId,          // e.g., "anthropic", "openai", "moonshot"
    pub model: ModelId,                // e.g., "claude-sonnet-4-6"
    pub messages: Vec<Message>,
    pub response_format: Option<ResponseFormat>,
    pub tools: Vec<ToolSpec>,
    pub temperature: Option<f32>,
    pub top_p: Option<f32>,
    pub max_tokens: Option<u32>,
    pub stream: bool,
    // reserved for future minors: memory, budget, baggage, trust_level, trace_id
}

Every L1 provider crate (nika-provider-anthropic, nika-provider-openai, nika-provider-gemini, …) implements the single Provider trait from nika-kernel:

#[trait_variant::make(ProviderDyn: Send)]
pub trait Provider: sealed::Sealed + Send + Sync + 'static {
    async fn infer(&self, req: &InferRequest) -> Result<InferResponse>;
    async fn stream(&self, req: &InferRequest) -> Result<InferStream>;
    fn capabilities(&self, model: &str) -> ModelCapabilities;
    fn provider_id(&self) -> &'static ProviderId;
}

Switching provider is one line in YAML:

tasks:
  - id: first
    infer:
      model: ollama/llama3.1               # <provider>/<name> · swap to mistral/… · anthropic/… · gemini/…
      prompt: "Hello."

API dialects (7)

Providers speak 7 distinct dialects. One provider crate per dialect + an OpenAI-compatible family that covers ~22 providers.

Dialect	Providers	Notes
`openai-chat`	openai, mistral, groq, deepseek, xai, openrouter, together, fireworks, cerebras, sambanova, perplexity, moonshot, qwen, minimax, zhipu, nvidia-nim, deepinfra, replicate, hyperbolic, writer, databricks, azure, cloudflare	23 providers share OpenAI’s `/v1/chat/completions` shape with per-provider base URL + auth header
`anthropic`	anthropic	Native Messages API, tool use as schema, extended thinking
`gemini`	gemini, vertex	Native `generateContent` API, safety settings, structured output via `response_schema`
`cohere`	cohere	Native Command API
`ai21`	ai21	Native Jamba API
`bedrock`	bedrock	AWS SigV4, model-prefix routing (`anthropic.claude-...`)
`voyage`	voyage	Embeddings-only

Local runner: the native provider uses mistral.rs to load GGUF models on CPU / Metal / CUDA without any HTTP round-trip. Feature-gated in nika-provider-native, off by default.

Capability-aware routing

Every InferRequest is routed through capability rules before dispatch. The rules decide:

What modalities the model accepts (text / image / audio / pdf / …).
Whether tool calling + parallel tool calls are supported.
Whether JSON mode is unavailable / object / schema.
Whether streaming is native or emulated.
Whether prompt caching / context caching reduces cost.
Which supported parameters are legal (reasoning-effort, thinking- budget, computer-use, citations, …).

If you send tools: [...] to a model whose capability rule says tool_calling = false, Nika fails fast with NIKA-13x before hitting the network — no silent provider-side error.

See Capability rules for the full resolution algorithm (defaults → scan rules in file order → apply matches).

Fallback chains (target design)

When a provider fails, Nika can route to a fallback:

target design — a future minor

tasks:
  - id: primary
    infer:
      model: anthropic/claude-sonnet-4-6   # <provider>/<name>
      prompt: "..."
    fallback:
      - { model: openai/gpt-4o }
      - { model: groq/llama-3.3-70b-versatile }

fallback: is reserved in the kernel DTO but not yet parsed at v. Model fallback chains need a separate runtime policy gate before they become user-facing syntax.

Why and not 9

Earlier drafts of this page listed 9 providers. That was a snapshot from mid-2025 — today’s catalog is -deep, covering frontier cloud labs, fast-inference shops (Groq, Cerebras, SambaNova), Chinese labs (Moonshot, Qwen, MiniMax, Zhipu), enterprise platforms (Bedrock, Azure, Vertex), and niche (Voyage embeddings, Writer enterprise). Source of truth lives in the TOML and is build-time validated.

Adding a new provider

Add a `[[providers]]` block to `llm-providers.toml`

Declare id, name, aliases, env_var, key_prefixes, default_model, cheap_model, api_dialect, tags, and at least one [[providers.models]] sub-block.

Add capability rules if the provider has unique features

Put rules in model-capabilities.toml scoped via scope.providers. Inherit defaults if the provider is standard (most openai-chat providers need no rules).

Implement the L1 provider crate

If the api_dialect already exists, wire a new provider instance with the correct base URL + auth. If new dialect, new L1 crate nika-provider-<name>.

Pass the engine's admission checklist

Property test against the provider’s API contract. Criterion bench for token throughput. Capability parity test. How admission works →.

Providers catalog

All providers with env vars, default / cheap models, dialects.

Capability rules

Resolution algorithm and -rule contract.

Concepts · Verbs

The infer: verb and its three siblings (exec / invoke / agent) — fetch is the nika:fetch builtin under invoke:.

L0 decisions

Why capability rules use prefix matching, not regex (Q1).

​One InferRequest for all providers

​API dialects (7)

​Capability-aware routing

​Fallback chains (target design)

​Why and not 9

​Adding a new provider

​See also

Providers catalog

Capability rules

Concepts · Verbs

L0 decisions

One `InferRequest` for all providers

API dialects (7)

Capability-aware routing

Fallback chains (target design)

Why and not 9

Adding a new provider

See also