Models
Route every AI request to the right model family.
Keep one API contract while switching between premium hosted models, faster small models, and local OpenAI-compatible deployments.
One message format. Many providers.
Provider credentials stay server-side.
Unsupported models fail with typed API errors.
Streaming chunks preserve OpenAI-compatible shape.
Usage events capture tokens, provider, model, latency, and cost.
Fast general reasoning, multimodal chat, and broad application compatibility.
gpt-4ogpt-4o-minigpt-4-turboo1o3-mini
Long-form reasoning, safer assistant behavior, and agent orchestration workloads.
claude-opus-4claude-sonnet-4claude-haiku-4
Low-latency multimodal tasks and broad context workflows.
gemini-2.0-flashgemini-2.5-pro
Regional deployments, open-weight model routing, and local OpenAI-compatible backends.
mistral-largemistral-smallllama-3.1-70bqwen2.5-72b
