Gateway API

The Arkonova Gateway provides a unified OpenAI-compatible endpoint for accessing multiple AI providers through a single API key. Route requests to GPT-4o, Claude, Gemini, or any custom model without changing your client code.

Introduction

The Gateway is a reverse proxy that translates your requests into provider-specific calls, applying authentication, quota enforcement, routing policy, and telemetry collection transparently. From your client's perspective it looks like a standard OpenAI API.

Any library that supports a configurable base_url — the OpenAI Python/Node SDK, LangChain, LiteLLM, LlamaIndex, and others — works with the Gateway without modifications.

The Gateway is fully compatible with the OpenAI Chat Completions API (POST /v1/chat/completions). Existing OpenAI integrations only need a changed base_url and a new API key.

Authentication

All Gateway requests must carry an Arkonova API key in the standard HTTP Authorization header. Keys are prefixed with ark-.

HTTP Header
Authorization: Bearer ark-xxxxxxxxxxxxxxxxxxxxxxxx

You can issue and manage API keys from your account dashboard. Each key can be scoped to specific providers or models, have token quotas, and optionally be restricted by IP range.

Never embed your ark- key in client-side code. Keep it server-side or in environment variables.

Base URL

Gateway Base URL
https://arkonova.network/gateway/v1

Set this as base_url in your SDK configuration. The gateway exposes the same path structure as the OpenAI API, so /chat/completions, /models, and /embeddings all work as expected.

Chat Completions

POST /gateway/v1/chat/completions

The primary endpoint. Accepts the standard OpenAI Chat Completions request body. The model field determines which provider and model the request is routed to.

Request Body

ParameterTypeRequiredDescription
modelstringRequiredModel ID, e.g. gpt-4o, claude-sonnet-4-6, gemini-2.0-flash
messagesarrayRequiredArray of message objects with role (system, user, assistant) and content
streambooleanOptionalIf true, enables SSE token-by-token streaming. Default: false
temperaturenumberOptionalSampling temperature 0–2. Default: 1
max_tokensintegerOptionalMaximum tokens to generate
top_pnumberOptionalNucleus sampling probability 0–1
stopstring / arrayOptionalStop sequence(s)
nintegerOptionalNumber of completions to generate. Default: 1
x-fallbackarrayOptionalGateway extension: fallback model IDs tried in order if primary fails, e.g. ["gpt-4o-mini", "claude-haiku-4-5"]

Minimal Example

JSON Request
{ "model": "gpt-4o", "messages": [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Explain API gateways in one sentence."} ] }

Response

On success the gateway returns the provider's response wrapped in the standard OpenAI response format:

JSON Response (200 OK)
{ "id": "chatcmpl-abc123", "object": "chat.completion", "created": 1740000000, "model": "gpt-4o", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "An API gateway is a single entry point that routes client requests to one or more backend services." }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 28, "completion_tokens": 22, "total_tokens": 50 }, "x-gateway": { "provider": "openai", "latency_ms": 312, "routed_model": "gpt-4o" } }
The extra x-gateway field in the response contains routing metadata: which provider served the request and the measured round-trip latency. This field is always present and does not affect OpenAI SDK compatibility.

Streaming

Setting "stream": true switches the response to Server-Sent Events (SSE). The gateway normalizes the event format across all providers — clients receive the same data: {...} chunks regardless of which provider handles the request.

SSE Stream — individual chunks
data: {"id":"chatcmpl-xyz","object":"chat.completion.chunk","choices":[{"delta":{"content":"An "},"index":0}]} data: {"id":"chatcmpl-xyz","object":"chat.completion.chunk","choices":[{"delta":{"content":"API "},"index":0}]} data: {"id":"chatcmpl-xyz","object":"chat.completion.chunk","choices":[{"delta":{"content":"gateway"},"index":0}]} data: {"id":"chatcmpl-xyz","object":"chat.completion.chunk","choices":[{"delta":{},"finish_reason":"stop","index":0}]} data: [DONE]
Python — streaming with OpenAI SDK
import openai client = openai.OpenAI( base_url="https://arkonova.network/gateway/v1", api_key="ark-..." ) stream = client.chat.completions.create( model="claude-sonnet-4-6", messages=[{"role": "user", "content": "Tell me a short story."}], stream=True ) for chunk in stream: content = chunk.choices[0].delta.content if content: print(content, end="", flush=True)

Supported Models

Pass the model ID in the model field. The gateway resolves the provider automatically. Aliases (e.g. gpt-4o) are forwarded as-is; the gateway infers the provider from the model name prefix.

OpenAI
gpt-4o gpt-4o-mini gpt-4-turbo o1 o3 text-embedding-3-small
Anthropic
claude-opus-4-6 claude-sonnet-4-6 claude-haiku-4-5 claude-3-5-sonnet claude-3-5-haiku
Google
gemini-2.0-flash gemini-2.0-pro gemini-1.5-flash gemma-3
Custom / Self-Hosted
custom:<model-id> ollama:<model-id>

For custom/self-hosted endpoints, prefix the model ID with custom: and configure the target URL in your dashboard. Any OpenAI-compatible endpoint (Together AI, Fireworks, local Ollama) can be registered this way.

Routing & Fallback

The gateway supports two routing strategies:

Fallback Chain

Include an x-fallback field in the request body with an ordered list of backup model IDs. If the primary model returns a 429, 500, or times out, the gateway automatically retries with the next fallback — transparently, with no change to the response format.

JSON — with fallback chain
{ "model": "gpt-4o", "x-fallback": ["claude-sonnet-4-6", "gpt-4o-mini"], "messages": [{"role": "user", "content": "Hello"}] }
A common pattern is to put a frontier model first and a fast/cheap model last as the catch-all fallback. This maximizes quality while keeping availability at 100%.

Policy-Based Routing (Dashboard)

In the API key settings you can define a routing policy that applies to all requests from that key:

API Keys & Quotas

Each ark- key has independently configurable limits:

SettingDescription
model_allowlistWhitelist of model IDs this key may request. Requests for unlisted models are rejected with 403.
token_quotaMonthly token budget (prompt + completion). Requests exceeding the quota return 429.
rpm_limitMax requests per minute. Excess requests return 429 with a Retry-After header.
ip_allowlistOptional list of allowed source IPs in CIDR notation. Requests from other IPs are rejected with 403.
provider_lockForce all requests through a single provider regardless of the model field.

Error Codes

The gateway returns standard HTTP status codes. Error bodies follow the OpenAI error format:

Error Response Body
{ "error": { "message": "Model 'gpt-5' is not in the allowlist for this API key.", "type": "invalid_request_error", "code": "model_not_allowed" } }
StatusCodeDescription
400 invalid_request_error Malformed request body — missing required fields or invalid types.
401 authentication_error Missing or invalid Authorization header.
403 model_not_allowed The requested model is not in the key's allowlist, or IP is not whitelisted.
429 quota_exceeded / rate_limited Token quota exhausted or RPM limit hit. Check Retry-After header.
502 provider_error Upstream provider returned an error and all fallbacks were exhausted.
504 provider_timeout Upstream provider did not respond within the timeout window (30 s default).

Examples

Python — OpenAI SDK

Python
import openai client = openai.OpenAI( base_url="https://arkonova.network/gateway/v1", api_key="ark-..." ) # Non-streaming response = client.chat.completions.create( model="claude-sonnet-4-6", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What is the capital of France?"} ] ) print(response.choices[0].message.content)

Python — with fallback chain

Python
import requests resp = requests.post( "https://arkonova.network/gateway/v1/chat/completions", headers={"Authorization": "Bearer ark-..."}, json={ "model": "gpt-4o", "x-fallback": ["claude-sonnet-4-6", "gpt-4o-mini"], "messages": [{"role": "user", "content": "Hello!"}] } ) data = resp.json() print(data["choices"][0]["message"]["content"]) print("Served by:", data["x-gateway"]["provider"])

curl

Shell
curl https://arkonova.network/gateway/v1/chat/completions \ -H "Authorization: Bearer ark-..." \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-4o", "messages": [{"role": "user", "content": "Hello!"}] }'

Node.js — OpenAI SDK

JavaScript
import OpenAI from "openai"; const client = new OpenAI({ baseURL: "https://arkonova.network/gateway/v1", apiKey: "ark-...", }); const response = await client.chat.completions.create({ model: "gemini-2.0-flash", messages: [{ role: "user", content: "Hello!" }], }); console.log(response.choices[0].message.content);

LangChain (Python)

Python
from langchain_openai import ChatOpenAI llm = ChatOpenAI( model="claude-sonnet-4-6", openai_api_base="https://arkonova.network/gateway/v1", openai_api_key="ark-..." ) result = llm.invoke("Explain what an API gateway does.") print(result.content)
Domain migration

Primary domain: arkonova.network

We are migrating away from arkonova.ru. Please update bookmarks and links.

Open primary domain