feat(gateway): add missing OpenAI-compatible endpoints (models and embeddings) (#53992)

* feat(gateway): add OpenAI-compatible models and embeddings

* docs(gateway): clarify model list and agent routing

* Update index.md

* fix(gateway): harden embeddings HTTP provider selection

* fix(gateway): validate compat model overrides

* fix(gateway): harden embeddings and response continuity

* fix(gateway): restore compat model id handling
This commit is contained in:
Vincent Koc 2026-03-24 16:53:51 -07:00 committed by GitHub
parent 0709224ce3
commit eaad4ad1be
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
14 changed files with 1011 additions and 27 deletions

View File

@ -19,6 +19,7 @@ Docs: https://docs.openclaw.ai
- CLI/containers: add `--container` and `OPENCLAW_CONTAINER` to run `openclaw` commands inside a running Docker or Podman OpenClaw container. (#52651) Thanks @sallyom.
- Discord/auto threads: add optional `autoThreadName: "generated"` naming so new auto-created threads can be renamed asynchronously with concise LLM-generated titles while keeping the existing message-based naming as the default. (#43366) Thanks @davidguttman.
- Slack/interactive replies: restore rich reply parity for direct deliveries, auto-render simple trailing `Options:` lines as buttons/selects, improve Slack interactive setup defaults, and isolate reply controls from plugin interactive handlers. (#53389) Thanks @vincentkoc.
- Gateway/OpenAI compatibility: add `/v1/models` and `/v1/embeddings`, and forward explicit model overrides through `/v1/chat/completions` and `/v1/responses` for broader client and RAG compatibility. Thanks @vincentkoc.
### Fixes

View File

@ -70,11 +70,35 @@ Default mode is `gateway.reload.mode="hybrid"`.
- One always-on process for routing, control plane, and channel connections.
- Single multiplexed port for:
- WebSocket control/RPC
- HTTP APIs (OpenAI-compatible, Responses, tools invoke)
- HTTP APIs, OpenAI compatible (`/v1/models`, `/v1/embeddings`, `/v1/chat/completions`, `/v1/responses`, `/tools/invoke`)
- Control UI and hooks
- Default bind mode: `loopback`.
- Auth is required by default (`gateway.auth.token` / `gateway.auth.password`, or `OPENCLAW_GATEWAY_TOKEN` / `OPENCLAW_GATEWAY_PASSWORD`).
## OpenAI-compatible endpoints
OpenClaws highest-leverage compatibility surface is now:
- `GET /v1/models`
- `GET /v1/models/{id}`
- `POST /v1/embeddings`
- `POST /v1/chat/completions`
- `POST /v1/responses`
Why this set matters:
- Most Open WebUI, LobeChat, and LibreChat integrations probe `/v1/models` first.
- Many RAG and memory pipelines expect `/v1/embeddings`.
- Agent-native clients increasingly prefer `/v1/responses`.
Planning note:
- Keep `/v1/models` as a flat `provider/model` list for client compatibility.
- Treat agent and sub-agent selection as separate OpenClaw routing concerns, not pseudo-model entries.
- When you need agent-scoped filtering, pass `x-openclaw-agent-id` on both model-list and request calls.
All of these run on the main Gateway port and use the same trusted operator auth boundary as the rest of the Gateway HTTP API.
### Port and bind precedence
| Setting | Resolution order |

View File

@ -14,6 +14,13 @@ This endpoint is **disabled by default**. Enable it in config first.
- `POST /v1/chat/completions`
- Same port as the Gateway (WS + HTTP multiplex): `http://<gateway-host>:<port>/v1/chat/completions`
When the Gateways OpenAI-compatible HTTP surface is enabled, it also serves:
- `GET /v1/models`
- `GET /v1/models/{id}`
- `POST /v1/embeddings`
- `POST /v1/responses`
Under the hood, requests are executed as a normal Gateway agent run (same codepath as `openclaw agent`), so routing/permissions/config match your Gateway.
## Authentication
@ -55,6 +62,12 @@ Or target a specific OpenClaw agent by header:
Advanced:
- `x-openclaw-session-key: <sessionKey>` to fully control session routing.
- `x-openclaw-message-channel: <channel>` to set the synthetic ingress channel context for channel-aware prompts and policies.
For `/v1/models` and `/v1/embeddings`, `x-openclaw-agent-id` is still useful:
- `/v1/models` uses it for agent-scoped model filtering where relevant.
- `/v1/embeddings` uses it to resolve agent-specific memory-search embedding config.
## Enabling the endpoint
@ -94,6 +107,51 @@ By default the endpoint is **stateless per request** (a new session key is gener
If the request includes an OpenAI `user` string, the Gateway derives a stable session key from it, so repeated calls can share an agent session.
## Why this surface matters
This is the highest-leverage compatibility set for self-hosted frontends and tooling:
- Most Open WebUI, LobeChat, and LibreChat setups expect `/v1/models`.
- Many RAG systems expect `/v1/embeddings`.
- Existing OpenAI chat clients can usually start with `/v1/chat/completions`.
- More agent-native clients increasingly prefer `/v1/responses`.
## Model list and agent routing
<AccordionGroup>
<Accordion title="What does `/v1/models` return?">
A flat OpenAI-style model list.
The returned ids are canonical `provider/model` values such as `openai/gpt-5.4`.
These ids are meant to be passed back directly as the OpenAI `model` field.
</Accordion>
<Accordion title="Does `/v1/models` list agents or sub-agents?">
No.
`/v1/models` lists model choices, not execution topology. Agents and sub-agents are OpenClaw routing concerns, so they are selected separately with `x-openclaw-agent-id` or the `openclaw:<agentId>` / `agent:<agentId>` model aliases on chat and responses requests.
</Accordion>
<Accordion title="How does agent-scoped filtering work?">
Send `x-openclaw-agent-id: <agentId>` when you want the model list for a specific agent.
OpenClaw filters the model list against that agent's allowed models and fallbacks when configured. If no allowlist is configured, the endpoint returns the full catalog.
</Accordion>
<Accordion title="How do sub-agents pick a model?">
Sub-agent model choice is resolved at spawn time from OpenClaw agent config.
That means sub-agent model selection does not create extra `/v1/models` entries. Keep the compatibility list flat, and treat agent and sub-agent selection as separate OpenClaw-native routing behavior.
</Accordion>
<Accordion title="What should clients do in practice?">
Use `/v1/models` to populate the normal model picker.
If your client or integration also knows which OpenClaw agent it wants, set `x-openclaw-agent-id` when listing models and when sending chat, responses, or embeddings requests. That keeps the picker aligned with the target agent's allowed model set.
</Accordion>
</AccordionGroup>
## Streaming (SSE)
Set `stream: true` to receive Server-Sent Events (SSE):
@ -130,3 +188,36 @@ curl -N http://127.0.0.1:18789/v1/chat/completions \
"messages": [{"role":"user","content":"hi"}]
}'
```
List models:
```bash
curl -sS http://127.0.0.1:18789/v1/models \
-H 'Authorization: Bearer YOUR_TOKEN'
```
Fetch one model:
```bash
curl -sS http://127.0.0.1:18789/v1/models/openai%2Fgpt-5.4 \
-H 'Authorization: Bearer YOUR_TOKEN'
```
Create embeddings:
```bash
curl -sS http://127.0.0.1:18789/v1/embeddings \
-H 'Authorization: Bearer YOUR_TOKEN' \
-H 'Content-Type: application/json' \
-H 'x-openclaw-agent-id: main' \
-d '{
"model": "openai/text-embedding-3-small",
"input": ["alpha", "beta"]
}'
```
Notes:
- `/v1/models` returns canonical ids in `provider/model` form so they can be passed back directly as OpenAI `model` values.
- `/v1/models` stays flat on purpose: it does not enumerate agents or sub-agents as pseudo-model ids.
- `/v1/embeddings` supports `input` as a string or array of strings.

View File

@ -26,9 +26,19 @@ Operational behavior matches [OpenAI Chat Completions](/gateway/openai-http-api)
- treat the endpoint as full operator access for the gateway instance
- select agents with `model: "openclaw:<agentId>"`, `model: "agent:<agentId>"`, or `x-openclaw-agent-id`
- use `x-openclaw-session-key` for explicit session routing
- use `x-openclaw-message-channel` when you want a non-default synthetic ingress channel context
Enable or disable this endpoint with `gateway.http.endpoints.responses.enabled`.
The same compatibility surface also includes:
- `GET /v1/models`
- `GET /v1/models/{id}`
- `POST /v1/embeddings`
- `POST /v1/chat/completions`
For the canonical explanation of how model listing, agent routing, and sub-agent model selection fit together, see [OpenAI Chat Completions](/gateway/openai-http-api#model-list-and-agent-routing).
## Session behavior
By default the endpoint is **stateless per request** (a new session key is generated each call).
@ -54,9 +64,12 @@ Accepted but **currently ignored**:
- `reasoning`
- `metadata`
- `store`
- `previous_response_id`
- `truncation`
Supported:
- `previous_response_id`: OpenClaw reuses the earlier response session when the request stays within the same agent/user/requested-session scope.
## Items (input)
### `message`

View File

@ -0,0 +1,176 @@
import { afterAll, beforeAll, describe, expect, it, vi } from "vitest";
import { resolveAgentDir } from "../agents/agent-scope.js";
import { getFreePort, installGatewayTestHooks } from "./test-helpers.js";
installGatewayTestHooks({ scope: "suite" });
let startGatewayServer: typeof import("./server.js").startGatewayServer;
let createEmbeddingProviderMock: ReturnType<typeof vi.fn>;
let enabledServer: Awaited<ReturnType<typeof startServer>>;
let enabledPort: number;
beforeAll(async () => {
vi.resetModules();
createEmbeddingProviderMock = vi.fn(async (options: { provider: string; model: string }) => ({
provider: {
id: options.provider,
model: options.model,
embedQuery: async () => [0.1, 0.2],
embedBatch: async (texts: string[]) =>
texts.map((_text, index) => [index + 0.1, index + 0.2]),
},
}));
vi.doMock("../memory/embeddings.js", async () => {
const actual =
await vi.importActual<typeof import("../memory/embeddings.js")>("../memory/embeddings.js");
return {
...actual,
createEmbeddingProvider: createEmbeddingProviderMock,
};
});
({ startGatewayServer } = await import("./server.js"));
enabledPort = await getFreePort();
enabledServer = await startServer(enabledPort, { openAiChatCompletionsEnabled: true });
});
afterAll(async () => {
await enabledServer.close({ reason: "embeddings http enabled suite done" });
vi.resetModules();
});
async function startServer(port: number, opts?: { openAiChatCompletionsEnabled?: boolean }) {
return await startGatewayServer(port, {
host: "127.0.0.1",
auth: { mode: "token", token: "secret" },
controlUiEnabled: false,
openAiChatCompletionsEnabled: opts?.openAiChatCompletionsEnabled ?? false,
});
}
async function postEmbeddings(body: unknown, headers?: Record<string, string>) {
return await fetch(`http://127.0.0.1:${enabledPort}/v1/embeddings`, {
method: "POST",
headers: {
authorization: "Bearer secret",
"content-type": "application/json",
...headers,
},
body: JSON.stringify(body),
});
}
describe("OpenAI-compatible embeddings HTTP API (e2e)", () => {
it("embeds string and array inputs", async () => {
const single = await postEmbeddings({
model: "text-embedding-3-small",
input: "hello",
});
expect(single.status).toBe(200);
const singleJson = (await single.json()) as {
object?: string;
data?: Array<{ object?: string; embedding?: number[]; index?: number }>;
};
expect(singleJson.object).toBe("list");
expect(singleJson.data?.[0]?.object).toBe("embedding");
expect(singleJson.data?.[0]?.embedding).toEqual([0.1, 0.2]);
const batch = await postEmbeddings({
model: "text-embedding-3-small",
input: ["a", "b"],
});
expect(batch.status).toBe(200);
const batchJson = (await batch.json()) as {
data?: Array<{ embedding?: number[]; index?: number }>;
};
expect(batchJson.data).toEqual([
{ object: "embedding", index: 0, embedding: [0.1, 0.2] },
{ object: "embedding", index: 1, embedding: [1.1, 1.2] },
]);
const qualified = await postEmbeddings({
model: "openai/text-embedding-3-small",
input: "hello again",
});
expect(qualified.status).toBe(200);
const qualifiedJson = (await qualified.json()) as { model?: string };
expect(qualifiedJson.model).toBe("openai/text-embedding-3-small");
const lastCall = createEmbeddingProviderMock.mock.calls.at(-1)?.[0] as
| { provider?: string; model?: string }
| undefined;
expect(lastCall).toMatchObject({
provider: "openai",
model: "text-embedding-3-small",
});
});
it("supports base64 encoding and agent-scoped auth/config resolution", async () => {
const res = await postEmbeddings(
{
model: "text-embedding-3-small",
input: "hello",
encoding_format: "base64",
},
{ "x-openclaw-agent-id": "beta" },
);
expect(res.status).toBe(200);
const json = (await res.json()) as { data?: Array<{ embedding?: string }> };
expect(typeof json.data?.[0]?.embedding).toBe("string");
expect(createEmbeddingProviderMock).toHaveBeenCalled();
const lastCall = createEmbeddingProviderMock.mock.calls.at(-1)?.[0] as
| { provider?: string; model?: string; fallback?: string; agentDir?: string }
| undefined;
expect(lastCall?.model).toBe("text-embedding-3-small");
expect(lastCall?.fallback).toBe("none");
expect(lastCall?.agentDir).toBe(resolveAgentDir({}, "beta"));
});
it("rejects invalid input shapes", async () => {
const res = await postEmbeddings({
model: "text-embedding-3-small",
input: [{ nope: true }],
});
expect(res.status).toBe(400);
const json = (await res.json()) as { error?: { type?: string } };
expect(json.error?.type).toBe("invalid_request_error");
});
it("rejects disallowed provider-prefixed model overrides", async () => {
const res = await postEmbeddings({
model: "ollama/nomic-embed-text",
input: "hello",
});
expect(res.status).toBe(400);
const json = (await res.json()) as { error?: { type?: string; message?: string } };
expect(json.error).toEqual({
type: "invalid_request_error",
message: "This agent does not allow that embedding provider on `/v1/embeddings`.",
});
});
it("rejects oversized batches", async () => {
const res = await postEmbeddings({
model: "text-embedding-3-small",
input: Array.from({ length: 129 }, () => "x"),
});
expect(res.status).toBe(400);
const json = (await res.json()) as { error?: { type?: string; message?: string } };
expect(json.error).toEqual({
type: "invalid_request_error",
message: "Too many inputs (max 128).",
});
});
it("sanitizes provider failures", async () => {
createEmbeddingProviderMock.mockRejectedValueOnce(new Error("secret upstream failure"));
const res = await postEmbeddings({
model: "text-embedding-3-small",
input: "hello",
});
expect(res.status).toBe(500);
const json = (await res.json()) as { error?: { type?: string; message?: string } };
expect(json.error).toEqual({
type: "api_error",
message: "internal error",
});
});
});

View File

@ -0,0 +1,252 @@
import { Buffer } from "node:buffer";
import type { IncomingMessage, ServerResponse } from "node:http";
import { resolveAgentDir } from "../agents/agent-scope.js";
import { resolveMemorySearchConfig } from "../agents/memory-search.js";
import { loadConfig } from "../config/config.js";
import { logWarn } from "../logger.js";
import {
createEmbeddingProvider,
type EmbeddingProviderOptions,
type EmbeddingProviderId,
type EmbeddingProviderRequest,
} from "../memory/embeddings.js";
import type { AuthRateLimiter } from "./auth-rate-limit.js";
import type { ResolvedGatewayAuth } from "./auth.js";
import { sendJson } from "./http-common.js";
import { handleGatewayPostJsonEndpoint } from "./http-endpoint-helpers.js";
import { resolveAgentIdFromHeader } from "./http-utils.js";
type OpenAiEmbeddingsHttpOptions = {
auth: ResolvedGatewayAuth;
maxBodyBytes?: number;
trustedProxies?: string[];
allowRealIpFallback?: boolean;
rateLimiter?: AuthRateLimiter;
};
type EmbeddingsRequest = {
model?: unknown;
input?: unknown;
encoding_format?: unknown;
dimensions?: unknown;
user?: unknown;
};
const DEFAULT_EMBEDDINGS_BODY_BYTES = 5 * 1024 * 1024;
const MAX_EMBEDDING_INPUTS = 128;
const MAX_EMBEDDING_INPUT_CHARS = 8_192;
const MAX_EMBEDDING_TOTAL_CHARS = 65_536;
const SAFE_AUTO_EXPLICIT_PROVIDERS = new Set<EmbeddingProviderId>([
"openai",
"gemini",
"voyage",
"mistral",
]);
function coerceRequest(value: unknown): EmbeddingsRequest {
return value && typeof value === "object" ? (value as EmbeddingsRequest) : {};
}
function resolveInputTexts(input: unknown): string[] | null {
if (typeof input === "string") {
return [input];
}
if (!Array.isArray(input)) {
return null;
}
if (input.every((entry) => typeof entry === "string")) {
return input;
}
return null;
}
function encodeEmbeddingBase64(embedding: number[]): string {
const float32 = Float32Array.from(embedding);
return Buffer.from(float32.buffer).toString("base64");
}
function validateInputTexts(texts: string[]): string | undefined {
if (texts.length > MAX_EMBEDDING_INPUTS) {
return `Too many inputs (max ${MAX_EMBEDDING_INPUTS}).`;
}
let totalChars = 0;
for (const text of texts) {
if (text.length > MAX_EMBEDDING_INPUT_CHARS) {
return `Input too long (max ${MAX_EMBEDDING_INPUT_CHARS} chars).`;
}
totalChars += text.length;
if (totalChars > MAX_EMBEDDING_TOTAL_CHARS) {
return `Total input too large (max ${MAX_EMBEDDING_TOTAL_CHARS} chars).`;
}
}
return undefined;
}
function resolveEmbeddingsTarget(params: {
requestModel: string;
configuredProvider: EmbeddingProviderRequest;
}): { provider: EmbeddingProviderRequest; model: string } | { errorMessage: string } {
const raw = params.requestModel.trim();
const slash = raw.indexOf("/");
if (slash === -1) {
return { provider: params.configuredProvider, model: raw };
}
const provider = raw.slice(0, slash).trim().toLowerCase() as EmbeddingProviderRequest;
const model = raw.slice(slash + 1).trim();
if (!model) {
return { errorMessage: "Unsupported embedding model reference." };
}
if (params.configuredProvider === "auto") {
if (provider === "auto") {
return { provider: "auto", model };
}
if (SAFE_AUTO_EXPLICIT_PROVIDERS.has(provider)) {
return { provider, model };
}
return {
errorMessage: "This agent does not allow that embedding provider on `/v1/embeddings`.",
};
}
if (provider !== params.configuredProvider) {
return {
errorMessage: "This agent does not allow that embedding provider on `/v1/embeddings`.",
};
}
return { provider: params.configuredProvider, model };
}
export async function handleOpenAiEmbeddingsHttpRequest(
req: IncomingMessage,
res: ServerResponse,
opts: OpenAiEmbeddingsHttpOptions,
): Promise<boolean> {
const handled = await handleGatewayPostJsonEndpoint(req, res, {
pathname: "/v1/embeddings",
auth: opts.auth,
trustedProxies: opts.trustedProxies,
allowRealIpFallback: opts.allowRealIpFallback,
rateLimiter: opts.rateLimiter,
maxBodyBytes: opts.maxBodyBytes ?? DEFAULT_EMBEDDINGS_BODY_BYTES,
});
if (handled === false) {
return false;
}
if (!handled) {
return true;
}
const payload = coerceRequest(handled.body);
const requestModel = typeof payload.model === "string" ? payload.model.trim() : "";
if (!requestModel) {
sendJson(res, 400, {
error: { message: "Missing `model`.", type: "invalid_request_error" },
});
return true;
}
const texts = resolveInputTexts(payload.input);
if (!texts) {
sendJson(res, 400, {
error: {
message: "`input` must be a string or an array of strings.",
type: "invalid_request_error",
},
});
return true;
}
const inputError = validateInputTexts(texts);
if (inputError) {
sendJson(res, 400, {
error: { message: inputError, type: "invalid_request_error" },
});
return true;
}
const cfg = loadConfig();
const agentId = resolveAgentIdFromHeader(req) ?? "main";
const agentDir = resolveAgentDir(cfg, agentId);
const memorySearch = resolveMemorySearchConfig(cfg, agentId);
const configuredProvider = (memorySearch?.provider ?? "openai") as EmbeddingProviderRequest;
const target = resolveEmbeddingsTarget({
requestModel,
configuredProvider,
});
if ("errorMessage" in target) {
sendJson(res, 400, {
error: {
message: target.errorMessage,
type: "invalid_request_error",
},
});
return true;
}
const options: EmbeddingProviderOptions = {
config: cfg,
agentDir,
provider: target.provider,
model: target.model,
// Public HTTP embeddings should fail closed rather than silently mixing
// vector spaces across fallback providers/models.
fallback: "none",
local: memorySearch?.local,
remote: memorySearch?.remote
? {
baseUrl: memorySearch.remote.baseUrl,
apiKey: memorySearch.remote.apiKey,
headers: memorySearch.remote.headers,
}
: undefined,
outputDimensionality:
typeof payload.dimensions === "number" && payload.dimensions > 0
? Math.floor(payload.dimensions)
: memorySearch?.outputDimensionality,
};
try {
const result = await createEmbeddingProvider(options);
if (!result.provider) {
sendJson(res, 503, {
error: {
message: result.providerUnavailableReason ?? "Embeddings provider unavailable.",
type: "api_error",
},
});
return true;
}
const embeddings = await result.provider.embedBatch(texts);
const encodingFormat = payload.encoding_format === "base64" ? "base64" : "float";
sendJson(res, 200, {
object: "list",
data: embeddings.map((embedding, index) => ({
object: "embedding",
index,
embedding: encodingFormat === "base64" ? encodeEmbeddingBase64(embedding) : embedding,
})),
model:
requestModel.includes("/") || target.provider === "auto"
? requestModel
: `${target.provider}/${target.model}`,
usage: {
prompt_tokens: 0,
total_tokens: 0,
},
});
} catch (err) {
logWarn(`openai-compat: embeddings request failed: ${String(err)}`);
sendJson(res, 500, {
error: {
message: "internal error",
type: "api_error",
},
});
}
return true;
}

View File

@ -1,7 +1,16 @@
import { randomUUID } from "node:crypto";
import type { IncomingMessage } from "node:http";
import {
buildAllowedModelSet,
isCliProvider,
modelKey,
parseModelRef,
resolveDefaultModelForAgent,
} from "../agents/model-selection.js";
import { loadConfig } from "../config/config.js";
import { buildAgentMainSessionKey, normalizeAgentId } from "../routing/session-key.js";
import { normalizeMessageChannel } from "../utils/message-channel.js";
import { loadGatewayModelCatalog } from "./server-model-catalog.js";
export function getHeader(req: IncomingMessage, name: string): string | undefined {
const raw = req.headers[name.toLowerCase()];
@ -50,6 +59,51 @@ export function resolveAgentIdFromModel(model: string | undefined): string | und
return normalizeAgentId(agentId);
}
export async function resolveOpenAiCompatModelOverride(params: {
agentId: string;
model: string | undefined;
}): Promise<{ modelOverride?: string; errorMessage?: string }> {
const model = params.model;
const raw = model?.trim();
if (!raw) {
return {};
}
if (raw.toLowerCase() === "openclaw") {
return {};
}
if (resolveAgentIdFromModel(raw)) {
return {};
}
const cfg = loadConfig();
const defaultModelRef = resolveDefaultModelForAgent({ cfg, agentId: params.agentId });
const defaultProvider = defaultModelRef.provider;
const parsed = parseModelRef(raw, defaultProvider);
if (!parsed) {
return { errorMessage: "Invalid `model`." };
}
const catalog = await loadGatewayModelCatalog();
const allowed = buildAllowedModelSet({
cfg,
catalog,
defaultProvider,
agentId: params.agentId,
});
const normalized = modelKey(parsed.provider, parsed.model);
if (
!isCliProvider(parsed.provider, cfg) &&
!allowed.allowAny &&
!allowed.allowedKeys.has(normalized)
) {
return {
errorMessage: `Model '${normalized}' is not allowed for agent '${params.agentId}'.`,
};
}
return { modelOverride: raw };
}
export function resolveAgentIdForRequest(params: {
req: IncomingMessage;
model: string | undefined;

View File

@ -0,0 +1,76 @@
import { afterAll, beforeAll, describe, expect, it } from "vitest";
import { getFreePort, installGatewayTestHooks } from "./test-helpers.js";
installGatewayTestHooks({ scope: "suite" });
let startGatewayServer: typeof import("./server.js").startGatewayServer;
let enabledServer: Awaited<ReturnType<typeof startServer>>;
let enabledPort: number;
beforeAll(async () => {
({ startGatewayServer } = await import("./server.js"));
enabledPort = await getFreePort();
enabledServer = await startServer(enabledPort, { openAiChatCompletionsEnabled: true });
});
afterAll(async () => {
await enabledServer.close({ reason: "models http enabled suite done" });
});
async function startServer(port: number, opts?: { openAiChatCompletionsEnabled?: boolean }) {
return await startGatewayServer(port, {
host: "127.0.0.1",
auth: { mode: "token", token: "secret" },
controlUiEnabled: false,
openAiChatCompletionsEnabled: opts?.openAiChatCompletionsEnabled ?? false,
});
}
async function getModels(pathname: string, headers?: Record<string, string>) {
return await fetch(`http://127.0.0.1:${enabledPort}${pathname}`, {
headers: {
authorization: "Bearer secret",
...headers,
},
});
}
describe("OpenAI-compatible models HTTP API (e2e)", () => {
it("serves /v1/models when compatibility endpoints are enabled", async () => {
const res = await getModels("/v1/models");
expect(res.status).toBe(200);
const json = (await res.json()) as { object?: string; data?: Array<{ id?: string }> };
expect(json.object).toBe("list");
expect(Array.isArray(json.data)).toBe(true);
expect((json.data?.length ?? 0) > 0).toBe(true);
expect(
json.data?.every((entry) => typeof entry.id === "string" && entry.id?.includes("/")),
).toBe(true);
});
it("serves /v1/models/{id}", async () => {
const list = (await (await getModels("/v1/models")).json()) as {
data?: Array<{ id?: string }>;
};
const firstId = list.data?.[0]?.id;
expect(typeof firstId).toBe("string");
const res = await getModels(`/v1/models/${encodeURIComponent(firstId!)}`);
expect(res.status).toBe(200);
const json = (await res.json()) as { id?: string; object?: string };
expect(json.object).toBe("model");
expect(json.id).toBe(firstId);
});
it("rejects when disabled", async () => {
const port = await getFreePort();
const server = await startServer(port, { openAiChatCompletionsEnabled: false });
try {
const res = await fetch(`http://127.0.0.1:${port}/v1/models`, {
headers: { authorization: "Bearer secret" },
});
expect(res.status).toBe(404);
} finally {
await server.close({ reason: "models disabled test done" });
}
});
});

138
src/gateway/models-http.ts Normal file
View File

@ -0,0 +1,138 @@
import type { IncomingMessage, ServerResponse } from "node:http";
import { DEFAULT_PROVIDER } from "../agents/defaults.js";
import type { ModelCatalogEntry } from "../agents/model-catalog.js";
import { buildAllowedModelSet, modelKey, parseModelRef } from "../agents/model-selection.js";
import { loadConfig } from "../config/config.js";
import type { AuthRateLimiter } from "./auth-rate-limit.js";
import type { ResolvedGatewayAuth } from "./auth.js";
import { authorizeGatewayBearerRequestOrReply } from "./http-auth-helpers.js";
import { sendInvalidRequest, sendJson, sendMethodNotAllowed } from "./http-common.js";
import { resolveAgentIdForRequest } from "./http-utils.js";
import { loadGatewayModelCatalog } from "./server-model-catalog.js";
type OpenAiModelsHttpOptions = {
auth: ResolvedGatewayAuth;
trustedProxies?: string[];
allowRealIpFallback?: boolean;
rateLimiter?: AuthRateLimiter;
};
type OpenAiModelObject = {
id: string;
object: "model";
created: number;
owned_by: string;
permission: [];
input?: ModelCatalogEntry["input"];
context_window?: number;
reasoning?: boolean;
};
function toOpenAiModel(entry: ModelCatalogEntry): OpenAiModelObject {
return {
id: modelKey(entry.provider, entry.id),
object: "model",
created: 0,
owned_by: entry.provider,
permission: [],
...(entry.input ? { input: entry.input } : {}),
...(typeof entry.contextWindow === "number" ? { context_window: entry.contextWindow } : {}),
...(typeof entry.reasoning === "boolean" ? { reasoning: entry.reasoning } : {}),
};
}
async function authorizeRequest(
req: IncomingMessage,
res: ServerResponse,
opts: OpenAiModelsHttpOptions,
): Promise<boolean> {
return await authorizeGatewayBearerRequestOrReply({
req,
res,
auth: opts.auth,
trustedProxies: opts.trustedProxies,
allowRealIpFallback: opts.allowRealIpFallback,
rateLimiter: opts.rateLimiter,
});
}
async function loadAllowedCatalog(req: IncomingMessage): Promise<ModelCatalogEntry[]> {
const cfg = loadConfig();
const catalog = await loadGatewayModelCatalog();
const agentId = resolveAgentIdForRequest({ req, model: undefined });
const { allowedCatalog } = buildAllowedModelSet({
cfg,
catalog,
defaultProvider: DEFAULT_PROVIDER,
agentId,
});
return allowedCatalog.length > 0 ? allowedCatalog : catalog;
}
function resolveRequestPath(req: IncomingMessage): string {
return new URL(req.url ?? "/", `http://${req.headers.host || "localhost"}`).pathname;
}
export async function handleOpenAiModelsHttpRequest(
req: IncomingMessage,
res: ServerResponse,
opts: OpenAiModelsHttpOptions,
): Promise<boolean> {
const requestPath = resolveRequestPath(req);
if (requestPath !== "/v1/models" && !requestPath.startsWith("/v1/models/")) {
return false;
}
if (req.method !== "GET") {
sendMethodNotAllowed(res, "GET");
return true;
}
if (!(await authorizeRequest(req, res, opts))) {
return true;
}
const catalog = await loadAllowedCatalog(req);
if (requestPath === "/v1/models") {
sendJson(res, 200, {
object: "list",
data: catalog.map(toOpenAiModel),
});
return true;
}
const encodedId = requestPath.slice("/v1/models/".length);
if (!encodedId) {
sendInvalidRequest(res, "Missing model id.");
return true;
}
let decodedId: string;
try {
decodedId = decodeURIComponent(encodedId);
} catch {
sendInvalidRequest(res, "Invalid model id encoding.");
return true;
}
const parsed = parseModelRef(decodedId, DEFAULT_PROVIDER);
if (!parsed) {
sendInvalidRequest(res, "Invalid model id.");
return true;
}
const key = modelKey(parsed.provider, parsed.model);
const entry = catalog.find((item) => modelKey(item.provider, item.id) === key);
if (!entry) {
sendJson(res, 404, {
error: {
message: `Model '${decodedId}' not found.`,
type: "invalid_request_error",
},
});
return true;
}
sendJson(res, 200, toOpenAiModel(entry));
return true;
}

View File

@ -1,3 +1,5 @@
import fs from "node:fs/promises";
import path from "node:path";
import { afterAll, beforeAll, describe, expect, it } from "vitest";
import { HISTORY_CONTEXT_MARKER } from "../auto-reply/reply/history.js";
import { CURRENT_MESSAGE_MARKER } from "../auto-reply/reply/mentions.js";
@ -45,6 +47,15 @@ async function startServer(port: number, opts?: { openAiChatCompletionsEnabled?:
});
}
async function writeGatewayConfig(config: Record<string, unknown>) {
const configPath = process.env.OPENCLAW_CONFIG_PATH;
if (!configPath) {
throw new Error("OPENCLAW_CONFIG_PATH is required for gateway config tests");
}
await fs.mkdir(path.dirname(configPath), { recursive: true });
await fs.writeFile(configPath, JSON.stringify(config, null, 2), "utf-8");
}
async function postChatCompletions(port: number, body: unknown, headers?: Record<string, string>) {
const res = await fetch(`http://127.0.0.1:${port}/v1/chat/completions`, {
method: "POST",
@ -244,6 +255,54 @@ describe("OpenAI-compatible HTTP API (e2e)", () => {
await res.text();
}
{
mockAgentOnce([{ text: "hello" }]);
const res = await postChatCompletions(port, {
model: "openai/gpt-5.4",
messages: [{ role: "user", content: "hi" }],
});
expect(res.status).toBe(200);
const opts = (agentCommand.mock.calls[0] as unknown[] | undefined)?.[0];
expect((opts as { model?: string } | undefined)?.model).toBe("openai/gpt-5.4");
await res.text();
}
{
await writeGatewayConfig({
agents: {
defaults: {
model: { primary: "openai/gpt-5.4" },
models: {
"openai/gpt-5.4": {},
},
},
},
});
mockAgentOnce([{ text: "hello" }]);
const res = await postChatCompletions(port, {
model: "gpt-5.4",
messages: [{ role: "user", content: "hi" }],
});
expect(res.status).toBe(200);
const opts = (agentCommand.mock.calls[0] as unknown[] | undefined)?.[0];
expect((opts as { model?: string } | undefined)?.model).toBe("gpt-5.4");
await res.text();
await writeGatewayConfig({});
}
{
agentCommand.mockClear();
const res = await postChatCompletions(port, {
model: "openai/",
messages: [{ role: "user", content: "hi" }],
});
expect(res.status).toBe(400);
const json = (await res.json()) as { error?: { type?: string; message?: string } };
expect(json.error?.type).toBe("invalid_request_error");
expect(json.error?.message).toBe("Invalid `model`.");
expect(agentCommand).toHaveBeenCalledTimes(0);
}
{
mockAgentOnce([{ text: "hello" }]);
const res = await postChatCompletions(port, {

View File

@ -27,7 +27,7 @@ import type { AuthRateLimiter } from "./auth-rate-limit.js";
import type { ResolvedGatewayAuth } from "./auth.js";
import { sendJson, setSseHeaders, writeDone } from "./http-common.js";
import { handleGatewayPostJsonEndpoint } from "./http-endpoint-helpers.js";
import { resolveGatewayRequestContext } from "./http-utils.js";
import { resolveGatewayRequestContext, resolveOpenAiCompatModelOverride } from "./http-utils.js";
import { normalizeInputHostnameAllowlist } from "./input-allowlist.js";
type OpenAiHttpOptions = {
@ -102,6 +102,7 @@ function writeSse(res: ServerResponse, data: unknown) {
function buildAgentCommandInput(params: {
prompt: { message: string; extraSystemPrompt?: string; images?: ImageContent[] };
modelOverride?: string;
sessionKey: string;
runId: string;
messageChannel: string;
@ -110,6 +111,7 @@ function buildAgentCommandInput(params: {
message: params.prompt.message,
extraSystemPrompt: params.prompt.extraSystemPrompt,
images: params.prompt.images,
model: params.modelOverride,
sessionKey: params.sessionKey,
runId: params.runId,
deliver: false as const,
@ -432,7 +434,7 @@ export async function handleOpenAiHttpRequest(
const model = typeof payload.model === "string" ? payload.model : "openclaw";
const user = typeof payload.user === "string" ? payload.user : undefined;
const { sessionKey, messageChannel } = resolveGatewayRequestContext({
const { agentId, sessionKey, messageChannel } = resolveGatewayRequestContext({
req,
model,
user,
@ -440,6 +442,16 @@ export async function handleOpenAiHttpRequest(
defaultMessageChannel: "webchat",
useMessageChannelHeader: true,
});
const { modelOverride, errorMessage: modelError } = await resolveOpenAiCompatModelOverride({
agentId,
model,
});
if (modelError) {
sendJson(res, 400, {
error: { message: modelError, type: "invalid_request_error" },
});
return true;
}
const activeTurnContext = resolveActiveTurnContext(payload.messages);
const prompt = buildAgentPrompt(payload.messages, activeTurnContext.activeUserMessageIndex);
let images: ImageContent[] = [];
@ -474,6 +486,7 @@ export async function handleOpenAiHttpRequest(
extraSystemPrompt: prompt.extraSystemPrompt,
images: images.length > 0 ? images : undefined,
},
modelOverride,
sessionKey,
runId,
messageChannel,

View File

@ -17,12 +17,12 @@ let openResponsesTesting: {
responseId: string,
sessionKey: string,
now: number,
scope?: { agentId: string; user?: string; requestedSessionKey?: string },
scope?: { authSubject: string; agentId: string; requestedSessionKey?: string },
): void;
lookupResponseSessionAt(
responseId: string | undefined,
now: number,
scope?: { agentId: string; user?: string; requestedSessionKey?: string },
scope?: { authSubject: string; agentId: string; requestedSessionKey?: string },
): string | undefined;
getResponseSessionIds(): string[];
};
@ -233,6 +233,17 @@ describe("OpenResponses HTTP API (e2e)", () => {
);
await ensureResponseConsumed(resMissingModel);
agentCommand.mockClear();
const resInvalidModel = await postResponses(port, { model: "openai/", input: "hi" });
expect(resInvalidModel.status).toBe(400);
const invalidModelJson = (await resInvalidModel.json()) as {
error?: { type?: string; message?: string };
};
expect(invalidModelJson.error?.type).toBe("invalid_request_error");
expect(invalidModelJson.error?.message).toBe("Invalid `model`.");
expect(agentCommand).toHaveBeenCalledTimes(0);
await ensureResponseConsumed(resInvalidModel);
mockAgentOnce([{ text: "hello" }]);
const resHeader = await postResponses(
port,
@ -267,10 +278,22 @@ describe("OpenResponses HTTP API (e2e)", () => {
expect(resChannelHeader.status).toBe(200);
const optsChannelHeader = (agentCommand.mock.calls[0] as unknown[] | undefined)?.[0];
expect((optsChannelHeader as { messageChannel?: string } | undefined)?.messageChannel).toBe(
"webchat",
"custom-client-channel",
);
await ensureResponseConsumed(resChannelHeader);
mockAgentOnce([{ text: "hello" }]);
const resModelOverride = await postResponses(port, {
model: "openai/text-embedding-3-small",
input: "hi",
});
expect(resModelOverride.status).toBe(200);
const optsModelOverride = (agentCommand.mock.calls[0] as unknown[] | undefined)?.[0];
expect((optsModelOverride as { model?: string } | undefined)?.model).toBe(
"openai/text-embedding-3-small",
);
await ensureResponseConsumed(resModelOverride);
mockAgentOnce([{ text: "hello" }]);
const resUser = await postResponses(port, {
user: "alice",
@ -777,7 +800,7 @@ describe("OpenResponses HTTP API (e2e)", () => {
await ensureResponseConsumed(secondResponse);
});
it("does not reuse prior sessions across different user scopes", async () => {
it("reuses prior sessions across different user values when auth scope matches", async () => {
const port = enabledPort;
agentCommand.mockClear();
agentCommand.mockResolvedValueOnce({
@ -812,8 +835,7 @@ describe("OpenResponses HTTP API (e2e)", () => {
const secondOpts = (agentCommand.mock.calls[1] as unknown[] | undefined)?.[0] as
| { sessionKey?: string }
| undefined;
expect(secondOpts?.sessionKey).not.toBe(firstOpts?.sessionKey);
expect(secondOpts?.sessionKey ?? "").toContain("openresponses-user:bob");
expect(secondOpts?.sessionKey).toBe(firstOpts?.sessionKey);
await ensureResponseConsumed(secondResponse);
});
@ -864,22 +886,22 @@ describe("OpenResponses HTTP API (e2e)", () => {
expect(openResponsesTesting.lookupResponseSessionAt("resp_504", 505)).toBe("session_504");
});
it("does not reuse cached sessions when the user scope changes", () => {
it("does not reuse cached sessions when the auth subject changes", () => {
openResponsesTesting.storeResponseSessionAt("resp_1", "session_1", 100, {
authSubject: "subject:a",
agentId: "main",
user: "alice",
});
expect(
openResponsesTesting.lookupResponseSessionAt("resp_1", 101, {
authSubject: "subject:a",
agentId: "main",
user: "alice",
}),
).toBe("session_1");
expect(
openResponsesTesting.lookupResponseSessionAt("resp_1", 101, {
authSubject: "subject:b",
agentId: "main",
user: "bob",
}),
).toBeUndefined();
});

View File

@ -6,7 +6,7 @@
* @see https://www.open-responses.com/
*/
import { randomUUID } from "node:crypto";
import { createHash, randomUUID } from "node:crypto";
import type { IncomingMessage, ServerResponse } from "node:http";
import type { ImageContent } from "../agents/command/types.js";
import type { ClientToolDefinition } from "../agents/pi-embedded-runner/run/params.js";
@ -35,7 +35,13 @@ import type { AuthRateLimiter } from "./auth-rate-limit.js";
import type { ResolvedGatewayAuth } from "./auth.js";
import { sendJson, setSseHeaders, writeDone } from "./http-common.js";
import { handleGatewayPostJsonEndpoint } from "./http-endpoint-helpers.js";
import { getHeader, resolveGatewayRequestContext } from "./http-utils.js";
import {
getBearerToken,
getHeader,
resolveAgentIdForRequest,
resolveGatewayRequestContext,
resolveOpenAiCompatModelOverride,
} from "./http-utils.js";
import { normalizeInputHostnameAllowlist } from "./input-allowlist.js";
import {
CreateResponseBodySchema,
@ -64,8 +70,8 @@ const DEFAULT_MAX_URL_PARTS = 8;
const RESPONSE_SESSION_TTL_MS = 30 * 60 * 1000;
const MAX_RESPONSE_SESSION_ENTRIES = 500;
type ResponseSessionScope = {
authSubject: string;
agentId: string;
user?: string;
requestedSessionKey?: string;
};
@ -77,23 +83,40 @@ type ResponseSessionEntry = ResponseSessionScope & {
const responseSessionMap = new Map<string, ResponseSessionEntry>();
function normalizeResponseSessionScope(scope: ResponseSessionScope): ResponseSessionScope {
const user = scope.user?.trim();
const authSubject = scope.authSubject.trim();
const requestedSessionKey = scope.requestedSessionKey?.trim();
return {
authSubject,
agentId: scope.agentId,
user: user || undefined,
requestedSessionKey: requestedSessionKey || undefined,
};
}
function resolveResponseSessionAuthSubject(params: {
req: IncomingMessage;
auth: ResolvedGatewayAuth;
}): string {
const bearer = getBearerToken(params.req);
if (bearer) {
return `bearer:${createHash("sha256").update(bearer).digest("hex")}`;
}
if (params.auth.mode === "trusted-proxy" && params.auth.trustedProxy?.userHeader) {
const user = getHeader(params.req, params.auth.trustedProxy.userHeader)?.trim();
if (user) {
return `trusted-proxy:${user}`;
}
}
return `gateway-auth:${params.auth.mode}`;
}
function createResponseSessionScope(params: {
req: IncomingMessage;
auth: ResolvedGatewayAuth;
agentId: string;
user?: string;
}): ResponseSessionScope {
return normalizeResponseSessionScope({
authSubject: resolveResponseSessionAuthSubject({ req: params.req, auth: params.auth }),
agentId: params.agentId,
user: params.user,
requestedSessionKey: getHeader(params.req, "x-openclaw-session-key"),
});
}
@ -103,8 +126,8 @@ function matchesResponseSessionScope(
scope: ResponseSessionScope,
): boolean {
return (
entry.authSubject === scope.authSubject &&
entry.agentId === scope.agentId &&
entry.user === scope.user &&
entry.requestedSessionKey === scope.requestedSessionKey
);
}
@ -176,14 +199,14 @@ export const __testing = {
responseId: string,
sessionKey: string,
now: number,
scope: ResponseSessionScope = { agentId: "main" },
scope: ResponseSessionScope = { authSubject: "test", agentId: "main" },
) {
storeResponseSession(responseId, sessionKey, normalizeResponseSessionScope(scope), now);
},
lookupResponseSessionAt(
responseId: string | undefined,
now: number,
scope: ResponseSessionScope = { agentId: "main" },
scope: ResponseSessionScope = { authSubject: "test", agentId: "main" },
) {
return lookupResponseSession(responseId, normalizeResponseSessionScope(scope), now);
},
@ -387,6 +410,7 @@ async function runResponsesAgentCommand(params: {
images: ImageContent[];
clientTools: ClientToolDefinition[];
extraSystemPrompt: string;
modelOverride?: string;
streamParams: { maxTokens: number } | undefined;
sessionKey: string;
runId: string;
@ -399,6 +423,7 @@ async function runResponsesAgentCommand(params: {
images: params.images.length > 0 ? params.images : undefined,
clientTools: params.clientTools.length > 0 ? params.clientTools : undefined,
extraSystemPrompt: params.extraSystemPrompt || undefined,
model: params.modelOverride,
streamParams: params.streamParams ?? undefined,
sessionKey: params.sessionKey,
runId: params.runId,
@ -455,6 +480,17 @@ export async function handleOpenResponsesHttpRequest(
const stream = Boolean(payload.stream);
const model = payload.model;
const user = payload.user;
const agentId = resolveAgentIdForRequest({ req, model });
const { modelOverride, errorMessage: modelError } = await resolveOpenAiCompatModelOverride({
agentId,
model,
});
if (modelError) {
sendJson(res, 400, {
error: { message: modelError, type: "invalid_request_error" },
});
return true;
}
// Extract images + files from input (Phase 2)
let images: ImageContent[] = [];
@ -593,15 +629,15 @@ export async function handleOpenResponsesHttpRequest(
user,
sessionPrefix: "openresponses",
defaultMessageChannel: "webchat",
useMessageChannelHeader: false,
useMessageChannelHeader: true,
});
const responseSessionScope = createResponseSessionScope({
req,
auth: opts.auth,
agentId: resolved.agentId,
user,
});
// Resolve session key: reuse previous_response_id only when it matches the
// same agent/user/requested-session scope as the current request.
// same auth-subject/agent/requested-session scope as the current request.
const previousSessionKey = lookupResponseSession(
payload.previous_response_id,
responseSessionScope,
@ -652,6 +688,7 @@ export async function handleOpenResponsesHttpRequest(
images,
clientTools: resolvedClientTools,
extraSystemPrompt,
modelOverride,
streamParams,
sessionKey,
runId: responseId,
@ -903,6 +940,7 @@ export async function handleOpenResponsesHttpRequest(
images,
clientTools: resolvedClientTools,
extraSystemPrompt,
modelOverride,
streamParams,
sessionKey,
runId: responseId,

View File

@ -34,6 +34,7 @@ import {
handleControlUiHttpRequest,
type ControlUiRootState,
} from "./control-ui.js";
import { handleOpenAiEmbeddingsHttpRequest } from "./embeddings-http.js";
import { applyHookMappings } from "./hooks-mapping.js";
import {
extractHookToken,
@ -55,6 +56,7 @@ import {
} from "./hooks.js";
import { sendGatewayAuthFailure, setDefaultSecurityHeaders } from "./http-common.js";
import { getBearerToken } from "./http-utils.js";
import { handleOpenAiModelsHttpRequest } from "./models-http.js";
import { resolveRequestClientIp } from "./net.js";
import { handleOpenAiHttpRequest } from "./openai-http.js";
import { handleOpenResponsesHttpRequest } from "./openresponses-http.js";
@ -772,6 +774,7 @@ export function createGatewayHttpServer(opts: {
rateLimiter,
getReadiness,
} = opts;
const openAiCompatEnabled = openAiChatCompletionsEnabled || openResponsesEnabled;
const httpServer: HttpServer = opts.tlsOptions
? createHttpsServer(opts.tlsOptions, (req, res) => {
void handleRequest(req, res);
@ -812,6 +815,30 @@ export function createGatewayHttpServer(opts: {
name: "hooks",
run: () => handleHooksRequest(req, res),
},
{
name: "models",
run: () =>
openAiCompatEnabled
? handleOpenAiModelsHttpRequest(req, res, {
auth: resolvedAuth,
trustedProxies,
allowRealIpFallback,
rateLimiter,
})
: false,
},
{
name: "embeddings",
run: () =>
openAiCompatEnabled
? handleOpenAiEmbeddingsHttpRequest(req, res, {
auth: resolvedAuth,
trustedProxies,
allowRealIpFallback,
rateLimiter,
})
: false,
},
{
name: "tools-invoke",
run: () =>