mirror of
https://github.com/openclaw/openclaw.git
synced 2026-03-27 09:21:35 +07:00
feat(gateway): add missing OpenAI-compatible endpoints (models and embeddings) (#53992)
* feat(gateway): add OpenAI-compatible models and embeddings * docs(gateway): clarify model list and agent routing * Update index.md * fix(gateway): harden embeddings HTTP provider selection * fix(gateway): validate compat model overrides * fix(gateway): harden embeddings and response continuity * fix(gateway): restore compat model id handling
This commit is contained in:
@@ -70,11 +70,35 @@ Default mode is `gateway.reload.mode="hybrid"`.
|
||||
- One always-on process for routing, control plane, and channel connections.
|
||||
- Single multiplexed port for:
|
||||
- WebSocket control/RPC
|
||||
- HTTP APIs (OpenAI-compatible, Responses, tools invoke)
|
||||
- HTTP APIs, OpenAI compatible (`/v1/models`, `/v1/embeddings`, `/v1/chat/completions`, `/v1/responses`, `/tools/invoke`)
|
||||
- Control UI and hooks
|
||||
- Default bind mode: `loopback`.
|
||||
- Auth is required by default (`gateway.auth.token` / `gateway.auth.password`, or `OPENCLAW_GATEWAY_TOKEN` / `OPENCLAW_GATEWAY_PASSWORD`).
|
||||
|
||||
## OpenAI-compatible endpoints
|
||||
|
||||
OpenClaw’s highest-leverage compatibility surface is now:
|
||||
|
||||
- `GET /v1/models`
|
||||
- `GET /v1/models/{id}`
|
||||
- `POST /v1/embeddings`
|
||||
- `POST /v1/chat/completions`
|
||||
- `POST /v1/responses`
|
||||
|
||||
Why this set matters:
|
||||
|
||||
- Most Open WebUI, LobeChat, and LibreChat integrations probe `/v1/models` first.
|
||||
- Many RAG and memory pipelines expect `/v1/embeddings`.
|
||||
- Agent-native clients increasingly prefer `/v1/responses`.
|
||||
|
||||
Planning note:
|
||||
|
||||
- Keep `/v1/models` as a flat `provider/model` list for client compatibility.
|
||||
- Treat agent and sub-agent selection as separate OpenClaw routing concerns, not pseudo-model entries.
|
||||
- When you need agent-scoped filtering, pass `x-openclaw-agent-id` on both model-list and request calls.
|
||||
|
||||
All of these run on the main Gateway port and use the same trusted operator auth boundary as the rest of the Gateway HTTP API.
|
||||
|
||||
### Port and bind precedence
|
||||
|
||||
| Setting | Resolution order |
|
||||
|
||||
@@ -14,6 +14,13 @@ This endpoint is **disabled by default**. Enable it in config first.
|
||||
- `POST /v1/chat/completions`
|
||||
- Same port as the Gateway (WS + HTTP multiplex): `http://<gateway-host>:<port>/v1/chat/completions`
|
||||
|
||||
When the Gateway’s OpenAI-compatible HTTP surface is enabled, it also serves:
|
||||
|
||||
- `GET /v1/models`
|
||||
- `GET /v1/models/{id}`
|
||||
- `POST /v1/embeddings`
|
||||
- `POST /v1/responses`
|
||||
|
||||
Under the hood, requests are executed as a normal Gateway agent run (same codepath as `openclaw agent`), so routing/permissions/config match your Gateway.
|
||||
|
||||
## Authentication
|
||||
@@ -55,6 +62,12 @@ Or target a specific OpenClaw agent by header:
|
||||
Advanced:
|
||||
|
||||
- `x-openclaw-session-key: <sessionKey>` to fully control session routing.
|
||||
- `x-openclaw-message-channel: <channel>` to set the synthetic ingress channel context for channel-aware prompts and policies.
|
||||
|
||||
For `/v1/models` and `/v1/embeddings`, `x-openclaw-agent-id` is still useful:
|
||||
|
||||
- `/v1/models` uses it for agent-scoped model filtering where relevant.
|
||||
- `/v1/embeddings` uses it to resolve agent-specific memory-search embedding config.
|
||||
|
||||
## Enabling the endpoint
|
||||
|
||||
@@ -94,6 +107,51 @@ By default the endpoint is **stateless per request** (a new session key is gener
|
||||
|
||||
If the request includes an OpenAI `user` string, the Gateway derives a stable session key from it, so repeated calls can share an agent session.
|
||||
|
||||
## Why this surface matters
|
||||
|
||||
This is the highest-leverage compatibility set for self-hosted frontends and tooling:
|
||||
|
||||
- Most Open WebUI, LobeChat, and LibreChat setups expect `/v1/models`.
|
||||
- Many RAG systems expect `/v1/embeddings`.
|
||||
- Existing OpenAI chat clients can usually start with `/v1/chat/completions`.
|
||||
- More agent-native clients increasingly prefer `/v1/responses`.
|
||||
|
||||
## Model list and agent routing
|
||||
|
||||
<AccordionGroup>
|
||||
<Accordion title="What does `/v1/models` return?">
|
||||
A flat OpenAI-style model list.
|
||||
|
||||
The returned ids are canonical `provider/model` values such as `openai/gpt-5.4`.
|
||||
These ids are meant to be passed back directly as the OpenAI `model` field.
|
||||
|
||||
</Accordion>
|
||||
<Accordion title="Does `/v1/models` list agents or sub-agents?">
|
||||
No.
|
||||
|
||||
`/v1/models` lists model choices, not execution topology. Agents and sub-agents are OpenClaw routing concerns, so they are selected separately with `x-openclaw-agent-id` or the `openclaw:<agentId>` / `agent:<agentId>` model aliases on chat and responses requests.
|
||||
|
||||
</Accordion>
|
||||
<Accordion title="How does agent-scoped filtering work?">
|
||||
Send `x-openclaw-agent-id: <agentId>` when you want the model list for a specific agent.
|
||||
|
||||
OpenClaw filters the model list against that agent's allowed models and fallbacks when configured. If no allowlist is configured, the endpoint returns the full catalog.
|
||||
|
||||
</Accordion>
|
||||
<Accordion title="How do sub-agents pick a model?">
|
||||
Sub-agent model choice is resolved at spawn time from OpenClaw agent config.
|
||||
|
||||
That means sub-agent model selection does not create extra `/v1/models` entries. Keep the compatibility list flat, and treat agent and sub-agent selection as separate OpenClaw-native routing behavior.
|
||||
|
||||
</Accordion>
|
||||
<Accordion title="What should clients do in practice?">
|
||||
Use `/v1/models` to populate the normal model picker.
|
||||
|
||||
If your client or integration also knows which OpenClaw agent it wants, set `x-openclaw-agent-id` when listing models and when sending chat, responses, or embeddings requests. That keeps the picker aligned with the target agent's allowed model set.
|
||||
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
## Streaming (SSE)
|
||||
|
||||
Set `stream: true` to receive Server-Sent Events (SSE):
|
||||
@@ -130,3 +188,36 @@ curl -N http://127.0.0.1:18789/v1/chat/completions \
|
||||
"messages": [{"role":"user","content":"hi"}]
|
||||
}'
|
||||
```
|
||||
|
||||
List models:
|
||||
|
||||
```bash
|
||||
curl -sS http://127.0.0.1:18789/v1/models \
|
||||
-H 'Authorization: Bearer YOUR_TOKEN'
|
||||
```
|
||||
|
||||
Fetch one model:
|
||||
|
||||
```bash
|
||||
curl -sS http://127.0.0.1:18789/v1/models/openai%2Fgpt-5.4 \
|
||||
-H 'Authorization: Bearer YOUR_TOKEN'
|
||||
```
|
||||
|
||||
Create embeddings:
|
||||
|
||||
```bash
|
||||
curl -sS http://127.0.0.1:18789/v1/embeddings \
|
||||
-H 'Authorization: Bearer YOUR_TOKEN' \
|
||||
-H 'Content-Type: application/json' \
|
||||
-H 'x-openclaw-agent-id: main' \
|
||||
-d '{
|
||||
"model": "openai/text-embedding-3-small",
|
||||
"input": ["alpha", "beta"]
|
||||
}'
|
||||
```
|
||||
|
||||
Notes:
|
||||
|
||||
- `/v1/models` returns canonical ids in `provider/model` form so they can be passed back directly as OpenAI `model` values.
|
||||
- `/v1/models` stays flat on purpose: it does not enumerate agents or sub-agents as pseudo-model ids.
|
||||
- `/v1/embeddings` supports `input` as a string or array of strings.
|
||||
|
||||
@@ -26,9 +26,19 @@ Operational behavior matches [OpenAI Chat Completions](/gateway/openai-http-api)
|
||||
- treat the endpoint as full operator access for the gateway instance
|
||||
- select agents with `model: "openclaw:<agentId>"`, `model: "agent:<agentId>"`, or `x-openclaw-agent-id`
|
||||
- use `x-openclaw-session-key` for explicit session routing
|
||||
- use `x-openclaw-message-channel` when you want a non-default synthetic ingress channel context
|
||||
|
||||
Enable or disable this endpoint with `gateway.http.endpoints.responses.enabled`.
|
||||
|
||||
The same compatibility surface also includes:
|
||||
|
||||
- `GET /v1/models`
|
||||
- `GET /v1/models/{id}`
|
||||
- `POST /v1/embeddings`
|
||||
- `POST /v1/chat/completions`
|
||||
|
||||
For the canonical explanation of how model listing, agent routing, and sub-agent model selection fit together, see [OpenAI Chat Completions](/gateway/openai-http-api#model-list-and-agent-routing).
|
||||
|
||||
## Session behavior
|
||||
|
||||
By default the endpoint is **stateless per request** (a new session key is generated each call).
|
||||
@@ -54,9 +64,12 @@ Accepted but **currently ignored**:
|
||||
- `reasoning`
|
||||
- `metadata`
|
||||
- `store`
|
||||
- `previous_response_id`
|
||||
- `truncation`
|
||||
|
||||
Supported:
|
||||
|
||||
- `previous_response_id`: OpenClaw reuses the earlier response session when the request stays within the same agent/user/requested-session scope.
|
||||
|
||||
## Items (input)
|
||||
|
||||
### `message`
|
||||
|
||||
Reference in New Issue
Block a user