Files
lobehub/docs/development/basic/add-new-image-model.zh-CN.mdx
YuTengjing 463d6c8762 📝 docs: improve development guides to reflect current architecture (#12174)
* 🔧 chore(vscode): add typescript.tsdk and disable mdx server

Fix MDX extension crash caused by Cursor's bundled TypeScript version

* 🔧 chore(claude): add skills symlink to .claude directory

* 📝 docs: update development guides with current tech stack and architecture

- Update tech stack: Next.js 16 + React 19, hybrid routing (App Router + React Router DOM), tRPC, Drizzle ORM + PostgreSQL, react-i18next
- Update directory structure to reflect monorepo layout (apps/, packages/, e2e/, locales/)
- Expand src/server/ with detailed subdirectory descriptions
- Add complete SPA routing architecture with desktop and mobile route tables
- Add tRPC router grouping details (lambda, async, tools, mobile)
- Add data flow diagram
- Simplify dev setup section to link to setup-development guide
- Fix i18n default language description (English, not Chinese)
- Sync all changes between zh-CN and English versions

* 📝 docs: expand data flow diagram in folder structure guide

Replace the single-line data flow with a detailed layer-by-layer
flow diagram showing each layer's location and responsibility.

* 📝 docs: modernize feature development guide

- Remove outdated clientDB/pglite/indexDB references
- Update schema path to packages/database/src/schemas/
- Update types path to packages/types/src/
- Replace inline migration steps with link to db-migrations guide
- Add complete layered architecture table (Client Service, WebAPI,
  tRPC Router, Server Service, Server Module, Repository, DB Model)
- Clarify Client Service as frontend code
- Add i18n handling section with workflow and key naming convention
- Remove verbose CSS style code, keep core business logic only
- Expand testing section with commands, skill refs, and CI tip

* 🔥 docs: remove outdated frontend feature development guide

Content is superseded by the comprehensive feature-development guide
which covers the full chain from schema to testing.

* 📝 docs: add LobeHub ecosystem and community resources

Add official ecosystem packages (LobeUI, LobeIcons, LobeCharts,
LobeEditor, LobeTTS, LobeLint, Lobe i18n, MCP Mark) and community
platforms (Agent Market, MCP Market, YouTube, X, Discord).

* 📝 docs: improve contributing guidelines and resources

- Clarify semantic release triggers (feat/fix vs style/chore)
- Add testing section with Vitest/E2E/CI requirements
- Update contribution steps to include CI check
- Add LobeHub ecosystem packages and community platforms to resources

* 📝 docs: rewrite architecture guide to reflect current platform design

* 📝 docs: add code quality tools to architecture guide

* 📝 docs: rewrite chat-api guide to reflect current architecture

- Update sequence diagram with Agent Runtime loop as core execution engine
- Replace PluginGateway with ToolExecution layer (Builtin/MCP/Plugin)
- Update all path references (model-runtime, agent-runtime, fetch-sse packages)
- Split old AgentRuntime section into Model Runtime + Agent Runtime
- Add tool calling taxonomy: Builtin, MCP, and Plugin (deprecated)
- Add client-side vs server-side execution section
- Remove outdated adapter pseudo-code examples

* 📝 docs: update file paths in add-new-image-model guide

- src/libs/standard-parameters/ → packages/model-bank/src/standard-parameters/
- src/config/aiModels/ → packages/model-bank/src/aiModels/
- src/libs/model-runtime/ → packages/model-runtime/src/providers/

* 📝 docs: restore S3_PUBLIC_DOMAIN in deployment guides

The S3_PUBLIC_DOMAIN env var was incorrectly removed from all
documentation in commit 4a87b31. This variable is still required
by the code (src/server/services/file/impls/s3.ts) to generate
public URLs for uploaded files. Without it, image URLs sent to
vision models are just S3 keys instead of full URLs.

Closes #12161

* 📦 chore: pin @lobehub/ui to 4.33.4 to fix SortableList type errors

@lobehub/ui 4.34.0 introduced breaking type changes in SortableList
where SortableListItem became strict, causing type incompatibility
in onChange and renderItem callbacks across 6 files. Pin to 4.33.4
via pnpm overrides to enforce consistent version across monorepo.

* 🐛 fix: correct ReadableStream type annotations and add dom.asynciterable

- Add dom.asynciterable to tsconfig lib for ReadableStream async iteration
- Fix createCallbacksTransformer return type: TransformStream<string, Uint8Array>
- Update stream function return types from ReadableStream<string> to
  ReadableStream<Uint8Array> (llama.ts, ollama.ts, claude.ts)
- Remove @ts-ignore from for-await loops in test files
- Add explicit string[] type for chunks arrays

* Revert "📝 docs: restore S3_PUBLIC_DOMAIN in deployment guides"

This reverts commit 24073f83d3.
2026-02-07 22:29:14 +08:00

172 lines
5.4 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
title: 添加新的图像模型
description: 了解如何添加新的图像模型并兼容 OpenAI 请求格式。
tags:
- 图像模型
- AI 绘画
- OpenAI 兼容
---
# 添加新的图像模型
> 了解更多关于 AI 绘画模态的设计,请参考 [AI 绘画模态设计讨论](https://github.com/lobehub/lobehub/discussions/7442)
## 参数标准化
所有图像生成模型都必须使用 `packages/model-bank/src/standard-parameters/index.ts` 中定义的标准参数。这确保了不同 Provider 之间的参数一致性,让用户体验更加统一。
**支持的标准参数**
- `prompt` (必需):生成图像的提示词
- `aspectRatio`:宽高比(如 "16:9", "1:1"
- `width` / `height`:图像宽高
- `size`:预设尺寸(如 "1024x1024"
- `seed`:随机种子
- `steps`:生成步数
- `cfg`:引导缩放
- 其他参数请查看源文件
## 兼容 OpenAI 请求格式的模型
指的是可以使用 openai SDK 进行请求,并且请求参数和和返回值和 dall-e 以及 gpt-image-x 系列一致。
以智谱的 CogView-4 为例,它是一个兼容 openai 请求格式的模型。你只需要在对应的 ai models 文件 `packages/model-bank/src/aiModels/zhipu.ts` 中,添加模型配置,例如:
```ts
const zhipuImageModels: AIImageModelCard[] = [
// 添加模型配置
// https://bigmodel.cn/dev/howuse/image-generation-model/cogview-4
{
description:
'CogView-4 是智谱首个支持生成汉字的开源文生图模型,在语义理解、图像生成质量、中英文字生成能力等方面全面提升,支持任意长度的中英双语输入,能够生成在给定范围内的任意分辨率图像。',
displayName: 'CogView-4',
enabled: true,
id: 'cogview-4',
parameters: {
prompt: {
default: '',
},
size: {
default: '1024x1024',
enum: ['1024x1024', '768x1344', '864x1152', '1344x768', '1152x864', '1440x720', '720x1440'],
},
},
releasedAt: '2025-03-04',
type: 'image',
},
];
```
## 不兼容 OpenAI 请求格式的模型
对于不兼容 OpenAI 格式的图像生成模型,需要实现自定义的 `createImage` 方法。有两种主要实现方式:
### 方式一:使用 OpenAI Compatible Factory
大部分 Provider 都使用 `openaiCompatibleFactory` 来兼容 OpenAI可以通过传入自定义的 `createImage` 函数(参考 [PR #8534](https://github.com/lobehub/lobehub/pull/8534))。
**实现步骤**
1. **阅读 Provider 官方文档和标准参数定义**
- 查看 Provider 的图像生成 API 文档,了解请求格式和响应格式
- 阅读 `packages/model-bank/src/standard-parameters/index.ts`,了解支持的参数
- 在对应的 ai models 文件中增加 image model 配置
2. **实现自定义的 createImage 方法**
- 创建独立的图像生成函数,接受标准生图参数
- 将标准参数转换为 Provider 特定的格式
- 调用 Provider 的生图接口
- 返回统一格式的响应imageUrl 和可选的宽高)
3. **补充测试**
- 编写单元测试覆盖成功场景
- 测试各种错误情况和边界条件
**代码示例**
```ts
// packages/model-runtime/src/providers/<provider-name>/createImage.ts
export const createProviderImage = async (
payload: ImageGenerationPayload,
options: any,
): Promise<ImageGenerationResponse> => {
const { model, prompt, ...params } = payload;
// 调用 Provider 的原生 API
const result = await callProviderAPI({
model,
prompt,
// 转换参数格式
custom_param: params.width,
// ...
});
// 返回统一格式
return {
created: Date.now(),
data: [{ url: result.imageUrl }],
};
};
```
```ts
// packages/model-runtime/src/providers/<provider-name>/index.ts
export const LobeProviderAI = openaiCompatibleFactory({
constructorOptions: {
// ... 其他配置
},
createImage: createProviderImage, // 传入自定义实现
provider: ModelProvider.ProviderName,
});
```
### 方式二:在 Provider 类中直接实现
如果你的 Provider 有独立的类实现,可以直接在类中添加 `createImage` 方法(参考 [PR #8503](https://github.com/lobehub/lobehub/pull/8503))。
**实现步骤**
1. **阅读 Provider 官方文档和标准参数定义**
- 查看 Provider 的图像生成 API 文档
- 阅读 `packages/model-bank/src/standard-parameters/index.ts`
- 在对应的 ai models 文件中增加 image model 配置
2. **在 Provider 类中实现 createImage 方法**
- 直接在类中添加 `createImage` 方法
- 处理参数转换和 API 调用
- 返回统一格式的响应
3. **补充测试**
- 为新方法编写完整的测试用例
**代码示例**
```ts
// packages/model-runtime/src/providers/<provider-name>/index.ts
export class LobeProviderAI {
async createImage(
payload: ImageGenerationPayload,
options?: ChatStreamCallbacks,
): Promise<ImageGenerationResponse> {
const { model, prompt, ...params } = payload;
// 调用原生 API 并处理响应
const result = await this.client.generateImage({
model,
prompt,
// 参数转换
});
return {
created: Date.now(),
data: [{ url: result.url }],
};
}
}
```
### 重要注意事项
- **测试要求**:为自定义实现添加完整的单元测试,确保覆盖成功场景和各种错误情况
- **错误处理**:统一使用 `AgentRuntimeError` 进行错误封装,保持错误信息的一致性