mirror of
https://github.com/lobehub/lobehub.git
synced 2026-03-26 13:19:34 +07:00
* 🔧 chore(vscode): add typescript.tsdk and disable mdx server Fix MDX extension crash caused by Cursor's bundled TypeScript version * 🔧 chore(claude): add skills symlink to .claude directory * 📝 docs: update development guides with current tech stack and architecture - Update tech stack: Next.js 16 + React 19, hybrid routing (App Router + React Router DOM), tRPC, Drizzle ORM + PostgreSQL, react-i18next - Update directory structure to reflect monorepo layout (apps/, packages/, e2e/, locales/) - Expand src/server/ with detailed subdirectory descriptions - Add complete SPA routing architecture with desktop and mobile route tables - Add tRPC router grouping details (lambda, async, tools, mobile) - Add data flow diagram - Simplify dev setup section to link to setup-development guide - Fix i18n default language description (English, not Chinese) - Sync all changes between zh-CN and English versions * 📝 docs: expand data flow diagram in folder structure guide Replace the single-line data flow with a detailed layer-by-layer flow diagram showing each layer's location and responsibility. * 📝 docs: modernize feature development guide - Remove outdated clientDB/pglite/indexDB references - Update schema path to packages/database/src/schemas/ - Update types path to packages/types/src/ - Replace inline migration steps with link to db-migrations guide - Add complete layered architecture table (Client Service, WebAPI, tRPC Router, Server Service, Server Module, Repository, DB Model) - Clarify Client Service as frontend code - Add i18n handling section with workflow and key naming convention - Remove verbose CSS style code, keep core business logic only - Expand testing section with commands, skill refs, and CI tip * 🔥 docs: remove outdated frontend feature development guide Content is superseded by the comprehensive feature-development guide which covers the full chain from schema to testing. * 📝 docs: add LobeHub ecosystem and community resources Add official ecosystem packages (LobeUI, LobeIcons, LobeCharts, LobeEditor, LobeTTS, LobeLint, Lobe i18n, MCP Mark) and community platforms (Agent Market, MCP Market, YouTube, X, Discord). * 📝 docs: improve contributing guidelines and resources - Clarify semantic release triggers (feat/fix vs style/chore) - Add testing section with Vitest/E2E/CI requirements - Update contribution steps to include CI check - Add LobeHub ecosystem packages and community platforms to resources * 📝 docs: rewrite architecture guide to reflect current platform design * 📝 docs: add code quality tools to architecture guide * 📝 docs: rewrite chat-api guide to reflect current architecture - Update sequence diagram with Agent Runtime loop as core execution engine - Replace PluginGateway with ToolExecution layer (Builtin/MCP/Plugin) - Update all path references (model-runtime, agent-runtime, fetch-sse packages) - Split old AgentRuntime section into Model Runtime + Agent Runtime - Add tool calling taxonomy: Builtin, MCP, and Plugin (deprecated) - Add client-side vs server-side execution section - Remove outdated adapter pseudo-code examples * 📝 docs: update file paths in add-new-image-model guide - src/libs/standard-parameters/ → packages/model-bank/src/standard-parameters/ - src/config/aiModels/ → packages/model-bank/src/aiModels/ - src/libs/model-runtime/ → packages/model-runtime/src/providers/ * 📝 docs: restore S3_PUBLIC_DOMAIN in deployment guides The S3_PUBLIC_DOMAIN env var was incorrectly removed from all documentation in commit4a87b31. This variable is still required by the code (src/server/services/file/impls/s3.ts) to generate public URLs for uploaded files. Without it, image URLs sent to vision models are just S3 keys instead of full URLs. Closes #12161 * 📦 chore: pin @lobehub/ui to 4.33.4 to fix SortableList type errors @lobehub/ui 4.34.0 introduced breaking type changes in SortableList where SortableListItem became strict, causing type incompatibility in onChange and renderItem callbacks across 6 files. Pin to 4.33.4 via pnpm overrides to enforce consistent version across monorepo. * 🐛 fix: correct ReadableStream type annotations and add dom.asynciterable - Add dom.asynciterable to tsconfig lib for ReadableStream async iteration - Fix createCallbacksTransformer return type: TransformStream<string, Uint8Array> - Update stream function return types from ReadableStream<string> to ReadableStream<Uint8Array> (llama.ts, ollama.ts, claude.ts) - Remove @ts-ignore from for-await loops in test files - Add explicit string[] type for chunks arrays * Revert "📝 docs: restore S3_PUBLIC_DOMAIN in deployment guides" This reverts commit24073f83d3.
172 lines
5.4 KiB
Plaintext
172 lines
5.4 KiB
Plaintext
---
|
||
title: 添加新的图像模型
|
||
description: 了解如何添加新的图像模型并兼容 OpenAI 请求格式。
|
||
tags:
|
||
- 图像模型
|
||
- AI 绘画
|
||
- OpenAI 兼容
|
||
---
|
||
|
||
# 添加新的图像模型
|
||
|
||
> 了解更多关于 AI 绘画模态的设计,请参考 [AI 绘画模态设计讨论](https://github.com/lobehub/lobehub/discussions/7442)
|
||
|
||
## 参数标准化
|
||
|
||
所有图像生成模型都必须使用 `packages/model-bank/src/standard-parameters/index.ts` 中定义的标准参数。这确保了不同 Provider 之间的参数一致性,让用户体验更加统一。
|
||
|
||
**支持的标准参数**:
|
||
|
||
- `prompt` (必需):生成图像的提示词
|
||
- `aspectRatio`:宽高比(如 "16:9", "1:1")
|
||
- `width` / `height`:图像宽高
|
||
- `size`:预设尺寸(如 "1024x1024")
|
||
- `seed`:随机种子
|
||
- `steps`:生成步数
|
||
- `cfg`:引导缩放
|
||
- 其他参数请查看源文件
|
||
|
||
## 兼容 OpenAI 请求格式的模型
|
||
|
||
指的是可以使用 openai SDK 进行请求,并且请求参数和和返回值和 dall-e 以及 gpt-image-x 系列一致。
|
||
|
||
以智谱的 CogView-4 为例,它是一个兼容 openai 请求格式的模型。你只需要在对应的 ai models 文件 `packages/model-bank/src/aiModels/zhipu.ts` 中,添加模型配置,例如:
|
||
|
||
```ts
|
||
const zhipuImageModels: AIImageModelCard[] = [
|
||
// 添加模型配置
|
||
// https://bigmodel.cn/dev/howuse/image-generation-model/cogview-4
|
||
{
|
||
description:
|
||
'CogView-4 是智谱首个支持生成汉字的开源文生图模型,在语义理解、图像生成质量、中英文字生成能力等方面全面提升,支持任意长度的中英双语输入,能够生成在给定范围内的任意分辨率图像。',
|
||
displayName: 'CogView-4',
|
||
enabled: true,
|
||
id: 'cogview-4',
|
||
parameters: {
|
||
prompt: {
|
||
default: '',
|
||
},
|
||
size: {
|
||
default: '1024x1024',
|
||
enum: ['1024x1024', '768x1344', '864x1152', '1344x768', '1152x864', '1440x720', '720x1440'],
|
||
},
|
||
},
|
||
releasedAt: '2025-03-04',
|
||
type: 'image',
|
||
},
|
||
];
|
||
```
|
||
|
||
## 不兼容 OpenAI 请求格式的模型
|
||
|
||
对于不兼容 OpenAI 格式的图像生成模型,需要实现自定义的 `createImage` 方法。有两种主要实现方式:
|
||
|
||
### 方式一:使用 OpenAI Compatible Factory
|
||
|
||
大部分 Provider 都使用 `openaiCompatibleFactory` 来兼容 OpenAI,可以通过传入自定义的 `createImage` 函数(参考 [PR #8534](https://github.com/lobehub/lobehub/pull/8534))。
|
||
|
||
**实现步骤**:
|
||
|
||
1. **阅读 Provider 官方文档和标准参数定义**
|
||
- 查看 Provider 的图像生成 API 文档,了解请求格式和响应格式
|
||
- 阅读 `packages/model-bank/src/standard-parameters/index.ts`,了解支持的参数
|
||
- 在对应的 ai models 文件中增加 image model 配置
|
||
|
||
2. **实现自定义的 createImage 方法**
|
||
- 创建独立的图像生成函数,接受标准生图参数
|
||
- 将标准参数转换为 Provider 特定的格式
|
||
- 调用 Provider 的生图接口
|
||
- 返回统一格式的响应(imageUrl 和可选的宽高)
|
||
|
||
3. **补充测试**
|
||
- 编写单元测试覆盖成功场景
|
||
- 测试各种错误情况和边界条件
|
||
|
||
**代码示例**:
|
||
|
||
```ts
|
||
// packages/model-runtime/src/providers/<provider-name>/createImage.ts
|
||
export const createProviderImage = async (
|
||
payload: ImageGenerationPayload,
|
||
options: any,
|
||
): Promise<ImageGenerationResponse> => {
|
||
const { model, prompt, ...params } = payload;
|
||
|
||
// 调用 Provider 的原生 API
|
||
const result = await callProviderAPI({
|
||
model,
|
||
prompt,
|
||
// 转换参数格式
|
||
custom_param: params.width,
|
||
// ...
|
||
});
|
||
|
||
// 返回统一格式
|
||
return {
|
||
created: Date.now(),
|
||
data: [{ url: result.imageUrl }],
|
||
};
|
||
};
|
||
```
|
||
|
||
```ts
|
||
// packages/model-runtime/src/providers/<provider-name>/index.ts
|
||
export const LobeProviderAI = openaiCompatibleFactory({
|
||
constructorOptions: {
|
||
// ... 其他配置
|
||
},
|
||
createImage: createProviderImage, // 传入自定义实现
|
||
provider: ModelProvider.ProviderName,
|
||
});
|
||
```
|
||
|
||
### 方式二:在 Provider 类中直接实现
|
||
|
||
如果你的 Provider 有独立的类实现,可以直接在类中添加 `createImage` 方法(参考 [PR #8503](https://github.com/lobehub/lobehub/pull/8503))。
|
||
|
||
**实现步骤**:
|
||
|
||
1. **阅读 Provider 官方文档和标准参数定义**
|
||
- 查看 Provider 的图像生成 API 文档
|
||
- 阅读 `packages/model-bank/src/standard-parameters/index.ts`
|
||
- 在对应的 ai models 文件中增加 image model 配置
|
||
|
||
2. **在 Provider 类中实现 createImage 方法**
|
||
- 直接在类中添加 `createImage` 方法
|
||
- 处理参数转换和 API 调用
|
||
- 返回统一格式的响应
|
||
|
||
3. **补充测试**
|
||
- 为新方法编写完整的测试用例
|
||
|
||
**代码示例**:
|
||
|
||
```ts
|
||
// packages/model-runtime/src/providers/<provider-name>/index.ts
|
||
export class LobeProviderAI {
|
||
async createImage(
|
||
payload: ImageGenerationPayload,
|
||
options?: ChatStreamCallbacks,
|
||
): Promise<ImageGenerationResponse> {
|
||
const { model, prompt, ...params } = payload;
|
||
|
||
// 调用原生 API 并处理响应
|
||
const result = await this.client.generateImage({
|
||
model,
|
||
prompt,
|
||
// 参数转换
|
||
});
|
||
|
||
return {
|
||
created: Date.now(),
|
||
data: [{ url: result.url }],
|
||
};
|
||
}
|
||
}
|
||
```
|
||
|
||
### 重要注意事项
|
||
|
||
- **测试要求**:为自定义实现添加完整的单元测试,确保覆盖成功场景和各种错误情况
|
||
- **错误处理**:统一使用 `AgentRuntimeError` 进行错误封装,保持错误信息的一致性
|