mirror of
https://github.com/lobehub/lobehub.git
synced 2026-03-27 13:29:15 +07:00
* 🔧 chore(vscode): add typescript.tsdk and disable mdx server Fix MDX extension crash caused by Cursor's bundled TypeScript version * 🔧 chore(claude): add skills symlink to .claude directory * 📝 docs: update development guides with current tech stack and architecture - Update tech stack: Next.js 16 + React 19, hybrid routing (App Router + React Router DOM), tRPC, Drizzle ORM + PostgreSQL, react-i18next - Update directory structure to reflect monorepo layout (apps/, packages/, e2e/, locales/) - Expand src/server/ with detailed subdirectory descriptions - Add complete SPA routing architecture with desktop and mobile route tables - Add tRPC router grouping details (lambda, async, tools, mobile) - Add data flow diagram - Simplify dev setup section to link to setup-development guide - Fix i18n default language description (English, not Chinese) - Sync all changes between zh-CN and English versions * 📝 docs: expand data flow diagram in folder structure guide Replace the single-line data flow with a detailed layer-by-layer flow diagram showing each layer's location and responsibility. * 📝 docs: modernize feature development guide - Remove outdated clientDB/pglite/indexDB references - Update schema path to packages/database/src/schemas/ - Update types path to packages/types/src/ - Replace inline migration steps with link to db-migrations guide - Add complete layered architecture table (Client Service, WebAPI, tRPC Router, Server Service, Server Module, Repository, DB Model) - Clarify Client Service as frontend code - Add i18n handling section with workflow and key naming convention - Remove verbose CSS style code, keep core business logic only - Expand testing section with commands, skill refs, and CI tip * 🔥 docs: remove outdated frontend feature development guide Content is superseded by the comprehensive feature-development guide which covers the full chain from schema to testing. * 📝 docs: add LobeHub ecosystem and community resources Add official ecosystem packages (LobeUI, LobeIcons, LobeCharts, LobeEditor, LobeTTS, LobeLint, Lobe i18n, MCP Mark) and community platforms (Agent Market, MCP Market, YouTube, X, Discord). * 📝 docs: improve contributing guidelines and resources - Clarify semantic release triggers (feat/fix vs style/chore) - Add testing section with Vitest/E2E/CI requirements - Update contribution steps to include CI check - Add LobeHub ecosystem packages and community platforms to resources * 📝 docs: rewrite architecture guide to reflect current platform design * 📝 docs: add code quality tools to architecture guide * 📝 docs: rewrite chat-api guide to reflect current architecture - Update sequence diagram with Agent Runtime loop as core execution engine - Replace PluginGateway with ToolExecution layer (Builtin/MCP/Plugin) - Update all path references (model-runtime, agent-runtime, fetch-sse packages) - Split old AgentRuntime section into Model Runtime + Agent Runtime - Add tool calling taxonomy: Builtin, MCP, and Plugin (deprecated) - Add client-side vs server-side execution section - Remove outdated adapter pseudo-code examples * 📝 docs: update file paths in add-new-image-model guide - src/libs/standard-parameters/ → packages/model-bank/src/standard-parameters/ - src/config/aiModels/ → packages/model-bank/src/aiModels/ - src/libs/model-runtime/ → packages/model-runtime/src/providers/ * 📝 docs: restore S3_PUBLIC_DOMAIN in deployment guides The S3_PUBLIC_DOMAIN env var was incorrectly removed from all documentation in commit4a87b31. This variable is still required by the code (src/server/services/file/impls/s3.ts) to generate public URLs for uploaded files. Without it, image URLs sent to vision models are just S3 keys instead of full URLs. Closes #12161 * 📦 chore: pin @lobehub/ui to 4.33.4 to fix SortableList type errors @lobehub/ui 4.34.0 introduced breaking type changes in SortableList where SortableListItem became strict, causing type incompatibility in onChange and renderItem callbacks across 6 files. Pin to 4.33.4 via pnpm overrides to enforce consistent version across monorepo. * 🐛 fix: correct ReadableStream type annotations and add dom.asynciterable - Add dom.asynciterable to tsconfig lib for ReadableStream async iteration - Fix createCallbacksTransformer return type: TransformStream<string, Uint8Array> - Update stream function return types from ReadableStream<string> to ReadableStream<Uint8Array> (llama.ts, ollama.ts, claude.ts) - Remove @ts-ignore from for-await loops in test files - Add explicit string[] type for chunks arrays * Revert "📝 docs: restore S3_PUBLIC_DOMAIN in deployment guides" This reverts commit24073f83d3.
174 lines
5.8 KiB
Plaintext
174 lines
5.8 KiB
Plaintext
---
|
|
title: Adding New Image Models
|
|
description: >-
|
|
Explore how to add new image models for AI generation with standard
|
|
parameters.
|
|
tags:
|
|
- AI Image Generation
|
|
- Image Models
|
|
- OpenAI Compatibility
|
|
---
|
|
|
|
# Adding New Image Models
|
|
|
|
> Learn more about the AI image generation modal design in the [AI Image Generation Modal Design Discussion](https://github.com/lobehub/lobehub/discussions/7442)
|
|
|
|
## Parameter Standardization
|
|
|
|
All image generation models must use the standard parameters defined in `packages/model-bank/src/standard-parameters/index.ts`. This ensures parameter consistency across different Providers, creating a more unified user experience.
|
|
|
|
**Supported Standard Parameters**:
|
|
|
|
- `prompt` (required): Text prompt for image generation
|
|
- `aspectRatio`: Aspect ratio (e.g., "16:9", "1:1")
|
|
- `width` / `height`: Image dimensions
|
|
- `size`: Preset dimensions (e.g., "1024x1024")
|
|
- `seed`: Random seed
|
|
- `steps`: Generation steps
|
|
- `cfg`: Guidance scale
|
|
- For other parameters, please check the source file
|
|
|
|
## OpenAI Compatible Models
|
|
|
|
These models can be requested using the OpenAI SDK, with request parameters and return values consistent with DALL-E and GPT-Image-X series.
|
|
|
|
Taking Zhipu's CogView-4 as an example, which is an OpenAI-compatible model, you can add it by adding the model configuration in the corresponding AI models file `packages/model-bank/src/aiModels/zhipu.ts`:
|
|
|
|
```ts
|
|
const zhipuImageModels: AIImageModelCard[] = [
|
|
// Add model configuration
|
|
// https://bigmodel.cn/dev/howuse/image-generation-model/cogview-4
|
|
{
|
|
description:
|
|
'CogView-4 is the first open-source text-to-image model from Zhipu that supports Chinese character generation, with comprehensive improvements in semantic understanding, image generation quality, and Chinese-English text generation capabilities.',
|
|
displayName: 'CogView-4',
|
|
enabled: true,
|
|
id: 'cogview-4',
|
|
parameters: {
|
|
prompt: {
|
|
default: '',
|
|
},
|
|
size: {
|
|
default: '1024x1024',
|
|
enum: ['1024x1024', '768x1344', '864x1152', '1344x768', '1152x864', '1440x720', '720x1440'],
|
|
},
|
|
},
|
|
releasedAt: '2025-03-04',
|
|
type: 'image',
|
|
},
|
|
];
|
|
```
|
|
|
|
## Non-OpenAI Compatible Models
|
|
|
|
For image generation models that are not compatible with OpenAI format, you need to implement a custom `createImage` method. There are two main implementation approaches:
|
|
|
|
### Method 1: Using OpenAI Compatible Factory
|
|
|
|
Most Providers use `openaiCompatibleFactory` for OpenAI compatibility. You can pass in a custom `createImage` function (reference [PR #8534](https://github.com/lobehub/lobehub/pull/8534)).
|
|
|
|
**Implementation Steps**:
|
|
|
|
1. **Read Provider documentation and standard parameter definitions**
|
|
- Review the Provider's image generation API documentation to understand request and response formats
|
|
- Read `packages/model-bank/src/standard-parameters/index.ts` to understand supported parameters
|
|
- Add image model configuration in the corresponding AI models file
|
|
|
|
2. **Implement custom createImage method**
|
|
- Create a standalone image generation function that accepts standard parameters
|
|
- Convert standard parameters to Provider-specific format
|
|
- Call the Provider's image generation API
|
|
- Return a unified response format (imageUrl and optional width/height)
|
|
|
|
3. **Add tests**
|
|
- Write unit tests covering success scenarios
|
|
- Test various error cases and edge conditions
|
|
|
|
**Code Example**:
|
|
|
|
```ts
|
|
// packages/model-runtime/src/providers/<provider-name>/createImage.ts
|
|
export const createProviderImage = async (
|
|
payload: ImageGenerationPayload,
|
|
options: any,
|
|
): Promise<ImageGenerationResponse> => {
|
|
const { model, prompt, ...params } = payload;
|
|
|
|
// Call Provider's native API
|
|
const result = await callProviderAPI({
|
|
model,
|
|
prompt,
|
|
// Convert parameter format
|
|
custom_param: params.width,
|
|
// ...
|
|
});
|
|
|
|
// Return unified format
|
|
return {
|
|
created: Date.now(),
|
|
data: [{ url: result.imageUrl }],
|
|
};
|
|
};
|
|
```
|
|
|
|
```ts
|
|
// packages/model-runtime/src/providers/<provider-name>/index.ts
|
|
export const LobeProviderAI = openaiCompatibleFactory({
|
|
constructorOptions: {
|
|
// ... other configurations
|
|
},
|
|
createImage: createProviderImage, // Pass custom implementation
|
|
provider: ModelProvider.ProviderName,
|
|
});
|
|
```
|
|
|
|
### Method 2: Direct Implementation in Provider Class
|
|
|
|
If your Provider has an independent class implementation, you can directly add the `createImage` method in the class (reference [PR #8503](https://github.com/lobehub/lobehub/pull/8503)).
|
|
|
|
**Implementation Steps**:
|
|
|
|
1. **Read Provider documentation and standard parameter definitions**
|
|
- Review the Provider's image generation API documentation
|
|
- Read `packages/model-bank/src/standard-parameters/index.ts`
|
|
- Add image model configuration in the corresponding AI models file
|
|
|
|
2. **Implement createImage method in Provider class**
|
|
- Add the `createImage` method directly in the class
|
|
- Handle parameter conversion and API calls
|
|
- Return a unified response format
|
|
|
|
3. **Add tests**
|
|
- Write comprehensive test cases for the new method
|
|
|
|
**Code Example**:
|
|
|
|
```ts
|
|
// packages/model-runtime/src/providers/<provider-name>/index.ts
|
|
export class LobeProviderAI {
|
|
async createImage(
|
|
payload: ImageGenerationPayload,
|
|
options?: ChatStreamCallbacks,
|
|
): Promise<ImageGenerationResponse> {
|
|
const { model, prompt, ...params } = payload;
|
|
|
|
// Call native API and handle response
|
|
const result = await this.client.generateImage({
|
|
model,
|
|
prompt,
|
|
// Parameter conversion
|
|
});
|
|
|
|
return {
|
|
created: Date.now(),
|
|
data: [{ url: result.url }],
|
|
};
|
|
}
|
|
}
|
|
```
|
|
|
|
### Important Notes
|
|
|
|
- **Testing Requirements**: Add comprehensive unit tests for custom implementations, ensuring coverage of success scenarios and various error cases
|
|
- **Error Handling**: Use `AgentRuntimeError` consistently for error wrapping to maintain error message consistency
|