dify-docs/plugin-dev-en/0412-model-schema.mdx

---
dimensions:
  type:
    primary: reference
    detail: core
  level: intermediate
standard_title: Model Schema
language: en
title: Model API Interface
description: This document provides detailed interface specifications required for
  Dify model plugin development, including model provider implementation, interface
  definitions for five model types (LLM, TextEmbedding, Rerank, Speech2text, Text2speech),
  and complete specifications for related data structures such as PromptMessage and
  LLMResult. The document serves as a development reference for developers implementing
  various model integrations.
---

This section introduces the interface methods and parameter descriptions that providers and each model type need to implement. Before developing a model plugin, you may first need to read [Model Design Rules](/plugin-dev-en/0411-model-designing-rules) and [Model Plugin Introduction](/plugin-dev-en/0131-model-plugin-introduction).

### Model Provider

Inherit the `__base.model_provider.ModelProvider` base class and implement the following interface:

```python
def validate_provider_credentials(self, credentials: dict) -> None:
    """
    Validate provider credentials
    You can choose any validate_credentials method of model type or implement validate method by yourself,
    such as: get model list api

    if validate failed, raise exception

    :param credentials: provider credentials, credentials form defined in `provider_credential_schema`.
    """
```

* `credentials` (object) Credential information

The credential parameters are defined by the provider YAML configuration file's `provider_credential_schema`, passed in as `api_key`, etc. If validation fails, please throw a `errors.validate.CredentialsValidateFailedError` error. **Note: Predefined models need to fully implement this interface, while custom model providers only need to implement it simply as follows:**

```python
class XinferenceProvider(Provider):
    def validate_provider_credentials(self, credentials: dict) -> None:
        pass
```

### Models

Models are divided into 5 different types, with different base classes to inherit from and different methods to implement for each type.

#### Common Interfaces

All models need to implement the following 2 methods consistently:

* Model credential validation

Similar to provider credential validation, this validates individual models.

```python
def validate_credentials(self, model: str, credentials: dict) -> None:
    """
    Validate model credentials

    :param model: model name
    :param credentials: model credentials
    :return:
    """
```

Parameters:

* `model` (string) Model name
* `credentials` (object) Credential information

The credential parameters are defined by the provider YAML configuration file's `provider_credential_schema` or `model_credential_schema`, passed in as `api_key`, etc. If validation fails, please throw a `errors.validate.CredentialsValidateFailedError` error.

* Invocation error mapping table

When a model invocation exception occurs, it needs to be mapped to a specified `InvokeError` type in Runtime, which helps Dify handle different errors differently. Runtime Errors:

* `InvokeConnectionError` Connection error during invocation
* `InvokeServerUnavailableError` Service provider unavailable
* `InvokeRateLimitError` Rate limit reached
* `InvokeAuthorizationError` Authentication failed
* `InvokeBadRequestError` Incorrect parameters passed

```python
@property
def _invoke_error_mapping(self) -> dict[type[InvokeError], list[type[Exception]]]:
    """
    Map model invoke error to unified error
    The key is the error type thrown to the caller
    The value is the error type thrown by the model,
    which needs to be converted into a unified error type for the caller.

    :return: Invoke error mapping
    """
```

You can also directly throw corresponding Errors and define them as follows, so that in subsequent calls you can directly throw exceptions like `InvokeConnectionError`.

#### LLM

Inherit the `__base.large_language_model.LargeLanguageModel` base class and implement the following interface:

* LLM Invocation

Implement the core method for LLM invocation, which can support both streaming and synchronous responses.

```python
def _invoke(self, model: str, credentials: dict,
            prompt_messages: list[PromptMessage], model_parameters: dict,
            tools: Optional[list[PromptMessageTool]] = None, stop: Optional[list[str]] = None,
            stream: bool = True, user: Optional[str] = None) \
        -> Union[LLMResult, Generator]:
    """
    Invoke large language model

    :param model: model name
    :param credentials: model credentials
    :param prompt_messages: prompt messages
    :param model_parameters: model parameters
    :param tools: tools for tool calling
    :param stop: stop words
    :param stream: is stream response
    :param user: unique user id
    :return: full response or stream response chunk generator result
    """
```

* Parameters:
  * `model` (string) Model name
  * `credentials` (object) Credential information

The credential parameters are defined by the provider YAML configuration file's `provider_credential_schema` or `model_credential_schema`, passed in as `api_key`, etc.

* `prompt_messages` (array\[[PromptMessage](#promptmessage)]) Prompt list

If the model is of `Completion` type, the list only needs to include one [UserPromptMessage](#userpromptmessage) element; if the model is of `Chat` type, different messages need to be passed in as a list of [SystemPromptMessage](#systempromptmessage), [UserPromptMessage](#userpromptmessage), [AssistantPromptMessage](#assistantpromptmessage), [ToolPromptMessage](#toolpromptmessage) elements

* `model_parameters` (object) Model parameters defined by the model YAML configuration's `parameter_rules`.

* `tools` (array\[[PromptMessageTool](#promptmessagetool)]) \[optional] Tool list, equivalent to `function` in `function calling`. This is the tool list passed to tool calling.

* `stop` (array\[string]) \[optional] Stop sequence. The model response will stop output before the string defined in the stop sequence.

* `stream` (bool) Whether to stream output, default is True
For streaming output, it returns Generator\[[LLMResultChunk](#llmresultchunk)], for non-streaming output, it returns [LLMResult](#llmresult).

* `user` (string) \[optional] A unique identifier for the user that can help the provider monitor and detect abuse.

* Return Value

For streaming output, it returns Generator\[[LLMResultChunk](#llmresultchunk)], for non-streaming output, it returns [LLMResult](#llmresult).

* Pre-calculate input tokens

If the model does not provide a pre-calculation tokens interface, you can directly return 0.

```python
def get_num_tokens(self, model: str, credentials: dict, prompt_messages: list[PromptMessage],
                   tools: Optional[list[PromptMessageTool]] = None) -> int:
    """
    Get number of tokens for given prompt messages

    :param model: model name
    :param credentials: model credentials
    :param prompt_messages: prompt messages
    :param tools: tools for tool calling
    :return:
    """
```

Parameter explanations are the same as in `LLM Invocation` above. This interface needs to calculate based on the appropriate `tokenizer` for the corresponding `model`. If the corresponding model does not provide a `tokenizer`, you can use the `_get_num_tokens_by_gpt2(text: str)` method in the `AIModel` base class for calculation.

* Get custom model rules [optional]

```python
def get_customizable_model_schema(self, model: str, credentials: dict) -> Optional[AIModelEntity]:
    """
    Get customizable model schema

    :param model: model name
    :param credentials: model credentials
    :return: model schema
    """
```

When a provider supports adding custom LLMs, this method can be implemented to allow custom models to obtain model rules. By default, it returns None.

For most fine-tuned models under the `OpenAI` provider, the base model can be obtained through the fine-tuned model name, such as `gpt-3.5-turbo-1106`, and then return the predefined parameter rules of the base model. Refer to the specific implementation of [OpenAI](https://github.com/langgenius/dify-official-plugins/tree/main/models/openai).

#### TextEmbedding

Inherit the `__base.text_embedding_model.TextEmbeddingModel` base class and implement the following interface:

* Embedding Invocation

```python
def _invoke(self, model: str, credentials: dict,
            texts: list[str], user: Optional[str] = None) \
        -> TextEmbeddingResult:
    """
    Invoke large language model

    :param model: model name
    :param credentials: model credentials
    :param texts: texts to embed
    :param user: unique user id
    :return: embeddings result
    """
```

* Parameters:

* `model` (string) Model name
* `credentials` (object) Credential information

The credential parameters are defined by the provider YAML configuration file's `provider_credential_schema` or `model_credential_schema`, passed in as `api_key`, etc.

* `texts` (array\[string]) Text list, can be processed in batch
* `user` (string) \[optional] A unique identifier for the user
Can help the provider monitor and detect abuse.

* Return:

[TextEmbeddingResult](#textembeddingresult) entity.

* Pre-calculate tokens

```python
def get_num_tokens(self, model: str, credentials: dict, texts: list[str]) -> int:
    """
    Get number of tokens for given prompt messages

    :param model: model name
    :param credentials: model credentials
    :param texts: texts to embed
    :return:
    """
```

Parameter explanations can be found in the `Embedding Invocation` section above.

Similar to the `LargeLanguageModel` above, this interface needs to calculate based on the appropriate `tokenizer` for the corresponding `model`. If the corresponding model does not provide a `tokenizer`, you can use the `_get_num_tokens_by_gpt2(text: str)` method in the `AIModel` base class for calculation.

#### Rerank

Inherit the `__base.rerank_model.RerankModel` base class and implement the following interface:

* Rerank Invocation

```python
def _invoke(self, model: str, credentials: dict,
            query: str, docs: list[str], score_threshold: Optional[float] = None, top_n: Optional[int] = None,
            user: Optional[str] = None) \
        -> RerankResult:
    """
    Invoke rerank model

    :param model: model name
    :param credentials: model credentials
    :param query: search query
    :param docs: docs for reranking
    :param score_threshold: score threshold
    :param top_n: top n
    :param user: unique user id
    :return: rerank result
    """
```

* Parameters:

* `model` (string) Model name
* `credentials` (object) Credential information
The credential parameters are defined by the provider YAML configuration file's `provider_credential_schema` or `model_credential_schema`, passed in as `api_key`, etc.
* `query` (string) Query request content
* `docs` (array\[string]) List of segments that need to be reranked
* `score_threshold` (float) \[optional] Score threshold
* `top_n` (int) \[optional] Take the top n segments
* `user` (string) \[optional] A unique identifier for the user
Can help the provider monitor and detect abuse.

* Return:

[RerankResult](#rerankresult) entity.

#### Speech2text

Inherit the `__base.speech2text_model.Speech2TextModel` base class and implement the following interface:

* Invoke

```python
def _invoke(self, model: str, credentials: dict,
            file: IO[bytes], user: Optional[str] = None) \
        -> str:
    """
    Invoke large language model

    :param model: model name
    :param credentials: model credentials
    :param file: audio file
    :param user: unique user id
    :return: text for given audio file
    """
```

* Parameters:

* `model` (string) Model name
* `credentials` (object) Credential information
The credential parameters are defined by the provider YAML configuration file's `provider_credential_schema` or `model_credential_schema`, passed in as `api_key`, etc.
* `file` (File) File stream
* `user` (string) \[optional] A unique identifier for the user
Can help the provider monitor and detect abuse.

* Return:

String after speech conversion.

#### Text2speech

Inherit the `__base.text2speech_model.Text2SpeechModel` base class and implement the following interface:

* Invoke

```python
def _invoke(self, model: str, credentials: dict, content_text: str, streaming: bool, user: Optional[str] = None):
    """
    Invoke large language model

    :param model: model name
    :param credentials: model credentials
    :param content_text: text content to be translated
    :param streaming: output is streaming
    :param user: unique user id
    :return: translated audio file
    """
```

* Parameters:

* `model` (string) Model name
* `credentials` (object) Credential information
The credential parameters are defined by the provider YAML configuration file's `provider_credential_schema` or `model_credential_schema`, passed in as `api_key`, etc.
* `content_text` (string) Text content to be converted
* `streaming` (bool) Whether to stream output
* `user` (string) \[optional] A unique identifier for the user
Can help the provider monitor and detect abuse.

* Return:

Audio stream after text conversion.


#### Moderation

Inherit the `__base.moderation_model.ModerationModel` base class and implement the following interface:

* Invoke

```python
def _invoke(self, model: str, credentials: dict,
            text: str, user: Optional[str] = None) \
        -> bool:
    """
    Invoke large language model

    :param model: model name
    :param credentials: model credentials
    :param text: text to moderate
    :param user: unique user id
    :return: false if text is safe, true otherwise
    """
```

* Parameters:

* `model` (string) Model name
* `credentials` (object) Credential information
The credential parameters are defined by the provider YAML configuration file's `provider_credential_schema` or `model_credential_schema`, passed in as `api_key`, etc.
* `text` (string) Text content
* `user` (string) \[optional] A unique identifier for the user
Can help the provider monitor and detect abuse.

* Return:

False indicates the input text is safe, True indicates it is not.

### Entities

#### PromptMessageRole

Message role

```python
class PromptMessageRole(Enum):
    """
    Enum class for prompt message.
    """
    SYSTEM = "system"
    USER = "user"
    ASSISTANT = "assistant"
    TOOL = "tool"
```

#### PromptMessageContentType

Message content type, divided into plain text and images.

```python
class PromptMessageContentType(Enum):
    """
    Enum class for prompt message content type.
    """
    TEXT = 'text'
    IMAGE = 'image'
```

#### PromptMessageContent

Message content base class, used only for parameter declaration, cannot be initialized.

```python
class PromptMessageContent(BaseModel):
    """
    Model class for prompt message content.
    """
    type: PromptMessageContentType
    data: str  # Content data
```

Currently supports two types: text and images, and can support text and multiple images simultaneously.
You need to initialize `TextPromptMessageContent` and `ImagePromptMessageContent` separately.

#### TextPromptMessageContent

```python
class TextPromptMessageContent(PromptMessageContent):
    """
    Model class for text prompt message content.
    """
    type: PromptMessageContentType = PromptMessageContentType.TEXT
```

When passing in text and images, text needs to be constructed as this entity as part of the `content` list.

#### ImagePromptMessageContent

```python
class ImagePromptMessageContent(PromptMessageContent):
    """
    Model class for image prompt message content.
    """
    class DETAIL(Enum):
        LOW = 'low'
        HIGH = 'high'

    type: PromptMessageContentType = PromptMessageContentType.IMAGE
    detail: DETAIL = DETAIL.LOW  # Resolution
```

When passing in text and images, images need to be constructed as this entity as part of the `content` list.
`data` can be a `url` or an image `base64` encoded string.

#### PromptMessage

Base class for all Role message bodies, used only for parameter declaration, cannot be initialized.

```python
class PromptMessage(ABC, BaseModel):
    """
    Model class for prompt message.
    """
    role: PromptMessageRole  # Message role
    content: Optional[str | list[PromptMessageContent]] = None  # Supports two types: string and content list. The content list is for multimodal needs, see PromptMessageContent for details.
    name: Optional[str] = None  # Name, optional.
```

#### UserPromptMessage

UserMessage message body, represents user messages.

```python
class UserPromptMessage(PromptMessage):
    """
    Model class for user prompt message.
    """
    role: PromptMessageRole = PromptMessageRole.USER
```

#### AssistantPromptMessage

Represents model response messages, typically used for `few-shots` or chat history input.

```python
class AssistantPromptMessage(PromptMessage):
    """
    Model class for assistant prompt message.
    """
    class ToolCall(BaseModel):
        """
        Model class for assistant prompt message tool call.
        """
        class ToolCallFunction(BaseModel):
            """
            Model class for assistant prompt message tool call function.
            """
            name: str  # Tool name
            arguments: str  # Tool parameters

        id: str  # Tool ID, only effective for OpenAI tool call, a unique ID for tool invocation, the same tool can be called multiple times
        type: str  # Default is function
        function: ToolCallFunction  # Tool call information

    role: PromptMessageRole = PromptMessageRole.ASSISTANT
    tool_calls: list[ToolCall] = []  # Model's tool call results (only returned when tools are passed in and the model decides to call them)
```

Here `tool_calls` is the list of `tool call` returned by the model after passing in `tools` to the model.

#### SystemPromptMessage

Represents system messages, typically used to set system instructions for the model.

```python
class SystemPromptMessage(PromptMessage):
    """
    Model class for system prompt message.
    """
    role: PromptMessageRole = PromptMessageRole.SYSTEM
```

#### ToolPromptMessage

Represents tool messages, used to pass results to the model for next-step planning after a tool has been executed.

```python
class ToolPromptMessage(PromptMessage):
    """
    Model class for tool prompt message.
    """
    role: PromptMessageRole = PromptMessageRole.TOOL
    tool_call_id: str  # Tool call ID, if OpenAI tool call is not supported, you can also pass in the tool name
```

The base class's `content` passes in the tool execution result.

#### PromptMessageTool

```python
class PromptMessageTool(BaseModel):
    """
    Model class for prompt message tool.
    """
    name: str  # Tool name
    description: str  # Tool description
    parameters: dict  # Tool parameters dict

```

***

#### LLMResult

```python
class LLMResult(BaseModel):
    """
    Model class for llm result.
    """
    model: str  # Actually used model
    prompt_messages: list[PromptMessage]  # Prompt message list
    message: AssistantPromptMessage  # Reply message
    usage: LLMUsage  # Tokens used and cost information
    system_fingerprint: Optional[str] = None  # Request fingerprint, refer to OpenAI parameter definition
```

#### LLMResultChunkDelta

Delta entity within each iteration in streaming response

```python
class LLMResultChunkDelta(BaseModel):
    """
    Model class for llm result chunk delta.
    """
    index: int  # Sequence number
    message: AssistantPromptMessage  # Reply message
    usage: Optional[LLMUsage] = None  # Tokens used and cost information, only returned in the last message
    finish_reason: Optional[str] = None  # Completion reason, only returned in the last message
```

#### LLMResultChunk

Iteration entity in streaming response

```python
class LLMResultChunk(BaseModel):
    """
    Model class for llm result chunk.
    """
    model: str  # Actually used model
    prompt_messages: list[PromptMessage]  # Prompt message list
    system_fingerprint: Optional[str] = None  # Request fingerprint, refer to OpenAI parameter definition
    delta: LLMResultChunkDelta  # Changes in content for each iteration
```

#### LLMUsage

```python
class LLMUsage(ModelUsage):
    """
    Model class for llm usage.
    """
    prompt_tokens: int  # Tokens used by prompt
    prompt_unit_price: Decimal  # Prompt unit price
    prompt_price_unit: Decimal  # Prompt price unit, i.e., unit price based on how many tokens
    prompt_price: Decimal  # Prompt cost
    completion_tokens: int  # Tokens used by completion
    completion_unit_price: Decimal  # Completion unit price
    completion_price_unit: Decimal  # Completion price unit, i.e., unit price based on how many tokens
    completion_price: Decimal  # Completion cost
    total_tokens: int  # Total tokens used
    total_price: Decimal  # Total cost
    currency: str  # Currency unit
    latency: float  # Request time (s)
```

***

#### TextEmbeddingResult

```python
class TextEmbeddingResult(BaseModel):
    """
    Model class for text embedding result.
    """
    model: str  # Actually used model
    embeddings: list[list[float]]  # Embedding vector list, corresponding to the input texts list
    usage: EmbeddingUsage  # Usage information
```

#### EmbeddingUsage

```python
class EmbeddingUsage(ModelUsage):
    """
    Model class for embedding usage.
    """
    tokens: int  # Tokens used
    total_tokens: int  # Total tokens used
    unit_price: Decimal  # Unit price
    price_unit: Decimal  # Price unit, i.e., unit price based on how many tokens
    total_price: Decimal  # Total cost
    currency: str  # Currency unit
    latency: float  # Request time (s)
```

***

#### RerankResult

```python
class RerankResult(BaseModel):
    """
    Model class for rerank result.
    """
    model: str  # Actually used model
    docs: list[RerankDocument]  # List of reranked segments
```

#### RerankDocument

```python
class RerankDocument(BaseModel):
    """
    Model class for rerank document.
    """
    index: int  # Original sequence number
    text: str  # Segment text content
    score: float  # Score
```

## Related Resources

- [Model Design Rules](/plugin-dev-en/0411-model-designing-rules) - Understand the standards for model configuration
- [Model Plugin Introduction](/plugin-dev-en/0411-model-plugin-introduction) - Quickly understand the basic concepts of model plugins
- [Quickly Integrate a New Model](/plugin-dev-en/0211-getting-started-new-model) - Learn how to add new models to existing providers
- [Create a New Model Provider](/plugin-dev-en/0222-creating-new-model-provider) - Learn how to develop brand new model providers

{/*
Contributing Section
DO NOT edit this section!
It will be automatically generated by the script.
*/}

---

[Edit this page](https://github.com/langgenius/dify-docs/edit/main/plugin-dev-en/0412-model-schema.mdx) | [Report an issue](https://github.com/langgenius/dify-docs/issues/new?template=docs.yml)