diff --git a/plugin-dev-en/0411-persistent-storage-kv.mdx b/plugin-dev-en/0411-persistent-storage-kv.mdx
index cf09907e..8a79d7e0 100644
--- a/plugin-dev-en/0411-persistent-storage-kv.mdx
+++ b/plugin-dev-en/0411-persistent-storage-kv.mdx
@@ -7,63 +7,148 @@ dimensions:
 standard_title: Persistent Storage KV
 language: en
 title: Persistent Storage
-description: This document introduces the persistent storage functionality in Dify
-  plugins, detailing how to use the KV database in plugins to store, retrieve, and
-  delete data. This feature enables plugins to persistently store data within the
-  same Workspace, meeting the needs for data preservation across sessions.
+description: Learn how to implement persistent storage in your Dify plugins using the built-in key-value database to maintain state across interactions.
 ---
 
-When examining Tools and Endpoints in plugins individually, it's not difficult to see that in most cases, they can only complete single-round interactions: request, return data, and the task ends.
+## Overview
 
-If there is data that needs to be stored long-term, such as implementing persistent memory, the plugin needs to have persistent storage capabilities. **The persistent storage mechanism allows plugins to have the ability to persistently store data within the same Workspace**. Currently, a KV database is provided to meet storage needs, and more flexible and powerful storage interfaces may be introduced in the future based on actual usage.
+Most plugin tools and endpoints operate in a stateless, single-round interaction model:
+1. Receive a request
+2. Process data
+3. Return a response
+4. End the interaction
 
-### Storing Keys
+However, many real-world applications require maintaining state across multiple interactions. This is where **persistent storage** becomes essential.
 
-#### **Entry Point**
+<Info>
+The persistent storage mechanism allows plugins to store data persistently within the same workspace, enabling stateful applications and memory features.
+</Info>
+
+Dify currently provides a key-value (KV) storage system for plugins, with plans to introduce more flexible and powerful storage interfaces in the future based on developer needs.
+
+## Accessing Storage
+
+All storage operations are performed through the `storage` object available in your plugin's session:
 
 ```python
-    self.session.storage
+# Access the storage interface
+storage = self.session.storage
 ```
 
-#### **Interface**
+## Storage Operations
+
+### Storing Data
+
+Store data with the `set` method:
 
 ```python
-    def set(self, key: str, val: bytes) -> None:
-        pass
+def set(self, key: str, val: bytes) -> None:
+    """
+    Store data in persistent storage
+    
+    Parameters:
+        key: Unique identifier for your data
+        val: Binary data to store (bytes)
+    """
+    pass
 ```
 
-Note that what is passed in is bytes, so you can actually store files in it.
+<Warning>
+The value must be in `bytes` format. This provides flexibility to store various types of data, including files.
+</Warning>
 
-### Getting Keys
-
-#### **Entry Point**
+#### Example: Storing Different Data Types
 
 ```python
-    self.session.storage
+# String data (must convert to bytes)
+storage.set("user_name", "John Doe".encode('utf-8'))
+
+# JSON data
+import json
+user_data = {"name": "John", "age": 30, "preferences": ["AI", "NLP"]}
+storage.set("user_data", json.dumps(user_data).encode('utf-8'))
+
+# File data
+with open("image.jpg", "rb") as f:
+    image_data = f.read()
+    storage.set("profile_image", image_data)
 ```
 
-#### **Interface**
+### Retrieving Data
+
+Retrieve stored data with the `get` method:
 
 ```python
-    def get(self, key: str) -> bytes:
-        pass
+def get(self, key: str) -> bytes:
+    """
+    Retrieve data from persistent storage
+    
+    Parameters:
+        key: Unique identifier for your data
+        
+    Returns:
+        The stored data as bytes, or None if key doesn't exist
+    """
+    pass
 ```
 
-### Deleting Keys
-
-#### **Entry Point**
+#### Example: Retrieving and Converting Data
 
 ```python
-    self.session.storage
+# Retrieving string data
+name_bytes = storage.get("user_name")
+if name_bytes:
+    name = name_bytes.decode('utf-8')
+    print(f"Retrieved name: {name}")
+
+# Retrieving JSON data
+import json
+user_data_bytes = storage.get("user_data")
+if user_data_bytes:
+    user_data = json.loads(user_data_bytes.decode('utf-8'))
+    print(f"User preferences: {user_data['preferences']}")
 ```
 
-#### **Interface**
+### Deleting Data
+
+Delete stored data with the `delete` method:
 
 ```python
-    def delete(self, key: str) -> None:
-        pass
+def delete(self, key: str) -> None:
+    """
+    Delete data from persistent storage
+    
+    Parameters:
+        key: Unique identifier for the data to delete
+    """
+    pass
 ```
 
+## Best Practices
+
+<CardGroup cols={2}>
+  <Card title="Use Descriptive Keys" icon="key">
+    Create a consistent naming scheme for your keys to avoid conflicts and make your code more maintainable.
+  </Card>
+  <Card title="Handle Missing Keys" icon="triangle-exclamation">
+    Always check if data exists before processing it, as the key might not be found.
+  </Card>
+  <Card title="Serialize Complex Data" icon="code">
+    Convert complex objects to JSON or other serialized formats before storing.
+  </Card>
+  <Card title="Implement Error Handling" icon="shield">
+    Wrap storage operations in try/except blocks to handle potential errors gracefully.
+  </Card>
+</CardGroup>
+
+## Common Use Cases
+
+- **User Preferences**: Store user settings and preferences between sessions
+- **Conversation History**: Maintain context from previous conversations
+- **API Tokens**: Store authentication tokens securely
+- **Cached Data**: Store frequently accessed data to reduce API calls
+- **File Storage**: Store user-uploaded files or generated content
+
 {/*
 Contributing Section
 DO NOT edit this section!
diff --git a/plugin-dev-en/0412-model-schema.mdx b/plugin-dev-en/0412-model-schema.mdx
index 71a04720..a8aa980e 100644
--- a/plugin-dev-en/0412-model-schema.mdx
+++ b/plugin-dev-en/0412-model-schema.mdx
@@ -7,386 +7,947 @@ dimensions:
 standard_title: Model Schema
 language: en
 title: Model API Interface
-description: This document provides detailed interface specifications required for
-  Dify model plugin development, including model provider implementation, interface
-  definitions for five model types (LLM, TextEmbedding, Rerank, Speech2text, Text2speech),
-  and complete specifications for related data structures such as PromptMessage and
-  LLMResult. The document serves as a development reference for developers implementing
-  various model integrations.
+description: Comprehensive guide to the Dify model plugin API including implementation requirements for LLM, TextEmbedding, Rerank, Speech2text, and Text2speech models, with detailed specifications for all related data structures.
 ---
 
-This section introduces the interface methods and parameter descriptions that providers and each model type need to implement. Before developing a model plugin, you may first need to read [Model Design Rules](/plugin-dev-en/0411-model-designing-rules) and [Model Plugin Introduction](/plugin-dev-en/0131-model-plugin-introduction).
+## Introduction
 
-### Model Provider
+This document details the interfaces and data structures required to implement Dify model plugins. It serves as a technical reference for developers integrating AI models with the Dify platform.
 
-Inherit the `__base.model_provider.ModelProvider` base class and implement the following interface:
+<Note>
+Before diving into this API reference, we recommend first reading the [Model Design Rules](/plugin-dev-en/0411-model-designing-rules) and [Model Plugin Introduction](/plugin-dev-en/0131-model-plugin-introduction) for conceptual understanding.
+</Note>
 
-```python
+<CardGroup cols={2}>
+  <Card title="Provider Implementation" icon="plug" href="#model-provider">
+    Learn how to implement model provider classes for different AI service providers
+  </Card>
+  <Card title="Model Types" icon="layer-group" href="#models">
+    Implementation details for the five supported model types: LLM, Embedding, Rerank, Speech2Text, and Text2Speech
+  </Card>
+  <Card title="Data Structures" icon="database" href="#entities">
+    Comprehensive reference for all data structures used in the model API
+  </Card>
+  <Card title="Error Handling" icon="triangle-exclamation" href="#common-interfaces">
+    Guidelines for proper error mapping and exception handling
+  </Card>
+</CardGroup>
+
+## Model Provider
+
+Every model provider must inherit from the `__base.model_provider.ModelProvider` base class and implement the credential validation interface.
+
+### Provider Credential Validation
+
+<CodeGroup>
+```python Core Implementation
 def validate_provider_credentials(self, credentials: dict) -> None:
     """
-    Validate provider credentials
-    You can choose any validate_credentials method of model type or implement validate method by yourself,
-    such as: get model list api
-
-    if validate failed, raise exception
-
-    :param credentials: provider credentials, credentials form defined in `provider_credential_schema`.
+    Validate provider credentials by making a test API call
+    
+    Parameters:
+        credentials: Provider credentials as defined in `provider_credential_schema`
+        
+    Raises:
+        CredentialsValidateFailedError: If validation fails
     """
+    try:
+        # Example implementation - validate using an LLM model instance
+        model_instance = self.get_model_instance(ModelType.LLM)
+        model_instance.validate_credentials(
+            model="example-model", 
+            credentials=credentials
+        )
+    except Exception as ex:
+        logger.exception(f"Credential validation failed")
+        raise CredentialsValidateFailedError(f"Invalid credentials: {str(ex)}")
 ```
 
-* `credentials` (object) Credential information
-
-The credential parameters are defined by the provider YAML configuration file's `provider_credential_schema`, passed in as `api_key`, etc. If validation fails, please throw a `errors.validate.CredentialsValidateFailedError` error. **Note: Predefined models need to fully implement this interface, while custom model providers only need to implement it simply as follows:**
-
-```python
+```python Custom Model Provider
 class XinferenceProvider(Provider):
     def validate_provider_credentials(self, credentials: dict) -> None:
+        """
+        For custom-only model providers, a simple implementation is sufficient
+        as validation happens at the model level
+        """
         pass
 ```
+</CodeGroup>
 
-### Models
+<ParamField path="credentials" type="dict">
+  Credential information as defined in the provider's YAML configuration under `provider_credential_schema`. 
+  Typically includes fields like `api_key`, `organization_id`, etc.
+</ParamField>
 
-Models are divided into 5 different types, with different base classes to inherit from and different methods to implement for each type.
+<Warning>
+If validation fails, your implementation must raise a `CredentialsValidateFailedError` exception. This ensures proper error handling in the Dify UI.
+</Warning>
 
-#### Common Interfaces
+<Tip>
+For predefined model providers, you should implement a thorough validation method that verifies the credentials work with your API. For custom model providers (where each model has its own credentials), a simplified implementation is sufficient.
+</Tip>
 
-All models need to implement the following 2 methods consistently:
+## Models
 
-* Model credential validation
+Dify supports five distinct model types, each requiring implementation of specific interfaces. However, all model types share some common requirements.
 
-Similar to provider credential validation, this validates individual models.
+### Common Interfaces
 
-```python
+Every model implementation, regardless of type, must implement these two fundamental methods:
+
+#### 1. Model Credential Validation
+
+<CodeGroup>
+```python Implementation
 def validate_credentials(self, model: str, credentials: dict) -> None:
     """
-    Validate model credentials
-
-    :param model: model name
-    :param credentials: model credentials
-    :return:
+    Validate that the provided credentials work with the specified model
+    
+    Parameters:
+        model: The specific model identifier (e.g., "gpt-4")
+        credentials: Authentication details for the model
+        
+    Raises:
+        CredentialsValidateFailedError: If validation fails
     """
+    try:
+        # Make a lightweight API call to verify credentials
+        # Example: List available models or check account status
+        response = self._api_client.validate_api_key(credentials["api_key"])
+        
+        # Verify the specific model is available if applicable
+        if model not in response.get("available_models", []):
+            raise CredentialsValidateFailedError(f"Model {model} is not available")
+            
+    except ApiException as e:
+        raise CredentialsValidateFailedError(str(e))
 ```
+</CodeGroup>
 
-Parameters:
+<ParamField path="model" type="string" required>
+  The specific model identifier to validate (e.g., "gpt-4", "claude-3-opus")
+</ParamField>
 
-* `model` (string) Model name
-* `credentials` (object) Credential information
+<ParamField path="credentials" type="dict" required>
+  Credential information as defined in the provider's configuration
+</ParamField>
 
-The credential parameters are defined by the provider YAML configuration file's `provider_credential_schema` or `model_credential_schema`, passed in as `api_key`, etc. If validation fails, please throw a `errors.validate.CredentialsValidateFailedError` error.
+#### 2. Error Mapping
 
-* Invocation error mapping table
-
-When a model invocation exception occurs, it needs to be mapped to a specified `InvokeError` type in Runtime, which helps Dify handle different errors differently. Runtime Errors:
-
-* `InvokeConnectionError` Connection error during invocation
-* `InvokeServerUnavailableError` Service provider unavailable
-* `InvokeRateLimitError` Rate limit reached
-* `InvokeAuthorizationError` Authentication failed
-* `InvokeBadRequestError` Incorrect parameters passed
-
-```python
+<CodeGroup>
+```python Implementation
 @property
 def _invoke_error_mapping(self) -> dict[type[InvokeError], list[type[Exception]]]:
     """
-    Map model invoke error to unified error
-    The key is the error type thrown to the caller
-    The value is the error type thrown by the model,
-    which needs to be converted into a unified error type for the caller.
-
-    :return: Invoke error mapping
+    Map provider-specific exceptions to standardized Dify error types
+    
+    Returns:
+        Dictionary mapping Dify error types to lists of provider exception types
     """
+    return {
+        InvokeConnectionError: [
+            requests.exceptions.ConnectionError,
+            requests.exceptions.Timeout,
+            ConnectionRefusedError
+        ],
+        InvokeServerUnavailableError: [
+            ServiceUnavailableError,
+            HTTPStatusError
+        ],
+        InvokeRateLimitError: [
+            RateLimitExceededError,
+            QuotaExceededError
+        ],
+        InvokeAuthorizationError: [
+            AuthenticationError,
+            InvalidAPIKeyError,
+            PermissionDeniedError
+        ],
+        InvokeBadRequestError: [
+            InvalidRequestError,
+            ValidationError
+        ]
+    }
+```
+</CodeGroup>
+
+<Accordion title="Available Error Types">
+  <ParamField path="InvokeConnectionError" type="class">
+    Network connection failures, timeouts
+  </ParamField>
+  <ParamField path="InvokeServerUnavailableError" type="class">
+    Service provider is down or unavailable
+  </ParamField>
+  <ParamField path="InvokeRateLimitError" type="class">
+    Rate limits or quota limits reached
+  </ParamField>
+  <ParamField path="InvokeAuthorizationError" type="class">
+    Authentication or permission issues
+  </ParamField>
+  <ParamField path="InvokeBadRequestError" type="class">
+    Invalid parameters or requests
+  </ParamField>
+</Accordion>
+
+<Tip>
+You can alternatively raise these standardized error types directly in your code instead of relying on the error mapping. This approach gives you more control over error messages.
+</Tip>
+
+### LLM Implementation
+
+To implement a Large Language Model provider, inherit from the `__base.large_language_model.LargeLanguageModel` base class and implement these methods:
+
+#### 1. Model Invocation
+
+This core method handles both streaming and non-streaming API calls to language models.
+
+<CodeGroup>
+```python Core Implementation
+def _invoke(
+    self, 
+    model: str, 
+    credentials: dict,
+    prompt_messages: list[PromptMessage], 
+    model_parameters: dict,
+    tools: Optional[list[PromptMessageTool]] = None, 
+    stop: Optional[list[str]] = None,
+    stream: bool = True, 
+    user: Optional[str] = None
+) -> Union[LLMResult, Generator[LLMResultChunk, None, None]]:
+    """
+    Invoke the language model
+    """
+    # Prepare API parameters
+    api_params = self._prepare_api_parameters(
+        model, 
+        credentials, 
+        prompt_messages, 
+        model_parameters,
+        tools, 
+        stop
+    )
+    
+    try:
+        # Choose between streaming and non-streaming implementation
+        if stream:
+            return self._invoke_stream(model, api_params, user)
+        else:
+            return self._invoke_sync(model, api_params, user)
+            
+    except Exception as e:
+        # Map errors using the error mapping property
+        self._handle_api_error(e)
+
+# Helper methods for streaming and non-streaming calls
+def _invoke_stream(self, model, api_params, user):
+    # Implement streaming call and yield chunks
+    pass
+    
+def _invoke_sync(self, model, api_params, user):
+    # Implement synchronous call and return complete result
+    pass
+```
+</CodeGroup>
+
+<Accordion title="Parameters">
+  <ParamField path="model" type="string" required>
+    Model identifier (e.g., "gpt-4", "claude-3")
+  </ParamField>
+  
+  <ParamField path="credentials" type="dict" required>
+    Authentication credentials for the API
+  </ParamField>
+  
+  <ParamField path="prompt_messages" type="list[PromptMessage]" required>
+    Message list in Dify's standardized format:
+    - For `completion` models: Include a single `UserPromptMessage`
+    - For `chat` models: Include `SystemPromptMessage`, `UserPromptMessage`, `AssistantPromptMessage`, `ToolPromptMessage` as needed
+  </ParamField>
+  
+  <ParamField path="model_parameters" type="dict" required>
+    Model-specific parameters (temperature, top_p, etc.) as defined in the model's YAML configuration
+  </ParamField>
+  
+  <ParamField path="tools" type="list[PromptMessageTool]">
+    Tool definitions for function calling capabilities
+  </ParamField>
+  
+  <ParamField path="stop" type="list[string]">
+    Stop sequences that will halt model generation when encountered
+  </ParamField>
+  
+  <ParamField path="stream" type="boolean" default={true}>
+    Whether to return a streaming response
+  </ParamField>
+  
+  <ParamField path="user" type="string">
+    User identifier for API monitoring
+  </ParamField>
+</Accordion>
+
+<Accordion title="Return Values">
+  <ParamField path="stream=True" type="Generator[LLMResultChunk, None, None]">
+    A generator yielding chunks of the response as they become available
+  </ParamField>
+  
+  <ParamField path="stream=False" type="LLMResult">
+    A complete response object with the full generated text
+  </ParamField>
+</Accordion>
+
+<Tip>
+We recommend implementing separate helper methods for streaming and non-streaming calls to keep your code organized and maintainable.
+</Tip>
+
+#### 2. Token Counting
+
+<CodeGroup>
+```python Implementation
+def get_num_tokens(
+    self, 
+    model: str, 
+    credentials: dict, 
+    prompt_messages: list[PromptMessage],
+    tools: Optional[list[PromptMessageTool]] = None
+) -> int:
+    """
+    Calculate the number of tokens in the prompt
+    """
+    # Convert prompt_messages to the format expected by the tokenizer
+    text = self._convert_messages_to_text(prompt_messages)
+    
+    try:
+        # Use the appropriate tokenizer for this model
+        tokenizer = self._get_tokenizer(model)
+        return len(tokenizer.encode(text))
+    except Exception:
+        # Fall back to a generic tokenizer
+        return self._get_num_tokens_by_gpt2(text)
+```
+</CodeGroup>
+
+<Info>
+If the model doesn't provide a tokenizer, you can use the base class's `_get_num_tokens_by_gpt2(text)` method for a reasonable approximation.
+</Info>
+
+#### 3. Custom Model Schema (Optional)
+
+<CodeGroup>
+```python Implementation
+def get_customizable_model_schema(
+    self, 
+    model: str, 
+    credentials: dict
+) -> Optional[AIModelEntity]:
+    """
+    Get parameter schema for custom models
+    """
+    # For fine-tuned models, you might return the base model's schema
+    if model.startswith("ft:"):
+        base_model = self._extract_base_model(model)
+        return self._get_predefined_model_schema(base_model)
+    
+    # For standard models, return None to use the predefined schema
+    return None
+```
+</CodeGroup>
+
+<Info>
+This method is only necessary for providers that support custom models. It allows custom models to inherit parameter rules from base models.
+</Info>
+
+### TextEmbedding Implementation
+
+<Info>
+Text embedding models convert text into high-dimensional vectors that capture semantic meaning, which is useful for retrieval, similarity search, and classification.
+</Info>
+
+To implement a Text Embedding provider, inherit from the `__base.text_embedding_model.TextEmbeddingModel` base class:
+
+#### 1. Core Embedding Method
+
+<CodeGroup>
+```python Implementation
+def _invoke(
+    self, 
+    model: str, 
+    credentials: dict,
+    texts: list[str], 
+    user: Optional[str] = None
+) -> TextEmbeddingResult:
+    """
+    Generate embedding vectors for multiple texts
+    """
+    # Set up API client with credentials
+    client = self._get_client(credentials)
+    
+    # Handle batching if needed
+    batch_size = self._get_batch_size(model)
+    all_embeddings = []
+    total_tokens = 0
+    start_time = time.time()
+    
+    # Process in batches to avoid API limits
+    for i in range(0, len(texts), batch_size):
+        batch = texts[i:i+batch_size]
+        
+        # Make API call to the embeddings endpoint
+        response = client.embeddings.create(
+            model=model,
+            input=batch,
+            user=user
+        )
+        
+        # Extract embeddings from response
+        batch_embeddings = [item.embedding for item in response.data]
+        all_embeddings.extend(batch_embeddings)
+        
+        # Track token usage
+        total_tokens += response.usage.total_tokens
+    
+    # Calculate usage metrics
+    elapsed_time = time.time() - start_time
+    usage = self._create_embedding_usage(
+        model=model,
+        tokens=total_tokens,
+        latency=elapsed_time
+    )
+    
+    return TextEmbeddingResult(
+        model=model,
+        embeddings=all_embeddings,
+        usage=usage
+    )
+```
+</CodeGroup>
+
+<Accordion title="Parameters">
+  <ParamField path="model" type="string" required>
+    Embedding model identifier
+  </ParamField>
+  
+  <ParamField path="credentials" type="dict" required>
+    Authentication credentials for the embedding service
+  </ParamField>
+  
+  <ParamField path="texts" type="list[string]" required>
+    List of text inputs to embed
+  </ParamField>
+  
+  <ParamField path="user" type="string">
+    User identifier for API monitoring
+  </ParamField>
+</Accordion>
+
+<Accordion title="Return Value">
+  <ParamField path="TextEmbeddingResult" type="object" required>
+    A structured response containing:
+    - model: The model used for embedding
+    - embeddings: List of embedding vectors corresponding to input texts
+    - usage: Metadata about token usage and costs
+  </ParamField>
+</Accordion>
+
+#### 2. Token Counting Method
+
+<CodeGroup>
+```python Implementation
+def get_num_tokens(
+    self, 
+    model: str, 
+    credentials: dict, 
+    texts: list[str]
+) -> int:
+    """
+    Calculate the number of tokens in the texts to be embedded
+    """
+    # Join all texts to estimate token count
+    combined_text = " ".join(texts)
+    
+    try:
+        # Use the appropriate tokenizer for this model
+        tokenizer = self._get_tokenizer(model)
+        return len(tokenizer.encode(combined_text))
+    except Exception:
+        # Fall back to a generic tokenizer
+        return self._get_num_tokens_by_gpt2(combined_text)
+```
+</CodeGroup>
+
+<Tip>
+For embedding models, accurate token counting is important for cost estimation, but not critical for functionality. The `_get_num_tokens_by_gpt2` method provides a reasonable approximation for most models.
+</Tip>
+
+### Rerank Implementation
+
+<Info>
+Reranking models help improve search quality by re-ordering a set of candidate documents based on their relevance to a query, typically after an initial retrieval phase.
+</Info>
+
+To implement a Reranking provider, inherit from the `__base.rerank_model.RerankModel` base class:
+
+<CodeGroup>
+```python Implementation
+def _invoke(
+    self, 
+    model: str, 
+    credentials: dict,
+    query: str, 
+    docs: list[str], 
+    score_threshold: Optional[float] = None, 
+    top_n: Optional[int] = None,
+    user: Optional[str] = None
+) -> RerankResult:
+    """
+    Rerank documents based on relevance to the query
+    """
+    # Set up API client with credentials
+    client = self._get_client(credentials)
+    
+    # Prepare request data
+    request_data = {
+        "query": query,
+        "documents": docs,
+    }
+    
+    # Call reranking API endpoint
+    response = client.rerank(
+        model=model,
+        **request_data,
+        user=user
+    )
+    
+    # Process results
+    ranked_results = []
+    for i, result in enumerate(response.results):
+        # Create RerankDocument for each result
+        doc = RerankDocument(
+            index=result.document_index,  # Original index in docs list
+            text=docs[result.document_index],  # Original text
+            score=result.relevance_score  # Relevance score
+        )
+        ranked_results.append(doc)
+    
+    # Sort by score in descending order
+    ranked_results.sort(key=lambda x: x.score, reverse=True)
+    
+    # Apply score threshold filtering if specified
+    if score_threshold is not None:
+        ranked_results = [doc for doc in ranked_results if doc.score >= score_threshold]
+    
+    # Apply top_n limit if specified
+    if top_n is not None and top_n > 0:
+        ranked_results = ranked_results[:top_n]
+    
+    return RerankResult(
+        model=model,
+        docs=ranked_results
+    )
+```
+</CodeGroup>
+
+<Accordion title="Parameters">
+  <ParamField path="model" type="string" required>
+    Reranking model identifier
+  </ParamField>
+  
+  <ParamField path="credentials" type="dict" required>
+    Authentication credentials for the API
+  </ParamField>
+  
+  <ParamField path="query" type="string" required>
+    The search query text
+  </ParamField>
+  
+  <ParamField path="docs" type="list[string]" required>
+    List of document texts to be reranked
+  </ParamField>
+  
+  <ParamField path="score_threshold" type="float">
+    Optional minimum score threshold for filtering results
+  </ParamField>
+  
+  <ParamField path="top_n" type="int">
+    Optional limit on number of results to return
+  </ParamField>
+  
+  <ParamField path="user" type="string">
+    User identifier for API monitoring
+  </ParamField>
+</Accordion>
+
+<Accordion title="Return Value">
+  <ParamField path="RerankResult" type="object" required>
+    A structured response containing:
+    - model: The model used for reranking
+    - docs: List of RerankDocument objects with index, text, and score
+  </ParamField>
+</Accordion>
+
+<Warning>
+Reranking can be computationally expensive, especially with large document sets. Implement batching for large document collections to avoid timeouts or excessive resource consumption.
+</Warning>
+
+### Speech2Text Implementation
+
+<Info>
+Speech-to-text models convert spoken language from audio files into written text, enabling applications like transcription services, voice commands, and accessibility features.
+</Info>
+
+To implement a Speech-to-Text provider, inherit from the `__base.speech2text_model.Speech2TextModel` base class:
+
+<CodeGroup>
+```python Implementation
+def _invoke(
+    self, 
+    model: str, 
+    credentials: dict,
+    file: IO[bytes], 
+    user: Optional[str] = None
+) -> str:
+    """
+    Convert speech audio to text
+    """
+    # Set up API client with credentials
+    client = self._get_client(credentials)
+    
+    try:
+        # Determine the file format
+        file_format = self._detect_audio_format(file)
+        
+        # Prepare the file for API submission
+        # Most APIs require either a file path or binary data
+        audio_data = file.read()
+        
+        # Call the speech-to-text API
+        response = client.audio.transcriptions.create(
+            model=model,
+            file=("audio.mp3", audio_data),  # Adjust filename based on actual format
+            user=user
+        )
+        
+        # Extract and return the transcribed text
+        return response.text
+        
+    except Exception as e:
+        # Map to appropriate error type
+        self._handle_api_error(e)
+        
+    finally:
+        # Reset file pointer for potential reuse
+        file.seek(0)
 ```
 
-You can also directly throw corresponding Errors and define them as follows, so that in subsequent calls you can directly throw exceptions like `InvokeConnectionError`.
-
-#### LLM
-
-Inherit the `__base.large_language_model.LargeLanguageModel` base class and implement the following interface:
-
-* LLM Invocation
-
-Implement the core method for LLM invocation, which can support both streaming and synchronous responses.
-
-```python
-def _invoke(self, model: str, credentials: dict,
-            prompt_messages: list[PromptMessage], model_parameters: dict,
-            tools: Optional[list[PromptMessageTool]] = None, stop: Optional[list[str]] = None,
-            stream: bool = True, user: Optional[str] = None) \
-        -> Union[LLMResult, Generator]:
+```python Helper Methods
+def _detect_audio_format(self, file: IO[bytes]) -> str:
     """
-    Invoke large language model
-
-    :param model: model name
-    :param credentials: model credentials
-    :param prompt_messages: prompt messages
-    :param model_parameters: model parameters
-    :param tools: tools for tool calling
-    :param stop: stop words
-    :param stream: is stream response
-    :param user: unique user id
-    :return: full response or stream response chunk generator result
+    Detect the audio format based on file header
     """
+    # Read the first few bytes to check the file signature
+    header = file.read(12)
+    file.seek(0)  # Reset file pointer
+    
+    # Check for common audio format signatures
+    if header.startswith(b'RIFF') and header[8:12] == b'WAVE':
+        return 'wav'
+    elif header.startswith(b'ID3') or header.startswith(b'\xFF\xFB'):
+        return 'mp3'
+    elif header.startswith(b'OggS'):
+        return 'ogg'
+    elif header.startswith(b'fLaC'):
+        return 'flac'
+    else:
+        # Default or additional format checks
+        return 'mp3'  # Default assumption
+```
+</CodeGroup>
+
+<Accordion title="Parameters">
+  <ParamField path="model" type="string" required>
+    Speech-to-text model identifier
+  </ParamField>
+  
+  <ParamField path="credentials" type="dict" required>
+    Authentication credentials for the API
+  </ParamField>
+  
+  <ParamField path="file" type="IO[bytes]" required>
+    Binary file object containing the audio to transcribe
+  </ParamField>
+  
+  <ParamField path="user" type="string">
+    User identifier for API monitoring
+  </ParamField>
+</Accordion>
+
+<Accordion title="Return Value">
+  <ParamField path="text" type="string" required>
+    The transcribed text from the audio file
+  </ParamField>
+</Accordion>
+
+<Tip>
+Audio format detection is important for proper handling of different file types. Consider implementing a helper method to detect the format from the file header as shown in the example.
+</Tip>
+
+<Warning>
+Some speech-to-text APIs have file size limitations. Consider implementing chunking for large audio files if necessary.
+</Warning>
+
+### Text2Speech Implementation
+
+<Info>
+Text-to-speech models convert written text into natural-sounding speech, enabling applications such as voice assistants, screen readers, and audio content generation.
+</Info>
+
+To implement a Text-to-Speech provider, inherit from the `__base.text2speech_model.Text2SpeechModel` base class:
+
+<CodeGroup>
+```python Implementation
+def _invoke(
+    self, 
+    model: str, 
+    credentials: dict, 
+    content_text: str, 
+    streaming: bool,
+    user: Optional[str] = None
+) -> Union[bytes, Generator[bytes, None, None]]:
+    """
+    Convert text to speech audio
+    """
+    # Set up API client with credentials
+    client = self._get_client(credentials)
+    
+    # Get voice settings based on model
+    voice = self._get_voice_for_model(model)
+    
+    try:
+        # Choose implementation based on streaming preference
+        if streaming:
+            return self._stream_audio(
+                client=client,
+                model=model,
+                text=content_text,
+                voice=voice,
+                user=user
+            )
+        else:
+            return self._generate_complete_audio(
+                client=client,
+                model=model,
+                text=content_text,
+                voice=voice,
+                user=user
+            )
+    except Exception as e:
+        self._handle_api_error(e)
 ```
 
-* Parameters:
-  * `model` (string) Model name
-  * `credentials` (object) Credential information
-
-The credential parameters are defined by the provider YAML configuration file's `provider_credential_schema` or `model_credential_schema`, passed in as `api_key`, etc.
-
-* `prompt_messages` (array\[[PromptMessage](#promptmessage)]) Prompt list
-
-If the model is of `Completion` type, the list only needs to include one [UserPromptMessage](#userpromptmessage) element; if the model is of `Chat` type, different messages need to be passed in as a list of [SystemPromptMessage](#systempromptmessage), [UserPromptMessage](#userpromptmessage), [AssistantPromptMessage](#assistantpromptmessage), [ToolPromptMessage](#toolpromptmessage) elements
-
-* `model_parameters` (object) Model parameters defined by the model YAML configuration's `parameter_rules`.
-
-* `tools` (array\[[PromptMessageTool](#promptmessagetool)]) \[optional] Tool list, equivalent to `function` in `function calling`. This is the tool list passed to tool calling.
-
-* `stop` (array\[string]) \[optional] Stop sequence. The model response will stop output before the string defined in the stop sequence.
-
-* `stream` (bool) Whether to stream output, default is True
-For streaming output, it returns Generator\[[LLMResultChunk](#llmresultchunk)], for non-streaming output, it returns [LLMResult](#llmresult).
-
-* `user` (string) \[optional] A unique identifier for the user that can help the provider monitor and detect abuse.
-
-* Return Value
-
-For streaming output, it returns Generator\[[LLMResultChunk](#llmresultchunk)], for non-streaming output, it returns [LLMResult](#llmresult).
-
-* Pre-calculate input tokens
-
-If the model does not provide a pre-calculation tokens interface, you can directly return 0.
-
-```python
-def get_num_tokens(self, model: str, credentials: dict, prompt_messages: list[PromptMessage],
-                   tools: Optional[list[PromptMessageTool]] = None) -> int:
+```python Helper Methods
+def _stream_audio(self, client, model, text, voice, user=None):
     """
-    Get number of tokens for given prompt messages
-
-    :param model: model name
-    :param credentials: model credentials
-    :param prompt_messages: prompt messages
-    :param tools: tools for tool calling
-    :return:
+    Implementation for streaming audio output
     """
+    # Make API request with stream=True
+    response = client.audio.speech.create(
+        model=model,
+        voice=voice,
+        input=text,
+        stream=True,
+        user=user
+    )
+    
+    # Yield chunks as they arrive
+    for chunk in response:
+        if chunk:
+            yield chunk
+            
+def _generate_complete_audio(self, client, model, text, voice, user=None):
+    """
+    Implementation for complete audio file generation
+    """
+    # Make API request for complete audio
+    response = client.audio.speech.create(
+        model=model,
+        voice=voice,
+        input=text,
+        user=user
+    )
+    
+    # Get audio data as bytes
+    audio_data = response.content
+    return audio_data
+```
+</CodeGroup>
+
+<Accordion title="Parameters">
+  <ParamField path="model" type="string" required>
+    Text-to-speech model identifier
+  </ParamField>
+  
+  <ParamField path="credentials" type="dict" required>
+    Authentication credentials for the API
+  </ParamField>
+  
+  <ParamField path="content_text" type="string" required>
+    Text content to be converted to speech
+  </ParamField>
+  
+  <ParamField path="streaming" type="boolean" required>
+    Whether to return streaming audio or complete file
+  </ParamField>
+  
+  <ParamField path="user" type="string">
+    User identifier for API monitoring
+  </ParamField>
+</Accordion>
+
+<Accordion title="Return Value">
+  <ParamField path="streaming=True" type="Generator[bytes, None, None]">
+    A generator yielding audio chunks as they become available
+  </ParamField>
+  
+  <ParamField path="streaming=False" type="bytes">
+    Complete audio data as bytes
+  </ParamField>
+</Accordion>
+
+<Tip>
+Most text-to-speech APIs require you to specify a voice along with the model. Consider implementing a mapping between Dify's model identifiers and the provider's voice options.
+</Tip>
+
+<Warning>
+Long text inputs may need to be chunked for better speech synthesis quality. Consider implementing text preprocessing to handle punctuation, numbers, and special characters properly.
+</Warning>
+
+
+### Moderation Implementation
+
+<Info>
+Moderation models analyze content for potentially harmful, inappropriate, or unsafe material, helping maintain platform safety and content policies.
+</Info>
+
+To implement a Moderation provider, inherit from the `__base.moderation_model.ModerationModel` base class:
+
+<CodeGroup>
+```python Implementation
+def _invoke(
+    self, 
+    model: str, 
+    credentials: dict,
+    text: str, 
+    user: Optional[str] = None
+) -> bool:
+    """
+    Analyze text for harmful content
+    
+    Returns:
+        bool: False if the text is safe, True if it contains harmful content
+    """
+    # Set up API client with credentials
+    client = self._get_client(credentials)
+    
+    try:
+        # Call moderation API
+        response = client.moderations.create(
+            model=model,
+            input=text,
+            user=user
+        )
+        
+        # Check if any categories were flagged
+        result = response.results[0]
+        
+        # Return True if flagged in any category, False if safe
+        return result.flagged
+        
+    except Exception as e:
+        # Log the error but default to safe if there's an API issue
+        # This is a conservative approach - production systems might want
+        # different fallback behavior
+        logger.error(f"Moderation API error: {str(e)}")
+        return False
 ```
 
-Parameter explanations are the same as in `LLM Invocation` above. This interface needs to calculate based on the appropriate `tokenizer` for the corresponding `model`. If the corresponding model does not provide a `tokenizer`, you can use the `_get_num_tokens_by_gpt2(text: str)` method in the `AIModel` base class for calculation.
-
-* Get custom model rules [optional]
-
-```python
-def get_customizable_model_schema(self, model: str, credentials: dict) -> Optional[AIModelEntity]:
+```python Detailed Implementation
+def _invoke(
+    self, 
+    model: str, 
+    credentials: dict,
+    text: str, 
+    user: Optional[str] = None
+) -> bool:
     """
-    Get customizable model schema
-
-    :param model: model name
-    :param credentials: model credentials
-    :return: model schema
+    Analyze text for harmful content with detailed category checking
     """
+    # Set up API client with credentials
+    client = self._get_client(credentials)
+    
+    try:
+        # Call moderation API
+        response = client.moderations.create(
+            model=model,
+            input=text,
+            user=user
+        )
+        
+        # Get detailed category results
+        result = response.results[0]
+        categories = result.categories
+        
+        # Check specific categories based on your application's needs
+        # For example, you might want to flag certain categories but not others
+        critical_violations = [
+            categories.harassment,
+            categories.hate,
+            categories.self_harm,
+            categories.sexual,
+            categories.violence
+        ]
+        
+        # Flag content if any critical category is violated
+        return any(critical_violations)
+        
+    except Exception as e:
+        self._handle_api_error(e)
+        # Default to safe in case of error
+        return False
 ```
+</CodeGroup>
 
-When a provider supports adding custom LLMs, this method can be implemented to allow custom models to obtain model rules. By default, it returns None.
+<Accordion title="Parameters">
+  <ParamField path="model" type="string" required>
+    Moderation model identifier
+  </ParamField>
+  
+  <ParamField path="credentials" type="dict" required>
+    Authentication credentials for the API
+  </ParamField>
+  
+  <ParamField path="text" type="string" required>
+    Text content to be analyzed
+  </ParamField>
+  
+  <ParamField path="user" type="string">
+    User identifier for API monitoring
+  </ParamField>
+</Accordion>
 
-For most fine-tuned models under the `OpenAI` provider, the base model can be obtained through the fine-tuned model name, such as `gpt-3.5-turbo-1106`, and then return the predefined parameter rules of the base model. Refer to the specific implementation of [OpenAI](https://github.com/langgenius/dify-official-plugins/tree/main/models/openai).
+<Accordion title="Return Value">
+  <ParamField path="result" type="boolean" required>
+    Boolean indicating content safety:
+    - False: The content is safe
+    - True: The content contains harmful material
+  </ParamField>
+</Accordion>
 
-#### TextEmbedding
+<Warning>
+Moderation is often used as a safety mechanism. Consider the implications of false negatives (letting harmful content through) versus false positives (blocking safe content) when implementing your solution.
+</Warning>
 
-Inherit the `__base.text_embedding_model.TextEmbeddingModel` base class and implement the following interface:
-
-* Embedding Invocation
-
-```python
-def _invoke(self, model: str, credentials: dict,
-            texts: list[str], user: Optional[str] = None) \
-        -> TextEmbeddingResult:
-    """
-    Invoke large language model
-
-    :param model: model name
-    :param credentials: model credentials
-    :param texts: texts to embed
-    :param user: unique user id
-    :return: embeddings result
-    """
-```
-
-* Parameters:
-
-* `model` (string) Model name
-* `credentials` (object) Credential information
-
-The credential parameters are defined by the provider YAML configuration file's `provider_credential_schema` or `model_credential_schema`, passed in as `api_key`, etc.
-
-* `texts` (array\[string]) Text list, can be processed in batch
-* `user` (string) \[optional] A unique identifier for the user
-Can help the provider monitor and detect abuse.
-
-* Return:
-
-[TextEmbeddingResult](#textembeddingresult) entity.
-
-* Pre-calculate tokens
-
-```python
-def get_num_tokens(self, model: str, credentials: dict, texts: list[str]) -> int:
-    """
-    Get number of tokens for given prompt messages
-
-    :param model: model name
-    :param credentials: model credentials
-    :param texts: texts to embed
-    :return:
-    """
-```
-
-Parameter explanations can be found in the `Embedding Invocation` section above.
-
-Similar to the `LargeLanguageModel` above, this interface needs to calculate based on the appropriate `tokenizer` for the corresponding `model`. If the corresponding model does not provide a `tokenizer`, you can use the `_get_num_tokens_by_gpt2(text: str)` method in the `AIModel` base class for calculation.
-
-#### Rerank
-
-Inherit the `__base.rerank_model.RerankModel` base class and implement the following interface:
-
-* Rerank Invocation
-
-```python
-def _invoke(self, model: str, credentials: dict,
-            query: str, docs: list[str], score_threshold: Optional[float] = None, top_n: Optional[int] = None,
-            user: Optional[str] = None) \
-        -> RerankResult:
-    """
-    Invoke rerank model
-
-    :param model: model name
-    :param credentials: model credentials
-    :param query: search query
-    :param docs: docs for reranking
-    :param score_threshold: score threshold
-    :param top_n: top n
-    :param user: unique user id
-    :return: rerank result
-    """
-```
-
-* Parameters:
-
-* `model` (string) Model name
-* `credentials` (object) Credential information
-The credential parameters are defined by the provider YAML configuration file's `provider_credential_schema` or `model_credential_schema`, passed in as `api_key`, etc.
-* `query` (string) Query request content
-* `docs` (array\[string]) List of segments that need to be reranked
-* `score_threshold` (float) \[optional] Score threshold
-* `top_n` (int) \[optional] Take the top n segments
-* `user` (string) \[optional] A unique identifier for the user
-Can help the provider monitor and detect abuse.
-
-* Return:
-
-[RerankResult](#rerankresult) entity.
-
-#### Speech2text
-
-Inherit the `__base.speech2text_model.Speech2TextModel` base class and implement the following interface:
-
-* Invoke
-
-```python
-def _invoke(self, model: str, credentials: dict,
-            file: IO[bytes], user: Optional[str] = None) \
-        -> str:
-    """
-    Invoke large language model
-
-    :param model: model name
-    :param credentials: model credentials
-    :param file: audio file
-    :param user: unique user id
-    :return: text for given audio file
-    """        
-```
-
-* Parameters:
-
-* `model` (string) Model name
-* `credentials` (object) Credential information
-The credential parameters are defined by the provider YAML configuration file's `provider_credential_schema` or `model_credential_schema`, passed in as `api_key`, etc.
-* `file` (File) File stream
-* `user` (string) \[optional] A unique identifier for the user
-Can help the provider monitor and detect abuse.
-
-* Return:
-
-String after speech conversion.
-
-#### Text2speech
-
-Inherit the `__base.text2speech_model.Text2SpeechModel` base class and implement the following interface:
-
-* Invoke
-
-```python
-def _invoke(self, model: str, credentials: dict, content_text: str, streaming: bool, user: Optional[str] = None):
-    """
-    Invoke large language model
-
-    :param model: model name
-    :param credentials: model credentials
-    :param content_text: text content to be translated
-    :param streaming: output is streaming
-    :param user: unique user id
-    :return: translated audio file
-    """        
-```
-
-* Parameters:
-
-* `model` (string) Model name
-* `credentials` (object) Credential information
-The credential parameters are defined by the provider YAML configuration file's `provider_credential_schema` or `model_credential_schema`, passed in as `api_key`, etc.
-* `content_text` (string) Text content to be converted
-* `streaming` (bool) Whether to stream output
-* `user` (string) \[optional] A unique identifier for the user
-Can help the provider monitor and detect abuse.
-
-* Return:
-
-Audio stream after text conversion.
-
-
-#### Moderation
-
-Inherit the `__base.moderation_model.ModerationModel` base class and implement the following interface:
-
-* Invoke
-
-```python
-def _invoke(self, model: str, credentials: dict,
-            text: str, user: Optional[str] = None) \
-        -> bool:
-    """
-    Invoke large language model
-
-    :param model: model name
-    :param credentials: model credentials
-    :param text: text to moderate
-    :param user: unique user id
-    :return: false if text is safe, true otherwise
-    """
-```
-
-* Parameters:
-
-* `model` (string) Model name
-* `credentials` (object) Credential information
-The credential parameters are defined by the provider YAML configuration file's `provider_credential_schema` or `model_credential_schema`, passed in as `api_key`, etc.
-* `text` (string) Text content
-* `user` (string) \[optional] A unique identifier for the user
-Can help the provider monitor and detect abuse.
-
-* Return:
-
-False indicates the input text is safe, True indicates it is not.
+<Tip>
+Many moderation APIs provide detailed category scores rather than just a binary result. Consider extending this implementation to return more detailed information about specific categories of harmful content if your application needs it.
+</Tip>
 
 ### Entities