mirror of
https://github.com/LibreChat-AI/librechat.ai.git
synced 2026-03-27 10:48:32 +07:00
chore: Remove forcePrompt, add Moonshot Provider (#495)
* docs: add Moonshot AI configuration documentation and update known endpoints list * docs: remove forcePrompt configuration from various AI endpoint examples and documentation
This commit is contained in:
@@ -268,7 +268,6 @@ custom:
|
||||
titleModel: "gpt-3.5-turbo"
|
||||
summarize: false
|
||||
summaryModel: "gpt-3.5-turbo"
|
||||
forcePrompt: false
|
||||
modelDisplayLabel: "Lite LLM"
|
||||
```
|
||||
|
||||
|
||||
@@ -183,7 +183,6 @@ options={[
|
||||
['serverless', 'boolean', 'Specifies if the group is a serverless inference chat completions endpoint from Azure Model Catalog, for which only a model identifier, baseURL, and apiKey are needed. For more info, see serverless inference endpoints.', 'serverless: true'],
|
||||
['addParams', 'object', 'Adds or overrides additional parameters for Azure OpenAI API requests. Useful for specifying API-specific options as key-value pairs.', 'addParams: {temperature: 0.7}'],
|
||||
['dropParams', 'array', 'Allows for the exclusion of certain default parameters from Azure OpenAI API requests. Useful for APIs that do not accept or recognize specific parameters. This should be specified as a list of strings.', 'dropParams: [top_p, stop]'],
|
||||
['forcePrompt', 'boolean', 'Dictates whether to send a prompt parameter instead of messages in the request body. This option is useful when needing to format the request in a manner consistent with OpenAI API expectations, particularly for scenarios preferring a single text payload.', 'forcePrompt: true'],
|
||||
]}
|
||||
/>
|
||||
|
||||
@@ -215,7 +214,6 @@ endpoints:
|
||||
dropParams:
|
||||
- "frequency_penalty"
|
||||
- "presence_penalty"
|
||||
forcePrompt: false
|
||||
models:
|
||||
# ... (model-level configurations)
|
||||
```
|
||||
@@ -583,4 +581,3 @@ endpoints:
|
||||
- Compatibility with LibreChat relies on parity with OpenAI API specs, which at the time of writing, are typically **"Pay-as-you-go"** or "Models as a Service" (MaaS) deployments on Azure AI Studio, that are OpenAI-SDK-compatible with either `v1/completions` or `models/chat/completions` endpoint handling.
|
||||
- All models that offer serverless deployments ("Serverless APIs") are compatible from the Azure model catalog. You can filter by "Serverless API" under Deployment options and "Chat completion" under inference tasks to see the full list; however, real time endpoint models have not been tested.
|
||||
- These serverless inference endpoint/models may or may not support function calling according to OpenAI API specs, which enables their use with Agents.
|
||||
- If using legacy "/v1/completions" (without "chat"), you need to set the `forcePrompt` field to `true` in your [group config.](#group-level-configuration)
|
||||
|
||||
@@ -24,7 +24,6 @@ description: Example configuration for Anyscale
|
||||
titleModel: "meta-llama/Llama-2-7b-chat-hf"
|
||||
summarize: false
|
||||
summaryModel: "meta-llama/Llama-2-7b-chat-hf"
|
||||
forcePrompt: false
|
||||
modelDisplayLabel: "Anyscale"
|
||||
```
|
||||
|
||||
|
||||
@@ -25,7 +25,6 @@ description: Example configuration for Fireworks
|
||||
titleModel: "accounts/fireworks/models/llama-v2-7b-chat"
|
||||
summarize: false
|
||||
summaryModel: "accounts/fireworks/models/llama-v2-7b-chat"
|
||||
forcePrompt: false
|
||||
modelDisplayLabel: "Fireworks"
|
||||
dropParams: ["user"]
|
||||
```
|
||||
|
||||
@@ -21,7 +21,6 @@ description: Example configuration for LiteLLM
|
||||
titleModel: "gpt-3.5-turbo"
|
||||
summarize: false
|
||||
summaryModel: "gpt-3.5-turbo"
|
||||
forcePrompt: false
|
||||
modelDisplayLabel: "LiteLLM"
|
||||
```
|
||||
|
||||
|
||||
@@ -27,7 +27,6 @@ description: Example configuration for Apple MLX
|
||||
titleModel: "current_model"
|
||||
summarize: false
|
||||
summaryModel: "current_model"
|
||||
forcePrompt: false
|
||||
modelDisplayLabel: "Apple MLX"
|
||||
addParams:
|
||||
max_tokens: 2000
|
||||
|
||||
@@ -0,0 +1,25 @@
|
||||
---
|
||||
title: Moonshot
|
||||
description: Example configuration for Moonshot AI (Kimi)
|
||||
---
|
||||
|
||||
# [Moonshot](https://www.moonshot.ai/)
|
||||
|
||||
> Moonshot API key: [platform.moonshot.cn](https://platform.moonshot.ai)
|
||||
|
||||
**Notes:**
|
||||
|
||||
- **Known:** icon provided.
|
||||
- **Important:** For models with reasoning/thinking capabilities (e.g., `kimi-k2.5`, `kimi-k2-thinking`), the endpoint `name` **must** be set to `"Moonshot"` (case-insensitive) for interleaved reasoning to work correctly with tool calls. Using a different name will result in errors like `thinking is enabled but reasoning_content is missing in assistant tool call message`. See [Moonshot's documentation](https://platform.moonshot.ai/docs/guide/use-kimi-k2-thinking-model#frequently-asked-questions) for more details.
|
||||
|
||||
```yaml filename="librechat.yaml"
|
||||
- name: "Moonshot"
|
||||
apiKey: "${MOONSHOT_API_KEY}"
|
||||
baseURL: "https://api.moonshot.ai/v1"
|
||||
models:
|
||||
default: ["kimi-k2.5"]
|
||||
fetch: true
|
||||
titleConvo: true
|
||||
titleModel: "current_model"
|
||||
modelDisplayLabel: "Moonshot"
|
||||
```
|
||||
@@ -26,7 +26,6 @@ description: Example configuration for NeurochainAI
|
||||
titleModel: "current_model"
|
||||
summarize: false
|
||||
summaryModel: "current_model"
|
||||
forcePrompt: false
|
||||
modelDisplayLabel: "NeurochainAI"
|
||||
iconURL: "https://raw.githubusercontent.com/LibreChat-AI/librechat-config-yaml/refs/heads/main/icons/NeurochainAI.png"
|
||||
```
|
||||
|
||||
@@ -36,7 +36,6 @@ description: Example configuration for Ollama
|
||||
titleModel: "current_model"
|
||||
summarize: false
|
||||
summaryModel: "current_model"
|
||||
forcePrompt: false
|
||||
modelDisplayLabel: "Ollama"
|
||||
```
|
||||
|
||||
@@ -60,7 +59,6 @@ However, in case you experience the behavior where `llama3` does not stop genera
|
||||
titleModel: "current_model"
|
||||
summarize: false
|
||||
summaryModel: "current_model"
|
||||
forcePrompt: false
|
||||
modelDisplayLabel: "Ollama"
|
||||
addParams:
|
||||
"stop": [
|
||||
|
||||
@@ -32,7 +32,6 @@ description: Example configuration for Perplexity
|
||||
titleModel: "llama-3-sonar-small-32k-chat"
|
||||
summarize: false
|
||||
summaryModel: "llama-3-sonar-small-32k-chat"
|
||||
forcePrompt: false
|
||||
dropParams: ["stop", "frequency_penalty"]
|
||||
modelDisplayLabel: "Perplexity"
|
||||
```
|
||||
|
||||
@@ -33,7 +33,6 @@ description: Example configuration for Portkey AI
|
||||
titleModel: "current_model"
|
||||
summarize: false
|
||||
summaryModel: "current_model"
|
||||
forcePrompt: false
|
||||
modelDisplayLabel: "Portkey:OpenAI"
|
||||
iconURL: https://images.crunchbase.com/image/upload/c_pad,f_auto,q_auto:eco,dpr_1/rjqy7ghvjoiu4cd1xjbf
|
||||
```
|
||||
@@ -55,7 +54,6 @@ description: Example configuration for Portkey AI
|
||||
titleModel: "current_model"
|
||||
summarize: false
|
||||
summaryModel: "current_model"
|
||||
forcePrompt: false
|
||||
modelDisplayLabel: "Portkey:Llama"
|
||||
iconURL: https://images.crunchbase.com/image/upload/c_pad,f_auto,q_auto:eco,dpr_1/rjqy7ghvjoiu4cd1xjbf
|
||||
```
|
||||
|
||||
@@ -24,7 +24,6 @@ description: Example configuration for ShuttleAI
|
||||
titleModel: "shuttle-2.5-mini"
|
||||
summarize: false
|
||||
summaryModel: "shuttle-2.5-mini"
|
||||
forcePrompt: false
|
||||
modelDisplayLabel: "ShuttleAI"
|
||||
dropParams: ["user", "stop"]
|
||||
```
|
||||
|
||||
@@ -168,6 +168,5 @@ with open("models_togetherai.json", "w") as file:
|
||||
titleModel: "togethercomputer/llama-2-7b-chat"
|
||||
summarize: false
|
||||
summaryModel: "togethercomputer/llama-2-7b-chat"
|
||||
forcePrompt: false
|
||||
modelDisplayLabel: "together.ai"
|
||||
```
|
||||
@@ -25,7 +25,6 @@ To use [TrueFoundry's AI Gateway](https://www.truefoundry.com/ai-gateway) follow
|
||||
titleModel: "current_model"
|
||||
summarize: false
|
||||
summaryModel: "current_model"
|
||||
forcePrompt: false
|
||||
modelDisplayLabel: "TrueFoundry:OpenAI"
|
||||
```
|
||||
For more details you can check: [TrueFoundry Docs](https://docs.truefoundry.com/docs/introduction)
|
||||
@@ -27,7 +27,6 @@ description: Example configuration for vLLM
|
||||
titleMessageRole: "user"
|
||||
summarize: false
|
||||
summaryModel: "current_model"
|
||||
forcePrompt: false
|
||||
```
|
||||
|
||||
The configuration above connects LibreChat to a local vLLM server running on port 8023. It uses the Gemma 3 27B model as the default model, but will fetch all available models from your vLLM server.
|
||||
|
||||
@@ -23,7 +23,6 @@ description: Example configuration for xAI
|
||||
titleModel: "grok-beta"
|
||||
summarize: false
|
||||
summaryModel: "grok-beta"
|
||||
forcePrompt: false
|
||||
modelDisplayLabel: "Grok"
|
||||
```
|
||||
|
||||
|
||||
@@ -45,7 +45,6 @@ endpoints:
|
||||
titleModel: "meta-llama/Llama-2-7b-chat-hf"
|
||||
summarize: false
|
||||
summaryModel: "meta-llama/Llama-2-7b-chat-hf"
|
||||
forcePrompt: false
|
||||
modelDisplayLabel: "Anyscale"
|
||||
|
||||
# APIpie
|
||||
@@ -99,7 +98,6 @@ endpoints:
|
||||
titleModel: "accounts/fireworks/models/llama-v2-7b-chat"
|
||||
summarize: false
|
||||
summaryModel: "accounts/fireworks/models/llama-v2-7b-chat"
|
||||
forcePrompt: false
|
||||
modelDisplayLabel: "Fireworks"
|
||||
dropParams: ["user"]
|
||||
|
||||
@@ -148,7 +146,6 @@ endpoints:
|
||||
titleModel: "gpt-3.5-turbo"
|
||||
summarize: false
|
||||
summaryModel: "gpt-3.5-turbo"
|
||||
forcePrompt: false
|
||||
modelDisplayLabel: "OpenRouter"
|
||||
|
||||
# Perplexity
|
||||
@@ -168,7 +165,6 @@ endpoints:
|
||||
titleModel: "sonar-medium-chat"
|
||||
summarize: false
|
||||
summaryModel: "sonar-medium-chat"
|
||||
forcePrompt: false
|
||||
dropParams: ["stop", "frequency_penalty"]
|
||||
modelDisplayLabel: "Perplexity"
|
||||
|
||||
@@ -185,7 +181,6 @@ endpoints:
|
||||
titleModel: "gemini-pro"
|
||||
summarize: false
|
||||
summaryModel: "llama-summarize"
|
||||
forcePrompt: false
|
||||
modelDisplayLabel: "ShuttleAI"
|
||||
dropParams: ["user"]
|
||||
|
||||
@@ -245,7 +240,6 @@ endpoints:
|
||||
titleModel: "togethercomputer/llama-2-7b-chat"
|
||||
summarize: false
|
||||
summaryModel: "togethercomputer/llama-2-7b-chat"
|
||||
forcePrompt: false
|
||||
modelDisplayLabel: "together.ai"
|
||||
```
|
||||
|
||||
@@ -379,9 +373,6 @@ endpoints:
|
||||
# Summary Model: Specify the model to use if summarization is enabled.
|
||||
# summaryModel: "mistral-tiny" # Defaults to "gpt-3.5-turbo" if omitted.
|
||||
|
||||
# Force Prompt setting: If true, sends a `prompt` parameter instead of `messages`.
|
||||
# forcePrompt: false
|
||||
|
||||
# The label displayed for the AI model in messages.
|
||||
modelDisplayLabel: 'Mistral' # Default is "AI" when not set.
|
||||
|
||||
|
||||
@@ -228,22 +228,6 @@ addParams:
|
||||
dropParams: ["stop", "user", "frequency_penalty", "presence_penalty"]
|
||||
```
|
||||
|
||||
### forcePrompt
|
||||
|
||||
**Key:**
|
||||
<OptionTable
|
||||
options={[
|
||||
['forcePrompt', 'Boolean', 'If `true`, sends a `prompt` parameter instead of `messages`. This combines all messages into a single text payload, following OpenAI format, and uses the `/completions` endpoint of your baseURL rather than `/chat/completions`.', ''],
|
||||
]}
|
||||
/>
|
||||
|
||||
**Default:** Not specified
|
||||
|
||||
**Example:**
|
||||
```yaml filename="endpoints / azureOpenAI / groups / {group_item} / forcePrompt"
|
||||
forcePrompt: false
|
||||
```
|
||||
|
||||
### models
|
||||
|
||||
**Key:**
|
||||
|
||||
@@ -110,18 +110,25 @@ iconURL: https://github.com/danny-avila/LibreChat/raw/main/docs/assets/LibreChat
|
||||
* If you want to use existing project icons, define the endpoint `name` as one of the main endpoints (case-sensitive):
|
||||
- "openAI" | "azureOpenAI" | "google" | "anthropic" | "assistants" | "gptPlugins"
|
||||
* There are also "known endpoints" (case-insensitive), which have icons provided. If your endpoint `name` matches the following names, you should omit this field:
|
||||
- "Mistral"
|
||||
- "Deepseek"
|
||||
- "OpenRouter"
|
||||
- "groq"
|
||||
- "APIpie"
|
||||
- "Anyscale"
|
||||
- "APIpie"
|
||||
- "Cohere"
|
||||
- "Deepseek"
|
||||
- "Fireworks"
|
||||
- "Perplexity"
|
||||
- "together.ai"
|
||||
- "ollama"
|
||||
- "xai"
|
||||
- "groq"
|
||||
- "Helicone"
|
||||
- "Huggingface"
|
||||
- "Mistral"
|
||||
- "MLX"
|
||||
- "Moonshot"
|
||||
- "ollama"
|
||||
- "OpenRouter"
|
||||
- "Perplexity"
|
||||
- "Qwen"
|
||||
- "ShuttleAI"
|
||||
- "together.ai"
|
||||
- "Unify"
|
||||
- "xai"
|
||||
|
||||
## models
|
||||
|
||||
@@ -361,22 +368,6 @@ summarize: false
|
||||
summaryModel: "mistral-tiny"
|
||||
```
|
||||
|
||||
## forcePrompt
|
||||
|
||||
**Key:**
|
||||
<OptionTable
|
||||
options={[
|
||||
['forcePrompt', 'Boolean', 'If `true`, sends a `prompt` parameter instead of `messages`.', 'Combines all messages into a single text payload or "prompt", following OpenAI format, which uses the `/completions` endpoint of your baseURL rather than `/chat/completions`.'],
|
||||
]}
|
||||
/>
|
||||
|
||||
**Default:** `false`
|
||||
|
||||
**Example:**
|
||||
```yaml filename="endpoints / custom / forcePrompt"
|
||||
forcePrompt: false
|
||||
```
|
||||
|
||||
## modelDisplayLabel
|
||||
|
||||
**Key:**
|
||||
|
||||
@@ -53,7 +53,6 @@ services:
|
||||
titleModel: "current_model"
|
||||
summarize: false
|
||||
summaryModel: "current_model"
|
||||
forcePrompt: false
|
||||
modelDisplayLabel: "OpenRouter"
|
||||
- name: "Ollama"
|
||||
apiKey: "ollama"
|
||||
|
||||
Reference in New Issue
Block a user