--- title: Image Generation & Editing icon: Image description: Comprehensive guide to LibreChat's built-in image generation and editing tools --- ## Quick Start Get image generation working in under 5 minutes with OpenAI Image Tools (recommended). ### Create an Agent Go to the **Agents** panel in LibreChat and create a new agent. Give it a name like "Image Creator". ### Add OpenAI Image Tools In the agent's **Tools** list, select **OpenAI Image Tools**. This adds both image generation and image editing capabilities. ### Set Your API Key Add the following to your `.env` file: ```bash filename=".env" IMAGE_GEN_OAI_API_KEY=sk-your-openai-api-key ``` ### Restart and Test | Deployment | Command | |------------|---------| | Docker | `docker compose down && docker compose up -d` | | Local | Stop (Ctrl+C) then `npm run backend` | Send a message like "Generate an image of a sunset over mountains" to your agent. --- LibreChat comes with **built-in image tools** that you can add to an **[Agent](/docs/features/agents).** Each has its own look, price-point, and setup step (usually just an API key or URL). | Tool | Best for | Needs | |------|----------|-------| | **OpenAI Image Tools** | Cutting-edge results (GPT-Image-1).
Can also ***edit*** the images you upload. | OpenAI API | | **Gemini Image Tools** | Google's latest image models with context-aware generation. | Gemini API or Vertex AI | | **DALL·E (3 / 2)** | Legacy OpenAI Image models. | OpenAI API | | **Stable Diffusion** | Local or self-hosted generation, endless community models. | Automatic1111 API | | **Flux** | Fast cloud renders, optional fine-tunes. | Flux API | | **MCP** | Bring-your-own-Image-Generators | MCP server with image output support | **Notes:** - API keys can be omitted in favor of allowing the user to enter their own key from the UI. - Image Outputs are directly sent to the LLM as part of the immediate chat context following generation. - The LLM will only get vision context from images attached to user messages, and not from generations/edits, except for immediately after generation. - See [Image Storage and Handling](#image-storage-and-handling) for more details. - MCP Server tool image outputs are supported, which may output images similarly to LC's built-in tools. - Note: MCP servers may or may not use the correct format when outputting images. See details in the [MCP section below](#5--model-context-protocol-mcp). --- ## 1 · OpenAI Image Tools (recommended) ### Features "OpenAI Image Tools" are an agent toolkit made up of 2 separate tools. - **Image Generation**: - **Create** brand-new images from text prompts (no upload required). - **Image Editing**: - **Edit** or **remix** the images you just uploaded—change colours, add objects, extend the canvas, etc. - Both use OpenAI's latest image generation model, **GPT-Image-1**, for superior instruction following, text rendering, detailed editing, real-world knowledge - See OpenAI's [Image Generation documentation](https://platform.openai.com/docs/guides/image-generation?image-generation-model=gpt-image-1) for more details. #### Generation vs. Editing | Use-case | Invokes | |----------|---------------| | "Start from scratch" | **Image Generation** | | "Use existing image(s)" | **Image Editing** | The agent decides which tool to use based on the context: - **Image Generation** creates brand new images from text descriptions only - **Image Editing** modifies or remixes existing images using their image IDs - These can be images from the current message or previously generated/referenced images - The LLM keeps track of image IDs as long as they remain in the context window - Includes the referenced image IDs in the tool output - Both tools are always available, but the LLM will choose the appropriate one based on the user's request - Both tools will include the generated image ID in the tool output ⚠️ **Important** - Image editing relies on image IDs, which are retained in the chat history. - When files are uploaded to the current request, their image IDs are added to the context of the LLM before any tokens are generated. - Previously referenced or generated image IDs can be used for editing, as long as they remain within the context window. - The LLM can include any relevant image IDs in the `image_ids` array when calling the image editing tool. - You can also attach previously uploaded images from the side panel without needing to upload them again. - This also has the added benefit of providing a vision model with the image context, which can be useful for informing the `prompt` for the image editing tool. ### Parameters #### Image Generation • **prompt** – text description (required) • **size** – `auto` (default), `1024x1024` (square), `1536x1024` (landscape), or `1024x1536` (portrait) • **quality** – `auto` (default), `high`, `medium`, or `low` • **background** – `auto` (default), `transparent`, or `opaque` (transparent requires PNG or WebP format) #### Image Editing • **image_ids** – array of image IDs to use as reference for editing (required) • **prompt** – text description of the changes (required) • **size** – `auto` (default), `1024x1024`, `1536x1024`, `1024x1536`, `256x256`, or `512x512` • **quality** – `auto` (default), `high`, `medium`, or `low` ### Setup Create or reuse an OpenAI key and add to `.env`: ```bash IMAGE_GEN_OAI_API_KEY=sk-... # optional extras IMAGE_GEN_OAI_BASEURL=https://... ``` For Azure OpenAI deployments, you will first need access: https://aka.ms/oai/gptimage1access Then, add your corresponding credentials to your `.env` file: ```bash IMAGE_GEN_OAI_API_KEY=your-api-key # optional extras IMAGE_GEN_OAI_BASEURL=https://deploymentname.openai.azure.com/openai/deployments/gpt-image-1/ IMAGE_GEN_OAI_AZURE_API_VERSION=2025-04-01-preview ``` Then add "OpenAI Image Tools" to your Agent's *Tools* list. ### Advanced Configuration You can customize the tool descriptions and prompt guidance by setting these environment variables: ```bash # Image Generation Tool Descriptions IMAGE_GEN_OAI_DESCRIPTION=... IMAGE_GEN_OAI_PROMPT_DESCRIPTION=... # Image Editing Tool Descriptions IMAGE_EDIT_OAI_DESCRIPTION=... IMAGE_EDIT_OAI_PROMPT_DESCRIPTION=... ``` ### Pricing See the [GPT-Image-1 pricing page](https://platform.openai.com/docs/models/gpt-image-1) and [Image Generation Documentation](https://platform.openai.com/docs/guides/image-generation?image-generation-model=gpt-image-1#cost-and-latency) for details on costs associated with image generation. --- ## 2 · Gemini Image Tools Gemini Image Tools integrate Google's latest image generation models, supporting both text-to-image generation and image context-aware editing. ### Features - **Text-to-Image Generation**: Create high-quality images from detailed text descriptions - **Image Context Support**: Use existing images as context or inspiration for new generations - **Image Editing**: Generate new images based on modifications to existing ones (include original image ID) - **Multiple Models**: Choose between `gemini-2.5-flash-image` (default) or `gemini-3-pro-image-preview` - **Dual API Support**: Works with both simple Gemini API keys and Google Cloud Vertex AI ### Parameters • **prompt** – Detailed text description of the desired image (required, up to 32,000 characters) • **image_ids** – Optional array of image IDs to use as visual context for generation ### Setup #### Option 1: Gemini API (Recommended) Get an API key from [Google AI Studio](https://aistudio.google.com/app/apikey): ```bash GEMINI_API_KEY=your_api_key_here ``` #### Option 2: Vertex AI (Enterprise) For Google Cloud users with Vertex AI access: ```bash GOOGLE_SERVICE_KEY_FILE=/path/to/service-account.json GOOGLE_CLOUD_LOCATION=us-central1 # optional, default: global ``` ### Model Selection ```bash # Default model (fast and efficient) GEMINI_IMAGE_MODEL=gemini-2.5-flash-image # Higher quality model GEMINI_IMAGE_MODEL=gemini-3-pro-image-preview ``` ### Advanced Configuration Customize tool descriptions via environment variables: ```bash GEMINI_IMAGE_GEN_DESCRIPTION=... GEMINI_IMAGE_GEN_PROMPT_DESCRIPTION=... GEMINI_IMAGE_IDS_DESCRIPTION=... ``` More details can be found in the dedicated [Gemini Image Gen guide](/docs/configuration/tools/gemini_image_gen). --- ## 3 · DALL·E (legacy) DALL·E provides high-quality image generation using OpenAI's legacy image models. ### Parameters • **prompt** – Text description of the desired image (required, up to 4000 characters) • **style** – `vivid` (hyper-real, dramatic - default) or `natural` (less hyper-real) • **quality** – `standard` (default) or `hd` • **size** – `1024x1024` (default/square), `1792x1024` (wide), or `1024x1792` (tall) ### Setup ```bash # Required DALLE_API_KEY=sk-... # or DALLE3_API_KEY=sk-... # Optional DALLE_REVERSE_PROXY=https://... # Alternative endpoint DALLE3_BASEURL=https://... # For Azure or custom endpoints DALLE3_AZURE_API_VERSION=2023-12-01-preview # For Azure deployments DALLE3_SYSTEM_PROMPT=... # Custom system prompt for DALL·E ``` ### Advanced Configuration For Azure OpenAI deployments, configure both the base URL and API version: ```bash DALLE3_BASEURL=https://your-resource-name.openai.azure.com/openai/deployments/your-deployment-name DALLE3_AZURE_API_VERSION=2023-12-01-preview DALLE3_API_KEY=your-azure-api-key ``` Enable the **DALL·E** tool for the Agent and start prompting. ### Pricing See the [DALL-E pricing page](https://platform.openai.com/docs/models/dall-e-3) and [Image Generation Documentation](https://platform.openai.com/docs/guides/image-generation?image-generation-model=dall-e-3) for details on costs associated with image generation. --- ## 4 · Stable Diffusion (local) Run images entirely on your own machine or server. Point LibreChat at any Automatic1111 (or compatible) endpoint and you're set. ### Parameters • **prompt** – Detailed keywords describing desired elements in the image (required) • **negative_prompt** – Keywords describing elements to exclude from the image (required) The Stable Diffusion implementation uses these default parameters: - cfg_scale: 4.5 - steps: 22 - width: 1024 - height: 1024 These values are currently fixed but provide good results for most use cases. ### Setup ```bash SD_WEBUI_URL=http://127.0.0.1:7860 # URL to your Automatic1111 WebUI ``` No API key required—just the reachable URL. More details on setting up Automatic1111 can be found in the dedicated [Stable Diffusion guide](/docs/configuration/tools/stable_diffusion). --- ## 5 · Flux Cloud generator with an emphasis on speed and optional fine-tuned models. ### Features - Fast cloud-based image generation - Support for fine-tuned models - Multiple quality levels and aspect ratios - Raw mode for less processed, more natural-looking images ### Parameters The Flux tool supports three main actions: 1. **generate** - Create a new image from a text prompt 2. **generate_finetuned** - Create an image using a fine-tuned model 3. **list_finetunes** - List available custom models for the user More details can be found in the dedicated [Flux guide](/docs/configuration/tools/flux#parameters). ### Setup ```bash FLUX_API_KEY=flux_live_... FLUX_API_BASE_URL=https://api.us1.bfl.ai # default is fine for most users ``` Choose the **Flux** tool inside the Agent. Prompts are plain text; one call produces one image. ### Pricing See the [Flux pricing page](https://docs.bfl.ml/pricing/) for details on costs associated with image generation. --- ## 6 · Model Context Protocol (MCP) Image outputs are supported from MCP servers. For example, the [Puppeteer MCP Server](https://github.com/modelcontextprotocol/servers/tree/main/src/puppeteer) can be used to generate screenshots of web pages, which correctly output the image in the expected format and is treated the same as LC's built-in image tools. > The examples below assume LibreChat is running outside of Docker, directly using Node.js. The Model Context Protocol is a relatively new framework, and many developers are still learning how to properly serve their systems with uv/node for scalable distribution. > As this technology is still emerging, there are currently few image-generating servers available, and many existing ones have yet to adopt the correct response format for images. > While many MCP servers do function well within Docker, the following examples do not, or not without more advanced configurations, showcasing some of the current inconsistency between MCP servers. ```yaml mcpServers: puppeteer: command: npx args: - -y - "@modelcontextprotocol/server-puppeteer" ``` The following is an example of an [Image Generation server](https://github.com/GongRzhe/Image-Generation-MCP-Server) that outputs images using [Replicate API](https://replicate.com/account/api-tokens), but returns URLs of the images, which doesn't conform to MCP's image response standard. > Note: for this particular server, you need to install the `@gongrzhe/image-gen-server` package globally using npm, i.e. `npm install -g @gongrzhe/image-gen-server`, then point to the package's compiled files as shown below. ```yaml mcpServers: image-gen: command: "node" # First, install the package globally using npm: # `npm install -g @gongrzhe/image-gen-server` # Then, point to the location of the installed package, # which you can find by running `npm root -g` args: - "{REPLACE_WITH_NODE_MODULES_LOCATION}/@gongrzhe/image-gen-server/build/index.js" # Example with output from `npm root -g`: # - "/home/danny/.nvm/versions/node/v20.19.0/lib/node_modules/@gongrzhe/image-gen-server/build/index.js" env: # Do not hardcode the API token here, use the environment variable instead # The following will pick up the token from your .env file or environment REPLICATE_API_TOKEN: "${REPLICATE_API_TOKEN}" MODEL: "google/imagen-3" ``` --- ## Image Storage and Handling All generated images are: 1. Saved according to the configured [**`fileStrategy`**](/docs/configuration/librechat_yaml/object_structure/config#filestrategy) 2. Displayed directly in the chat interface 3. Image tool outputs are directly sent to the LLM as part of the immediate chat context following generation. - This may create issues if you are using an LLM that does not support image inputs. - There will be an option to disable this behavior on a per-agent-basis in the future. - These outputs are only directly sent to the LLM upon generation, not on every message. - To include the image in the chat, you can directly attach it to the message from the side panel. - To summarize, the LLM will only get vision context from images attached to user messages, and not from generations/edits, except for immediately after generation. --- ## Proxy Support All image generation tools support proxy configuration through the `PROXY` environment variable: ```bash PROXY=http://proxy-url:port ``` ## Error Handling If any of the tools encounter an error, they will return an error message explaining what went wrong. Common issues include: - Invalid API key - API unavailability - Content policy violations - Proxy/network issues - Invalid parameters - Unsupported image payload (see [Image Storage and Handling](#image-storage-and-handling) above) --- ## Prompting Though you can customize the prompts for [OpenAI Image Tools](#advanced-configuration) and [DALL·E](#advanced-configuration-1), the following tips inform the default prompts supplied by the tools, which is helpful to know for your own writing/prompting. 1. Start with the **subject** and **style** (photo, oil painting, etc.). 2. Add **composition** and **camera / medium** ("wide-angle shot of…", "watercolour…"). 3. Mention **lighting & mood** ("golden hour", "dramatic shadows"). 4. Finish with **detail keywords** (textures, colours, expressions). 5. Keep negatives positive—describe what should be included, not what to avoid. Example: > A cinematic photo of an antique library bathed in warm afternoon sunlight. Tall wooden shelves overflow with leather-bound books, and dust particles shimmer in the light. A single green-shaded banker's lamp illuminates an open atlas on a polished mahogany desk in the foreground. 85 mm lens, shallow depth of field, rich amber tones, ultra-high detail. ## Related Pages Create and configure AI agents with custom tools Bring your own tools via Model Context Protocol Detailed setup guide for Google Gemini image generation