mirror of
https://github.com/LibreChat-AI/librechat.ai.git
synced 2026-03-27 10:48:32 +07:00
* chore: update GitHub Actions workflow to use latest action version for improved stability * chore: update roadmap references and enhance documentation for AWS Bedrock inference profiles - Updated footer menu and card icons to reflect the 2026 roadmap. - Adjusted default values in changelog and configuration documentation for `maxRecursionLimit`. - Added comprehensive documentation for AWS Bedrock inference profiles, including setup, configuration, and examples. - Introduced Agents API documentation for programmatic access to LibreChat agents. - Enhanced existing documentation for clarity and consistency across various sections. * feat: release Config v1.3.4 with new features and updates - Introduced `create` field in `interface.prompts` and `interface.agents` for enhanced user control. - Added `interface.remoteAgents` configuration for managing remote agent permissions. - Updated `endpoints.bedrock` with `models` and `inferenceProfiles` for better customization. - Included Moonshot as a known endpoint for AI integration. - Introduced new agent capabilities: `deferred_tools` and `programmatic_tools`. - Removed deprecated `forcePrompt` setting from configurations. - Updated default model lists and added support for new models. - Enhanced `modelSpecs` with `artifacts` field and `effort` parameter for Anthropic models. * refactor: update BlogHeader to use usePathname for route handling - Replaced useRouter with usePathname for improved routing in BlogHeader component. - Simplified page retrieval logic by directly using pathname for matching routes. * feat: add changelog for v0.8.3-rc1 release with new features and fixes - Introduced several enhancements including event-driven lazy tool loading, token usage tracking, and programmatic tool calling UI. - Added support for new models and providers, including Claude Opus 4.6 and Moonshot. - Implemented various bug fixes and improvements for better user experience and performance. * chore: nextjs artifact * first draft roadmap * feat: enhance BlogPage with Open Graph image support and layout improvements - Added support for Open Graph images in blog entries to improve visual presentation. - Refactored article layout for better structure and readability, including adjustments to the display of metadata and content. - Updated styles for improved user experience during hover interactions. * feat: enhance BlogPage with date formatting and layout adjustments - Added a new dateFormatted field to entries for improved date display. - Implemented a date formatter for consistent date presentation. - Refactored article layout to use a grid system for better responsiveness. - Updated styles for article links and metadata for enhanced user experience. * feat: add responsive image sizes to BlogPage for improved layout - Included sizes attribute for Open Graph images to enhance responsiveness on different screen sizes. * feat: update feature titles and descriptions for clarity - Changed titles for "Forking Messages and Conversations" to "Forking Chats" and "Memory" to "User Memory" for better alignment with functionality. - Updated descriptions for "Message Search" and "Upload as Text" to enhance understanding of features. * chore: update configuration version to 1.3.4 across multiple documentation files - Updated the version number in `librechat.yaml` examples to reflect the latest release (1.3.4) in various configuration and feature documentation files. * feat: enhance User Memory documentation for clarity and detail - Updated the description to clarify that User Memory is a key/value store that operates on every chat request. - Added a callout to distinguish between key/value storage and conversation memory. - Expanded on the functionality of the memory agent, including its execution process and user control features. - Introduced a section on future improvements for the memory agent's efficiency and relevance. * feat: update title and description for NGINX documentation - Changed the title from "Secure Deployment with Nginx" to "NGINX" for brevity. - Updated the description to provide a clearer overview of the guide's purpose in securing LibreChat deployment with Nginx as a reverse proxy and HTTPS. * feat: update 2026 roadmap with key accomplishments and future plans - Celebrated LibreChat's 3rd anniversary with a summary of achievements from 2025, including growth in GitHub stars and community engagement. - Clarified the timeline for open-sourcing the Code Interpreter API by the end of Q1. - Revised notes on the v1 Admin Panel's core capabilities and community-driven items for better clarity and detail. * feat: enhance blog and author components with Open Graph image support - Added optional `ogImagePosition` field to blog entries for better image placement control. - Updated BlogPage and individual post pages to utilize the new `ogImagePosition` for responsive image styling. - Improved Author component to conditionally render author images based on availability. - Updated 2026 roadmap blog post with a new Open Graph image and position for enhanced visual appeal. * feat: enhance CardComponent with icon support and layout improvements - Added optional `icon` prop to CardComponent for better visual representation. - Updated CardComponent layout to include icon alongside title and children. - Improved styling for CardComponent and CardsBase for enhanced responsiveness and user experience. * feat: update 2026 roadmap with detailed focus areas and community-driven items - Added sections for Q1 and Q2 focus areas, outlining major initiatives like Dynamic Context and Admin Panel. - Enhanced clarity on community-driven items and their prioritization based on GitHub reactions. - Included hiring information to attract full-stack developers for ongoing project support. - Improved overall structure and readability of the roadmap content. * fix: improve icon styling in CardCompat component for better responsiveness - Updated icon container styling to ensure consistent height and width for SVG icons. - Enhanced layout of CardCompat to maintain visual integrity across different screen sizes. * chore: update .gitignore to include next-env.d.ts for TypeScript support * fix: correct import statement formatting in next-env.d.ts for consistency * fix: refine wording in 2026 roadmap for clarity - Updated the description of agentic workflows to emphasize a lean approach to context pulling. - Enhanced overall readability of the section on Dynamic Context. * feat: expand Admin Panel section in 2026 roadmap with detailed capabilities - Added comprehensive descriptions of the Admin Panel's core functionalities, including GUI for configuration, configuration profiles, group and role management, and access controls. - Clarified the development approach for the Admin Panel, emphasizing ongoing iteration and community involvement. - Updated note on the Admin Panel's prioritization and requirements following the ClickHouse acquisition. * feat: add TrackedLink component for enhanced analytics tracking - Introduced a new TrackedLink component that integrates Vercel analytics to track user interactions with links. - The component allows for customizable link properties while ensuring tracking of clicks with relevant metadata. - Updated CardCompat to utilize the new TrackedLink for improved user engagement tracking. * feat: enhance blog post layout and introduce TrackedAnchor component for link tracking - Wrapped the InlineTOC component in a div for improved spacing in blog posts. - Added a new TrackedAnchor component to facilitate link tracking with Vercel analytics, allowing for customizable anchor elements. - Updated mdx-components to utilize TrackedAnchor for enhanced link interaction tracking. * feat: update TrackedLink and TrackedAnchor components for external link handling - Enhanced the TrackedLink component to differentiate between internal and external links, using Next.js Link for internal navigation. - Introduced a utility function to determine if a link is external, improving tracking accuracy. - Updated TrackedAnchor to utilize the same external link handling logic for consistency in link tracking. * feat: add uncaught exception handling section to dotenv configuration documentation - Introduced a new section on uncaught exception handling, explaining how to override the default behavior to keep the app running after exceptions. - Added an option table detailing the `CONTINUE_ON_UNCAUGHT_EXCEPTION` configuration. - Included a warning callout advising against using this feature in production environments. * feat: add ESLint rule for unused variables in TypeScript - Introduced a new ESLint rule to enforce the handling of unused variables, allowing for specific patterns to be ignored. - This enhancement aims to improve code quality by ensuring that developers are alerted to potentially unnecessary variables while maintaining flexibility in naming conventions. * fix: update copyright year in LICENSE file to 2026 * feat: update footer menu link and add 2026 roadmap blog post - Changed the roadmap link in the FooterMenu component to point to the new blog post. - Introduced a new blog post detailing the 2026 roadmap for LibreChat, outlining key features and focus areas for the upcoming year. - Updated the import statement in next-env.d.ts for consistency with the new types directory. * fix: update import path in next-env.d.ts and add comment block in agents.mdx - Changed the import statement in next-env.d.ts to reference the new development types directory. - Added a comment block in agents.mdx to indicate that the Programmatic Tool Calling feature is in private beta. * fix: remove unused ESLint disable comment in context.tsx * chore: update blog
322 lines
11 KiB
Plaintext
322 lines
11 KiB
Plaintext
---
|
|
title: Upload Files as Text
|
|
icon: Upload
|
|
description: Upload files to include their full content in conversations without requiring OCR configuration.
|
|
---
|
|
|
|
Upload as Text allows you to upload documents and have their full content included directly in your conversation with the AI. This feature works out-of-the-box using text parsing methods, with optional OCR enhancement for improved extraction quality.
|
|
|
|
## Overview
|
|
|
|
- **No OCR required** - Uses text parsing with fallback methods by default
|
|
- **Enhanced by OCR** - If OCR is configured, extraction quality improves for images and scanned documents
|
|
- **Full document content** - Entire file content available to the model in the conversation
|
|
- **Works with all models** - No special tool capabilities needed
|
|
- **Token limit control** - Configurable via `fileTokenLimit` to manage context usage
|
|
|
|
## The `context` Capability
|
|
|
|
Upload as Text is controlled by the `context` capability in your LibreChat configuration.
|
|
|
|
```yaml
|
|
# librechat.yaml
|
|
endpoints:
|
|
agents:
|
|
capabilities:
|
|
- "context" # Enables "Upload as Text"
|
|
```
|
|
|
|
**Default:** The `context` capability is included by default. You only need to explicitly add it if you've customized the capabilities list.
|
|
|
|
## How It Works
|
|
|
|
When you upload a file using "Upload as Text":
|
|
|
|
1. LibreChat checks the file MIME type against `fileConfig` patterns
|
|
2. **Processing method determined by precedence: OCR > STT > text parsing**
|
|
3. If file matches `fileConfig.ocr.supportedMimeTypes` AND OCR is configured: **Use OCR**
|
|
4. If file matches `fileConfig.stt.supportedMimeTypes` AND STT is configured: **Use STT**
|
|
5. If file matches `fileConfig.text.supportedMimeTypes`: **Use text parsing**
|
|
6. Otherwise: **Fallback to text parsing**
|
|
7. Text is truncated to `fileConfig.fileTokenLimit` before prompt construction
|
|
8. Full extracted text included in conversation context
|
|
|
|
### Text Processing Methods
|
|
|
|
**Text Parsing (Default):**
|
|
- Uses a robust parsing library (same as the RAG API)
|
|
- Handles PDFs, Word docs, text files, code files, and more
|
|
- No external service required
|
|
- Works immediately without configuration
|
|
- Fallback method if no other match
|
|
|
|
**OCR Enhancement (Optional):**
|
|
- Improves extraction from images, scanned documents, and complex PDFs
|
|
- Requires OCR service configuration
|
|
- Automatically used for files matching `fileConfig.ocr.supportedMimeTypes` when available
|
|
- See [OCR Configuration](/docs/features/ocr)
|
|
|
|
**STT Processing (Optional):**
|
|
- Converts audio files to text
|
|
- Requires STT service configuration
|
|
- See [Speech-to-Text Configuration](/docs/configuration/librechat_yaml/object_structure/file_config#stt)
|
|
|
|
## Usage
|
|
|
|
1. Click the attachment icon in the chat input
|
|
2. Select "Upload as Text" from the menu
|
|
3. Choose your file
|
|
4. File content is extracted and included in your message
|
|
|
|
**Note:** If you don't see "Upload as Text", ensure the `context` capability is enabled in your [`endpoints.agents.capabilities` configuration](/docs/configuration/librechat_yaml/object_structure/agents#capabilities).
|
|
|
|
## Configuration
|
|
|
|
### Basic Configuration
|
|
|
|
The `context` capability is enabled by default. No additional configuration is required for basic text parsing functionality.
|
|
|
|
### File Handling Configuration
|
|
|
|
Control text processing behavior with `fileConfig`:
|
|
|
|
```yaml
|
|
fileConfig:
|
|
# Maximum tokens from text files before truncation
|
|
fileTokenLimit: 100000
|
|
|
|
# Files matching these patterns use OCR (if configured)
|
|
ocr:
|
|
supportedMimeTypes:
|
|
- "^image/(jpeg|gif|png|webp|heic|heif)$"
|
|
- "^application/pdf$"
|
|
- "^application/vnd\\.openxmlformats-officedocument\\.(wordprocessingml\\.document|presentationml\\.presentation|spreadsheetml\\.sheet)$"
|
|
- "^application/vnd\\.ms-(word|powerpoint|excel)$"
|
|
- "^application/epub\\+zip$"
|
|
|
|
# Files matching these patterns use text parsing
|
|
text:
|
|
supportedMimeTypes:
|
|
- "^text/(plain|markdown|csv|json|xml|html|css|javascript|typescript|x-python|x-java|x-csharp|x-php|x-ruby|x-go|x-rust|x-kotlin|x-swift|x-scala|x-perl|x-lua|x-shell|x-sql|x-yaml|x-toml)$"
|
|
|
|
# Files matching these patterns use STT (if configured)
|
|
stt:
|
|
supportedMimeTypes:
|
|
- "^audio/(mp3|mpeg|mpeg3|wav|wave|x-wav|ogg|vorbis|mp4|x-m4a|flac|x-flac|webm)$"
|
|
```
|
|
|
|
**Processing Priority:** OCR > STT > text parsing > fallback
|
|
|
|
For more details, see [File Config Object Structure](/docs/configuration/librechat_yaml/object_structure/file_config).
|
|
|
|
### Optional: Configure OCR for Enhanced Extraction
|
|
|
|
OCR is **not required** but enhances extraction quality when configured:
|
|
|
|
```yaml
|
|
# librechat.yaml
|
|
ocr:
|
|
strategy: "mistral_ocr"
|
|
apiKey: "${OCR_API_KEY}"
|
|
baseURL: "https://api.mistral.ai/v1"
|
|
mistralModel: "mistral-ocr-latest"
|
|
```
|
|
|
|
See [OCR Configuration](/docs/features/ocr) for full details.
|
|
|
|
## When to Use Each Upload Option
|
|
|
|
LibreChat offers three different ways to upload files, each suited for different use cases:
|
|
|
|
### Use "Upload as Text" when:
|
|
- ✅ You want the AI to read the complete document content
|
|
- ✅ Working with smaller files that fit in context
|
|
- ✅ You need "chat with files" functionality
|
|
- ✅ Using models without tool capabilities
|
|
- ✅ You want direct content access without semantic search
|
|
|
|
### Use "Upload for File Search" when:
|
|
- ✅ Working with large documents or multiple files
|
|
- ✅ You want to optimize token usage
|
|
- ✅ You need semantic search for relevant sections
|
|
- ✅ Building knowledge bases
|
|
- ✅ The `file_search` capability is enabled and toggled ON
|
|
|
|
### Use standard "Upload Files" when:
|
|
- ✅ Using vision models to analyze images
|
|
- ✅ Using code interpreter to execute code
|
|
- ✅ Files don't need text extraction
|
|
|
|
## Supported File Types
|
|
|
|
### Text Files (text parsing)
|
|
- Plain text, Markdown, CSV, JSON, XML, HTML
|
|
- Programming languages (Python, JavaScript, Java, C++, etc.)
|
|
- Configuration files (YAML, TOML, INI, etc.)
|
|
- Shell scripts, SQL files
|
|
|
|
### Documents (text parsing or OCR)
|
|
- PDF documents
|
|
- Word documents (.docx, .doc)
|
|
- PowerPoint presentations (.pptx, .ppt)
|
|
- Excel spreadsheets (.xlsx, .xls)
|
|
- EPUB books
|
|
|
|
### Images (OCR if configured)
|
|
- JPEG, PNG, GIF, WebP
|
|
- HEIC, HEIF (Apple formats)
|
|
- Screenshots, photos of documents, scanned images
|
|
|
|
### Audio (STT if configured)
|
|
- MP3, WAV, OGG, FLAC
|
|
- M4A, WebM
|
|
- Voice recordings, podcasts
|
|
|
|
## File Processing Priority
|
|
|
|
LibreChat processes files based on MIME type matching with the following **priority order**:
|
|
|
|
1. **OCR** - If file matches `ocr.supportedMimeTypes` AND OCR is configured
|
|
2. **STT** - If file matches `stt.supportedMimeTypes` AND STT is configured
|
|
3. **Text Parsing** - If file matches `text.supportedMimeTypes`
|
|
4. **Fallback** - Text parsing as last resort
|
|
|
|
### Processing Examples
|
|
|
|
**PDF file with OCR configured:**
|
|
- Matches `ocr.supportedMimeTypes`
|
|
- **Uses OCR** to extract text
|
|
- Better quality for scanned PDFs
|
|
|
|
**PDF file without OCR configured:**
|
|
- Matches `text.supportedMimeTypes` (fallback)
|
|
- **Uses text parsing** library
|
|
- Works well for digital PDFs
|
|
|
|
**Python file:**
|
|
- Matches `text.supportedMimeTypes`
|
|
- **Uses text parsing** (no OCR needed)
|
|
- Direct text extraction
|
|
|
|
**Audio file with STT configured:**
|
|
- Matches `stt.supportedMimeTypes`
|
|
- **Uses STT** to transcribe
|
|
|
|
## Token Limits
|
|
|
|
Files are truncated to `fileTokenLimit` tokens to manage context window usage:
|
|
|
|
```yaml
|
|
fileConfig:
|
|
fileTokenLimit: 100000 # Default: 100,000 tokens
|
|
```
|
|
|
|
- Truncation happens at runtime before prompt construction
|
|
- Helps prevent exceeding model context limits
|
|
- Configurable based on your needs and model capabilities
|
|
- Larger limits allow more content but use more tokens
|
|
|
|
## Comparison with Other File Features
|
|
|
|
| Feature | Capability | Requires Service | Persistence | Best For |
|
|
|---------|-----------|------------------|-------------|----------|
|
|
| **Upload as Text** | `context` | No (enhanced by OCR) | Single conversation | Temporary document questions |
|
|
| **Agent File Context** | `context` | No (enhanced by OCR) | Agent system instructions | Specialized agent knowledge |
|
|
| **File Search** | `file_search` | Yes (vector DB) | Stored in vector store | Large documents, semantic search |
|
|
|
|
### Upload as Text vs Agent File Context
|
|
|
|
**Upload as Text (`context`):**
|
|
- Available in any chat conversation
|
|
- Content included in current conversation only
|
|
- No OCR service required (text parsing by default)
|
|
- Best for one-off document questions
|
|
|
|
**Agent File Context (`context`):**
|
|
- Only available in Agent Builder
|
|
- Content stored in agent's system instructions
|
|
- No OCR service required (text parsing by default)
|
|
- Best for creating specialized agents with persistent knowledge
|
|
- See [OCR for Documents](/docs/features/ocr)
|
|
|
|
### Upload as Text vs File Search
|
|
|
|
**Upload as Text (`context`):**
|
|
- Full document content in conversation context
|
|
- Direct access to all text
|
|
- Token usage: entire file (up to limit)
|
|
- Works without RAG API configuration
|
|
|
|
**File Search (`file_search`):**
|
|
- Semantic search over documents
|
|
- Returns relevant chunks via tool use
|
|
- Token usage: only relevant sections
|
|
- Requires RAG API and vector store configuration
|
|
- See [RAG API](/docs/features/rag_api)
|
|
|
|
## Example Use Cases
|
|
|
|
- **Document Analysis**: Upload contracts, reports, or articles for analysis
|
|
- **Code Review**: Upload source files for review and suggestions
|
|
- **Data Extraction**: Extract information from structured documents
|
|
- **Translation**: Translate document contents
|
|
- **Summarization**: Summarize articles, papers, or reports
|
|
- **Research**: Discuss academic papers or technical documentation
|
|
- **Troubleshooting**: Share log files for analysis
|
|
- **Content Editing**: Review and edit written content
|
|
- **Data Processing**: Work with CSV or JSON data files
|
|
|
|
## Troubleshooting
|
|
|
|
### "Upload as Text" option not appearing
|
|
|
|
**Solution:** Ensure the `context` capability is enabled:
|
|
|
|
```yaml
|
|
endpoints:
|
|
agents:
|
|
capabilities:
|
|
- "context" # Add this if missing
|
|
```
|
|
|
|
### File content not extracted properly
|
|
|
|
**Solutions:**
|
|
1. Check if file type is supported (matches `fileConfig` patterns)
|
|
2. For images/scanned documents: Configure OCR for better extraction
|
|
3. For audio files: Configure STT service
|
|
4. Verify file is not corrupted
|
|
|
|
### Content seems truncated
|
|
|
|
**Solution:** Increase the token limit:
|
|
|
|
```yaml
|
|
fileConfig:
|
|
fileTokenLimit: 150000 # Increase as needed
|
|
```
|
|
|
|
### Poor extraction quality from images
|
|
|
|
**Solution:** Configure OCR to enhance extraction:
|
|
|
|
```yaml
|
|
ocr:
|
|
strategy: "mistral_ocr"
|
|
apiKey: "${OCR_API_KEY}"
|
|
```
|
|
|
|
See [OCR Configuration](/docs/configuration/librechat_yaml/object_structure/ocr) for details.
|
|
|
|
## Related Features
|
|
|
|
- [File Context](/docs/features/agents#file-context) - Files used as Agent Context
|
|
- [OCR for Documents](/docs/features/ocr) - Learn about and configure OCR services
|
|
- [File Configuration](/docs/configuration/librechat_yaml/object_structure/file_config) - Configure file handling
|
|
|
|
---
|
|
|
|
Upload as Text provides a simple, powerful way to work with documents in LibreChat without requiring complex configuration or external services.
|
|
|
|
|