📙 docs: Upload as Text (#381)

* 📚 docs: Add OCR, textParsing, and fileTokenLimit configuration documentation (#376) - Added support for `ocr` and `textParsing` configurations in `fileConfig`, allowing users to specify file types for OCR processing and direct text extraction. - Introduced `fileTokenLimit` parameter for all endpoints to manage maximum token limits for file processing. * 📚 docs: Add STT configuration documentation (#380) - Added `stt` configuration to `fileConfig` for Speech-to-Text audio file processing, including supported MIME types. - Updated changelog to reflect the addition of STT alongside existing OCR and text parsing configurations. * 📚 docs: finish fileTokenLimit documentation and update changelog (#382) * refactor: change `textParsing` to `text`
2026-03-27 10:48:32 +07:00 · 2025-10-01 07:38:08 -07:00
parent 86476fe309
commit e2f771cea8
4 changed files with 150 additions and 2 deletions
--- a/components/changelog/content/config_v1.2.8.mdx
+++ b/components/changelog/content/config_v1.2.8.mdx
@@ -86,4 +86,20 @@
  - See [Interface Object Structure - fileSearch](/docs/configuration/librechat_yaml/object_structure/interface#filesearch) for details

 - Improved [Model Specs documentation](/docs/configuration/librechat_yaml/object_structure/model_specs) with parameter support updates:
-  - Added support for `disableStreaming`, `thinking`, `thinkingBudget`, `web_search`, and other parameters
+  - Added support for `disableStreaming`, `thinking`, `thinkingBudget`, `web_search`, and other parameters
+
+- Added OCR, text parsing, and STT separation to `fileConfig`:
+  - Added `ocr` configuration to control which file types use OCR processing
+  - Added `text` configuration to control which file types use direct text extraction
+  - Added `stt` configuration to control which audio file types use Speech-to-Text transcription
+  - Separate processing paths for visual documents (OCR), text files (native parsing), and audio files (STT)
+  - Processing precedence: OCR > STT > text parsing
+  - Default OCR support: images (JPEG, GIF, PNG, WebP, HEIC, HEIF), PDFs, Office documents, EPUB files
+  - Default text parsing support: all text MIME types and common programming languages
+  - Default STT support: audio formats (MP3, WAV, FLAC, OGG, M4A, WebM, etc.)
+  - See [File Config Object Structure](/docs/configuration/librechat_yaml/object_structure/file_config) for details
+
+- Added `fileTokenLimit` parameter support for all endpoints:
+  - Allows setting default and on-the-fly maximum token limits for file processing to control costs and resource usage
+  - Available as URL query parameter and in endpoint configuration panels, or can be configured in `fileConfig` field of `librechat.yaml`
+  - Runtime behavior: text from attached files is truncated to this limit just before prompt construction (default: 100000)