Files
lobehub/docs/usage/features/vision.mdx
sxjeru 2a811d0a85 🔨 chore: Prettier & Add proxyUrl for all providers (#8061)
* migration: 添加移除 reasoning_effort 参数和删除 doubao 的迁移脚本

* Refactor code structure for improved readability and maintainability

* npx prettier --write "src/**/*"

* add proxyUrl for all providers

* npx prettier --write "docs/**/*"

* Revert

* Revert

* revert
2025-06-04 21:39:39 +08:00

21 lines
1.3 KiB
Plaintext

---
title: LobeChat support Vision Recognition
description: >-
Discover how LobeChat integrates visual recognition capabilities like OpenAI's gpt-4-vision and Google Gemini Pro vision for intelligent conversations based on uploaded images.
tags:
- LobeChat
- Model Vision Recognition
- Multimodal Interaction
- Visual Elements
- Intelligent Conversations
---
# Model Vision Recognition
<Image alt={'Model Vision Recognition'} borderless cover src={'https://github.com/user-attachments/assets/18574a1f-46c2-4cbc-af2c-35a86e128a07'} />
LobeChat now supports large language models with visual recognition capabilities such as OpenAI's [`gpt-4-vision`](https://platform.openai.com/docs/guides/vision), Google Gemini Pro vision, and Zhipu GLM-4 Vision, enabling LobeChat to have multimodal interaction capabilities. Users can easily upload or drag and drop images into the chat box, and the assistant will be able to recognize the content of the images and engage in intelligent conversations based on them, creating more intelligent and diverse chat scenarios.
This feature opens up new ways of interaction, allowing communication to extend beyond text and encompass rich visual elements. Whether it's sharing images in daily use or interpreting images in specific industries, the assistant can provide an excellent conversational experience.