mirror of
https://github.com/langgenius/dify-docs.git
synced 2026-03-26 13:18:34 +07:00
* updated docs and migration script * update the dify version, add the migration script link, and improve formatting --------- Co-authored-by: DhruvGorasiya <Dhruv.Gorasiya@student.csulb.edu> Co-authored-by: kurokobo <2920259+kurokobo@users.noreply.github.com> Co-authored-by: Riskey <riskey47@dify.ai>
898 lines
25 KiB
Plaintext
898 lines
25 KiB
Plaintext
---
|
||
title: Weaviate Migration Guide upgrading to Client v4 and Server 1.27+
|
||
---
|
||
|
||
> This guide explains how to migrate from Weaviate client v3 to v4.17.0 and upgrade your Weaviate server from version 1.19.0 to 1.27.0 or higher. This migration is required for Dify versions that include the weaviate-client v4 upgrade.
|
||
|
||
## Overview
|
||
|
||
Starting with **Dify v1.9.2**, the weaviate-client has been upgraded from v3 to v4.17.0. This upgrade brings significant performance improvements and better stability, but requires **Weaviate server version 1.27.0 or higher**.
|
||
|
||
<Warning>
|
||
**BREAKING CHANGE**: The new weaviate-client v4 is NOT backward compatible with Weaviate server versions below 1.27.0. If you are running a self-hosted Weaviate instance on version 1.19.0 or older, you must upgrade your Weaviate server before upgrading Dify.
|
||
</Warning>
|
||
|
||
### Who Is Affected?
|
||
|
||
This migration affects:
|
||
|
||
- Self-hosted Dify users running their own Weaviate instances on versions below 1.27.0
|
||
- Users currently on Weaviate server version 1.19.0-1.26.x
|
||
- Users upgrading to Dify versions with weaviate-client v4
|
||
|
||
**Not affected**:
|
||
|
||
- Cloud-hosted Weaviate users (Weaviate Cloud manages the server version)
|
||
- Users already on Weaviate 1.27.0+ can upgrade Dify without additional steps
|
||
- Users running Dify's default Docker Compose setup (Weaviate version is updated automatically)
|
||
|
||
## Breaking Changes
|
||
|
||
### Client v4 Requirements
|
||
|
||
The weaviate-client v4 introduces several breaking changes:
|
||
|
||
1. **Minimum Server Version**: Requires Weaviate server 1.27.0 or higher
|
||
2. **API Changes**: New import structure (`weaviate.classes` instead of `weaviate.client`)
|
||
3. **gRPC Support**: Uses gRPC by default on port 50051 for improved performance
|
||
4. **Authentication Changes**: Updated authentication methods and configuration
|
||
|
||
### Why Upgrade?
|
||
|
||
- **Performance**: Significantly faster query and import operations via gRPC (50051)
|
||
- **Stability**: Better connection handling and error recovery
|
||
- **Future Compatibility**: Access to latest Weaviate features and ongoing support
|
||
- **Security**: Weaviate 1.19.0 is over a year old and no longer receives security updates
|
||
|
||
## Version Compatibility Matrix
|
||
|
||
| Dify Version | Weaviate-client Version | Compatible Weaviate Server Versions |
|
||
| ------------ | ----------------------- | ----------------------------------- |
|
||
| ≤ 1.9.1 | v3.x | 1.19.0 - 1.26.x |
|
||
| ≥ 1.9.2 | v4.17.0 | 1.27.0+ (tested up to 1.33.1) |
|
||
|
||
<Info>
|
||
This migration applies to any Dify version using weaviate-client v4.17.0 or higher.
|
||
</Info>
|
||
|
||
<Info>
|
||
Weaviate server version 1.19.0 was released over a year ago and is now outdated. Upgrading to 1.27.0+ provides access to numerous improvements in performance, stability, and features.
|
||
</Info>
|
||
|
||
## Prerequisites
|
||
|
||
Before starting the migration, complete these steps:
|
||
|
||
1. **Check Your Current Weaviate Version**
|
||
|
||
```bash
|
||
curl http://localhost:8080/v1/meta
|
||
```
|
||
|
||
Look for the `version` field in the response.
|
||
|
||
2. **Backup Your Data**
|
||
|
||
- Create a complete backup of your Weaviate data
|
||
- Backup your Docker volumes if using Docker Compose
|
||
- Document your current configuration settings
|
||
|
||
3. **Review System Requirements**
|
||
|
||
- Ensure sufficient disk space for database migration
|
||
- Verify network connectivity between Dify and Weaviate
|
||
- Confirm gRPC port (50051) is accessible if using external Weaviate
|
||
|
||
4. **Plan Downtime**
|
||
- The migration will require service downtime
|
||
- Notify users if running in production
|
||
- Schedule migration during low-traffic periods
|
||
|
||
## Migration Paths
|
||
|
||
Choose the migration path that matches your deployment setup and current Weaviate version.
|
||
|
||
### Choose Your Path
|
||
|
||
- **Path A – Migration with Backup (from 1.19)**: Recommended if you are still on Weaviate 1.19. You will create a backup, upgrade to 1.27+, repair any orphaned data, and then migrate the schema.
|
||
- **Path B – Direct Recovery (already on 1.27+)**: Use this if you already upgraded to 1.27+ and your knowledge bases stopped working. This path focuses on repairing the data layout and running the schema migration.
|
||
|
||
<Warning>
|
||
Do **not** attempt to downgrade back to 1.19. The schema format is incompatible and will lead to data loss.
|
||
</Warning>
|
||
|
||
### Path A: Migration with Backup (From 1.19)
|
||
|
||
<Info>
|
||
Safest path. Creates a backup before upgrading so you can restore if anything goes wrong.
|
||
</Info>
|
||
|
||
#### Prerequisites
|
||
|
||
- Currently running Weaviate 1.19
|
||
- Docker + Docker Compose installed
|
||
- Python 3.11+ available for the [schema migration script](https://github.com/langgenius/dify-docs/blob/main/assets/migrate_weaviate_collections.py)
|
||
|
||
#### Step A1: Enable the Backup Module on Weaviate 1.19
|
||
|
||
Edit `docker/docker-compose.yaml` so the `weaviate` service includes backup configuration:
|
||
|
||
```yaml
|
||
weaviate:
|
||
image: semitechnologies/weaviate:1.19.0
|
||
volumes:
|
||
- ./volumes/weaviate:/var/lib/weaviate
|
||
- ./volumes/weaviate_backups:/var/lib/weaviate/backups
|
||
ports:
|
||
- "8080:8080"
|
||
- "50051:50051"
|
||
environment:
|
||
ENABLE_MODULES: backup-filesystem
|
||
BACKUP_FILESYSTEM_PATH: /var/lib/weaviate/backups
|
||
# ... rest of your environment variables
|
||
```
|
||
|
||
Restart Weaviate to apply the change:
|
||
|
||
```bash
|
||
cd docker
|
||
docker compose down
|
||
docker compose --profile up -d
|
||
sleep 10
|
||
```
|
||
|
||
#### Step A2: Create a Backup
|
||
|
||
1. **List your collections**:
|
||
|
||
```bash
|
||
curl -s -H "Authorization: Bearer <WEAVIATE_API_KEY>" \
|
||
"http://localhost:8080/v1/schema" | \
|
||
python3 -c "
|
||
import json, sys
|
||
data = json.load(sys.stdin)
|
||
print("Collections:")
|
||
for cls in data.get('classes', []):
|
||
print(f" - {cls['class']}")
|
||
"
|
||
```
|
||
|
||
2. **Trigger the backup**: include specific collection names if you prefer.
|
||
|
||
```bash
|
||
curl -X POST \
|
||
-H "Authorization: Bearer <WEAVIATE_API_KEY>" \
|
||
-H "Content-Type: application/json" \
|
||
"http://localhost:8080/v1/backups/filesystem" \
|
||
-d '{
|
||
"id": "kb-backup",
|
||
"include": ["Vector_index_COLLECTION1_Node", "Vector_index_COLLECTION2_Node"]
|
||
}'
|
||
```
|
||
|
||
3. **Check backup status**:
|
||
|
||
```bash
|
||
sleep 5
|
||
curl -s -H "Authorization: Bearer <WEAVIATE_API_KEY>" \
|
||
"http://localhost:8080/v1/backups/filesystem/kb-backup" | \
|
||
python3 -m json.tool | grep status
|
||
```
|
||
|
||
4. **Verify backup files exist**:
|
||
|
||
```bash
|
||
ls -lh docker/volumes/weaviate_backups/kb-backup/
|
||
```
|
||
|
||
#### Step A3: Upgrade to Weaviate 1.27+
|
||
|
||
1. **Upgrade Dify to a version that ships Weaviate 1.27+**:
|
||
|
||
```bash
|
||
cd /path/to/dify
|
||
git fetch origin
|
||
git checkout main # or a tagged release that includes the upgrade
|
||
```
|
||
|
||
2. **Confirm the new Weaviate image**:
|
||
|
||
```bash
|
||
grep "image: semitechnologies/weaviate" docker/docker-compose.yaml
|
||
```
|
||
|
||
3. **Restart with the new version**:
|
||
|
||
```bash
|
||
cd docker
|
||
docker compose down
|
||
docker compose up -d
|
||
sleep 20
|
||
```
|
||
|
||
#### Step A4: Fix Orphaned LSM Data (if present)
|
||
|
||
You can fix orphaned LSM data either from the host or inside the container:
|
||
|
||
**Option A: From host (if volumes are mounted)**:
|
||
|
||
```bash
|
||
cd docker/volumes/weaviate
|
||
|
||
for dir in vector_index_*_node_*_lsm; do
|
||
[ -d "$dir" ] || continue
|
||
|
||
index_id=$(echo "$dir" | sed -n 's/vector_index_\([^_]*_[^_]*_[^_]*_[^_]*_[^_]*\)_node_.*/\1/p')
|
||
shard_id=$(echo "$dir" | sed -n 's/.*_node_\([^_]*\)_lsm/\1/p')
|
||
|
||
mkdir -p "vector_index_${index_id}_node/$shard_id/lsm"
|
||
cp -a "$dir/"* "vector_index_${index_id}_node/$shard_id/lsm/"
|
||
|
||
echo "✓ Copied $dir"
|
||
done
|
||
|
||
cd ../../
|
||
docker compose restart weaviate
|
||
sleep 15
|
||
```
|
||
|
||
**Option B: Inside Weaviate container (recommended)**:
|
||
|
||
```bash
|
||
cd /path/to/dify/docker
|
||
docker compose exec -it weaviate /bin/sh
|
||
|
||
# Inside container
|
||
cd /var/lib/weaviate
|
||
for dir in vector_index_*_node_*_lsm; do
|
||
[ -d "$dir" ] || continue
|
||
|
||
index_id=$(echo "$dir" | sed -n 's/vector_index_\([^_]*_[^_]*_[^_]*_[^_]*_[^_]*\)_node_.*/\1/p')
|
||
shard_id=$(echo "$dir" | sed -n 's/.*_node_\([^_]*\)_lsm/\1/p')
|
||
|
||
mkdir -p "vector_index_${index_id}_node/$shard_id/lsm"
|
||
cp -a "$dir/"* "vector_index_${index_id}_node/$shard_id/lsm/"
|
||
|
||
echo "✓ Copied $dir"
|
||
done
|
||
exit
|
||
|
||
# Restart Weaviate
|
||
docker compose restart weaviate
|
||
sleep 15
|
||
```
|
||
|
||
#### Step A5: Migrate the Schema
|
||
|
||
1. **Install dependencies** (in a temporary virtualenv is fine):
|
||
|
||
```bash
|
||
cd /path/to/dify
|
||
python3 -m venv weaviate_migration_env
|
||
source weaviate_migration_env/bin/activate
|
||
pip install weaviate-client requests
|
||
```
|
||
|
||
2. **Run the [migration script](https://github.com/langgenius/dify-docs/blob/main/assets/migrate_weaviate_collections.py)** either locally or inside the Worker container.\
|
||
**Option A: Run locally (if you have Python 3.11+ and dependencies installed)**:
|
||
|
||
```bash
|
||
python3 migrate_weaviate_collections.py
|
||
```
|
||
|
||
**Option B: Run inside Worker container (recommended for Docker setups)**:
|
||
|
||
```bash
|
||
# Copy script to storage directory
|
||
cp migrate_weaviate_collections.py /path/to/dify/docker/volumes/app/storage/
|
||
|
||
# Enter worker container
|
||
cd /path/to/dify/docker
|
||
docker compose exec -it worker /bin/bash
|
||
|
||
# Run migration script (use --no-cache for Dify 1.11.0+)
|
||
uv run --no-cache /app/api/storage/migrate_weaviate_collections.py
|
||
|
||
# Exit container
|
||
exit
|
||
```
|
||
|
||
<Info>
|
||
The migration script uses environment variables for configuration, making it suitable for running inside Docker containers. For Dify 1.11.0+, if you encounter permission errors with `uv`, use `uv run --no-cache` instead.
|
||
</Info>
|
||
|
||
3. **Restart Dify services**:
|
||
|
||
```bash
|
||
cd docker
|
||
docker compose restart api worker worker_beat
|
||
sleep 15
|
||
```
|
||
|
||
4. **Verify in the UI**: open Dify, test retrieval against your migrated knowledge bases.
|
||
|
||
<Warning>
|
||
For large collections (over 10,000 objects), verify that the object count matches between old and new collections. The migration script will display verification counts automatically.
|
||
</Warning>
|
||
|
||
<Info>
|
||
After confirming a healthy migration, you can delete `weaviate_migration_env` and the backup files to reclaim disk space.
|
||
</Info>
|
||
|
||
### Path B: Direct Recovery (Already on 1.27+)
|
||
|
||
<Warning>
|
||
Only use this path if you already upgraded to 1.27+ and your knowledge bases stopped working. You cannot create a 1.19 backup anymore, so you must repair the data in place.
|
||
</Warning>
|
||
|
||
#### Prerequisites
|
||
|
||
- Currently running Weaviate 1.27+ (including 1.33)
|
||
- Docker + Docker Compose installed
|
||
- Python 3.11+ for the [migration script](https://github.com/langgenius/dify-docs/blob/main/assets/migrate_weaviate_collections.py)
|
||
|
||
#### Step B1: Repair Orphaned LSM Data
|
||
|
||
Stop Weaviate and fix orphaned LSM data:
|
||
|
||
```bash
|
||
cd /path/to/dify/docker
|
||
docker compose stop weaviate
|
||
|
||
# Option A: From host (if volumes are mounted)
|
||
cd volumes/weaviate
|
||
|
||
for dir in vector_index_*_node_*_lsm; do
|
||
[ -d "$dir" ] || continue
|
||
|
||
index_id=$(echo "$dir" | sed -n 's/vector_index_\([^_]*_[^_]*_[^_]*_[^_]*_[^_]*\)_node_.*/\1/p')
|
||
shard_id=$(echo "$dir" | sed -n 's/.*_node_\([^_]*\)_lsm/\1/p')
|
||
|
||
mkdir -p "vector_index_${index_id}_node/$shard_id/lsm"
|
||
cp -a "$dir/"* "vector_index_${index_id}_node/$shard_id/lsm/"
|
||
|
||
echo "✓ Copied $dir"
|
||
done
|
||
|
||
# Option B: Inside container (recommended)
|
||
docker compose run --rm --entrypoint /bin/sh weaviate -c "
|
||
cd /var/lib/weaviate
|
||
for dir in vector_index_*_node_*_lsm; do
|
||
[ -d \"\$dir\" ] || continue
|
||
index_id=\$(echo \"\$dir\" | sed -n 's/vector_index_\([^_]*_[^_]*_[^_]*_[^_]*_[^_]*\)_node_.*/\1/p')
|
||
shard_id=\$(echo \"\$dir\" | sed -n 's/.*_node_\([^_]*\)_lsm/\1/p')
|
||
mkdir -p \"vector_index_\${index_id}_node/\$shard_id/lsm\"
|
||
cp -a \"\$dir/\"* \"vector_index_\${index_id}_node/\$shard_id/lsm/\"
|
||
echo \"✓ Copied \$dir\"
|
||
done
|
||
"
|
||
```
|
||
|
||
Restart Weaviate:
|
||
|
||
```bash
|
||
docker compose start weaviate
|
||
sleep 15
|
||
```
|
||
|
||
List collections and confirm object counts are non-zero:
|
||
|
||
```bash
|
||
curl -s -H "Authorization: Bearer <WEAVIATE_API_KEY>" \
|
||
"http://localhost:8080/v1/schema" | python3 -c "
|
||
import sys, json
|
||
for cls in json.load(sys.stdin).get('classes', []):
|
||
if cls['class'].startswith('Vector_index_'):
|
||
print(cls['class'])
|
||
"
|
||
|
||
curl -s -H "Authorization: Bearer <WEAVIATE_API_KEY>" \
|
||
"http://localhost:8080/v1/objects?class=YOUR_COLLECTION_NAME&limit=0" | \
|
||
python3 -c "import sys, json; print(json.load(sys.stdin).get('totalResults', 0))"
|
||
```
|
||
|
||
#### Step B2: Run the Schema Migration
|
||
|
||
Follow the same commands as [Step A5](#step-a5%3A-migrate-the-schema). You can run the script locally or inside the Worker container:
|
||
|
||
**To run inside Worker container**:
|
||
|
||
```bash
|
||
# Copy script to storage directory
|
||
cp migrate_weaviate_collections.py /path/to/dify/docker/volumes/app/storage/
|
||
|
||
# Enter worker container
|
||
cd /path/to/dify/docker
|
||
docker compose exec -it worker /bin/bash
|
||
|
||
# Run migration script
|
||
uv run --no-cache /app/api/storage/migrate_weaviate_collections.py
|
||
|
||
# Exit and restart services
|
||
exit
|
||
docker compose restart api worker worker_beat
|
||
```
|
||
|
||
<Info>
|
||
The migration script uses cursor-based pagination to safely handle large
|
||
collections. Verify object counts match after migration completes.
|
||
</Info>
|
||
|
||
#### Step B3: Verify in Dify
|
||
|
||
- Open Dify’s Knowledge Base UI.
|
||
- Use Retrieval Testing to confirm queries return results.
|
||
- If errors persist, inspect `docker compose logs weaviate` for additional repair steps (see [Troubleshooting](#troubleshooting)).
|
||
|
||
## Data Migration for Legacy Versions
|
||
|
||
<Warning>
|
||
**CRITICAL: Data Migration Required**
|
||
|
||
**Your existing knowledge bases will NOT work after upgrade without migration!**
|
||
|
||
**Why Migration is Needed**:
|
||
|
||
- Old data: Created with Weaviate v3 client (simple schema)
|
||
- New code: Requires Weaviate v4 format (extended schema)
|
||
- **Incompatible**: Old data missing required properties
|
||
|
||
**Migration Options**:
|
||
|
||
- Option A: Use Weaviate Backup/Restore
|
||
|
||
- Option B: Re-index from Original Documents
|
||
|
||
- Option C: Keep Old Weaviate (Don't Upgrade Yet) If you can't afford downtime or data loss.
|
||
|
||
</Warning>
|
||
|
||
### Automatic Migration
|
||
|
||
In most cases, Weaviate 1.27.0 will automatically migrate data from 1.19.0:
|
||
|
||
1. Stop Weaviate 1.19.0
|
||
2. Start Weaviate 1.27.0 with the same data directory
|
||
3. Weaviate will detect the old format and migrate automatically
|
||
4. Monitor logs for migration progress and any errors
|
||
|
||
### Manual Migration (If Automatic Fails)
|
||
|
||
If automatic migration fails, use Weaviate's export/import tools:
|
||
|
||
#### 1. Export Data from Old Version
|
||
|
||
Use the Cursor API or backup feature to export all data. For large datasets, use Weaviate's backup API:
|
||
|
||
```bash
|
||
# Using backup API (recommended)
|
||
curl -X POST "http://localhost:8080/v1/backups/filesystem" \
|
||
-H "Content-Type: application/json" \
|
||
-d '{"id": "pre-migration-backup"}'
|
||
```
|
||
|
||
#### 2. Import Data to New Version
|
||
|
||
After upgrading to Weaviate 1.27.0, restore the backup:
|
||
|
||
```bash
|
||
curl -X POST "http://localhost:8080/v1/backups/filesystem/pre-migration-backup/restore" \
|
||
-H "Content-Type: application/json"
|
||
```
|
||
|
||
<Info>
|
||
For comprehensive migration guidance, especially for complex schemas or large datasets, refer to the official [Weaviate Migration Guide](https://weaviate.io/developers/weaviate/installation/migration).
|
||
</Info>
|
||
|
||
## Configuration Changes
|
||
|
||
### New Environment Variables
|
||
|
||
The following new environment variable is available in Dify versions with weaviate-client v4:
|
||
|
||
#### WEAVIATE_GRPC_ENDPOINT
|
||
|
||
**Description**: Specifies the gRPC endpoint for Weaviate connections. Using gRPC significantly improves performance for batch operations and queries.
|
||
|
||
**Format**: `hostname:port` (NO protocol prefix)
|
||
|
||
**Default Ports**:
|
||
|
||
- Insecure: 50051
|
||
- Secure (TLS): 443
|
||
|
||
**Examples**:
|
||
|
||
```bash
|
||
# Docker Compose (internal network)
|
||
WEAVIATE_GRPC_ENDPOINT=weaviate:50051
|
||
|
||
# External server (insecure)
|
||
WEAVIATE_GRPC_ENDPOINT=192.168.1.100:50051
|
||
|
||
# External server with custom port
|
||
WEAVIATE_GRPC_ENDPOINT=weaviate.example.com:9090
|
||
|
||
# Weaviate Cloud (secure/TLS on port 443)
|
||
WEAVIATE_GRPC_ENDPOINT=your-instance.weaviate.cloud:443
|
||
```
|
||
|
||
<Warning>
|
||
Do NOT include protocol prefixes like `grpc://` or `http://` in the WEAVIATE_GRPC_ENDPOINT value. Use only `hostname:port`.
|
||
</Warning>
|
||
|
||
### Updated Environment Variables
|
||
|
||
All existing Weaviate environment variables remain the same:
|
||
|
||
- **WEAVIATE_ENDPOINT**: HTTP endpoint for Weaviate (e.g., `http://weaviate:8080`)
|
||
- **WEAVIATE_API_KEY**: API key for authentication (if enabled)
|
||
- **WEAVIATE_BATCH_SIZE**: Batch size for imports (default: 100)
|
||
- **WEAVIATE_GRPC_ENABLED**: Enable/disable gRPC (default: true in v4)
|
||
|
||
### Complete Configuration Example
|
||
|
||
```bash
|
||
# docker/.env or environment configuration
|
||
VECTOR_STORE=weaviate
|
||
|
||
# HTTP Endpoint (required)
|
||
WEAVIATE_ENDPOINT=http://weaviate:8080
|
||
|
||
# Authentication (if enabled on your Weaviate instance)
|
||
WEAVIATE_API_KEY=your-secret-api-key
|
||
|
||
# gRPC Configuration (recommended for performance)
|
||
WEAVIATE_GRPC_ENABLED=true
|
||
WEAVIATE_GRPC_ENDPOINT=weaviate:50051
|
||
|
||
# Batch Import Settings
|
||
WEAVIATE_BATCH_SIZE=100
|
||
```
|
||
|
||
## Verification Steps
|
||
|
||
After completing the migration, verify everything is working correctly:
|
||
|
||
### 1. Check Weaviate Connection
|
||
|
||
Verify Weaviate is accessible and running the correct version:
|
||
|
||
```bash
|
||
# Check HTTP endpoint and version
|
||
curl http://your-weaviate-host:8080/v1/meta | jq '.version'
|
||
|
||
# Should return 1.27.0 or higher
|
||
```
|
||
|
||
### 2. Verify Dify Connection
|
||
|
||
Check the Dify logs for successful Weaviate connection:
|
||
|
||
```bash
|
||
docker compose logs api | grep -i weaviate
|
||
```
|
||
|
||
Look for messages indicating successful connection without "No module named 'weaviate.classes'" errors.
|
||
|
||
### 3. Test Knowledge Base Creation
|
||
|
||
1. Log into your Dify instance
|
||
2. Navigate to **Knowledge Base** section
|
||
3. Create a new knowledge base
|
||
4. Upload a test document (PDF, TXT, or MD)
|
||
5. Wait for indexing to complete
|
||
6. Check that status changes from "QUEUING" → "INDEXING" → "AVAILABLE"
|
||
|
||
<Info>
|
||
If documents get stuck in "QUEUING" status, check that the Celery worker is running: `docker compose logs worker`.
|
||
</Info>
|
||
|
||
### 4. Test Vector Search
|
||
|
||
1. Create or open a chat application with knowledge base integration
|
||
2. Ask a question that should retrieve information from your knowledge base
|
||
3. Verify that relevant results are returned with correct scores
|
||
4. Check the citation/source links work correctly
|
||
|
||
### 5. Verify gRPC Performance
|
||
|
||
If gRPC is enabled, you should see improved performance:
|
||
|
||
```bash
|
||
# Check if gRPC port is accessible
|
||
docker exec -it dify-api-1 nc -zv weaviate 50051
|
||
|
||
# Monitor query times in logs
|
||
docker compose logs -f api | grep -i "query_time\|duration"
|
||
```
|
||
|
||
<Info>
|
||
With gRPC properly configured, vector search queries should be 2-5x faster compared to HTTP-only connections.
|
||
</Info>
|
||
|
||
## Troubleshooting
|
||
|
||
### Issue: "No module named 'weaviate.classes'"
|
||
|
||
**Cause**: The weaviate-client v4 is not installed, or v3 is still being used.
|
||
|
||
**Solution**:
|
||
|
||
```bash
|
||
# For Docker installations, ensure you're running the correct Dify version
|
||
docker compose pull
|
||
docker compose down
|
||
docker compose up -d
|
||
|
||
# For source installations
|
||
pip uninstall weaviate-client
|
||
pip install weaviate-client==4.17.0
|
||
```
|
||
|
||
### Issue: Connection Refused on gRPC Port (50051)
|
||
|
||
**Cause**: Port 50051 is not exposed, not accessible, or Weaviate is not listening on it.
|
||
|
||
**Solution**:
|
||
|
||
1. **For Docker Compose users with bundled Weaviate**:
|
||
The port is available internally between containers. No action needed unless you're connecting from outside Docker.
|
||
|
||
2. **For external Weaviate**:
|
||
|
||
```bash
|
||
# Check if Weaviate is listening on 50051
|
||
docker ps | grep weaviate
|
||
# Look for "0.0.0.0:50051->50051/tcp"
|
||
|
||
# If not exposed, restart with port mapping
|
||
docker run -p 8080:8080 -p 50051:50051 ...
|
||
```
|
||
|
||
3. **Check firewall rules**:
|
||
|
||
```bash
|
||
# Linux
|
||
sudo ufw allow 50051/tcp
|
||
|
||
# Check if port is listening
|
||
netstat -tlnp | grep 50051
|
||
```
|
||
|
||
### Issue: Authentication Errors (401 Unauthorized)
|
||
|
||
**Cause**: API key mismatch or authentication configuration issue.
|
||
|
||
**Solution**:
|
||
|
||
1. Verify API key matches in both Weaviate and Dify:
|
||
|
||
```bash
|
||
# Check Weaviate authentication
|
||
curl http://localhost:8080/v1/meta | jq '.authentication'
|
||
|
||
# Check Dify configuration
|
||
docker compose exec api env | grep WEAVIATE_API_KEY
|
||
```
|
||
|
||
2. If using anonymous access:
|
||
|
||
```yaml
|
||
# Weaviate docker-compose.yaml
|
||
weaviate:
|
||
environment:
|
||
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: "true"
|
||
AUTHENTICATION_APIKEY_ENABLED: "false"
|
||
```
|
||
|
||
Then remove `WEAVIATE_API_KEY` from Dify configuration.
|
||
|
||
### Issue: Documents Stuck in "QUEUING" Status
|
||
|
||
**Cause**: Celery worker not running or not connected to Redis.
|
||
|
||
**Solution**:
|
||
|
||
```bash
|
||
# Check if worker is running
|
||
docker compose ps worker
|
||
|
||
# Check worker logs
|
||
docker compose logs worker | tail -50
|
||
|
||
# Check Redis connection
|
||
docker compose exec api redis-cli -h redis -p 6379 -a difyai123456 ping
|
||
# Should return "PONG"
|
||
|
||
# Restart worker
|
||
docker compose restart worker
|
||
```
|
||
|
||
### Issue: Slow Performance After Migration
|
||
|
||
**Cause**: gRPC not enabled or configured incorrectly.
|
||
|
||
**Solution**:
|
||
|
||
1. Verify gRPC configuration:
|
||
|
||
```bash
|
||
docker compose exec api env | grep WEAVIATE_GRPC
|
||
```
|
||
|
||
Should show:
|
||
|
||
```
|
||
WEAVIATE_GRPC_ENABLED=true
|
||
WEAVIATE_GRPC_ENDPOINT=weaviate:50051
|
||
```
|
||
|
||
2. Test gRPC connectivity:
|
||
|
||
```bash
|
||
docker exec -it dify-api-1 nc -zv weaviate 50051
|
||
# Should return "succeeded"
|
||
```
|
||
|
||
3. If still slow, check network latency between Dify and Weaviate
|
||
|
||
### Issue: Schema Migration Errors
|
||
|
||
**Cause**: Incompatible schema changes between Weaviate versions or corrupted data.
|
||
|
||
**Solution**:
|
||
|
||
1. Check Weaviate logs for specific error messages:
|
||
|
||
```bash
|
||
docker compose logs weaviate | tail -100
|
||
```
|
||
|
||
2. List current schema:
|
||
|
||
```bash
|
||
curl http://localhost:8080/v1/schema
|
||
```
|
||
|
||
3. If necessary, delete corrupted collections (⚠️ this deletes all data):
|
||
|
||
```bash
|
||
# Backup first!
|
||
curl -X DELETE http://localhost:8080/v1/schema/YourCollectionName
|
||
```
|
||
|
||
4. Restart Dify to recreate schema:
|
||
```bash
|
||
docker compose restart api worker
|
||
```
|
||
|
||
<Warning>
|
||
Deleting collections removes all data. Only do this if you have a backup and are prepared to re-index all content.
|
||
</Warning>
|
||
|
||
### Issue: Docker Volume Permission Errors
|
||
|
||
**Cause**: User ID mismatch in Docker containers.
|
||
|
||
**Solution**:
|
||
|
||
```bash
|
||
# Check ownership of Weaviate data directory
|
||
ls -la docker/volumes/weaviate/
|
||
|
||
# Fix permissions (use the UID shown in error messages)
|
||
sudo chown -R 1000:1000 docker/volumes/weaviate/
|
||
|
||
# Restart services
|
||
docker compose restart weaviate
|
||
```
|
||
|
||
### Issue: Permission Denied When Running Migration Script (Dify 1.11.0+)
|
||
|
||
**Cause**: The `/home/dify` directory may not exist in newer Dify versions, causing `uv` cache creation to fail.
|
||
|
||
**Solution**:
|
||
|
||
```bash
|
||
# Option 1: Use --no-cache flag (recommended)
|
||
uv run --no-cache migrate_weaviate_collections.py
|
||
|
||
# Option 2: Run as root user
|
||
docker compose exec -u root worker /bin/bash
|
||
uv run migrate_weaviate_collections.py
|
||
```
|
||
|
||
## Rollback Plan
|
||
|
||
If the migration fails and you need to rollback:
|
||
|
||
### Step 1: Stop Services
|
||
|
||
```bash
|
||
cd /path/to/dify/docker
|
||
docker compose down
|
||
```
|
||
|
||
### Step 2: Restore Backup
|
||
|
||
```bash
|
||
# Remove current volumes
|
||
rm -rf volumes/weaviate
|
||
|
||
# Restore from backup
|
||
tar -xvf ../weaviate-backup-TIMESTAMP.tgz
|
||
```
|
||
|
||
### Step 3: Revert Dify Version
|
||
|
||
```bash
|
||
cd /path/to/dify
|
||
git checkout <previous-version-tag>
|
||
cd docker
|
||
docker compose pull
|
||
```
|
||
|
||
### Step 4: Restart Services
|
||
|
||
```bash
|
||
docker compose up -d
|
||
```
|
||
|
||
### Step 5: Verify Rollback
|
||
|
||
Check that services are running with old versions:
|
||
|
||
```bash
|
||
# Check versions
|
||
docker compose exec api pip show weaviate-client
|
||
curl http://localhost:8080/v1/meta | jq '.version'
|
||
|
||
# Check for errors
|
||
docker compose logs | grep -i error
|
||
```
|
||
|
||
<Info>
|
||
Always test the rollback procedure in a staging environment first if possible. Maintain multiple backup copies before attempting major migrations.
|
||
</Info>
|
||
|
||
## Additional Resources
|
||
|
||
### Official Documentation
|
||
|
||
- [Weaviate Migration Guide](https://weaviate.io/developers/weaviate/installation/migration)
|
||
- [Weaviate v4 Client Documentation](https://weaviate.io/developers/weaviate/client-libraries/python)
|
||
- [Weaviate Backup and Restore](https://weaviate.io/developers/weaviate/configuration/backups)
|
||
- [Dify Self-Hosting Guide](/en/self-host/quick-start/docker-compose)
|
||
- [Dify Environment Variables](/en/self-host/configuration/environments)
|
||
|
||
### Community Resources
|
||
|
||
- [Dify GitHub Repository](https://github.com/langgenius/dify)
|
||
- [Dify GitHub Issues - Weaviate](https://github.com/langgenius/dify/issues?q=is%3Aissue+weaviate)
|
||
- [Weaviate Community Forum](https://forum.weaviate.io/)
|
||
- [Dify Community Forum](https://forum.dify.ai/)
|
||
|
||
### Migration Tools
|
||
|
||
- [Weaviate Python Client v4](https://github.com/weaviate/weaviate-python-client)
|
||
- [Weaviate Backup Tools](https://github.com/weaviate/weaviate/tree/main/tools)
|
||
|
||
## Summary
|
||
|
||
This migration brings important improvements to Dify's vector storage capabilities:
|
||
|
||
- **Better Performance**: gRPC support dramatically improves query and import speeds (2-5x faster)
|
||
|
||
- **Improved Stability**: Enhanced connection handling and error recovery
|
||
|
||
- **Security**: Access to security updates and patches not available in Weaviate 1.19.0
|
||
|
||
- **Future-Proof**: Access to latest Weaviate features and ongoing support
|
||
|
||
While this is a breaking change requiring server upgrade for users on old versions, the benefits significantly outweigh the migration effort. Most Docker Compose users can complete the migration in under 15 minutes with the automatic update.
|
||
|
||
<Info>
|
||
If you encounter any issues not covered in this guide, please report them on the [Dify GitHub Issues page](https://github.com/langgenius/dify/issues) with the label "weaviate" and "migration".
|
||
</Info>
|