- Cloud Providers: Added Azure OpenAI tab (with provider type toggle
instructions and link to EntraID tutorial) and LiteLLM tab
- Local Servers: Added Llama.cpp tab with quick start command and
link to the dedicated Llama.cpp guide
https://claude.ai/code/session_01TPoquFdHG6dZxRrZ4Jormh
- OpenAI page: Focus on OpenAI/Azure only, link to compatible page
for all other providers
- OpenAI-Compatible page: Complete rewrite with tabbed provider guides
- Cloud tabs: Anthropic, Google Gemini, DeepSeek, Mistral, Groq,
Perplexity, MiniMax, OpenRouter, Amazon Bedrock
- Local tabs: Lemonade, LM Studio, vLLM, LocalAI, Docker Model Runner
- Added prominent warning about /models endpoint failing for some
providers (Anthropic, Perplexity, MiniMax) with solution table
- Fixed Google Gemini URL (removed trailing slash)
- Deleted minimax.md and amazon-bedrock.md tutorials (content moved
into the compatible page tabs)
https://claude.ai/code/session_01TPoquFdHG6dZxRrZ4Jormh
- scaling.md: Link each step to relevant troubleshooting sections
(DB corruption, WebSocket errors, login loops, worker crashes,
file access issues, logging, OpenTelemetry)
- multi-replica.mdx: Add scaling guide link in intro and Related Docs,
plus links to Redis and RAG troubleshooting
- performance.md: Add scaling guide link in Scaling Infrastructure section
- redis.md: Add scaling guide link in "When is Redis Required?"
- connection-error.mdx: Add links to Redis tutorial, scaling guide,
and multi-replica WebSocket troubleshooting
- Helm.md: Add scaling guide link alongside existing HA guide link
https://claude.ai/code/session_01TPoquFdHG6dZxRrZ4Jormh
- Added Step 4 (Switch to External Vector Database) explaining why default
ChromaDB crashes in multi-process setups, with a comparison table of
alternatives (PGVector, Milvus, Qdrant, Pinecone, ChromaDB HTTP mode)
- Corrected Step 5 (file storage): shared filesystem (NFS) is sufficient,
cloud storage (S3) is optional. Files use UUID-based names so no write
conflicts occur between processes.
- Updated architecture diagram, env var examples, and quick reference table
to include vector DB column and clarify storage requirements.
https://claude.ai/code/session_01TPoquFdHG6dZxRrZ4Jormh
ChromaDB's default local PersistentClient uses SQLite which is not fork-safe.
When uvicorn forks multiple workers, concurrent writes crash workers instantly.
Added warnings and guidance across env config, HA/scaling, performance,
troubleshooting, Docker Swarm, Helm, Redis, RAG, and enterprise architecture docs.
https://claude.ai/code/session_01TPoquFdHG6dZxRrZ4Jormh