mirror of https://github.com/open-webui/docs.git synced 2026-01-03 18:26:47 +07:00

Files

Samuel Maier 1f70dd5b4b add slim_down.md

2024-07-28 18:28:59 +02:00

1.2 KiB

Raw Blame History

sidebar_position, title

sidebar_position	title
10	Slimming down RAM usage

Slimming down RAM usage

If you deploy this image in a RAM constrained environment, there are a few things you can do do slim down the image.

On a Raspberry Pi 4 (arm64) with version v0.3.10 this was able to reduce idle memory consumption from >1GB to ~200MB.

TLDR

Set the following environment variables: RAG_EMBEDDING_ENGINE: ollama, AUDIO_STT_ENGINE: openai.

Longer explanation

A lot of the memory consumption is because of loaded ML models. Even if you use an external language model (OpenAI or un-bundled ollama) a lot of models may be loaded for additional purposes.

As of v0.3.10 this includes:

Speach-to-text (defaults to whisper)
RAG Embedding engine (defaults to local SentenceTransformers model)
Image generation engine (disabled by default)

The first 2 are enabled and set to local models by default. You can change the models in the admin planel (RAG: Documents category, set it to ollama or OpenAI, Speach-to-text: Audio section, OpenAI or WebAPI work). If you deploy via docker you can also set these with the following environment variables: RAG_EMBEDDING_ENGINE: ollama, AUDIO_STT_ENGINE: openai.

1.2 KiB Raw Blame History

Slimming down RAM usage

TLDR

Longer explanation

1.2 KiB

Raw Blame History