Merge pull request #11882 from nextcloud/fix/admin/ai/llm2-stt_whisper2-no-gpu

fix(admin/AI): llm2 & stt_whisper2 still don't support GPU yet
This commit is contained in:
Marcel Klehr
2024-06-25 08:07:49 +02:00
committed by GitHub
2 changed files with 2 additions and 26 deletions

View File

@@ -18,24 +18,10 @@ Requirements
* This app is built as an External App and thus depends on AppAPI v2.3.0 or higher
* Nextcloud AIO is supported
* Using GPU processing is supported, but not required; be prepared for slow performance unless you are using GPU
* We currently only support NVIDIA GPUs
* GPU Sizing
* You will need a GPU with enough VRAM to hold the model you choose
* Some examples:
* for 8B parameter models, 5bit-quantized variants and lower should fit on a 8GB VRAM, but of course have slightly lower quality
* for 8B parameter models, 6bit-quantized variants and up will need 12GB VRAM
* for a 70B parameter model 4bit-quantized variants will need ~44GB VRAM
* If you want better reasoning capabilities, you will need to look for models with more parameters, like 14B and higher, which of course also need more VRAM
* To check whether a model fits on your GPU, you can use this `calculator <https://rahulschand.github.io/gpu_poor/>`_
* Using GPU is currently not supported
* CPU Sizing
* If you don't have a GPU, this app will utilize your CPU cores
* The more cores you have and the more powerful the CPU the better, we recommend 10-20 cores
* The app will hog all cores by default, so it is usually better to run it on a separate machine

View File

@@ -19,20 +19,10 @@ Requirements
* Minimal Nextcloud version: 28
* This app is built as an External App and thus depends on AppAPI v2.3.0
* Nextcloud AIO is supported
* Using GPU processing is supported, but not required; be prepared for slow performance unless you are using GPU
* We currently only support NVIDIA GPUs
* GPU Sizing
* You will need a GPU with enough VRAM to hold the model you choose
* the small model should fit on 2GB VRAM
* the large-v2 (the best and largest) will need 6GB VRAM
* The distil-whisper variants have half the parameters of the original models while supposedly staying within 1% of the original error rate (your mileage may vary)
* Using GPU is currently not supported
* CPU Sizing
* If you don't have a GPU, this app will utilize your CPU cores
* The more cores you have and the more powerful the CPU the better, we recommend 10-20 cores
* The app will hog all cores by default, so it is usually better to run it on a separate machine