diff --git a/admin_manual/ai/app_context_chat.rst b/admin_manual/ai/app_context_chat.rst index 53eeb8c2c..97e9d3367 100644 --- a/admin_manual/ai/app_context_chat.rst +++ b/admin_manual/ai/app_context_chat.rst @@ -13,6 +13,8 @@ Together they provide the ContextChat text processing tasks accessible via the : The *context_chat* and *context_chat_backend* apps run only open source models and do so entirely on-premises. Nextcloud can provide customer support upon request, please talk to your account manager for the possibilities. +This app supports input and output in languages other than English if the language model supports the language. + Requirements ------------ diff --git a/admin_manual/ai/app_llm2.rst b/admin_manual/ai/app_llm2.rst index 2c78c11e4..28131a1dc 100644 --- a/admin_manual/ai/app_llm2.rst +++ b/admin_manual/ai/app_llm2.rst @@ -4,17 +4,19 @@ App: Local large language model (llm2) .. _ai-app-llm2: -The *llm2* app is one of the apps that provide text processing functionality using Large language models in Nextcloud and act as a text processing backend for the :ref:`Nextcloud Assistant app`, the *mail* app and :ref:`other apps making use of the core Translation API`. The *llm2* app specifically runs only open source models and does so entirely on-premises. Nextcloud can provide customer support upon request, please talk to your account manager for the possibilities. +The *llm2* app is one of the apps that provide text processing functionality using Large language models in Nextcloud and act as a text processing backend for the :ref:`Nextcloud Assistant app`, the *mail* app and :ref:`other apps making use of the core Text Processing API`. The *llm2* app specifically runs only open source models and does so entirely on-premises. Nextcloud can provide customer support upon request, please talk to your account manager for the possibilities. This app uses `ctransformers `_ under the hood and is thus compatible with any model in *gguf* format. Output quality will differ depending on which model you use, we recommend the following models: -* `Llama2 7b Chat `_ (Slightly older; good quality; good acclaim) -* `NeuralBeagle14 7B `_ (Newer; good quality; less well known) +* `Llama3 8b Instruct `_ (reasonable quality; fast; good acclaim; multilingual output may not be optimal) +* `Llama3 70B Instruct `_ (good quality; good acclaim; good multilingual output) + +This app supports input and output in languages other than English if the underlying model supports the language. Requirements ------------ -* This app is built as an External App and thus depends on AppAPI v2.3.0 +* This app is built as an External App and thus depends on AppAPI v2.3.0 or higher * Nextcloud AIO is supported * Using GPU processing is supported, but not required; be prepared for slow performance unless you are using GPU * We currently only support NVIDIA GPUs @@ -22,9 +24,14 @@ Requirements * You will need a GPU with enough VRAM to hold the model you choose - * for 7B parameter models, 5bit-quantized variants and lower should fit on a 8GB VRAM, but of course have lower quality - * for 7B parameter models, 6bit-quantized variants and up will need 12GB VRAM + * Some examples: + + * for 8B parameter models, 5bit-quantized variants and lower should fit on a 8GB VRAM, but of course have slightly lower quality + * for 8B parameter models, 6bit-quantized variants and up will need 12GB VRAM + * for a 70B parameter model 4bit-quantized variants will need ~44GB VRAM + * If you want better reasoning capabilities, you will need to look for models with more parameters, like 14B and higher, which of course also need more VRAM + * To check whether a model fits on your GPU, you can use this `calculator `_ * CPU Sizing @@ -69,7 +76,8 @@ Nextcloud customers should file bugs directly with our Support system. Known Limitations ----------------- -* We currently only support the English language +* We currently only support languages that the underlying model supports; correctness of language use in languages other than English may be poor depending on the language's coverage in the model's training data (We recommended model Llama 3 or other models explicitly trained on multiple languages) +* Language models can be bad at reasoning tasks * Language models are likely to generate false information and should thus only be used in situations that are not critical. It's recommended to only use AI at the beginning of a creation process and not at the end, so that outputs of AI serve as a draft for example and not as final product. Always check the output of language models before using it. * Make sure to test the language model you are using it for whether it meets the use-case's quality requirements * Language models notoriously have a high energy consumption, if you want to reduce load on your server you can choose smaller models or quantized models in excahnge for lower accuracy diff --git a/admin_manual/ai/app_stt_whisper2.rst b/admin_manual/ai/app_stt_whisper2.rst index 4ef327dd0..0fddadad5 100644 --- a/admin_manual/ai/app_stt_whisper2.rst +++ b/admin_manual/ai/app_stt_whisper2.rst @@ -6,10 +6,12 @@ App: Local Whisper Speech-To-Text (stt_whisper2) The *stt_whisper2* app is one of the apps that provide Speech-To-Text functionality in Nextcloud and act as a media transcription backend for the :ref:`Nextcloud Assistant app`, the *talk* app and :ref:`other apps making use of the core Translation API`. The *stt_whisper2* app specifically runs only open source models and does so entirely on-premises. Nextcloud can provide customer support upon request, please talk to your account manager for the possibilities. +This app supports input and output in languages other than English if the underlying model supports the language. + This app uses `faster-whisper `_ under the hood. Output quality will differ depending on which model you use, we recommend the following models: - * OpenAI Whisper large-v2 or v3 - * OpenAI Whisper medium.en + * OpenAI Whisper large-v2 or v3 (multilingual) + * OpenAI Whisper medium.en (English only) Requirements ------------ @@ -44,7 +46,7 @@ Installation Supplying alternate models ~~~~~~~~~~~~~~~~~~~~~~~~~~ -This app allows supplying alternate LLM models as *gguf* files in the ``/nc_app_llm2_data`` directory of the docker container. You can use any `*faster-whisper* model by Systran on hugging face `_ by simply +This app allows supplying alternate models in the ``/nc_app_llm2_data`` directory of the docker container. You can use any `*faster-whisper* model by Systran on hugging face `_ in the following way: 1. git cloning the respective repository 2. Copying the folder with the git repository to ``/nc_app_llm2_data`` inside the docker container.