From 8a82b7341e361856a67dd69264962863f60283cf Mon Sep 17 00:00:00 2001 From: Marcel Klehr Date: Thu, 13 Jun 2024 09:50:46 +0200 Subject: [PATCH 1/6] fix(admin/AI): Small additions Signed-off-by: Marcel Klehr --- admin_manual/ai/app_context_chat.rst | 2 ++ admin_manual/ai/app_llm2.rst | 22 +++++++++++++++------- admin_manual/ai/app_stt_whisper2.rst | 8 +++++--- 3 files changed, 22 insertions(+), 10 deletions(-) diff --git a/admin_manual/ai/app_context_chat.rst b/admin_manual/ai/app_context_chat.rst index 53eeb8c2c..b3ccc5813 100644 --- a/admin_manual/ai/app_context_chat.rst +++ b/admin_manual/ai/app_context_chat.rst @@ -13,6 +13,8 @@ Together they provide the ContextChat text processing tasks accessible via the : The *context_chat* and *context_chat_backend* apps run only open source models and do so entirely on-premises. Nextcloud can provide customer support upon request, please talk to your account manager for the possibilities. +This app supports input and output in other languages than English, if the language model supports the language. + Requirements ------------ diff --git a/admin_manual/ai/app_llm2.rst b/admin_manual/ai/app_llm2.rst index 2c78c11e4..e074d3545 100644 --- a/admin_manual/ai/app_llm2.rst +++ b/admin_manual/ai/app_llm2.rst @@ -4,17 +4,19 @@ App: Local large language model (llm2) .. _ai-app-llm2: -The *llm2* app is one of the apps that provide text processing functionality using Large language models in Nextcloud and act as a text processing backend for the :ref:`Nextcloud Assistant app`, the *mail* app and :ref:`other apps making use of the core Translation API`. The *llm2* app specifically runs only open source models and does so entirely on-premises. Nextcloud can provide customer support upon request, please talk to your account manager for the possibilities. +The *llm2* app is one of the apps that provide text processing functionality using Large language models in Nextcloud and act as a text processing backend for the :ref:`Nextcloud Assistant app`, the *mail* app and :ref:`other apps making use of the core Text Processing API`. The *llm2* app specifically runs only open source models and does so entirely on-premises. Nextcloud can provide customer support upon request, please talk to your account manager for the possibilities. This app uses `ctransformers `_ under the hood and is thus compatible with any model in *gguf* format. Output quality will differ depending on which model you use, we recommend the following models: -* `Llama2 7b Chat `_ (Slightly older; good quality; good acclaim) -* `NeuralBeagle14 7B `_ (Newer; good quality; less well known) +* `Llama3 8b Instruct `_ (reasonable quality; fast; good acclaim; multilingual output may not be optimal) +* `Llama3 70B Instruct `_ (good quality; good acclaim; good multilingual output) + +This app supports input and output in other languages than English, if the underlying model supports the language. Requirements ------------ -* This app is built as an External App and thus depends on AppAPI v2.3.0 +* This app is built as an External App and thus depends on AppAPI v2.3.0 or higher * Nextcloud AIO is supported * Using GPU processing is supported, but not required; be prepared for slow performance unless you are using GPU * We currently only support NVIDIA GPUs @@ -22,9 +24,14 @@ Requirements * You will need a GPU with enough VRAM to hold the model you choose - * for 7B parameter models, 5bit-quantized variants and lower should fit on a 8GB VRAM, but of course have lower quality - * for 7B parameter models, 6bit-quantized variants and up will need 12GB VRAM + * here are some examples: + + * for 8B parameter models, 5bit-quantized variants and lower should fit on a 8GB VRAM, but of course have slightly lower quality + * for 8B parameter models, 6bit-quantized variants and up will need 12GB VRAM + * for a 70B parameter model 4bit-quantized variants will need ~44GB VRAM + * If you want better reasoning capabilities, you will need to look for models with more parameters, like 14B and higher, which of course also need more VRAM + * To check whether a model fits on your GPU, you can use this `calculator `_ * CPU Sizing @@ -69,7 +76,8 @@ Nextcloud customers should file bugs directly with our Support system. Known Limitations ----------------- -* We currently only support the English language +* We currently only support languages that the underlying model supports; correctness of language use in other languages than English may be poor depending on the language's coverage in the model's training data (We recommended model Llama 3 or other models explicitly trained on multiple languages) +* Language models can be bad at reasoning tasks * Language models are likely to generate false information and should thus only be used in situations that are not critical. It's recommended to only use AI at the beginning of a creation process and not at the end, so that outputs of AI serve as a draft for example and not as final product. Always check the output of language models before using it. * Make sure to test the language model you are using it for whether it meets the use-case's quality requirements * Language models notoriously have a high energy consumption, if you want to reduce load on your server you can choose smaller models or quantized models in excahnge for lower accuracy diff --git a/admin_manual/ai/app_stt_whisper2.rst b/admin_manual/ai/app_stt_whisper2.rst index 4ef327dd0..0b901a4d2 100644 --- a/admin_manual/ai/app_stt_whisper2.rst +++ b/admin_manual/ai/app_stt_whisper2.rst @@ -6,10 +6,12 @@ App: Local Whisper Speech-To-Text (stt_whisper2) The *stt_whisper2* app is one of the apps that provide Speech-To-Text functionality in Nextcloud and act as a media transcription backend for the :ref:`Nextcloud Assistant app`, the *talk* app and :ref:`other apps making use of the core Translation API`. The *stt_whisper2* app specifically runs only open source models and does so entirely on-premises. Nextcloud can provide customer support upon request, please talk to your account manager for the possibilities. +This app supports input and output in other languages than English, if the underlying model supports the language. + This app uses `faster-whisper `_ under the hood. Output quality will differ depending on which model you use, we recommend the following models: - * OpenAI Whisper large-v2 or v3 - * OpenAI Whisper medium.en + * OpenAI Whisper large-v2 or v3 (multilingual) + * OpenAI Whisper medium.en (only supports English) Requirements ------------ @@ -44,7 +46,7 @@ Installation Supplying alternate models ~~~~~~~~~~~~~~~~~~~~~~~~~~ -This app allows supplying alternate LLM models as *gguf* files in the ``/nc_app_llm2_data`` directory of the docker container. You can use any `*faster-whisper* model by Systran on hugging face `_ by simply +This app allows supplying alternate models in the ``/nc_app_llm2_data`` directory of the docker container. You can use any `*faster-whisper* model by Systran on hugging face `_ by simply 1. git cloning the respective repository 2. Copying the folder with the git repository to ``/nc_app_llm2_data`` inside the docker container. From 8d3184980990245c3d7db3e8842eb2887942a6a5 Mon Sep 17 00:00:00 2001 From: Marcel Klehr Date: Mon, 24 Jun 2024 12:39:21 +0200 Subject: [PATCH 2/6] Update admin_manual/ai/app_stt_whisper2.rst Co-authored-by: Josh Signed-off-by: Marcel Klehr --- admin_manual/ai/app_stt_whisper2.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/admin_manual/ai/app_stt_whisper2.rst b/admin_manual/ai/app_stt_whisper2.rst index 0b901a4d2..5b51d1fc6 100644 --- a/admin_manual/ai/app_stt_whisper2.rst +++ b/admin_manual/ai/app_stt_whisper2.rst @@ -46,7 +46,7 @@ Installation Supplying alternate models ~~~~~~~~~~~~~~~~~~~~~~~~~~ -This app allows supplying alternate models in the ``/nc_app_llm2_data`` directory of the docker container. You can use any `*faster-whisper* model by Systran on hugging face `_ by simply +This app allows supplying alternate models in the ``/nc_app_llm2_data`` directory of the docker container. You can use any `*faster-whisper* model by Systran on hugging face `_ in the following way: 1. git cloning the respective repository 2. Copying the folder with the git repository to ``/nc_app_llm2_data`` inside the docker container. From 5c881f087ce83f9c493d47ea2c313872520bfe26 Mon Sep 17 00:00:00 2001 From: Marcel Klehr Date: Mon, 24 Jun 2024 12:39:31 +0200 Subject: [PATCH 3/6] Update admin_manual/ai/app_stt_whisper2.rst Co-authored-by: Josh Signed-off-by: Marcel Klehr --- admin_manual/ai/app_stt_whisper2.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/admin_manual/ai/app_stt_whisper2.rst b/admin_manual/ai/app_stt_whisper2.rst index 5b51d1fc6..1557036ab 100644 --- a/admin_manual/ai/app_stt_whisper2.rst +++ b/admin_manual/ai/app_stt_whisper2.rst @@ -11,7 +11,7 @@ This app supports input and output in other languages than English, if the under This app uses `faster-whisper `_ under the hood. Output quality will differ depending on which model you use, we recommend the following models: * OpenAI Whisper large-v2 or v3 (multilingual) - * OpenAI Whisper medium.en (only supports English) + * OpenAI Whisper medium.en (English only) Requirements ------------ From 2995d98bf375da6947f59b832224fe85d3a80b83 Mon Sep 17 00:00:00 2001 From: Marcel Klehr Date: Mon, 24 Jun 2024 12:39:41 +0200 Subject: [PATCH 4/6] Update admin_manual/ai/app_llm2.rst Co-authored-by: Josh Signed-off-by: Marcel Klehr --- admin_manual/ai/app_llm2.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/admin_manual/ai/app_llm2.rst b/admin_manual/ai/app_llm2.rst index e074d3545..67b4d68d6 100644 --- a/admin_manual/ai/app_llm2.rst +++ b/admin_manual/ai/app_llm2.rst @@ -24,7 +24,7 @@ Requirements * You will need a GPU with enough VRAM to hold the model you choose - * here are some examples: + * Some examples: * for 8B parameter models, 5bit-quantized variants and lower should fit on a 8GB VRAM, but of course have slightly lower quality * for 8B parameter models, 6bit-quantized variants and up will need 12GB VRAM From 692751789993b77b6778e076910aa339df8473ce Mon Sep 17 00:00:00 2001 From: Marcel Klehr Date: Mon, 24 Jun 2024 12:39:58 +0200 Subject: [PATCH 5/6] Update admin_manual/ai/app_context_chat.rst Co-authored-by: Josh Signed-off-by: Marcel Klehr --- admin_manual/ai/app_context_chat.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/admin_manual/ai/app_context_chat.rst b/admin_manual/ai/app_context_chat.rst index b3ccc5813..97e9d3367 100644 --- a/admin_manual/ai/app_context_chat.rst +++ b/admin_manual/ai/app_context_chat.rst @@ -13,7 +13,7 @@ Together they provide the ContextChat text processing tasks accessible via the : The *context_chat* and *context_chat_backend* apps run only open source models and do so entirely on-premises. Nextcloud can provide customer support upon request, please talk to your account manager for the possibilities. -This app supports input and output in other languages than English, if the language model supports the language. +This app supports input and output in languages other than English if the language model supports the language. Requirements ------------ From fe598387c347f880b7394aba93de5e35299ddbde Mon Sep 17 00:00:00 2001 From: Marcel Klehr Date: Mon, 24 Jun 2024 12:40:25 +0200 Subject: [PATCH 6/6] Apply suggestions from code review Co-authored-by: Josh Signed-off-by: Marcel Klehr --- admin_manual/ai/app_llm2.rst | 4 ++-- admin_manual/ai/app_stt_whisper2.rst | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/admin_manual/ai/app_llm2.rst b/admin_manual/ai/app_llm2.rst index 67b4d68d6..28131a1dc 100644 --- a/admin_manual/ai/app_llm2.rst +++ b/admin_manual/ai/app_llm2.rst @@ -11,7 +11,7 @@ This app uses `ctransformers `_ under * `Llama3 8b Instruct `_ (reasonable quality; fast; good acclaim; multilingual output may not be optimal) * `Llama3 70B Instruct `_ (good quality; good acclaim; good multilingual output) -This app supports input and output in other languages than English, if the underlying model supports the language. +This app supports input and output in languages other than English if the underlying model supports the language. Requirements ------------ @@ -76,7 +76,7 @@ Nextcloud customers should file bugs directly with our Support system. Known Limitations ----------------- -* We currently only support languages that the underlying model supports; correctness of language use in other languages than English may be poor depending on the language's coverage in the model's training data (We recommended model Llama 3 or other models explicitly trained on multiple languages) +* We currently only support languages that the underlying model supports; correctness of language use in languages other than English may be poor depending on the language's coverage in the model's training data (We recommended model Llama 3 or other models explicitly trained on multiple languages) * Language models can be bad at reasoning tasks * Language models are likely to generate false information and should thus only be used in situations that are not critical. It's recommended to only use AI at the beginning of a creation process and not at the end, so that outputs of AI serve as a draft for example and not as final product. Always check the output of language models before using it. * Make sure to test the language model you are using it for whether it meets the use-case's quality requirements diff --git a/admin_manual/ai/app_stt_whisper2.rst b/admin_manual/ai/app_stt_whisper2.rst index 1557036ab..0fddadad5 100644 --- a/admin_manual/ai/app_stt_whisper2.rst +++ b/admin_manual/ai/app_stt_whisper2.rst @@ -6,7 +6,7 @@ App: Local Whisper Speech-To-Text (stt_whisper2) The *stt_whisper2* app is one of the apps that provide Speech-To-Text functionality in Nextcloud and act as a media transcription backend for the :ref:`Nextcloud Assistant app`, the *talk* app and :ref:`other apps making use of the core Translation API`. The *stt_whisper2* app specifically runs only open source models and does so entirely on-premises. Nextcloud can provide customer support upon request, please talk to your account manager for the possibilities. -This app supports input and output in other languages than English, if the underlying model supports the language. +This app supports input and output in languages other than English if the underlying model supports the language. This app uses `faster-whisper `_ under the hood. Output quality will differ depending on which model you use, we recommend the following models: