mirror of
https://github.com/nextcloud/documentation.git
synced 2026-01-02 17:59:36 +07:00
Merge pull request #11907 from nextcloud/fix/ai/small-additions2
fix(admin/AI): Small additions
This commit is contained in:
@@ -13,6 +13,8 @@ Together they provide the ContextChat text processing tasks accessible via the :
|
||||
|
||||
The *context_chat* and *context_chat_backend* apps run only open source models and do so entirely on-premises. Nextcloud can provide customer support upon request, please talk to your account manager for the possibilities.
|
||||
|
||||
This app supports input and output in languages other than English if the language model supports the language.
|
||||
|
||||
Requirements
|
||||
------------
|
||||
|
||||
|
||||
@@ -4,17 +4,19 @@ App: Local large language model (llm2)
|
||||
|
||||
.. _ai-app-llm2:
|
||||
|
||||
The *llm2* app is one of the apps that provide text processing functionality using Large language models in Nextcloud and act as a text processing backend for the :ref:`Nextcloud Assistant app<ai-app-assistant>`, the *mail* app and :ref:`other apps making use of the core Translation API<tp-consumer-apps>`. The *llm2* app specifically runs only open source models and does so entirely on-premises. Nextcloud can provide customer support upon request, please talk to your account manager for the possibilities.
|
||||
The *llm2* app is one of the apps that provide text processing functionality using Large language models in Nextcloud and act as a text processing backend for the :ref:`Nextcloud Assistant app<ai-app-assistant>`, the *mail* app and :ref:`other apps making use of the core Text Processing API<tp-consumer-apps>`. The *llm2* app specifically runs only open source models and does so entirely on-premises. Nextcloud can provide customer support upon request, please talk to your account manager for the possibilities.
|
||||
|
||||
This app uses `ctransformers <https://github.com/marella/ctransformers>`_ under the hood and is thus compatible with any model in *gguf* format. Output quality will differ depending on which model you use, we recommend the following models:
|
||||
|
||||
* `Llama2 7b Chat <https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF>`_ (Slightly older; good quality; good acclaim)
|
||||
* `NeuralBeagle14 7B <https://huggingface.co/mlabonne/NeuralBeagle14-7B-GGUF>`_ (Newer; good quality; less well known)
|
||||
* `Llama3 8b Instruct <https://huggingface.co/QuantFactory/Meta-Llama-3-8B-Instruct-GGUF>`_ (reasonable quality; fast; good acclaim; multilingual output may not be optimal)
|
||||
* `Llama3 70B Instruct <https://huggingface.co/QuantFactory/Meta-Llama-3-70B-Instruct-GGUF>`_ (good quality; good acclaim; good multilingual output)
|
||||
|
||||
This app supports input and output in languages other than English if the underlying model supports the language.
|
||||
|
||||
Requirements
|
||||
------------
|
||||
|
||||
* This app is built as an External App and thus depends on AppAPI v2.3.0
|
||||
* This app is built as an External App and thus depends on AppAPI v2.3.0 or higher
|
||||
* Nextcloud AIO is supported
|
||||
* Using GPU processing is supported, but not required; be prepared for slow performance unless you are using GPU
|
||||
* We currently only support NVIDIA GPUs
|
||||
@@ -22,9 +24,14 @@ Requirements
|
||||
|
||||
* You will need a GPU with enough VRAM to hold the model you choose
|
||||
|
||||
* for 7B parameter models, 5bit-quantized variants and lower should fit on a 8GB VRAM, but of course have lower quality
|
||||
* for 7B parameter models, 6bit-quantized variants and up will need 12GB VRAM
|
||||
* Some examples:
|
||||
|
||||
* for 8B parameter models, 5bit-quantized variants and lower should fit on a 8GB VRAM, but of course have slightly lower quality
|
||||
* for 8B parameter models, 6bit-quantized variants and up will need 12GB VRAM
|
||||
* for a 70B parameter model 4bit-quantized variants will need ~44GB VRAM
|
||||
|
||||
* If you want better reasoning capabilities, you will need to look for models with more parameters, like 14B and higher, which of course also need more VRAM
|
||||
* To check whether a model fits on your GPU, you can use this `calculator <https://rahulschand.github.io/gpu_poor/>`_
|
||||
|
||||
* CPU Sizing
|
||||
|
||||
@@ -69,7 +76,8 @@ Nextcloud customers should file bugs directly with our Support system.
|
||||
Known Limitations
|
||||
-----------------
|
||||
|
||||
* We currently only support the English language
|
||||
* We currently only support languages that the underlying model supports; correctness of language use in languages other than English may be poor depending on the language's coverage in the model's training data (We recommended model Llama 3 or other models explicitly trained on multiple languages)
|
||||
* Language models can be bad at reasoning tasks
|
||||
* Language models are likely to generate false information and should thus only be used in situations that are not critical. It's recommended to only use AI at the beginning of a creation process and not at the end, so that outputs of AI serve as a draft for example and not as final product. Always check the output of language models before using it.
|
||||
* Make sure to test the language model you are using it for whether it meets the use-case's quality requirements
|
||||
* Language models notoriously have a high energy consumption, if you want to reduce load on your server you can choose smaller models or quantized models in excahnge for lower accuracy
|
||||
|
||||
@@ -6,10 +6,12 @@ App: Local Whisper Speech-To-Text (stt_whisper2)
|
||||
|
||||
The *stt_whisper2* app is one of the apps that provide Speech-To-Text functionality in Nextcloud and act as a media transcription backend for the :ref:`Nextcloud Assistant app<ai-app-assistant>`, the *talk* app and :ref:`other apps making use of the core Translation API<stt-consumer-apps>`. The *stt_whisper2* app specifically runs only open source models and does so entirely on-premises. Nextcloud can provide customer support upon request, please talk to your account manager for the possibilities.
|
||||
|
||||
This app supports input and output in languages other than English if the underlying model supports the language.
|
||||
|
||||
This app uses `faster-whisper <https://github.com/SYSTRAN/faster-whisper>`_ under the hood. Output quality will differ depending on which model you use, we recommend the following models:
|
||||
|
||||
* OpenAI Whisper large-v2 or v3
|
||||
* OpenAI Whisper medium.en
|
||||
* OpenAI Whisper large-v2 or v3 (multilingual)
|
||||
* OpenAI Whisper medium.en (English only)
|
||||
|
||||
Requirements
|
||||
------------
|
||||
@@ -44,7 +46,7 @@ Installation
|
||||
Supplying alternate models
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
This app allows supplying alternate LLM models as *gguf* files in the ``/nc_app_llm2_data`` directory of the docker container. You can use any `*faster-whisper* model by Systran on hugging face <https://huggingface.co/Systran>`_ by simply
|
||||
This app allows supplying alternate models in the ``/nc_app_llm2_data`` directory of the docker container. You can use any `*faster-whisper* model by Systran on hugging face <https://huggingface.co/Systran>`_ in the following way:
|
||||
|
||||
1. git cloning the respective repository
|
||||
2. Copying the folder with the git repository to ``/nc_app_llm2_data`` inside the docker container.
|
||||
|
||||
Reference in New Issue
Block a user