mirror of https://github.com/open-webui/docs.git synced 2026-01-02 17:59:41 +07:00

Files

silentoplayz 2c91ef94ce Update openedai-speech-integration.md

Updates

2024-06-09 17:46:23 +00:00

4.5 KiB

Raw Blame History

sidebar_position, title

sidebar_position	title
11	Integrating OpenedAI-Speech with Open WebUI using Docker Desktop

Integrating `openedai-speech` into Open WebUI using Docker Desktop

What is `openedai-speech`?

openedai-speech is an OpenAI API compatible text-to-speech server that uses Coqui AI's xtts_v2 and/or Piper TTS as the backend. It's a free, private, text-to-speech server that allows for custom voice cloning and is compatible with the OpenAI audio/speech API.

Prerequisites

Docker Desktop installed on your system
Open WebUI running in a Docker container
A basic understanding of Docker and Docker Compose

Option 1: Using Docker Compose

Step 1: Create a new folder for the `openedai-speech` service

Create a new folder, for example, openedai-speech-service, to store the docker-compose.yml and .env files.

Step 2: Create a `docker-compose.yml` file

In the openedai-speech-service folder, create a new file named docker-compose.yml with the following contents:

services:
  server:
    image: ghcr.io/matatonic/openedai-speech
    container_name: openedai-speech
    env_file: .env
    ports:
      - "8000:8000"
    volumes:
      - tts-voices:/app/voices
      - tts-config:/app/config
    # labels:
    #   - "com.centurylinklabs.watchtower.enable=true"
    restart: unless-stopped

volumes:
  tts-voices:
  tts-config:

Step 3: Create an `.env` file (optional)

In the same openedai-speech-service folder, create a new file named .env with the following contents:

TTS_HOME=voices
HF_HOME=voices
#PRELOAD_MODEL=xtts
#PRELOAD_MODEL=xtts_v2.0.2
#PRELOAD_MODEL=parler-tts/parler_tts_mini_v0.1

Step 4: Run `docker-compose` to start the `openedai-speech` service

Run the following command in the openedai-speech-service folder to start the openedai-speech service in detached mode:

docker compose up -d

This will start the openedai-speech service in the background.

Option 2: Using Docker Run Commands

You can also use the following Docker run commands to start the openedai-speech service in detached mode:

With GPU (Nvidia) support:

docker run -d --gpus=all -p 8000:8000 -v tts-voices:/app/voices -v tts-config:/app/config --name openedai-speech ghcr.io/matatonic/openedai-speech:latest

Alternative without GPU support:

docker run -d -p 8000:8000 -v tts-voices:/app/voices -v tts-config:/app/config --name openedai-speech ghcr.io/matatonic/openedai-speech-min:latest

Step 5: Configure Open WebUI to use `openedai-speech`

Open the Open WebUI settings and navigate to the TTS Settings under Admin Panel > Settings > Audio. Add the following configuration:

API Base URL: http://host.docker.internal:8000/v1
API Key: sk-111111111 (note: this is a dummy API key, as openedai-speech doesn't require an API key; you can use whatever for this field)

Step 6: Choose a voice

Under Set Voice, you can choose from the following voices:

alloy
echo
echo-alt
fable
onyx
nova
shimmer

Step 7: Enjoy naturally sounding voices

You should now be able to use the openedai-speech integration with Open WebUI to generate naturally sounding voices.

Troubleshooting

If you encounter any issues, make sure that:

The openedai-speech service is running and exposed on port 8000.
The host.docker.internal hostname is resolvable from within the Open WebUI container.
host.docker.internal is required since openedai-speech is exposed via localhost on your PC, but open-webui cannot normally access this from within its container.
The API key is set to a dummy value, as openedai-speech doesn't require an API key.

Additional Resources

For more information on openedai-speech, please visit the GitHub repository.

Note: You can change the port number in the docker-compose.yml file to any open and usable port, but make sure to update the API Base URL in Open WebUI Admin Audio settings accordingly.

4.5 KiB Raw Blame History

Integrating openedai-speech into Open WebUI using Docker Desktop

What is openedai-speech?