mirror of
https://github.com/docker/docs.git
synced 2026-03-27 06:18:55 +07:00
dmr: split up content (#23227)
<!--Delete sections as needed --> ## Description Split up the content for DMR for better UX and maintainability.
This commit is contained in:
@@ -16,7 +16,7 @@ aliases:
|
||||
|
||||
{{< summary-bar feature_name="Docker Model Runner" >}}
|
||||
|
||||
Docker Model Runner makes it easy to manage, run, and
|
||||
Docker Model Runner (DMR) makes it easy to manage, run, and
|
||||
deploy AI models using Docker. Designed for developers,
|
||||
Docker Model Runner streamlines the process of pulling, running, and serving
|
||||
large language models (LLMs) and other AI models directly from Docker Hub or any
|
||||
@@ -39,7 +39,7 @@ with AI models locally.
|
||||
- Package GGUF files as OCI Artifacts and publish them to any Container Registry
|
||||
- Run and interact with AI models directly from the command line or from the Docker Desktop GUI
|
||||
- Manage local models and display logs
|
||||
- Display prompts and responses details
|
||||
- Display prompt and response details
|
||||
|
||||
## Requirements
|
||||
|
||||
@@ -75,14 +75,14 @@ Docker Engine only:
|
||||
{{< /tab >}}
|
||||
{{</tabs >}}
|
||||
|
||||
## How it works
|
||||
## How Docker Model Runner works
|
||||
|
||||
Models are pulled from Docker Hub the first time you use them and are stored
|
||||
locally. They load into memory only at runtime when a request is made, and
|
||||
unload when not in use to optimize resources. Because models can be large, the
|
||||
initial pull may take some time. After that, they're cached locally for faster
|
||||
access. You can interact with the model using
|
||||
[OpenAI-compatible APIs](#what-api-endpoints-are-available).
|
||||
[OpenAI-compatible APIs](api-reference.md).
|
||||
|
||||
> [!TIP]
|
||||
>
|
||||
@@ -92,569 +92,6 @@ access. You can interact with the model using
|
||||
> [Docker Compose](/manuals/ai/compose/models-and-compose.md) now support Docker
|
||||
> Model Runner.
|
||||
|
||||
## Enable Docker Model Runner
|
||||
|
||||
### Enable DMR in Docker Desktop
|
||||
|
||||
1. In the settings view, go to the **Beta features** tab.
|
||||
1. Select the **Enable Docker Model Runner** setting.
|
||||
1. If you use Windows with a supported NVIDIA GPU, you also see and can select
|
||||
**Enable GPU-backed inference**.
|
||||
1. Optional: To enable TCP support, select **Enable host-side TCP support**.
|
||||
1. In the **Port** field, type the port you want to use.
|
||||
1. If you interact with Model Runner from a local frontend web app, in
|
||||
**CORS Allows Origins**, select the origins that Model Runner should
|
||||
accept requests from. An origin is the URL where your web app runs, for
|
||||
example `http://localhost:3131`.
|
||||
|
||||
You can now use the `docker model` command in the CLI and view and interact
|
||||
with your local models in the **Models** tab in the Docker Desktop Dashboard.
|
||||
|
||||
> [!IMPORTANT]
|
||||
>
|
||||
> For Docker Desktop versions 4.41 and earlier, this setting was under the
|
||||
> **Experimental features** tab on the **Features in development** page.
|
||||
|
||||
### Enable DMR in Docker Engine
|
||||
|
||||
1. Ensure you have installed [Docker Engine](/engine/install/).
|
||||
1. DMR is available as a package. To install it, run:
|
||||
|
||||
{{< tabs >}}
|
||||
{{< tab name="Ubuntu/Debian">}}
|
||||
|
||||
```console
|
||||
$ sudo apt-get update
|
||||
$ sudo apt-get install docker-model-plugin
|
||||
```
|
||||
|
||||
{{< /tab >}}
|
||||
{{< tab name="RPM-base distributions">}}
|
||||
|
||||
```console
|
||||
$ sudo dnf update
|
||||
$ sudo dnf install docker-model-plugin
|
||||
```
|
||||
|
||||
{{< /tab >}}
|
||||
{{< /tabs >}}
|
||||
|
||||
1. Test the installation:
|
||||
|
||||
```console
|
||||
$ docker model version
|
||||
$ docker model run ai/smollm2
|
||||
```
|
||||
|
||||
> [!NOTE]
|
||||
> TCP support is enabled by default for Docker Engine on port `12434`.
|
||||
|
||||
### Update DMR in Docker Engine
|
||||
|
||||
To update Docker Model Runner in Docker Engine, uninstall it with
|
||||
[`docker model uninstall-runner`](/reference/cli/docker/model/uninstall-runner/)
|
||||
then reinstall it:
|
||||
|
||||
```console
|
||||
docker model uninstall-runner --images && docker model install-runner
|
||||
```
|
||||
|
||||
> [!NOTE]
|
||||
> With the above command, local models are preserved.
|
||||
> To delete the models during the upgrade, add the `--models` option to the
|
||||
> `uninstall-runner` command.
|
||||
|
||||
## Pull a model
|
||||
|
||||
Models are cached locally.
|
||||
|
||||
> [!NOTE]
|
||||
>
|
||||
> When you use the Docker CLI, you can also pull models directly from
|
||||
> [HuggingFace](https://huggingface.co/).
|
||||
|
||||
{{< tabs group="release" >}}
|
||||
{{< tab name="From Docker Desktop">}}
|
||||
|
||||
1. Select **Models** and select the **Docker Hub** tab.
|
||||
1. Find the model you want and select **Pull**.
|
||||
|
||||

|
||||
|
||||
{{< /tab >}}
|
||||
{{< tab name="From the Docker CLI">}}
|
||||
|
||||
Use the [`docker model pull` command](/reference/cli/docker/model/pull/).
|
||||
For example:
|
||||
|
||||
```bash {title="Pulling from Docker Hub"}
|
||||
docker model pull ai/smollm2:360M-Q4_K_M
|
||||
```
|
||||
|
||||
```bash {title="Pulling from HuggingFace"}
|
||||
docker model pull hf.co/bartowski/Llama-3.2-1B-Instruct-GGUF
|
||||
```
|
||||
|
||||
{{< /tab >}}
|
||||
{{< /tabs >}}
|
||||
|
||||
## Run a model
|
||||
|
||||
{{< tabs group="release" >}}
|
||||
{{< tab name="From Docker Desktop">}}
|
||||
|
||||
1. Select **Models** and select the **Local** tab.
|
||||
1. Select the play button. The interactive chat screen opens.
|
||||
|
||||

|
||||
|
||||
{{< /tab >}}
|
||||
{{< tab name="From the Docker CLI" >}}
|
||||
|
||||
Use the [`docker model run` command](/reference/cli/docker/model/run/).
|
||||
|
||||
{{< /tab >}}
|
||||
{{< /tabs >}}
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Display the logs
|
||||
|
||||
To troubleshoot issues, display the logs:
|
||||
|
||||
{{< tabs group="release" >}}
|
||||
{{< tab name="From Docker Desktop">}}
|
||||
|
||||
Select **Models** and select the **Logs** tab.
|
||||
|
||||

|
||||
|
||||
{{< /tab >}}
|
||||
{{< tab name="From the Docker CLI">}}
|
||||
|
||||
Use the [`docker model logs` command](/reference/cli/docker/model/logs/).
|
||||
|
||||
{{< /tab >}}
|
||||
{{< /tabs >}}
|
||||
|
||||
### Inspect requests and responses
|
||||
|
||||
Inspecting requests and responses helps you diagnose model-related issues.
|
||||
For example, you can evaluate context usage to verify you stay within the model's context
|
||||
window or display the full body of a request to control the parameters you are passing to your models
|
||||
when developing with a framework.
|
||||
|
||||
In Docker Desktop, to inspect the requests and responses for each model:
|
||||
|
||||
1. Select **Models** and select the **Requests** tab. This view displays all the requests to all models:
|
||||
- The time the request was sent.
|
||||
- The model name and version
|
||||
- The prompt/request
|
||||
- The context usage
|
||||
- The time it took for the response to be generated.
|
||||
2. Select one of the requests to display further details:
|
||||
- In the **Overview** tab, view the token usage, response metadata and generation speed, and the actual prompt and response.
|
||||
- In the **Request** and **Response** tabs, view the full JSON payload of the request and the response.
|
||||
|
||||
> [!NOTE]
|
||||
> You can also display the requests for a specific model when you select a model and then select the **Requests** tab.
|
||||
|
||||
## Publish a model
|
||||
|
||||
> [!NOTE]
|
||||
>
|
||||
> This works for any Container Registry supporting OCI Artifacts, not only
|
||||
> Docker Hub.
|
||||
|
||||
You can tag existing models with a new name and publish them under a different
|
||||
namespace and repository:
|
||||
|
||||
```console
|
||||
# Tag a pulled model under a new name
|
||||
$ docker model tag ai/smollm2 myorg/smollm2
|
||||
|
||||
# Push it to Docker Hub
|
||||
$ docker model push myorg/smollm2
|
||||
```
|
||||
|
||||
For more details, see the [`docker model tag`](/reference/cli/docker/model/tag)
|
||||
and [`docker model push`](/reference/cli/docker/model/push) command
|
||||
documentation.
|
||||
|
||||
You can also package a model file in GGUF format as an OCI Artifact and publish
|
||||
it to Docker Hub.
|
||||
|
||||
```console
|
||||
# Download a model file in GGUF format, for example from HuggingFace
|
||||
$ curl -L -o model.gguf https://huggingface.co/TheBloke/Mistral-7B-v0.1-GGUF/resolve/main/mistral-7b-v0.1.Q4_K_M.gguf
|
||||
|
||||
# Package it as OCI Artifact and push it to Docker Hub
|
||||
$ docker model package --gguf "$(pwd)/model.gguf" --push myorg/mistral-7b-v0.1:Q4_K_M
|
||||
```
|
||||
|
||||
For more details, see the
|
||||
[`docker model package`](/reference/cli/docker/model/package/) command
|
||||
documentation.
|
||||
|
||||
## Example: Integrate Docker Model Runner into your software development lifecycle
|
||||
|
||||
### Sample project
|
||||
|
||||
You can now start building your generative AI application powered by Docker
|
||||
Model Runner.
|
||||
|
||||
If you want to try an existing GenAI application, follow these steps:
|
||||
|
||||
1. Set up the sample app. Clone and run the following repository:
|
||||
|
||||
```console
|
||||
$ git clone https://github.com/docker/hello-genai.git
|
||||
```
|
||||
|
||||
1. In your terminal, go to the `hello-genai` directory.
|
||||
|
||||
1. Run `run.sh` to pull the chosen model and run the app.
|
||||
|
||||
1. Open your app in the browser at the addresses specified in the repository
|
||||
[README](https://github.com/docker/hello-genai).
|
||||
|
||||
You see the GenAI app's interface where you can start typing your prompts.
|
||||
|
||||
You can now interact with your own GenAI app, powered by a local model. Try a
|
||||
few prompts and notice how fast the responses are — all running on your machine
|
||||
with Docker.
|
||||
|
||||
### Use Model Runner in GitHub Actions
|
||||
|
||||
Here is an example of how to use Model Runner as part of a GitHub workflow.
|
||||
The example installs Model Runner, tests the installation, pulls and runs a
|
||||
model, interacts with the model via the API, and deletes the model.
|
||||
|
||||
```yaml {title="dmr-run.yml", collapse=true}
|
||||
name: Docker Model Runner Example Workflow
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
on:
|
||||
workflow_dispatch:
|
||||
inputs:
|
||||
test_model:
|
||||
description: 'Model to test with (default: ai/smollm2:360M-Q4_K_M)'
|
||||
required: false
|
||||
type: string
|
||||
default: 'ai/smollm2:360M-Q4_K_M'
|
||||
|
||||
jobs:
|
||||
dmr-test:
|
||||
runs-on: ubuntu-latest
|
||||
timeout-minutes: 30
|
||||
|
||||
steps:
|
||||
- name: Set up Docker
|
||||
uses: docker/setup-docker-action@v4
|
||||
|
||||
- name: Install docker-model-plugin
|
||||
run: |
|
||||
echo "Installing docker-model-plugin..."
|
||||
# Add Docker's official GPG key:
|
||||
sudo apt-get update
|
||||
sudo apt-get install ca-certificates curl
|
||||
sudo install -m 0755 -d /etc/apt/keyrings
|
||||
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
|
||||
sudo chmod a+r /etc/apt/keyrings/docker.asc
|
||||
|
||||
# Add the repository to Apt sources:
|
||||
echo \
|
||||
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
|
||||
$(. /etc/os-release && echo "${UBUNTU_CODENAME:-$VERSION_CODENAME}") stable" | \
|
||||
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
|
||||
sudo apt-get update
|
||||
sudo apt-get install -y docker-model-plugin
|
||||
|
||||
echo "Installation completed successfully"
|
||||
|
||||
- name: Test docker model version
|
||||
run: |
|
||||
echo "Testing docker model version command..."
|
||||
sudo docker model version
|
||||
|
||||
# Verify the command returns successfully
|
||||
if [ $? -eq 0 ]; then
|
||||
echo "✅ docker model version command works correctly"
|
||||
else
|
||||
echo "❌ docker model version command failed"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
- name: Pull the provided model and run it
|
||||
run: |
|
||||
MODEL="${{ github.event.inputs.test_model || 'ai/smollm2:360M-Q4_K_M' }}"
|
||||
echo "Testing with model: $MODEL"
|
||||
|
||||
# Test model pull
|
||||
echo "Pulling model..."
|
||||
sudo docker model pull "$MODEL"
|
||||
|
||||
if [ $? -eq 0 ]; then
|
||||
echo "✅ Model pull successful"
|
||||
else
|
||||
echo "❌ Model pull failed"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Test basic model run (with timeout to avoid hanging)
|
||||
echo "Testing docker model run..."
|
||||
timeout 60s sudo docker model run "$MODEL" "Give me a fact about whales." || {
|
||||
exit_code=$?
|
||||
if [ $exit_code -eq 124 ]; then
|
||||
echo "✅ Model run test completed (timed out as expected for non-interactive test)"
|
||||
else
|
||||
echo "❌ Model run failed with exit code: $exit_code"
|
||||
exit 1
|
||||
fi
|
||||
}
|
||||
- name: Test model pull and run
|
||||
run: |
|
||||
MODEL="${{ github.event.inputs.test_model || 'ai/smollm2:360M-Q4_K_M' }}"
|
||||
echo "Testing with model: $MODEL"
|
||||
|
||||
# Test model pull
|
||||
echo "Pulling model..."
|
||||
sudo docker model pull "$MODEL"
|
||||
|
||||
if [ $? -eq 0 ]; then
|
||||
echo "✅ Model pull successful"
|
||||
else
|
||||
echo "❌ Model pull failed"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Test basic model run (with timeout to avoid hanging)
|
||||
echo "Testing docker model run..."
|
||||
timeout 60s sudo docker model run "$MODEL" "Give me a fact about whales." || {
|
||||
exit_code=$?
|
||||
if [ $exit_code -eq 124 ]; then
|
||||
echo "✅ Model run test completed (timed out as expected for non-interactive test)"
|
||||
else
|
||||
echo "❌ Model run failed with exit code: $exit_code"
|
||||
exit 1
|
||||
fi
|
||||
}
|
||||
|
||||
- name: Test API endpoint
|
||||
run: |
|
||||
MODEL="${{ github.event.inputs.test_model || 'ai/smollm2:360M-Q4_K_M' }}"
|
||||
echo "Testing API endpoint with model: $MODEL"
|
||||
|
||||
# Test API call with curl
|
||||
echo "Testing API call..."
|
||||
RESPONSE=$(curl -s http://localhost:12434/engines/llama.cpp/v1/chat/completions \
|
||||
-H "Content-Type: application/json" \
|
||||
-d "{
|
||||
\"model\": \"$MODEL\",
|
||||
\"messages\": [
|
||||
{
|
||||
\"role\": \"user\",
|
||||
\"content\": \"Say hello\"
|
||||
}
|
||||
],
|
||||
\"top_k\": 1,
|
||||
\"temperature\": 0
|
||||
}")
|
||||
|
||||
if [ $? -eq 0 ]; then
|
||||
echo "✅ API call successful"
|
||||
echo "Response received: $RESPONSE"
|
||||
|
||||
# Check if response contains "hello" (case-insensitive)
|
||||
if echo "$RESPONSE" | grep -qi "hello"; then
|
||||
echo "✅ Response contains 'hello' (case-insensitive)"
|
||||
else
|
||||
echo "❌ Response does not contain 'hello'"
|
||||
echo "Full response: $RESPONSE"
|
||||
exit 1
|
||||
fi
|
||||
else
|
||||
echo "❌ API call failed"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
- name: Test model cleanup
|
||||
run: |
|
||||
MODEL="${{ github.event.inputs.test_model || 'ai/smollm2:360M-Q4_K_M' }}"
|
||||
|
||||
echo "Cleaning up test model..."
|
||||
sudo docker model rm "$MODEL" || echo "Model removal failed or model not found"
|
||||
|
||||
# Verify model was removed
|
||||
echo "Verifying model cleanup..."
|
||||
sudo docker model ls
|
||||
|
||||
echo "✅ Model cleanup completed"
|
||||
|
||||
- name: Report success
|
||||
if: success()
|
||||
run: |
|
||||
echo "🎉 Docker Model Runner daily health check completed successfully!"
|
||||
echo "All tests passed:"
|
||||
echo " ✅ docker-model-plugin installation successful"
|
||||
echo " ✅ docker model version command working"
|
||||
echo " ✅ Model pull and run operations successful"
|
||||
echo " ✅ API endpoint operations successful"
|
||||
echo " ✅ Cleanup operations successful"
|
||||
```
|
||||
|
||||
## FAQs
|
||||
|
||||
### What models are available?
|
||||
|
||||
All the available models are hosted in the [public Docker Hub namespace of `ai`](https://hub.docker.com/u/ai).
|
||||
|
||||
### What CLI commands are available?
|
||||
|
||||
See [the reference docs](/reference/cli/docker/model/).
|
||||
|
||||
### What API endpoints are available?
|
||||
|
||||
Once the feature is enabled, new API endpoints are available under the following base URLs:
|
||||
|
||||
{{< tabs >}}
|
||||
{{< tab name="Docker Desktop">}}
|
||||
|
||||
- From containers: `http://model-runner.docker.internal/`
|
||||
- From host processes: `http://localhost:12434/`, assuming TCP host access is
|
||||
enabled on the default port (12434).
|
||||
|
||||
{{< /tab >}}
|
||||
{{< tab name="Docker Engine">}}
|
||||
|
||||
- From containers: `http://172.17.0.1:12434/` (with `172.17.0.1` representing the host gateway address)
|
||||
- From host processes: `http://localhost:12434/`
|
||||
|
||||
> [!NOTE]
|
||||
> The `172.17.0.1` interface may not be available by default to containers
|
||||
within a Compose project.
|
||||
> In this case, add an `extra_hosts` directive to your Compose service YAML:
|
||||
>
|
||||
> ```yaml
|
||||
> extra_hosts:
|
||||
> - "model-runner.docker.internal:host-gateway"
|
||||
> ```
|
||||
> Then you can access the Docker Model Runner APIs at http://model-runner.docker.internal:12434/
|
||||
|
||||
{{< /tab >}}
|
||||
{{</tabs >}}
|
||||
|
||||
Docker Model management endpoints:
|
||||
|
||||
```text
|
||||
POST /models/create
|
||||
GET /models
|
||||
GET /models/{namespace}/{name}
|
||||
DELETE /models/{namespace}/{name}
|
||||
```
|
||||
|
||||
OpenAI endpoints:
|
||||
|
||||
```text
|
||||
GET /engines/llama.cpp/v1/models
|
||||
GET /engines/llama.cpp/v1/models/{namespace}/{name}
|
||||
POST /engines/llama.cpp/v1/chat/completions
|
||||
POST /engines/llama.cpp/v1/completions
|
||||
POST /engines/llama.cpp/v1/embeddings
|
||||
```
|
||||
|
||||
To call these endpoints via a Unix socket (`/var/run/docker.sock`), prefix their path
|
||||
with `/exp/vDD4.40`.
|
||||
|
||||
> [!NOTE]
|
||||
> You can omit `llama.cpp` from the path. For example: `POST /engines/v1/chat/completions`.
|
||||
|
||||
### How do I interact through the OpenAI API?
|
||||
|
||||
#### From within a container
|
||||
|
||||
To call the `chat/completions` OpenAI endpoint from within another container using `curl`:
|
||||
|
||||
```bash
|
||||
#!/bin/sh
|
||||
|
||||
curl http://model-runner.docker.internal/engines/llama.cpp/v1/chat/completions \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"model": "ai/smollm2",
|
||||
"messages": [
|
||||
{
|
||||
"role": "system",
|
||||
"content": "You are a helpful assistant."
|
||||
},
|
||||
{
|
||||
"role": "user",
|
||||
"content": "Please write 500 words about the fall of Rome."
|
||||
}
|
||||
]
|
||||
}'
|
||||
|
||||
```
|
||||
|
||||
#### From the host using TCP
|
||||
|
||||
To call the `chat/completions` OpenAI endpoint from the host via TCP:
|
||||
|
||||
1. Enable the host-side TCP support from the Docker Desktop GUI, or via the [Docker Desktop CLI](/manuals/desktop/features/desktop-cli.md).
|
||||
For example: `docker desktop enable model-runner --tcp <port>`.
|
||||
|
||||
If you are running on Windows, also enable GPU-backed inference.
|
||||
See [Enable Docker Model Runner](#enable-dmr-in-docker-desktop).
|
||||
|
||||
2. Interact with it as documented in the previous section using `localhost` and the correct port.
|
||||
|
||||
```bash
|
||||
#!/bin/sh
|
||||
|
||||
curl http://localhost:12434/engines/llama.cpp/v1/chat/completions \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"model": "ai/smollm2",
|
||||
"messages": [
|
||||
{
|
||||
"role": "system",
|
||||
"content": "You are a helpful assistant."
|
||||
},
|
||||
{
|
||||
"role": "user",
|
||||
"content": "Please write 500 words about the fall of Rome."
|
||||
}
|
||||
]
|
||||
}'
|
||||
```
|
||||
|
||||
#### From the host using a Unix socket
|
||||
|
||||
To call the `chat/completions` OpenAI endpoint through the Docker socket from the host using `curl`:
|
||||
|
||||
```bash
|
||||
#!/bin/sh
|
||||
|
||||
curl --unix-socket $HOME/.docker/run/docker.sock \
|
||||
localhost/exp/vDD4.40/engines/llama.cpp/v1/chat/completions \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"model": "ai/smollm2",
|
||||
"messages": [
|
||||
{
|
||||
"role": "system",
|
||||
"content": "You are a helpful assistant."
|
||||
},
|
||||
{
|
||||
"role": "user",
|
||||
"content": "Please write 500 words about the fall of Rome."
|
||||
}
|
||||
]
|
||||
}'
|
||||
```
|
||||
|
||||
## Known issues
|
||||
|
||||
### `docker model` is not recognised
|
||||
@@ -681,4 +118,9 @@ The Docker Model CLI currently lacks consistent support for specifying models by
|
||||
|
||||
## Share feedback
|
||||
|
||||
Thanks for trying out Docker Model Runner. Give feedback or report any bugs you may find through the **Give feedback** link next to the **Enable Docker Model Runner** setting.
|
||||
Thanks for trying out Docker Model Runner. Give feedback or report any bugs
|
||||
you may find through the **Give feedback** link next to the **Enable Docker Model Runner** setting.
|
||||
|
||||
## Next steps
|
||||
|
||||
[Get started with DMR](get-started.md)
|
||||
|
||||
192
content/manuals/ai/model-runner/api-reference.md
Normal file
192
content/manuals/ai/model-runner/api-reference.md
Normal file
@@ -0,0 +1,192 @@
|
||||
---
|
||||
title: DMR REST API
|
||||
description: Reference documentation for the Docker Model Runner REST API endpoints and usage examples.
|
||||
weight: 30
|
||||
keywords: Docker, ai, model runner, rest api, openai, endpoints, documentation
|
||||
---
|
||||
|
||||
Once Model Runner is enabled, new API endpoints are available. You can use
|
||||
these endpoints to interact with a model programmatically.
|
||||
|
||||
### Determine the base URL
|
||||
|
||||
The base URL to interact with the endpoints depends
|
||||
on how you run Docker:
|
||||
|
||||
{{< tabs >}}
|
||||
{{< tab name="Docker Desktop">}}
|
||||
|
||||
- From containers: `http://model-runner.docker.internal/`
|
||||
- From host processes: `http://localhost:12434/`, assuming TCP host access is
|
||||
enabled on the default port (12434).
|
||||
|
||||
{{< /tab >}}
|
||||
{{< tab name="Docker Engine">}}
|
||||
|
||||
- From containers: `http://172.17.0.1:12434/` (with `172.17.0.1` representing the host gateway address)
|
||||
- From host processes: `http://localhost:12434/`
|
||||
|
||||
> [!NOTE]
|
||||
> The `172.17.0.1` interface may not be available by default to containers
|
||||
within a Compose project.
|
||||
> In this case, add an `extra_hosts` directive to your Compose service YAML:
|
||||
>
|
||||
> ```yaml
|
||||
> extra_hosts:
|
||||
> - "model-runner.docker.internal:host-gateway"
|
||||
> ```
|
||||
> Then you can access the Docker Model Runner APIs at http://model-runner.docker.internal:12434/
|
||||
|
||||
{{< /tab >}}
|
||||
{{</tabs >}}
|
||||
|
||||
### Available DMR endpoints
|
||||
|
||||
- Create a model:
|
||||
|
||||
```text
|
||||
POST /models/create
|
||||
```
|
||||
|
||||
- List models:
|
||||
|
||||
```text
|
||||
GET /models
|
||||
```
|
||||
|
||||
- Get a model:
|
||||
|
||||
```text
|
||||
GET /models/{namespace}/{name}
|
||||
```
|
||||
|
||||
- Delete a local model:
|
||||
|
||||
```text
|
||||
DELETE /models/{namespace}/{name}
|
||||
```
|
||||
|
||||
### Available OpenAPI endpoints
|
||||
|
||||
DMR supports the following OpenAPI endpoints:
|
||||
|
||||
- [List models](https://platform.openai.com/docs/api-reference/models/list):
|
||||
|
||||
```text
|
||||
GET /engines/llama.cpp/v1/models
|
||||
```
|
||||
|
||||
- [Retrieve model](https://platform.openai.com/docs/api-reference/models/retrieve):
|
||||
|
||||
```text
|
||||
GET /engines/llama.cpp/v1/models/{namespace}/{name}
|
||||
```
|
||||
|
||||
- [List chat completions](https://platform.openai.com/docs/api-reference/chat/list):
|
||||
|
||||
```text
|
||||
POST /engines/llama.cpp/v1/chat/completions
|
||||
```
|
||||
|
||||
- [Create completions](https://platform.openai.com/docs/api-reference/completions/create):
|
||||
|
||||
```text
|
||||
POST /engines/llama.cpp/v1/completions
|
||||
```
|
||||
|
||||
|
||||
- [Create embeddings](https://platform.openai.com/docs/api-reference/embeddings/create):
|
||||
|
||||
```text
|
||||
POST /engines/llama.cpp/v1/embeddings
|
||||
```
|
||||
|
||||
To call these endpoints via a Unix socket (`/var/run/docker.sock`), prefix their path
|
||||
with `/exp/vDD4.40`.
|
||||
|
||||
> [!NOTE]
|
||||
> You can omit `llama.cpp` from the path. For example: `POST /engines/v1/chat/completions`.
|
||||
|
||||
## REST API examples
|
||||
|
||||
### Request from within a container
|
||||
|
||||
To call the `chat/completions` OpenAI endpoint from within another container using `curl`:
|
||||
|
||||
```bash
|
||||
#!/bin/sh
|
||||
|
||||
curl http://model-runner.docker.internal/engines/llama.cpp/v1/chat/completions \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"model": "ai/smollm2",
|
||||
"messages": [
|
||||
{
|
||||
"role": "system",
|
||||
"content": "You are a helpful assistant."
|
||||
},
|
||||
{
|
||||
"role": "user",
|
||||
"content": "Please write 500 words about the fall of Rome."
|
||||
}
|
||||
]
|
||||
}'
|
||||
|
||||
```
|
||||
|
||||
### Request from the host using TCP
|
||||
|
||||
To call the `chat/completions` OpenAI endpoint from the host via TCP:
|
||||
|
||||
1. Enable the host-side TCP support from the Docker Desktop GUI, or via the [Docker Desktop CLI](/manuals/desktop/features/desktop-cli.md).
|
||||
For example: `docker desktop enable model-runner --tcp <port>`.
|
||||
|
||||
If you are running on Windows, also enable GPU-backed inference.
|
||||
See [Enable Docker Model Runner](get-started.md#enable-docker-model-runner-in-docker-desktop).
|
||||
|
||||
2. Interact with it as documented in the previous section using `localhost` and the correct port.
|
||||
|
||||
```bash
|
||||
#!/bin/sh
|
||||
|
||||
curl http://localhost:12434/engines/llama.cpp/v1/chat/completions \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"model": "ai/smollm2",
|
||||
"messages": [
|
||||
{
|
||||
"role": "system",
|
||||
"content": "You are a helpful assistant."
|
||||
},
|
||||
{
|
||||
"role": "user",
|
||||
"content": "Please write 500 words about the fall of Rome."
|
||||
}
|
||||
]
|
||||
}'
|
||||
```
|
||||
|
||||
### Request from the host using a Unix socket
|
||||
|
||||
To call the `chat/completions` OpenAI endpoint through the Docker socket from the host using `curl`:
|
||||
|
||||
```bash
|
||||
#!/bin/sh
|
||||
|
||||
curl --unix-socket $HOME/.docker/run/docker.sock \
|
||||
localhost/exp/vDD4.40/engines/llama.cpp/v1/chat/completions \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"model": "ai/smollm2",
|
||||
"messages": [
|
||||
{
|
||||
"role": "system",
|
||||
"content": "You are a helpful assistant."
|
||||
},
|
||||
{
|
||||
"role": "user",
|
||||
"content": "Please write 500 words about the fall of Rome."
|
||||
}
|
||||
]
|
||||
}'
|
||||
```
|
||||
219
content/manuals/ai/model-runner/examples.md
Normal file
219
content/manuals/ai/model-runner/examples.md
Normal file
@@ -0,0 +1,219 @@
|
||||
---
|
||||
title: DMR examples
|
||||
description: Example projects and CI/CD workflows for Docker Model Runner.
|
||||
weight: 40
|
||||
keywords: Docker, ai, model runner, examples, github actions, genai, sample project
|
||||
---
|
||||
|
||||
See some examples of complete workflows using Docker Model Runner.
|
||||
|
||||
## Sample project
|
||||
|
||||
You can now start building your generative AI application powered by Docker
|
||||
Model Runner.
|
||||
|
||||
If you want to try an existing GenAI application, follow these steps:
|
||||
|
||||
1. Set up the sample app. Clone and run the following repository:
|
||||
|
||||
```console
|
||||
$ git clone https://github.com/docker/hello-genai.git
|
||||
```
|
||||
|
||||
1. In your terminal, go to the `hello-genai` directory.
|
||||
|
||||
1. Run `run.sh` to pull the chosen model and run the app.
|
||||
|
||||
1. Open your app in the browser at the addresses specified in the repository
|
||||
[README](https://github.com/docker/hello-genai).
|
||||
|
||||
You see the GenAI app's interface where you can start typing your prompts.
|
||||
|
||||
You can now interact with your own GenAI app, powered by a local model. Try a
|
||||
few prompts and notice how fast the responses are — all running on your machine
|
||||
with Docker.
|
||||
|
||||
## Use Model Runner in GitHub Actions
|
||||
|
||||
Here is an example of how to use Model Runner as part of a GitHub workflow.
|
||||
The example installs Model Runner, tests the installation, pulls and runs a
|
||||
model, interacts with the model via the API, and deletes the model.
|
||||
|
||||
```yaml {title="dmr-run.yml", collapse=true}
|
||||
name: Docker Model Runner Example Workflow
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
on:
|
||||
workflow_dispatch:
|
||||
inputs:
|
||||
test_model:
|
||||
description: 'Model to test with (default: ai/smollm2:360M-Q4_K_M)'
|
||||
required: false
|
||||
type: string
|
||||
default: 'ai/smollm2:360M-Q4_K_M'
|
||||
|
||||
jobs:
|
||||
dmr-test:
|
||||
runs-on: ubuntu-latest
|
||||
timeout-minutes: 30
|
||||
|
||||
steps:
|
||||
- name: Set up Docker
|
||||
uses: docker/setup-docker-action@v4
|
||||
|
||||
- name: Install docker-model-plugin
|
||||
run: |
|
||||
echo "Installing docker-model-plugin..."
|
||||
# Add Docker's official GPG key:
|
||||
sudo apt-get update
|
||||
sudo apt-get install ca-certificates curl
|
||||
sudo install -m 0755 -d /etc/apt/keyrings
|
||||
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
|
||||
sudo chmod a+r /etc/apt/keyrings/docker.asc
|
||||
|
||||
# Add the repository to Apt sources:
|
||||
echo \
|
||||
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
|
||||
$(. /etc/os-release && echo "${UBUNTU_CODENAME:-$VERSION_CODENAME}") stable" | \
|
||||
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
|
||||
sudo apt-get update
|
||||
sudo apt-get install -y docker-model-plugin
|
||||
|
||||
echo "Installation completed successfully"
|
||||
|
||||
- name: Test docker model version
|
||||
run: |
|
||||
echo "Testing docker model version command..."
|
||||
sudo docker model version
|
||||
|
||||
# Verify the command returns successfully
|
||||
if [ $? -eq 0 ]; then
|
||||
echo "✅ docker model version command works correctly"
|
||||
else
|
||||
echo "❌ docker model version command failed"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
- name: Pull the provided model and run it
|
||||
run: |
|
||||
MODEL="${{ github.event.inputs.test_model || 'ai/smollm2:360M-Q4_K_M' }}"
|
||||
echo "Testing with model: $MODEL"
|
||||
|
||||
# Test model pull
|
||||
echo "Pulling model..."
|
||||
sudo docker model pull "$MODEL"
|
||||
|
||||
if [ $? -eq 0 ]; then
|
||||
echo "✅ Model pull successful"
|
||||
else
|
||||
echo "❌ Model pull failed"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Test basic model run (with timeout to avoid hanging)
|
||||
echo "Testing docker model run..."
|
||||
timeout 60s sudo docker model run "$MODEL" "Give me a fact about whales." || {
|
||||
exit_code=$?
|
||||
if [ $exit_code -eq 124 ]; then
|
||||
echo "✅ Model run test completed (timed out as expected for non-interactive test)"
|
||||
else
|
||||
echo "❌ Model run failed with exit code: $exit_code"
|
||||
exit 1
|
||||
fi
|
||||
}
|
||||
- name: Test model pull and run
|
||||
run: |
|
||||
MODEL="${{ github.event.inputs.test_model || 'ai/smollm2:360M-Q4_K_M' }}"
|
||||
echo "Testing with model: $MODEL"
|
||||
|
||||
# Test model pull
|
||||
echo "Pulling model..."
|
||||
sudo docker model pull "$MODEL"
|
||||
|
||||
if [ $? -eq 0 ]; then
|
||||
echo "✅ Model pull successful"
|
||||
else
|
||||
echo "❌ Model pull failed"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Test basic model run (with timeout to avoid hanging)
|
||||
echo "Testing docker model run..."
|
||||
timeout 60s sudo docker model run "$MODEL" "Give me a fact about whales." || {
|
||||
exit_code=$?
|
||||
if [ $exit_code -eq 124 ]; then
|
||||
echo "✅ Model run test completed (timed out as expected for non-interactive test)"
|
||||
else
|
||||
echo "❌ Model run failed with exit code: $exit_code"
|
||||
exit 1
|
||||
fi
|
||||
}
|
||||
|
||||
- name: Test API endpoint
|
||||
run: |
|
||||
MODEL="${{ github.event.inputs.test_model || 'ai/smollm2:360M-Q4_K_M' }}"
|
||||
echo "Testing API endpoint with model: $MODEL"
|
||||
|
||||
# Test API call with curl
|
||||
echo "Testing API call..."
|
||||
RESPONSE=$(curl -s http://localhost:12434/engines/llama.cpp/v1/chat/completions \
|
||||
-H "Content-Type: application/json" \
|
||||
-d "{
|
||||
\"model\": \"$MODEL\",
|
||||
\"messages\": [
|
||||
{
|
||||
\"role\": \"user\",
|
||||
\"content\": \"Say hello\"
|
||||
}
|
||||
],
|
||||
\"top_k\": 1,
|
||||
\"temperature\": 0
|
||||
}")
|
||||
|
||||
if [ $? -eq 0 ]; then
|
||||
echo "✅ API call successful"
|
||||
echo "Response received: $RESPONSE"
|
||||
|
||||
# Check if response contains "hello" (case-insensitive)
|
||||
if echo "$RESPONSE" | grep -qi "hello"; then
|
||||
echo "✅ Response contains 'hello' (case-insensitive)"
|
||||
else
|
||||
echo "❌ Response does not contain 'hello'"
|
||||
echo "Full response: $RESPONSE"
|
||||
exit 1
|
||||
fi
|
||||
else
|
||||
echo "❌ API call failed"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
- name: Test model cleanup
|
||||
run: |
|
||||
MODEL="${{ github.event.inputs.test_model || 'ai/smollm2:360M-Q4_K_M' }}"
|
||||
|
||||
echo "Cleaning up test model..."
|
||||
sudo docker model rm "$MODEL" || echo "Model removal failed or model not found"
|
||||
|
||||
# Verify model was removed
|
||||
echo "Verifying model cleanup..."
|
||||
sudo docker model ls
|
||||
|
||||
echo "✅ Model cleanup completed"
|
||||
|
||||
- name: Report success
|
||||
if: success()
|
||||
run: |
|
||||
echo "🎉 Docker Model Runner daily health check completed successfully!"
|
||||
echo "All tests passed:"
|
||||
echo " ✅ docker-model-plugin installation successful"
|
||||
echo " ✅ docker model version command working"
|
||||
echo " ✅ Model pull and run operations successful"
|
||||
echo " ✅ API endpoint operations successful"
|
||||
echo " ✅ Cleanup operations successful"
|
||||
```
|
||||
|
||||
## Related pages
|
||||
|
||||
- [Models and Compose](../compose/models-and-compose.md)
|
||||
223
content/manuals/ai/model-runner/get-started.md
Normal file
223
content/manuals/ai/model-runner/get-started.md
Normal file
@@ -0,0 +1,223 @@
|
||||
---
|
||||
title: Get started with DMR
|
||||
description: How to install, enable, and use Docker Model Runner to manage and run AI models.
|
||||
weight: 10
|
||||
keywords: Docker, ai, model runner, setup, installation, getting started
|
||||
---
|
||||
|
||||
Get started with [Docker Model Runner](_index.md).
|
||||
|
||||
## Enable Docker Model Runner
|
||||
|
||||
### Enable DMR in Docker Desktop
|
||||
|
||||
1. In the settings view, go to the **Beta features** tab.
|
||||
1. Select the **Enable Docker Model Runner** setting.
|
||||
1. If you use Windows with a supported NVIDIA GPU, you also see and can select
|
||||
**Enable GPU-backed inference**.
|
||||
1. Optional: To enable TCP support, select **Enable host-side TCP support**.
|
||||
1. In the **Port** field, type the port you want to use.
|
||||
1. If you interact with Model Runner from a local frontend web app, in
|
||||
**CORS Allows Origins**, select the origins that Model Runner should
|
||||
accept requests from. An origin is the URL where your web app runs, for
|
||||
example `http://localhost:3131`.
|
||||
|
||||
You can now use the `docker model` command in the CLI and view and interact
|
||||
with your local models in the **Models** tab in the Docker Desktop Dashboard.
|
||||
|
||||
> [!IMPORTANT]
|
||||
>
|
||||
> For Docker Desktop versions 4.41 and earlier, this setting was under the
|
||||
> **Experimental features** tab on the **Features in development** page.
|
||||
|
||||
### Enable Docker Model Runner in Docker Engine
|
||||
|
||||
1. Ensure you have installed [Docker Engine](/engine/install/).
|
||||
1. Docker Model Runner is available as a package. To install it, run:
|
||||
|
||||
{{< tabs >}}
|
||||
{{< tab name="Ubuntu/Debian">}}
|
||||
|
||||
```console
|
||||
$ sudo apt-get update
|
||||
$ sudo apt-get install docker-model-plugin
|
||||
```
|
||||
|
||||
{{< /tab >}}
|
||||
{{< tab name="RPM-base distributions">}}
|
||||
|
||||
```console
|
||||
$ sudo dnf update
|
||||
$ sudo dnf install docker-model-plugin
|
||||
```
|
||||
|
||||
{{< /tab >}}
|
||||
{{< /tabs >}}
|
||||
|
||||
1. Test the installation:
|
||||
|
||||
```console
|
||||
$ docker model version
|
||||
$ docker model run ai/smollm2
|
||||
```
|
||||
|
||||
> [!NOTE]
|
||||
> TCP support is enabled by default for Docker Engine on port `12434`.
|
||||
|
||||
### Update Docker Model Runner in Docker Engine
|
||||
|
||||
To update Docker Model Runner in Docker Engine, uninstall it with
|
||||
[`docker model uninstall-runner`](/reference/cli/docker/model/uninstall-runner/)
|
||||
then reinstall it:
|
||||
|
||||
```console
|
||||
docker model uninstall-runner --images && docker model install-runner
|
||||
```
|
||||
|
||||
> [!NOTE]
|
||||
> With the above command, local models are preserved.
|
||||
> To delete the models during the upgrade, add the `--models` option to the
|
||||
> `uninstall-runner` command.
|
||||
|
||||
## Pull a model
|
||||
|
||||
Models are cached locally.
|
||||
|
||||
> [!NOTE]
|
||||
>
|
||||
> When you use the Docker CLI, you can also pull models directly from
|
||||
> [HuggingFace](https://huggingface.co/).
|
||||
|
||||
{{< tabs group="release" >}}
|
||||
{{< tab name="From Docker Desktop">}}
|
||||
|
||||
1. Select **Models** and select the **Docker Hub** tab.
|
||||
1. Find the model you want and select **Pull**.
|
||||
|
||||

|
||||
|
||||
{{< /tab >}}
|
||||
{{< tab name="From the Docker CLI">}}
|
||||
|
||||
Use the [`docker model pull` command](/reference/cli/docker/model/pull/).
|
||||
For example:
|
||||
|
||||
```bash {title="Pulling from Docker Hub"}
|
||||
docker model pull ai/smollm2:360M-Q4_K_M
|
||||
```
|
||||
|
||||
```bash {title="Pulling from HuggingFace"}
|
||||
docker model pull hf.co/bartowski/Llama-3.2-1B-Instruct-GGUF
|
||||
```
|
||||
|
||||
{{< /tab >}}
|
||||
{{< /tabs >}}
|
||||
|
||||
## Run a model
|
||||
|
||||
{{< tabs group="release" >}}
|
||||
{{< tab name="From Docker Desktop">}}
|
||||
|
||||
1. Select **Models** and select the **Local** tab.
|
||||
1. Select the play button. The interactive chat screen opens.
|
||||
|
||||

|
||||
|
||||
{{< /tab >}}
|
||||
{{< tab name="From the Docker CLI" >}}
|
||||
|
||||
Use the [`docker model run` command](/reference/cli/docker/model/run/).
|
||||
|
||||
{{< /tab >}}
|
||||
{{< /tabs >}}
|
||||
|
||||
## Configure a model
|
||||
|
||||
You can configure a model, such as the its maximum token limit and more,
|
||||
use Docker Compose. See [Models and Compose - Model configuration options](../compose/models-and-compose.md#model-configuration-options).
|
||||
|
||||
## Publish a model
|
||||
|
||||
> [!NOTE]
|
||||
>
|
||||
> This works for any Container Registry supporting OCI Artifacts, not only
|
||||
> Docker Hub.
|
||||
|
||||
You can tag existing models with a new name and publish them under a different
|
||||
namespace and repository:
|
||||
|
||||
```console
|
||||
# Tag a pulled model under a new name
|
||||
$ docker model tag ai/smollm2 myorg/smollm2
|
||||
|
||||
# Push it to Docker Hub
|
||||
$ docker model push myorg/smollm2
|
||||
```
|
||||
|
||||
For more details, see the [`docker model tag`](/reference/cli/docker/model/tag)
|
||||
and [`docker model push`](/reference/cli/docker/model/push) command
|
||||
documentation.
|
||||
|
||||
You can also package a model file in GGUF format as an OCI Artifact and publish
|
||||
it to Docker Hub.
|
||||
|
||||
```console
|
||||
# Download a model file in GGUF format, for example from HuggingFace
|
||||
$ curl -L -o model.gguf https://huggingface.co/TheBloke/Mistral-7B-v0.1-GGUF/resolve/main/mistral-7b-v0.1.Q4_K_M.gguf
|
||||
|
||||
# Package it as OCI Artifact and push it to Docker Hub
|
||||
$ docker model package --gguf "$(pwd)/model.gguf" --push myorg/mistral-7b-v0.1:Q4_K_M
|
||||
```
|
||||
|
||||
For more details, see the
|
||||
[`docker model package`](/reference/cli/docker/model/package/) command
|
||||
documentation.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Display the logs
|
||||
|
||||
To troubleshoot issues, display the logs:
|
||||
|
||||
{{< tabs group="release" >}}
|
||||
{{< tab name="From Docker Desktop">}}
|
||||
|
||||
Select **Models** and select the **Logs** tab.
|
||||
|
||||

|
||||
|
||||
{{< /tab >}}
|
||||
{{< tab name="From the Docker CLI">}}
|
||||
|
||||
Use the [`docker model logs` command](/reference/cli/docker/model/logs/).
|
||||
|
||||
{{< /tab >}}
|
||||
{{< /tabs >}}
|
||||
|
||||
### Inspect requests and responses
|
||||
|
||||
Inspecting requests and responses helps you diagnose model-related issues.
|
||||
For example, you can evaluate context usage to verify you stay within the model's context
|
||||
window or display the full body of a request to control the parameters you are passing to your models
|
||||
when developing with a framework.
|
||||
|
||||
In Docker Desktop, to inspect the requests and responses for each model:
|
||||
|
||||
1. Select **Models** and select the **Requests** tab. This view displays all the requests to all models:
|
||||
- The time the request was sent.
|
||||
- The model name and version
|
||||
- The prompt/request
|
||||
- The context usage
|
||||
- The time it took for the response to be generated.
|
||||
2. Select one of the requests to display further details:
|
||||
- In the **Overview** tab, view the token usage, response metadata and generation speed, and the actual prompt and response.
|
||||
- In the **Request** and **Response** tabs, view the full JSON payload of the request and the response.
|
||||
|
||||
> [!NOTE]
|
||||
> You can also display the requests for a specific model when you select a model and then select the **Requests** tab.
|
||||
|
||||
## Related pages
|
||||
|
||||
- [Interact with your model programmatically](./api-reference.md)
|
||||
- [Models and Compose](../compose/models-and-compose.md)
|
||||
- [Docker Model Runner cli reference documentation](/reference/cli/docker/model)
|
||||
0
content/manuals/ai/model-runner/setup.md
Normal file
0
content/manuals/ai/model-runner/setup.md
Normal file
Reference in New Issue
Block a user