vendor: github.com/docker/model-cli v1.0.3

See changes in https://github.com/docker/model-runner/compare/cmd/cli/v0.1.44...cmd/cli/v1.0.3.

Signed-off-by: Dorin Geman <dorin.geman@docker.com>
Signed-off-by: David Karlsson <35727626+dvdksn@users.noreply.github.com>
This commit is contained in:
Dorin Geman
2025-12-05 15:18:12 +02:00
committed by David Karlsson
parent 9963fecdaf
commit b6558dd6be
34 changed files with 1186 additions and 86 deletions

View File

@@ -6,6 +6,7 @@ long: |-
pname: docker
plink: docker.yaml
cname:
- docker model bench
- docker model df
- docker model inspect
- docker model install-runner
@@ -14,16 +15,22 @@ cname:
- docker model package
- docker model ps
- docker model pull
- docker model purge
- docker model push
- docker model reinstall-runner
- docker model requests
- docker model restart-runner
- docker model rm
- docker model run
- docker model start-runner
- docker model status
- docker model stop-runner
- docker model tag
- docker model uninstall-runner
- docker model unload
- docker model version
clink:
- docker_model_bench.yaml
- docker_model_df.yaml
- docker_model_inspect.yaml
- docker_model_install-runner.yaml
@@ -32,11 +39,16 @@ clink:
- docker_model_package.yaml
- docker_model_ps.yaml
- docker_model_pull.yaml
- docker_model_purge.yaml
- docker_model_push.yaml
- docker_model_reinstall-runner.yaml
- docker_model_requests.yaml
- docker_model_restart-runner.yaml
- docker_model_rm.yaml
- docker_model_run.yaml
- docker_model_start-runner.yaml
- docker_model_status.yaml
- docker_model_stop-runner.yaml
- docker_model_tag.yaml
- docker_model_uninstall-runner.yaml
- docker_model_unload.yaml

View File

@@ -0,0 +1,69 @@
command: docker model bench
short: Benchmark a model's performance at different concurrency levels
long: |-
Benchmark a model's performance showing tokens per second at different concurrency levels.
This command runs a series of benchmarks with 1, 2, 4, and 8 concurrent requests by default,
measuring the tokens per second (TPS) that the model can generate.
usage: docker model bench [MODEL]
pname: docker model
plink: docker_model.yaml
options:
- option: concurrency
value_type: intSlice
default_value: '[1,2,4,8]'
description: Concurrency levels to test
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
- option: duration
value_type: duration
default_value: 30s
description: Duration to run each concurrency test
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
- option: json
value_type: bool
default_value: "false"
description: Output results in JSON format
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
- option: prompt
value_type: string
default_value: |
Write a comprehensive 100 word summary on whales and their impact on society.
description: Prompt to use for benchmarking
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
- option: timeout
value_type: duration
default_value: 5m0s
description: Timeout for each individual request
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false

View File

@@ -33,9 +33,29 @@ options:
experimentalcli: false
kubernetes: false
swarm: false
- option: runtime-flags
- option: speculative-draft-model
value_type: string
description: raw runtime flags to pass to the inference engine
description: draft model for speculative decoding
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
- option: speculative-min-acceptance-rate
value_type: float64
default_value: "0"
description: minimum acceptance rate for speculative decoding
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
- option: speculative-num-tokens
value_type: int
default_value: "0"
description: number of tokens to predict speculatively
deprecated: false
hidden: false
experimental: false

View File

@@ -1,7 +1,7 @@
command: docker model configure
short: Configure runtime options for a model
long: Configure runtime options for a model
usage: docker model configure [--context-size=<n>] MODEL [-- <runtime-flags...>]
usage: docker model configure [--context-size=<n>] [--speculative-draft-model=<model>] [--hf_overrides=<json>] [--reasoning-budget=<n>] MODEL
pname: docker model
plink: docker_model.yaml
options:
@@ -15,6 +15,54 @@ options:
experimentalcli: false
kubernetes: false
swarm: false
- option: hf_overrides
value_type: string
description: HuggingFace model config overrides (JSON) - vLLM only
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
- option: reasoning-budget
value_type: int64
default_value: "0"
description: reasoning budget for reasoning models - llama.cpp only
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
- option: speculative-draft-model
value_type: string
description: draft model for speculative decoding
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
- option: speculative-min-acceptance-rate
value_type: float64
default_value: "0"
description: minimum acceptance rate for speculative decoding
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
- option: speculative-num-tokens
value_type: int
default_value: "0"
description: number of tokens to predict speculatively
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
deprecated: false
hidden: true
experimental: false

View File

@@ -6,6 +6,25 @@ usage: docker model install-runner
pname: docker model
plink: docker_model.yaml
options:
- option: backend
value_type: string
description: 'Specify backend (llama.cpp|vllm). Default: llama.cpp'
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
- option: debug
value_type: bool
default_value: "false"
description: Enable debug logging
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
- option: do-not-track
value_type: bool
default_value: "false"
@@ -19,7 +38,17 @@ options:
- option: gpu
value_type: string
default_value: auto
description: Specify GPU support (none|auto|cuda)
description: Specify GPU support (none|auto|cuda|rocm|musa|cann)
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
- option: host
value_type: string
default_value: 127.0.0.1
description: Host address to bind Docker Model Runner
deprecated: false
hidden: false
experimental: false
@@ -30,7 +59,7 @@ options:
value_type: uint16
default_value: "0"
description: |
Docker container port for Docker Model Runner (default: 12434 for Docker CE, 12435 for Cloud mode)
Docker container port for Docker Model Runner (default: 12434 for Docker Engine, 12435 for Cloud mode)
deprecated: false
hidden: false
experimental: false

View File

@@ -6,15 +6,6 @@ usage: docker model list [OPTIONS]
pname: docker model
plink: docker_model.yaml
options:
- option: backend
value_type: string
description: Specify the backend to use (llama.cpp, openai)
deprecated: false
hidden: true
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
- option: json
value_type: bool
default_value: "false"

View File

@@ -1,10 +1,12 @@
command: docker model package
short: |
Package a GGUF file into a Docker model OCI artifact, with optional licenses.
Package a GGUF file, Safetensors directory, or existing model into a Docker model OCI artifact.
long: |-
Package a GGUF file into a Docker model OCI artifact, with optional licenses. The package is sent to the model-runner, unless --push is specified.
When packaging a sharded model --gguf should point to the first shard. All shard files should be siblings and should include the index in the file name (e.g. model-00001-of-00015.gguf).
usage: docker model package --gguf <path> [--license <path>...] [--context-size <tokens>] [--push] MODEL
Package a GGUF file, Safetensors directory, or existing model into a Docker model OCI artifact, with optional licenses. The package is sent to the model-runner, unless --push is specified.
When packaging a sharded GGUF model, --gguf should point to the first shard. All shard files should be siblings and should include the index in the file name (e.g. model-00001-of-00015.gguf).
When packaging a Safetensors model, --safetensors-dir should point to a directory containing .safetensors files and config files (*.json, merges.txt). All files will be auto-discovered and config files will be packaged into a tar archive.
When packaging from an existing model using --from, you can modify properties like context size to create a variant of the original model.
usage: docker model package (--gguf <path> | --safetensors-dir <path> | --from <model>) [--license <path>...] [--context-size <tokens>] [--push] MODEL
pname: docker model
plink: docker_model.yaml
options:
@@ -27,9 +29,29 @@ options:
experimentalcli: false
kubernetes: false
swarm: false
- option: dir-tar
value_type: stringArray
default_value: '[]'
description: |
relative path to directory to package as tar (can be specified multiple times)
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
- option: from
value_type: string
description: reference to an existing model to repackage
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
- option: gguf
value_type: string
description: absolute path to gguf file (required)
description: absolute path to gguf file
deprecated: false
hidden: false
experimental: false
@@ -58,6 +80,15 @@ options:
experimentalcli: false
kubernetes: false
swarm: false
- option: safetensors-dir
value_type: string
description: absolute path to directory containing safetensors files and config
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
deprecated: false
hidden: false
experimental: false

View File

@@ -0,0 +1,25 @@
command: docker model purge
short: Remove all models
long: Remove all models
usage: docker model purge [OPTIONS]
pname: docker model
plink: docker_model.yaml
options:
- option: force
shorthand: f
value_type: bool
default_value: "false"
description: Forcefully remove all models
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false

View File

@@ -0,0 +1,75 @@
command: docker model reinstall-runner
short: Reinstall Docker Model Runner (Docker Engine only)
long: |
This command removes the existing Docker Model Runner container and reinstalls it with the specified configuration. Models and images are preserved during reinstallation.
usage: docker model reinstall-runner
pname: docker model
plink: docker_model.yaml
options:
- option: backend
value_type: string
description: 'Specify backend (llama.cpp|vllm). Default: llama.cpp'
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
- option: debug
value_type: bool
default_value: "false"
description: Enable debug logging
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
- option: do-not-track
value_type: bool
default_value: "false"
description: Do not track models usage in Docker Model Runner
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
- option: gpu
value_type: string
default_value: auto
description: Specify GPU support (none|auto|cuda|rocm|musa|cann)
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
- option: host
value_type: string
default_value: 127.0.0.1
description: Host address to bind Docker Model Runner
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
- option: port
value_type: uint16
default_value: "0"
description: |
Docker container port for Docker Model Runner (default: 12434 for Docker Engine, 12435 for Cloud mode)
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false

View File

@@ -0,0 +1,68 @@
command: docker model restart-runner
short: Restart Docker Model Runner (Docker Engine only)
long: |-
This command restarts the Docker Model Runner without pulling container images. Use this command to restart the runner when you already have the required images locally.
For the first-time setup or to ensure you have the latest images, use `docker model install-runner` instead.
usage: docker model restart-runner
pname: docker model
plink: docker_model.yaml
options:
- option: debug
value_type: bool
default_value: "false"
description: Enable debug logging
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
- option: do-not-track
value_type: bool
default_value: "false"
description: Do not track models usage in Docker Model Runner
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
- option: gpu
value_type: string
default_value: auto
description: Specify GPU support (none|auto|cuda|rocm|musa|cann)
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
- option: host
value_type: string
default_value: 127.0.0.1
description: Host address to bind Docker Model Runner
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
- option: port
value_type: uint16
default_value: "0"
description: |
Docker container port for Docker Model Runner (default: 12434 for Docker Engine, 12435 for Cloud mode)
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false

View File

@@ -10,18 +10,9 @@ usage: docker model run MODEL [PROMPT]
pname: docker model
plink: docker_model.yaml
options:
- option: backend
value_type: string
description: Specify the backend to use (llama.cpp, openai)
deprecated: false
hidden: true
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
- option: color
value_type: string
default_value: auto
default_value: "no"
description: Use colored output (auto|yes|no)
deprecated: false
hidden: false
@@ -39,6 +30,17 @@ options:
experimentalcli: false
kubernetes: false
swarm: false
- option: detach
shorthand: d
value_type: bool
default_value: "false"
description: Load the model in the background without interaction
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
- option: ignore-runtime-memory-check
value_type: bool
default_value: "false"
@@ -72,12 +74,18 @@ examples: |-
Output:
```console
Interactive chat mode started. Type '/bye' to exit.
> Hi
Hi there! It's SmolLM, AI assistant. How can I help you today?
> /bye
Chat session ended.
```
### Pre-load a model
```console
docker model run --detach ai/smollm2
```
This loads the model into memory without interaction, ensuring maximum performance for subsequent requests.
deprecated: false
hidden: false
experimental: false

View File

@@ -0,0 +1,67 @@
command: docker model start-runner
short: Start Docker Model Runner (Docker Engine only)
long: |-
This command starts the Docker Model Runner without pulling container images. Use this command to start the runner when you already have the required images locally.
For the first-time setup or to ensure you have the latest images, use `docker model install-runner` instead.
usage: docker model start-runner
pname: docker model
plink: docker_model.yaml
options:
- option: backend
value_type: string
description: 'Specify backend (llama.cpp|vllm). Default: llama.cpp'
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
- option: debug
value_type: bool
default_value: "false"
description: Enable debug logging
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
- option: do-not-track
value_type: bool
default_value: "false"
description: Do not track models usage in Docker Model Runner
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
- option: gpu
value_type: string
default_value: auto
description: Specify GPU support (none|auto|cuda|rocm|musa|cann)
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
- option: port
value_type: uint16
default_value: "0"
description: |
Docker container port for Docker Model Runner (default: 12434 for Docker Engine, 12435 for Cloud mode)
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false

View File

@@ -0,0 +1,27 @@
command: docker model stop-runner
short: Stop Docker Model Runner (Docker Engine only)
long: |-
This command stops the Docker Model Runner by removing the running containers, but preserves the container images on disk. Use this command when you want to temporarily stop the runner but plan to start it again later.
To completely remove the runner including images, use `docker model uninstall-runner --images` instead.
usage: docker model stop-runner
pname: docker model
plink: docker_model.yaml
options:
- option: models
value_type: bool
default_value: "false"
description: Remove model storage volume
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false

View File

@@ -1,6 +1,6 @@
command: docker model uninstall-runner
short: Uninstall Docker Model Runner
long: Uninstall Docker Model Runner
short: Uninstall Docker Model Runner (Docker Engine only)
long: Uninstall Docker Model Runner (Docker Engine only)
usage: docker model uninstall-runner
pname: docker model
plink: docker_model.yaml

View File

@@ -5,25 +5,31 @@ Docker Model Runner
### Subcommands
| Name | Description |
|:------------------------------------------------|:------------------------------------------------------------------------------|
| [`df`](model_df.md) | Show Docker Model Runner disk usage |
| [`inspect`](model_inspect.md) | Display detailed information on one model |
| [`install-runner`](model_install-runner.md) | Install Docker Model Runner (Docker Engine only) |
| [`list`](model_list.md) | List the models pulled to your local environment |
| [`logs`](model_logs.md) | Fetch the Docker Model Runner logs |
| [`package`](model_package.md) | Package a GGUF file into a Docker model OCI artifact, with optional licenses. |
| [`ps`](model_ps.md) | List running models |
| [`pull`](model_pull.md) | Pull a model from Docker Hub or HuggingFace to your local environment |
| [`push`](model_push.md) | Push a model to Docker Hub |
| [`requests`](model_requests.md) | Fetch requests+responses from Docker Model Runner |
| [`rm`](model_rm.md) | Remove local models downloaded from Docker Hub |
| [`run`](model_run.md) | Run a model and interact with it using a submitted prompt or chat mode |
| [`status`](model_status.md) | Check if the Docker Model Runner is running |
| [`tag`](model_tag.md) | Tag a model |
| [`uninstall-runner`](model_uninstall-runner.md) | Uninstall Docker Model Runner |
| [`unload`](model_unload.md) | Unload running models |
| [`version`](model_version.md) | Show the Docker Model Runner version |
| Name | Description |
|:------------------------------------------------|:------------------------------------------------------------------------------------------------|
| [`bench`](model_bench.md) | Benchmark a model's performance at different concurrency levels |
| [`df`](model_df.md) | Show Docker Model Runner disk usage |
| [`inspect`](model_inspect.md) | Display detailed information on one model |
| [`install-runner`](model_install-runner.md) | Install Docker Model Runner (Docker Engine only) |
| [`list`](model_list.md) | List the models pulled to your local environment |
| [`logs`](model_logs.md) | Fetch the Docker Model Runner logs |
| [`package`](model_package.md) | Package a GGUF file, Safetensors directory, or existing model into a Docker model OCI artifact. |
| [`ps`](model_ps.md) | List running models |
| [`pull`](model_pull.md) | Pull a model from Docker Hub or HuggingFace to your local environment |
| [`purge`](model_purge.md) | Remove all models |
| [`push`](model_push.md) | Push a model to Docker Hub |
| [`reinstall-runner`](model_reinstall-runner.md) | Reinstall Docker Model Runner (Docker Engine only) |
| [`requests`](model_requests.md) | Fetch requests+responses from Docker Model Runner |
| [`restart-runner`](model_restart-runner.md) | Restart Docker Model Runner (Docker Engine only) |
| [`rm`](model_rm.md) | Remove local models downloaded from Docker Hub |
| [`run`](model_run.md) | Run a model and interact with it using a submitted prompt or chat mode |
| [`start-runner`](model_start-runner.md) | Start Docker Model Runner (Docker Engine only) |
| [`status`](model_status.md) | Check if the Docker Model Runner is running |
| [`stop-runner`](model_stop-runner.md) | Stop Docker Model Runner (Docker Engine only) |
| [`tag`](model_tag.md) | Tag a model |
| [`uninstall-runner`](model_uninstall-runner.md) | Uninstall Docker Model Runner (Docker Engine only) |
| [`unload`](model_unload.md) | Unload running models |
| [`version`](model_version.md) | Show the Docker Model Runner version |

View File

@@ -0,0 +1,21 @@
# docker model bench
<!---MARKER_GEN_START-->
Benchmark a model's performance showing tokens per second at different concurrency levels.
This command runs a series of benchmarks with 1, 2, 4, and 8 concurrent requests by default,
measuring the tokens per second (TPS) that the model can generate.
### Options
| Name | Type | Default | Description |
|:----------------|:-----------|:--------------------------------------------------------------------------------|:--------------------------------------|
| `--concurrency` | `intSlice` | `[1,2,4,8]` | Concurrency levels to test |
| `--duration` | `duration` | `30s` | Duration to run each concurrency test |
| `--json` | `bool` | | Output results in JSON format |
| `--prompt` | `string` | `Write a comprehensive 100 word summary on whales and their impact on society.` | Prompt to use for benchmarking |
| `--timeout` | `duration` | `5m0s` | Timeout for each individual request |
<!---MARKER_GEN_END-->

View File

@@ -5,11 +5,14 @@ Install Docker Model Runner (Docker Engine only)
### Options
| Name | Type | Default | Description |
|:-----------------|:---------|:--------|:---------------------------------------------------------------------------------------------------|
| `--do-not-track` | `bool` | | Do not track models usage in Docker Model Runner |
| `--gpu` | `string` | `auto` | Specify GPU support (none\|auto\|cuda) |
| `--port` | `uint16` | `0` | Docker container port for Docker Model Runner (default: 12434 for Docker CE, 12435 for Cloud mode) |
| Name | Type | Default | Description |
|:-----------------|:---------|:------------|:-------------------------------------------------------------------------------------------------------|
| `--backend` | `string` | | Specify backend (llama.cpp\|vllm). Default: llama.cpp |
| `--debug` | `bool` | | Enable debug logging |
| `--do-not-track` | `bool` | | Do not track models usage in Docker Model Runner |
| `--gpu` | `string` | `auto` | Specify GPU support (none\|auto\|cuda\|rocm\|musa\|cann) |
| `--host` | `string` | `127.0.0.1` | Host address to bind Docker Model Runner |
| `--port` | `uint16` | `0` | Docker container port for Docker Model Runner (default: 12434 for Docker Engine, 12435 for Cloud mode) |
<!---MARKER_GEN_END-->

View File

@@ -1,18 +1,23 @@
# docker model package
<!---MARKER_GEN_START-->
Package a GGUF file into a Docker model OCI artifact, with optional licenses. The package is sent to the model-runner, unless --push is specified.
When packaging a sharded model --gguf should point to the first shard. All shard files should be siblings and should include the index in the file name (e.g. model-00001-of-00015.gguf).
Package a GGUF file, Safetensors directory, or existing model into a Docker model OCI artifact, with optional licenses. The package is sent to the model-runner, unless --push is specified.
When packaging a sharded GGUF model, --gguf should point to the first shard. All shard files should be siblings and should include the index in the file name (e.g. model-00001-of-00015.gguf).
When packaging a Safetensors model, --safetensors-dir should point to a directory containing .safetensors files and config files (*.json, merges.txt). All files will be auto-discovered and config files will be packaged into a tar archive.
When packaging from an existing model using --from, you can modify properties like context size to create a variant of the original model.
### Options
| Name | Type | Default | Description |
|:------------------|:--------------|:--------|:---------------------------------------------------------------------------------------|
| `--chat-template` | `string` | | absolute path to chat template file (must be Jinja format) |
| `--context-size` | `uint64` | `0` | context size in tokens |
| `--gguf` | `string` | | absolute path to gguf file (required) |
| `-l`, `--license` | `stringArray` | | absolute path to a license file |
| `--push` | `bool` | | push to registry (if not set, the model is loaded into the Model Runner content store) |
| Name | Type | Default | Description |
|:--------------------|:--------------|:--------|:---------------------------------------------------------------------------------------|
| `--chat-template` | `string` | | absolute path to chat template file (must be Jinja format) |
| `--context-size` | `uint64` | `0` | context size in tokens |
| `--dir-tar` | `stringArray` | | relative path to directory to package as tar (can be specified multiple times) |
| `--from` | `string` | | reference to an existing model to repackage |
| `--gguf` | `string` | | absolute path to gguf file |
| `-l`, `--license` | `stringArray` | | absolute path to a license file |
| `--push` | `bool` | | push to registry (if not set, the model is loaded into the Model Runner content store) |
| `--safetensors-dir` | `string` | | absolute path to directory containing safetensors files and config |
<!---MARKER_GEN_END-->

View File

@@ -0,0 +1,14 @@
# docker model purge
<!---MARKER_GEN_START-->
Remove all models
### Options
| Name | Type | Default | Description |
|:----------------|:-------|:--------|:-----------------------------|
| `-f`, `--force` | `bool` | | Forcefully remove all models |
<!---MARKER_GEN_END-->

View File

@@ -0,0 +1,22 @@
# docker model reinstall-runner
<!---MARKER_GEN_START-->
Reinstall Docker Model Runner (Docker Engine only)
### Options
| Name | Type | Default | Description |
|:-----------------|:---------|:------------|:-------------------------------------------------------------------------------------------------------|
| `--backend` | `string` | | Specify backend (llama.cpp\|vllm). Default: llama.cpp |
| `--debug` | `bool` | | Enable debug logging |
| `--do-not-track` | `bool` | | Do not track models usage in Docker Model Runner |
| `--gpu` | `string` | `auto` | Specify GPU support (none\|auto\|cuda\|rocm\|musa\|cann) |
| `--host` | `string` | `127.0.0.1` | Host address to bind Docker Model Runner |
| `--port` | `uint16` | `0` | Docker container port for Docker Model Runner (default: 12434 for Docker Engine, 12435 for Cloud mode) |
<!---MARKER_GEN_END-->
## Description
This command removes the existing Docker Model Runner container and reinstalls it with the specified configuration. Models and images are preserved during reinstallation.

View File

@@ -0,0 +1,23 @@
# docker model restart-runner
<!---MARKER_GEN_START-->
Restart Docker Model Runner (Docker Engine only)
### Options
| Name | Type | Default | Description |
|:-----------------|:---------|:------------|:-------------------------------------------------------------------------------------------------------|
| `--debug` | `bool` | | Enable debug logging |
| `--do-not-track` | `bool` | | Do not track models usage in Docker Model Runner |
| `--gpu` | `string` | `auto` | Specify GPU support (none\|auto\|cuda\|rocm\|musa\|cann) |
| `--host` | `string` | `127.0.0.1` | Host address to bind Docker Model Runner |
| `--port` | `uint16` | `0` | Docker container port for Docker Model Runner (default: 12434 for Docker Engine, 12435 for Cloud mode) |
<!---MARKER_GEN_END-->
## Description
This command restarts the Docker Model Runner without pulling container images. Use this command to restart the runner when you already have the required images locally.
For the first-time setup or to ensure you have the latest images, use `docker model install-runner` instead.

View File

@@ -7,8 +7,9 @@ Run a model and interact with it using a submitted prompt or chat mode
| Name | Type | Default | Description |
|:--------------------------------|:---------|:--------|:----------------------------------------------------------------------------------|
| `--color` | `string` | `auto` | Use colored output (auto\|yes\|no) |
| `--color` | `string` | `no` | Use colored output (auto\|yes\|no) |
| `--debug` | `bool` | | Enable debug logging |
| `-d`, `--detach` | `bool` | | Load the model in the background without interaction |
| `--ignore-runtime-memory-check` | `bool` | | Do not block pull if estimated runtime memory for model exceeds system resources. |
@@ -45,9 +46,15 @@ docker model run ai/smollm2
Output:
```console
Interactive chat mode started. Type '/bye' to exit.
> Hi
Hi there! It's SmolLM, AI assistant. How can I help you today?
> /bye
Chat session ended.
```
### Pre-load a model
```console
docker model run --detach ai/smollm2
```
This loads the model into memory without interaction, ensuring maximum performance for subsequent requests.

View File

@@ -0,0 +1,23 @@
# docker model start-runner
<!---MARKER_GEN_START-->
Start Docker Model Runner (Docker Engine only)
### Options
| Name | Type | Default | Description |
|:-----------------|:---------|:--------|:-------------------------------------------------------------------------------------------------------|
| `--backend` | `string` | | Specify backend (llama.cpp\|vllm). Default: llama.cpp |
| `--debug` | `bool` | | Enable debug logging |
| `--do-not-track` | `bool` | | Do not track models usage in Docker Model Runner |
| `--gpu` | `string` | `auto` | Specify GPU support (none\|auto\|cuda\|rocm\|musa\|cann) |
| `--port` | `uint16` | `0` | Docker container port for Docker Model Runner (default: 12434 for Docker Engine, 12435 for Cloud mode) |
<!---MARKER_GEN_END-->
## Description
This command starts the Docker Model Runner without pulling container images. Use this command to start the runner when you already have the required images locally.
For the first-time setup or to ensure you have the latest images, use `docker model install-runner` instead.

View File

@@ -0,0 +1,19 @@
# docker model stop-runner
<!---MARKER_GEN_START-->
Stop Docker Model Runner (Docker Engine only)
### Options
| Name | Type | Default | Description |
|:-----------|:-------|:--------|:----------------------------|
| `--models` | `bool` | | Remove model storage volume |
<!---MARKER_GEN_END-->
## Description
This command stops the Docker Model Runner by removing the running containers, but preserves the container images on disk. Use this command when you want to temporarily stop the runner but plan to start it again later.
To completely remove the runner including images, use `docker model uninstall-runner --images` instead.

View File

@@ -1,7 +1,7 @@
# docker model uninstall-runner
<!---MARKER_GEN_START-->
Uninstall Docker Model Runner
Uninstall Docker Model Runner (Docker Engine only)
### Options