mirror of
https://github.com/docker/docs.git
synced 2026-03-27 14:28:47 +07:00
vendor: github.com/docker/model-cli v1.0.3
See changes in https://github.com/docker/model-runner/compare/cmd/cli/v0.1.44...cmd/cli/v1.0.3. Signed-off-by: Dorin Geman <dorin.geman@docker.com> Signed-off-by: David Karlsson <35727626+dvdksn@users.noreply.github.com>
This commit is contained in:
committed by
David Karlsson
parent
9963fecdaf
commit
b6558dd6be
@@ -6,6 +6,7 @@ long: |-
|
||||
pname: docker
|
||||
plink: docker.yaml
|
||||
cname:
|
||||
- docker model bench
|
||||
- docker model df
|
||||
- docker model inspect
|
||||
- docker model install-runner
|
||||
@@ -14,16 +15,22 @@ cname:
|
||||
- docker model package
|
||||
- docker model ps
|
||||
- docker model pull
|
||||
- docker model purge
|
||||
- docker model push
|
||||
- docker model reinstall-runner
|
||||
- docker model requests
|
||||
- docker model restart-runner
|
||||
- docker model rm
|
||||
- docker model run
|
||||
- docker model start-runner
|
||||
- docker model status
|
||||
- docker model stop-runner
|
||||
- docker model tag
|
||||
- docker model uninstall-runner
|
||||
- docker model unload
|
||||
- docker model version
|
||||
clink:
|
||||
- docker_model_bench.yaml
|
||||
- docker_model_df.yaml
|
||||
- docker_model_inspect.yaml
|
||||
- docker_model_install-runner.yaml
|
||||
@@ -32,11 +39,16 @@ clink:
|
||||
- docker_model_package.yaml
|
||||
- docker_model_ps.yaml
|
||||
- docker_model_pull.yaml
|
||||
- docker_model_purge.yaml
|
||||
- docker_model_push.yaml
|
||||
- docker_model_reinstall-runner.yaml
|
||||
- docker_model_requests.yaml
|
||||
- docker_model_restart-runner.yaml
|
||||
- docker_model_rm.yaml
|
||||
- docker_model_run.yaml
|
||||
- docker_model_start-runner.yaml
|
||||
- docker_model_status.yaml
|
||||
- docker_model_stop-runner.yaml
|
||||
- docker_model_tag.yaml
|
||||
- docker_model_uninstall-runner.yaml
|
||||
- docker_model_unload.yaml
|
||||
|
||||
69
_vendor/github.com/docker/model-runner/cmd/cli/docs/reference/docker_model_bench.yaml
generated
Normal file
69
_vendor/github.com/docker/model-runner/cmd/cli/docs/reference/docker_model_bench.yaml
generated
Normal file
@@ -0,0 +1,69 @@
|
||||
command: docker model bench
|
||||
short: Benchmark a model's performance at different concurrency levels
|
||||
long: |-
|
||||
Benchmark a model's performance showing tokens per second at different concurrency levels.
|
||||
|
||||
This command runs a series of benchmarks with 1, 2, 4, and 8 concurrent requests by default,
|
||||
measuring the tokens per second (TPS) that the model can generate.
|
||||
usage: docker model bench [MODEL]
|
||||
pname: docker model
|
||||
plink: docker_model.yaml
|
||||
options:
|
||||
- option: concurrency
|
||||
value_type: intSlice
|
||||
default_value: '[1,2,4,8]'
|
||||
description: Concurrency levels to test
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
- option: duration
|
||||
value_type: duration
|
||||
default_value: 30s
|
||||
description: Duration to run each concurrency test
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
- option: json
|
||||
value_type: bool
|
||||
default_value: "false"
|
||||
description: Output results in JSON format
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
- option: prompt
|
||||
value_type: string
|
||||
default_value: |
|
||||
Write a comprehensive 100 word summary on whales and their impact on society.
|
||||
description: Prompt to use for benchmarking
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
- option: timeout
|
||||
value_type: duration
|
||||
default_value: 5m0s
|
||||
description: Timeout for each individual request
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
|
||||
@@ -33,9 +33,29 @@ options:
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
- option: runtime-flags
|
||||
- option: speculative-draft-model
|
||||
value_type: string
|
||||
description: raw runtime flags to pass to the inference engine
|
||||
description: draft model for speculative decoding
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
- option: speculative-min-acceptance-rate
|
||||
value_type: float64
|
||||
default_value: "0"
|
||||
description: minimum acceptance rate for speculative decoding
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
- option: speculative-num-tokens
|
||||
value_type: int
|
||||
default_value: "0"
|
||||
description: number of tokens to predict speculatively
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
command: docker model configure
|
||||
short: Configure runtime options for a model
|
||||
long: Configure runtime options for a model
|
||||
usage: docker model configure [--context-size=<n>] MODEL [-- <runtime-flags...>]
|
||||
usage: docker model configure [--context-size=<n>] [--speculative-draft-model=<model>] [--hf_overrides=<json>] [--reasoning-budget=<n>] MODEL
|
||||
pname: docker model
|
||||
plink: docker_model.yaml
|
||||
options:
|
||||
@@ -15,6 +15,54 @@ options:
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
- option: hf_overrides
|
||||
value_type: string
|
||||
description: HuggingFace model config overrides (JSON) - vLLM only
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
- option: reasoning-budget
|
||||
value_type: int64
|
||||
default_value: "0"
|
||||
description: reasoning budget for reasoning models - llama.cpp only
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
- option: speculative-draft-model
|
||||
value_type: string
|
||||
description: draft model for speculative decoding
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
- option: speculative-min-acceptance-rate
|
||||
value_type: float64
|
||||
default_value: "0"
|
||||
description: minimum acceptance rate for speculative decoding
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
- option: speculative-num-tokens
|
||||
value_type: int
|
||||
default_value: "0"
|
||||
description: number of tokens to predict speculatively
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
deprecated: false
|
||||
hidden: true
|
||||
experimental: false
|
||||
|
||||
@@ -6,6 +6,25 @@ usage: docker model install-runner
|
||||
pname: docker model
|
||||
plink: docker_model.yaml
|
||||
options:
|
||||
- option: backend
|
||||
value_type: string
|
||||
description: 'Specify backend (llama.cpp|vllm). Default: llama.cpp'
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
- option: debug
|
||||
value_type: bool
|
||||
default_value: "false"
|
||||
description: Enable debug logging
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
- option: do-not-track
|
||||
value_type: bool
|
||||
default_value: "false"
|
||||
@@ -19,7 +38,17 @@ options:
|
||||
- option: gpu
|
||||
value_type: string
|
||||
default_value: auto
|
||||
description: Specify GPU support (none|auto|cuda)
|
||||
description: Specify GPU support (none|auto|cuda|rocm|musa|cann)
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
- option: host
|
||||
value_type: string
|
||||
default_value: 127.0.0.1
|
||||
description: Host address to bind Docker Model Runner
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
@@ -30,7 +59,7 @@ options:
|
||||
value_type: uint16
|
||||
default_value: "0"
|
||||
description: |
|
||||
Docker container port for Docker Model Runner (default: 12434 for Docker CE, 12435 for Cloud mode)
|
||||
Docker container port for Docker Model Runner (default: 12434 for Docker Engine, 12435 for Cloud mode)
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
|
||||
@@ -6,15 +6,6 @@ usage: docker model list [OPTIONS]
|
||||
pname: docker model
|
||||
plink: docker_model.yaml
|
||||
options:
|
||||
- option: backend
|
||||
value_type: string
|
||||
description: Specify the backend to use (llama.cpp, openai)
|
||||
deprecated: false
|
||||
hidden: true
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
- option: json
|
||||
value_type: bool
|
||||
default_value: "false"
|
||||
|
||||
@@ -1,10 +1,12 @@
|
||||
command: docker model package
|
||||
short: |
|
||||
Package a GGUF file into a Docker model OCI artifact, with optional licenses.
|
||||
Package a GGUF file, Safetensors directory, or existing model into a Docker model OCI artifact.
|
||||
long: |-
|
||||
Package a GGUF file into a Docker model OCI artifact, with optional licenses. The package is sent to the model-runner, unless --push is specified.
|
||||
When packaging a sharded model --gguf should point to the first shard. All shard files should be siblings and should include the index in the file name (e.g. model-00001-of-00015.gguf).
|
||||
usage: docker model package --gguf <path> [--license <path>...] [--context-size <tokens>] [--push] MODEL
|
||||
Package a GGUF file, Safetensors directory, or existing model into a Docker model OCI artifact, with optional licenses. The package is sent to the model-runner, unless --push is specified.
|
||||
When packaging a sharded GGUF model, --gguf should point to the first shard. All shard files should be siblings and should include the index in the file name (e.g. model-00001-of-00015.gguf).
|
||||
When packaging a Safetensors model, --safetensors-dir should point to a directory containing .safetensors files and config files (*.json, merges.txt). All files will be auto-discovered and config files will be packaged into a tar archive.
|
||||
When packaging from an existing model using --from, you can modify properties like context size to create a variant of the original model.
|
||||
usage: docker model package (--gguf <path> | --safetensors-dir <path> | --from <model>) [--license <path>...] [--context-size <tokens>] [--push] MODEL
|
||||
pname: docker model
|
||||
plink: docker_model.yaml
|
||||
options:
|
||||
@@ -27,9 +29,29 @@ options:
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
- option: dir-tar
|
||||
value_type: stringArray
|
||||
default_value: '[]'
|
||||
description: |
|
||||
relative path to directory to package as tar (can be specified multiple times)
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
- option: from
|
||||
value_type: string
|
||||
description: reference to an existing model to repackage
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
- option: gguf
|
||||
value_type: string
|
||||
description: absolute path to gguf file (required)
|
||||
description: absolute path to gguf file
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
@@ -58,6 +80,15 @@ options:
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
- option: safetensors-dir
|
||||
value_type: string
|
||||
description: absolute path to directory containing safetensors files and config
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
|
||||
25
_vendor/github.com/docker/model-runner/cmd/cli/docs/reference/docker_model_purge.yaml
generated
Normal file
25
_vendor/github.com/docker/model-runner/cmd/cli/docs/reference/docker_model_purge.yaml
generated
Normal file
@@ -0,0 +1,25 @@
|
||||
command: docker model purge
|
||||
short: Remove all models
|
||||
long: Remove all models
|
||||
usage: docker model purge [OPTIONS]
|
||||
pname: docker model
|
||||
plink: docker_model.yaml
|
||||
options:
|
||||
- option: force
|
||||
shorthand: f
|
||||
value_type: bool
|
||||
default_value: "false"
|
||||
description: Forcefully remove all models
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
|
||||
75
_vendor/github.com/docker/model-runner/cmd/cli/docs/reference/docker_model_reinstall-runner.yaml
generated
Normal file
75
_vendor/github.com/docker/model-runner/cmd/cli/docs/reference/docker_model_reinstall-runner.yaml
generated
Normal file
@@ -0,0 +1,75 @@
|
||||
command: docker model reinstall-runner
|
||||
short: Reinstall Docker Model Runner (Docker Engine only)
|
||||
long: |
|
||||
This command removes the existing Docker Model Runner container and reinstalls it with the specified configuration. Models and images are preserved during reinstallation.
|
||||
usage: docker model reinstall-runner
|
||||
pname: docker model
|
||||
plink: docker_model.yaml
|
||||
options:
|
||||
- option: backend
|
||||
value_type: string
|
||||
description: 'Specify backend (llama.cpp|vllm). Default: llama.cpp'
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
- option: debug
|
||||
value_type: bool
|
||||
default_value: "false"
|
||||
description: Enable debug logging
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
- option: do-not-track
|
||||
value_type: bool
|
||||
default_value: "false"
|
||||
description: Do not track models usage in Docker Model Runner
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
- option: gpu
|
||||
value_type: string
|
||||
default_value: auto
|
||||
description: Specify GPU support (none|auto|cuda|rocm|musa|cann)
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
- option: host
|
||||
value_type: string
|
||||
default_value: 127.0.0.1
|
||||
description: Host address to bind Docker Model Runner
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
- option: port
|
||||
value_type: uint16
|
||||
default_value: "0"
|
||||
description: |
|
||||
Docker container port for Docker Model Runner (default: 12434 for Docker Engine, 12435 for Cloud mode)
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
|
||||
68
_vendor/github.com/docker/model-runner/cmd/cli/docs/reference/docker_model_restart-runner.yaml
generated
Normal file
68
_vendor/github.com/docker/model-runner/cmd/cli/docs/reference/docker_model_restart-runner.yaml
generated
Normal file
@@ -0,0 +1,68 @@
|
||||
command: docker model restart-runner
|
||||
short: Restart Docker Model Runner (Docker Engine only)
|
||||
long: |-
|
||||
This command restarts the Docker Model Runner without pulling container images. Use this command to restart the runner when you already have the required images locally.
|
||||
|
||||
For the first-time setup or to ensure you have the latest images, use `docker model install-runner` instead.
|
||||
usage: docker model restart-runner
|
||||
pname: docker model
|
||||
plink: docker_model.yaml
|
||||
options:
|
||||
- option: debug
|
||||
value_type: bool
|
||||
default_value: "false"
|
||||
description: Enable debug logging
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
- option: do-not-track
|
||||
value_type: bool
|
||||
default_value: "false"
|
||||
description: Do not track models usage in Docker Model Runner
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
- option: gpu
|
||||
value_type: string
|
||||
default_value: auto
|
||||
description: Specify GPU support (none|auto|cuda|rocm|musa|cann)
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
- option: host
|
||||
value_type: string
|
||||
default_value: 127.0.0.1
|
||||
description: Host address to bind Docker Model Runner
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
- option: port
|
||||
value_type: uint16
|
||||
default_value: "0"
|
||||
description: |
|
||||
Docker container port for Docker Model Runner (default: 12434 for Docker Engine, 12435 for Cloud mode)
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
|
||||
@@ -10,18 +10,9 @@ usage: docker model run MODEL [PROMPT]
|
||||
pname: docker model
|
||||
plink: docker_model.yaml
|
||||
options:
|
||||
- option: backend
|
||||
value_type: string
|
||||
description: Specify the backend to use (llama.cpp, openai)
|
||||
deprecated: false
|
||||
hidden: true
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
- option: color
|
||||
value_type: string
|
||||
default_value: auto
|
||||
default_value: "no"
|
||||
description: Use colored output (auto|yes|no)
|
||||
deprecated: false
|
||||
hidden: false
|
||||
@@ -39,6 +30,17 @@ options:
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
- option: detach
|
||||
shorthand: d
|
||||
value_type: bool
|
||||
default_value: "false"
|
||||
description: Load the model in the background without interaction
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
- option: ignore-runtime-memory-check
|
||||
value_type: bool
|
||||
default_value: "false"
|
||||
@@ -72,12 +74,18 @@ examples: |-
|
||||
Output:
|
||||
|
||||
```console
|
||||
Interactive chat mode started. Type '/bye' to exit.
|
||||
> Hi
|
||||
Hi there! It's SmolLM, AI assistant. How can I help you today?
|
||||
> /bye
|
||||
Chat session ended.
|
||||
```
|
||||
|
||||
### Pre-load a model
|
||||
|
||||
```console
|
||||
docker model run --detach ai/smollm2
|
||||
```
|
||||
|
||||
This loads the model into memory without interaction, ensuring maximum performance for subsequent requests.
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
|
||||
67
_vendor/github.com/docker/model-runner/cmd/cli/docs/reference/docker_model_start-runner.yaml
generated
Normal file
67
_vendor/github.com/docker/model-runner/cmd/cli/docs/reference/docker_model_start-runner.yaml
generated
Normal file
@@ -0,0 +1,67 @@
|
||||
command: docker model start-runner
|
||||
short: Start Docker Model Runner (Docker Engine only)
|
||||
long: |-
|
||||
This command starts the Docker Model Runner without pulling container images. Use this command to start the runner when you already have the required images locally.
|
||||
|
||||
For the first-time setup or to ensure you have the latest images, use `docker model install-runner` instead.
|
||||
usage: docker model start-runner
|
||||
pname: docker model
|
||||
plink: docker_model.yaml
|
||||
options:
|
||||
- option: backend
|
||||
value_type: string
|
||||
description: 'Specify backend (llama.cpp|vllm). Default: llama.cpp'
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
- option: debug
|
||||
value_type: bool
|
||||
default_value: "false"
|
||||
description: Enable debug logging
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
- option: do-not-track
|
||||
value_type: bool
|
||||
default_value: "false"
|
||||
description: Do not track models usage in Docker Model Runner
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
- option: gpu
|
||||
value_type: string
|
||||
default_value: auto
|
||||
description: Specify GPU support (none|auto|cuda|rocm|musa|cann)
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
- option: port
|
||||
value_type: uint16
|
||||
default_value: "0"
|
||||
description: |
|
||||
Docker container port for Docker Model Runner (default: 12434 for Docker Engine, 12435 for Cloud mode)
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
|
||||
27
_vendor/github.com/docker/model-runner/cmd/cli/docs/reference/docker_model_stop-runner.yaml
generated
Normal file
27
_vendor/github.com/docker/model-runner/cmd/cli/docs/reference/docker_model_stop-runner.yaml
generated
Normal file
@@ -0,0 +1,27 @@
|
||||
command: docker model stop-runner
|
||||
short: Stop Docker Model Runner (Docker Engine only)
|
||||
long: |-
|
||||
This command stops the Docker Model Runner by removing the running containers, but preserves the container images on disk. Use this command when you want to temporarily stop the runner but plan to start it again later.
|
||||
|
||||
To completely remove the runner including images, use `docker model uninstall-runner --images` instead.
|
||||
usage: docker model stop-runner
|
||||
pname: docker model
|
||||
plink: docker_model.yaml
|
||||
options:
|
||||
- option: models
|
||||
value_type: bool
|
||||
default_value: "false"
|
||||
description: Remove model storage volume
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
deprecated: false
|
||||
hidden: false
|
||||
experimental: false
|
||||
experimentalcli: false
|
||||
kubernetes: false
|
||||
swarm: false
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
command: docker model uninstall-runner
|
||||
short: Uninstall Docker Model Runner
|
||||
long: Uninstall Docker Model Runner
|
||||
short: Uninstall Docker Model Runner (Docker Engine only)
|
||||
long: Uninstall Docker Model Runner (Docker Engine only)
|
||||
usage: docker model uninstall-runner
|
||||
pname: docker model
|
||||
plink: docker_model.yaml
|
||||
|
||||
@@ -5,25 +5,31 @@ Docker Model Runner
|
||||
|
||||
### Subcommands
|
||||
|
||||
| Name | Description |
|
||||
|:------------------------------------------------|:------------------------------------------------------------------------------|
|
||||
| [`df`](model_df.md) | Show Docker Model Runner disk usage |
|
||||
| [`inspect`](model_inspect.md) | Display detailed information on one model |
|
||||
| [`install-runner`](model_install-runner.md) | Install Docker Model Runner (Docker Engine only) |
|
||||
| [`list`](model_list.md) | List the models pulled to your local environment |
|
||||
| [`logs`](model_logs.md) | Fetch the Docker Model Runner logs |
|
||||
| [`package`](model_package.md) | Package a GGUF file into a Docker model OCI artifact, with optional licenses. |
|
||||
| [`ps`](model_ps.md) | List running models |
|
||||
| [`pull`](model_pull.md) | Pull a model from Docker Hub or HuggingFace to your local environment |
|
||||
| [`push`](model_push.md) | Push a model to Docker Hub |
|
||||
| [`requests`](model_requests.md) | Fetch requests+responses from Docker Model Runner |
|
||||
| [`rm`](model_rm.md) | Remove local models downloaded from Docker Hub |
|
||||
| [`run`](model_run.md) | Run a model and interact with it using a submitted prompt or chat mode |
|
||||
| [`status`](model_status.md) | Check if the Docker Model Runner is running |
|
||||
| [`tag`](model_tag.md) | Tag a model |
|
||||
| [`uninstall-runner`](model_uninstall-runner.md) | Uninstall Docker Model Runner |
|
||||
| [`unload`](model_unload.md) | Unload running models |
|
||||
| [`version`](model_version.md) | Show the Docker Model Runner version |
|
||||
| Name | Description |
|
||||
|:------------------------------------------------|:------------------------------------------------------------------------------------------------|
|
||||
| [`bench`](model_bench.md) | Benchmark a model's performance at different concurrency levels |
|
||||
| [`df`](model_df.md) | Show Docker Model Runner disk usage |
|
||||
| [`inspect`](model_inspect.md) | Display detailed information on one model |
|
||||
| [`install-runner`](model_install-runner.md) | Install Docker Model Runner (Docker Engine only) |
|
||||
| [`list`](model_list.md) | List the models pulled to your local environment |
|
||||
| [`logs`](model_logs.md) | Fetch the Docker Model Runner logs |
|
||||
| [`package`](model_package.md) | Package a GGUF file, Safetensors directory, or existing model into a Docker model OCI artifact. |
|
||||
| [`ps`](model_ps.md) | List running models |
|
||||
| [`pull`](model_pull.md) | Pull a model from Docker Hub or HuggingFace to your local environment |
|
||||
| [`purge`](model_purge.md) | Remove all models |
|
||||
| [`push`](model_push.md) | Push a model to Docker Hub |
|
||||
| [`reinstall-runner`](model_reinstall-runner.md) | Reinstall Docker Model Runner (Docker Engine only) |
|
||||
| [`requests`](model_requests.md) | Fetch requests+responses from Docker Model Runner |
|
||||
| [`restart-runner`](model_restart-runner.md) | Restart Docker Model Runner (Docker Engine only) |
|
||||
| [`rm`](model_rm.md) | Remove local models downloaded from Docker Hub |
|
||||
| [`run`](model_run.md) | Run a model and interact with it using a submitted prompt or chat mode |
|
||||
| [`start-runner`](model_start-runner.md) | Start Docker Model Runner (Docker Engine only) |
|
||||
| [`status`](model_status.md) | Check if the Docker Model Runner is running |
|
||||
| [`stop-runner`](model_stop-runner.md) | Stop Docker Model Runner (Docker Engine only) |
|
||||
| [`tag`](model_tag.md) | Tag a model |
|
||||
| [`uninstall-runner`](model_uninstall-runner.md) | Uninstall Docker Model Runner (Docker Engine only) |
|
||||
| [`unload`](model_unload.md) | Unload running models |
|
||||
| [`version`](model_version.md) | Show the Docker Model Runner version |
|
||||
|
||||
|
||||
|
||||
|
||||
21
_vendor/github.com/docker/model-runner/cmd/cli/docs/reference/model_bench.md
generated
Normal file
21
_vendor/github.com/docker/model-runner/cmd/cli/docs/reference/model_bench.md
generated
Normal file
@@ -0,0 +1,21 @@
|
||||
# docker model bench
|
||||
|
||||
<!---MARKER_GEN_START-->
|
||||
Benchmark a model's performance showing tokens per second at different concurrency levels.
|
||||
|
||||
This command runs a series of benchmarks with 1, 2, 4, and 8 concurrent requests by default,
|
||||
measuring the tokens per second (TPS) that the model can generate.
|
||||
|
||||
### Options
|
||||
|
||||
| Name | Type | Default | Description |
|
||||
|:----------------|:-----------|:--------------------------------------------------------------------------------|:--------------------------------------|
|
||||
| `--concurrency` | `intSlice` | `[1,2,4,8]` | Concurrency levels to test |
|
||||
| `--duration` | `duration` | `30s` | Duration to run each concurrency test |
|
||||
| `--json` | `bool` | | Output results in JSON format |
|
||||
| `--prompt` | `string` | `Write a comprehensive 100 word summary on whales and their impact on society.` | Prompt to use for benchmarking |
|
||||
| `--timeout` | `duration` | `5m0s` | Timeout for each individual request |
|
||||
|
||||
|
||||
<!---MARKER_GEN_END-->
|
||||
|
||||
@@ -5,11 +5,14 @@ Install Docker Model Runner (Docker Engine only)
|
||||
|
||||
### Options
|
||||
|
||||
| Name | Type | Default | Description |
|
||||
|:-----------------|:---------|:--------|:---------------------------------------------------------------------------------------------------|
|
||||
| `--do-not-track` | `bool` | | Do not track models usage in Docker Model Runner |
|
||||
| `--gpu` | `string` | `auto` | Specify GPU support (none\|auto\|cuda) |
|
||||
| `--port` | `uint16` | `0` | Docker container port for Docker Model Runner (default: 12434 for Docker CE, 12435 for Cloud mode) |
|
||||
| Name | Type | Default | Description |
|
||||
|:-----------------|:---------|:------------|:-------------------------------------------------------------------------------------------------------|
|
||||
| `--backend` | `string` | | Specify backend (llama.cpp\|vllm). Default: llama.cpp |
|
||||
| `--debug` | `bool` | | Enable debug logging |
|
||||
| `--do-not-track` | `bool` | | Do not track models usage in Docker Model Runner |
|
||||
| `--gpu` | `string` | `auto` | Specify GPU support (none\|auto\|cuda\|rocm\|musa\|cann) |
|
||||
| `--host` | `string` | `127.0.0.1` | Host address to bind Docker Model Runner |
|
||||
| `--port` | `uint16` | `0` | Docker container port for Docker Model Runner (default: 12434 for Docker Engine, 12435 for Cloud mode) |
|
||||
|
||||
|
||||
<!---MARKER_GEN_END-->
|
||||
|
||||
@@ -1,18 +1,23 @@
|
||||
# docker model package
|
||||
|
||||
<!---MARKER_GEN_START-->
|
||||
Package a GGUF file into a Docker model OCI artifact, with optional licenses. The package is sent to the model-runner, unless --push is specified.
|
||||
When packaging a sharded model --gguf should point to the first shard. All shard files should be siblings and should include the index in the file name (e.g. model-00001-of-00015.gguf).
|
||||
Package a GGUF file, Safetensors directory, or existing model into a Docker model OCI artifact, with optional licenses. The package is sent to the model-runner, unless --push is specified.
|
||||
When packaging a sharded GGUF model, --gguf should point to the first shard. All shard files should be siblings and should include the index in the file name (e.g. model-00001-of-00015.gguf).
|
||||
When packaging a Safetensors model, --safetensors-dir should point to a directory containing .safetensors files and config files (*.json, merges.txt). All files will be auto-discovered and config files will be packaged into a tar archive.
|
||||
When packaging from an existing model using --from, you can modify properties like context size to create a variant of the original model.
|
||||
|
||||
### Options
|
||||
|
||||
| Name | Type | Default | Description |
|
||||
|:------------------|:--------------|:--------|:---------------------------------------------------------------------------------------|
|
||||
| `--chat-template` | `string` | | absolute path to chat template file (must be Jinja format) |
|
||||
| `--context-size` | `uint64` | `0` | context size in tokens |
|
||||
| `--gguf` | `string` | | absolute path to gguf file (required) |
|
||||
| `-l`, `--license` | `stringArray` | | absolute path to a license file |
|
||||
| `--push` | `bool` | | push to registry (if not set, the model is loaded into the Model Runner content store) |
|
||||
| Name | Type | Default | Description |
|
||||
|:--------------------|:--------------|:--------|:---------------------------------------------------------------------------------------|
|
||||
| `--chat-template` | `string` | | absolute path to chat template file (must be Jinja format) |
|
||||
| `--context-size` | `uint64` | `0` | context size in tokens |
|
||||
| `--dir-tar` | `stringArray` | | relative path to directory to package as tar (can be specified multiple times) |
|
||||
| `--from` | `string` | | reference to an existing model to repackage |
|
||||
| `--gguf` | `string` | | absolute path to gguf file |
|
||||
| `-l`, `--license` | `stringArray` | | absolute path to a license file |
|
||||
| `--push` | `bool` | | push to registry (if not set, the model is loaded into the Model Runner content store) |
|
||||
| `--safetensors-dir` | `string` | | absolute path to directory containing safetensors files and config |
|
||||
|
||||
|
||||
<!---MARKER_GEN_END-->
|
||||
|
||||
14
_vendor/github.com/docker/model-runner/cmd/cli/docs/reference/model_purge.md
generated
Normal file
14
_vendor/github.com/docker/model-runner/cmd/cli/docs/reference/model_purge.md
generated
Normal file
@@ -0,0 +1,14 @@
|
||||
# docker model purge
|
||||
|
||||
<!---MARKER_GEN_START-->
|
||||
Remove all models
|
||||
|
||||
### Options
|
||||
|
||||
| Name | Type | Default | Description |
|
||||
|:----------------|:-------|:--------|:-----------------------------|
|
||||
| `-f`, `--force` | `bool` | | Forcefully remove all models |
|
||||
|
||||
|
||||
<!---MARKER_GEN_END-->
|
||||
|
||||
22
_vendor/github.com/docker/model-runner/cmd/cli/docs/reference/model_reinstall-runner.md
generated
Normal file
22
_vendor/github.com/docker/model-runner/cmd/cli/docs/reference/model_reinstall-runner.md
generated
Normal file
@@ -0,0 +1,22 @@
|
||||
# docker model reinstall-runner
|
||||
|
||||
<!---MARKER_GEN_START-->
|
||||
Reinstall Docker Model Runner (Docker Engine only)
|
||||
|
||||
### Options
|
||||
|
||||
| Name | Type | Default | Description |
|
||||
|:-----------------|:---------|:------------|:-------------------------------------------------------------------------------------------------------|
|
||||
| `--backend` | `string` | | Specify backend (llama.cpp\|vllm). Default: llama.cpp |
|
||||
| `--debug` | `bool` | | Enable debug logging |
|
||||
| `--do-not-track` | `bool` | | Do not track models usage in Docker Model Runner |
|
||||
| `--gpu` | `string` | `auto` | Specify GPU support (none\|auto\|cuda\|rocm\|musa\|cann) |
|
||||
| `--host` | `string` | `127.0.0.1` | Host address to bind Docker Model Runner |
|
||||
| `--port` | `uint16` | `0` | Docker container port for Docker Model Runner (default: 12434 for Docker Engine, 12435 for Cloud mode) |
|
||||
|
||||
|
||||
<!---MARKER_GEN_END-->
|
||||
|
||||
## Description
|
||||
|
||||
This command removes the existing Docker Model Runner container and reinstalls it with the specified configuration. Models and images are preserved during reinstallation.
|
||||
23
_vendor/github.com/docker/model-runner/cmd/cli/docs/reference/model_restart-runner.md
generated
Normal file
23
_vendor/github.com/docker/model-runner/cmd/cli/docs/reference/model_restart-runner.md
generated
Normal file
@@ -0,0 +1,23 @@
|
||||
# docker model restart-runner
|
||||
|
||||
<!---MARKER_GEN_START-->
|
||||
Restart Docker Model Runner (Docker Engine only)
|
||||
|
||||
### Options
|
||||
|
||||
| Name | Type | Default | Description |
|
||||
|:-----------------|:---------|:------------|:-------------------------------------------------------------------------------------------------------|
|
||||
| `--debug` | `bool` | | Enable debug logging |
|
||||
| `--do-not-track` | `bool` | | Do not track models usage in Docker Model Runner |
|
||||
| `--gpu` | `string` | `auto` | Specify GPU support (none\|auto\|cuda\|rocm\|musa\|cann) |
|
||||
| `--host` | `string` | `127.0.0.1` | Host address to bind Docker Model Runner |
|
||||
| `--port` | `uint16` | `0` | Docker container port for Docker Model Runner (default: 12434 for Docker Engine, 12435 for Cloud mode) |
|
||||
|
||||
|
||||
<!---MARKER_GEN_END-->
|
||||
|
||||
## Description
|
||||
|
||||
This command restarts the Docker Model Runner without pulling container images. Use this command to restart the runner when you already have the required images locally.
|
||||
|
||||
For the first-time setup or to ensure you have the latest images, use `docker model install-runner` instead.
|
||||
@@ -7,8 +7,9 @@ Run a model and interact with it using a submitted prompt or chat mode
|
||||
|
||||
| Name | Type | Default | Description |
|
||||
|:--------------------------------|:---------|:--------|:----------------------------------------------------------------------------------|
|
||||
| `--color` | `string` | `auto` | Use colored output (auto\|yes\|no) |
|
||||
| `--color` | `string` | `no` | Use colored output (auto\|yes\|no) |
|
||||
| `--debug` | `bool` | | Enable debug logging |
|
||||
| `-d`, `--detach` | `bool` | | Load the model in the background without interaction |
|
||||
| `--ignore-runtime-memory-check` | `bool` | | Do not block pull if estimated runtime memory for model exceeds system resources. |
|
||||
|
||||
|
||||
@@ -45,9 +46,15 @@ docker model run ai/smollm2
|
||||
Output:
|
||||
|
||||
```console
|
||||
Interactive chat mode started. Type '/bye' to exit.
|
||||
> Hi
|
||||
Hi there! It's SmolLM, AI assistant. How can I help you today?
|
||||
> /bye
|
||||
Chat session ended.
|
||||
```
|
||||
|
||||
### Pre-load a model
|
||||
|
||||
```console
|
||||
docker model run --detach ai/smollm2
|
||||
```
|
||||
|
||||
This loads the model into memory without interaction, ensuring maximum performance for subsequent requests.
|
||||
|
||||
23
_vendor/github.com/docker/model-runner/cmd/cli/docs/reference/model_start-runner.md
generated
Normal file
23
_vendor/github.com/docker/model-runner/cmd/cli/docs/reference/model_start-runner.md
generated
Normal file
@@ -0,0 +1,23 @@
|
||||
# docker model start-runner
|
||||
|
||||
<!---MARKER_GEN_START-->
|
||||
Start Docker Model Runner (Docker Engine only)
|
||||
|
||||
### Options
|
||||
|
||||
| Name | Type | Default | Description |
|
||||
|:-----------------|:---------|:--------|:-------------------------------------------------------------------------------------------------------|
|
||||
| `--backend` | `string` | | Specify backend (llama.cpp\|vllm). Default: llama.cpp |
|
||||
| `--debug` | `bool` | | Enable debug logging |
|
||||
| `--do-not-track` | `bool` | | Do not track models usage in Docker Model Runner |
|
||||
| `--gpu` | `string` | `auto` | Specify GPU support (none\|auto\|cuda\|rocm\|musa\|cann) |
|
||||
| `--port` | `uint16` | `0` | Docker container port for Docker Model Runner (default: 12434 for Docker Engine, 12435 for Cloud mode) |
|
||||
|
||||
|
||||
<!---MARKER_GEN_END-->
|
||||
|
||||
## Description
|
||||
|
||||
This command starts the Docker Model Runner without pulling container images. Use this command to start the runner when you already have the required images locally.
|
||||
|
||||
For the first-time setup or to ensure you have the latest images, use `docker model install-runner` instead.
|
||||
19
_vendor/github.com/docker/model-runner/cmd/cli/docs/reference/model_stop-runner.md
generated
Normal file
19
_vendor/github.com/docker/model-runner/cmd/cli/docs/reference/model_stop-runner.md
generated
Normal file
@@ -0,0 +1,19 @@
|
||||
# docker model stop-runner
|
||||
|
||||
<!---MARKER_GEN_START-->
|
||||
Stop Docker Model Runner (Docker Engine only)
|
||||
|
||||
### Options
|
||||
|
||||
| Name | Type | Default | Description |
|
||||
|:-----------|:-------|:--------|:----------------------------|
|
||||
| `--models` | `bool` | | Remove model storage volume |
|
||||
|
||||
|
||||
<!---MARKER_GEN_END-->
|
||||
|
||||
## Description
|
||||
|
||||
This command stops the Docker Model Runner by removing the running containers, but preserves the container images on disk. Use this command when you want to temporarily stop the runner but plan to start it again later.
|
||||
|
||||
To completely remove the runner including images, use `docker model uninstall-runner --images` instead.
|
||||
@@ -1,7 +1,7 @@
|
||||
# docker model uninstall-runner
|
||||
|
||||
<!---MARKER_GEN_START-->
|
||||
Uninstall Docker Model Runner
|
||||
Uninstall Docker Model Runner (Docker Engine only)
|
||||
|
||||
### Options
|
||||
|
||||
|
||||
Reference in New Issue
Block a user