From 9963fecdaf0d4533509077afdf55f2bf66dea2af Mon Sep 17 00:00:00 2001 From: Dorin-Andrei Geman Date: Fri, 5 Dec 2025 12:13:17 +0200 Subject: [PATCH] gpu: mention Docker Model Runner (#23810) ## Description Mention Docker Model Runner to test the GPU on Windows with NVIDIA platforms. Screenshot 2025-12-04 at 17 33 12 ## Reviews - [ ] Technical review - [ ] Editorial review - [ ] Product review --------- Signed-off-by: Dorin Geman Co-authored-by: Allie Sadler <102604716+aevesdocker@users.noreply.github.com> --- content/manuals/desktop/features/gpu.md | 26 ++++++++++++++++++++----- 1 file changed, 21 insertions(+), 5 deletions(-) diff --git a/content/manuals/desktop/features/gpu.md b/content/manuals/desktop/features/gpu.md index 6e69184205..6ad6b5c3b9 100644 --- a/content/manuals/desktop/features/gpu.md +++ b/content/manuals/desktop/features/gpu.md @@ -63,16 +63,32 @@ GPU Device 0: "GeForce RTX 2060 with Max-Q Design" with compute capability 7.5 = 2724.379 single-precision GFLOP/s at 20 flops per interaction ``` -## Run a real-world model: Llama2 with Ollama +## Run a real-world model: SmolLM2 with Docker Model Runner -Use the [official Ollama image](https://hub.docker.com/r/ollama/ollama) to run the Llama2 LLM with GPU acceleration: +> [!NOTE] +> +> Docker Model Runner with vLLM for Windows with WSL2 is available starting with Docker Desktop 4.54. + +Use Docker Model Runner to run the SmolLM2 LLM with vLLM and GPU acceleration: ```console -$ docker run --gpus=all -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama +$ docker model install-runner --backend vllm --gpu cuda ``` -Then start the model: +Check it's correctly installed: ```console -$ docker exec -it ollama ollama run llama2 +$ docker status +Docker Model Runner is running + +Status: +llama.cpp: running llama.cpp version: c22473b +vllm: running vllm version: 0.11.0 +``` + +Run the model: + +```console +$ docker model un ai/smollm2-vllm hi +Hello! I'm sure everything goes smoothly here. How can I assist you today? ```