From 9189223588d5aa0e16d4477a0d2e794bd929fd9d Mon Sep 17 00:00:00 2001 From: ArthurFlag Date: Mon, 30 Jun 2025 13:52:01 +0200 Subject: [PATCH 1/3] docs: note about quantized models --- docs/reference/model_pull.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/docs/reference/model_pull.md b/docs/reference/model_pull.md index bae41fc3..8da5ade2 100644 --- a/docs/reference/model_pull.md +++ b/docs/reference/model_pull.md @@ -22,6 +22,10 @@ docker model pull ai/smollm2 You can pull GGUF models directly from [Hugging Face](https://huggingface.co/models?library=gguf). +**Note about quantization:** If no tag is specified, the command pulls the non-quantized version of the model. +To specify the quantization, provide the version as a tag, for example: +`docker model pull hf.co/bartowski/Llama-3.2-1B-Instruct-GGUF:Q4_K_S` + ```console docker model pull hf.co/bartowski/Llama-3.2-1B-Instruct-GGUF ``` From 21c10be6653b3e2c1c09536fe7df404117e3f902 Mon Sep 17 00:00:00 2001 From: ArthurFlag Date: Mon, 30 Jun 2025 14:01:11 +0200 Subject: [PATCH 2/3] make docs --- docs/reference/docker_model_pull.yaml | 4 ++++ docs/reference/model_pull.md | 2 +- 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/docs/reference/docker_model_pull.yaml b/docs/reference/docker_model_pull.yaml index f10e4368..060b1cef 100644 --- a/docs/reference/docker_model_pull.yaml +++ b/docs/reference/docker_model_pull.yaml @@ -16,6 +16,10 @@ examples: |- You can pull GGUF models directly from [Hugging Face](https://huggingface.co/models?library=gguf). + **Note about quantization:** If no tag is specified, the command pulls the non-quantized version of the model. + To specify the quantization, provide it as a tag, for example: + `docker model pull hf.co/bartowski/Llama-3.2-1B-Instruct-GGUF:Q4_K_S` + ```console docker model pull hf.co/bartowski/Llama-3.2-1B-Instruct-GGUF ``` diff --git a/docs/reference/model_pull.md b/docs/reference/model_pull.md index 8da5ade2..3d9e4613 100644 --- a/docs/reference/model_pull.md +++ b/docs/reference/model_pull.md @@ -23,7 +23,7 @@ docker model pull ai/smollm2 You can pull GGUF models directly from [Hugging Face](https://huggingface.co/models?library=gguf). **Note about quantization:** If no tag is specified, the command pulls the non-quantized version of the model. -To specify the quantization, provide the version as a tag, for example: +To specify the quantization, provide it as a tag, for example: `docker model pull hf.co/bartowski/Llama-3.2-1B-Instruct-GGUF:Q4_K_S` ```console From 03f4eeb192dcd60302407deb5a7032ee043f1807 Mon Sep 17 00:00:00 2001 From: ArthurFlag Date: Tue, 1 Jul 2025 10:20:25 +0200 Subject: [PATCH 3/3] update --- docs/reference/docker_model_pull.yaml | 3 ++- docs/reference/model_pull.md | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/docs/reference/docker_model_pull.yaml b/docs/reference/docker_model_pull.yaml index 060b1cef..f0843b02 100644 --- a/docs/reference/docker_model_pull.yaml +++ b/docs/reference/docker_model_pull.yaml @@ -16,7 +16,8 @@ examples: |- You can pull GGUF models directly from [Hugging Face](https://huggingface.co/models?library=gguf). - **Note about quantization:** If no tag is specified, the command pulls the non-quantized version of the model. + **Note about quantization:** If no tag is specified, the command tries to pull the `Q4_K_M` version of the model. + If `Q4_K_M` doesn't exist, the command pulls the first GGUF found in the **Files** view of the model on HuggingFace. To specify the quantization, provide it as a tag, for example: `docker model pull hf.co/bartowski/Llama-3.2-1B-Instruct-GGUF:Q4_K_S` diff --git a/docs/reference/model_pull.md b/docs/reference/model_pull.md index 3d9e4613..246cc59d 100644 --- a/docs/reference/model_pull.md +++ b/docs/reference/model_pull.md @@ -22,7 +22,8 @@ docker model pull ai/smollm2 You can pull GGUF models directly from [Hugging Face](https://huggingface.co/models?library=gguf). -**Note about quantization:** If no tag is specified, the command pulls the non-quantized version of the model. +**Note about quantization:** If no tag is specified, the command tries to pull the `Q4_K_M` version of the model. +If `Q4_K_M` doesn't exist, the command pulls the first GGUF found in the **Files** view of the model on HuggingFace. To specify the quantization, provide it as a tag, for example: `docker model pull hf.co/bartowski/Llama-3.2-1B-Instruct-GGUF:Q4_K_S`