Intel
/

Qwen3-Coder-30B-A3B-Instruct-int4-AutoRound

4-bit precision

Model card Files Files and versions

weiweiz1 commited on 23 days ago

Commit

b2459e1

·

verified ·

1 Parent(s): f7fcbf5

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -19,7 +19,7 @@ Please follow the license of the original model.
 **vLLM usage**
 ~~~bash
-vllm serve Intel/Qwen3-Coder-30B-A3B-Instruct-int4-AutoRound --tensor-parallel-size 4  # --max-model-len 32768
 ~~~
 **INT4 Inference on CPU/Intel GPU/CUDA**

 **vLLM usage**
 ~~~bash
+vllm serve Intel/Qwen3-Coder-30B-A3B-Instruct-int4-AutoRound --tensor-parallel-size 4  --max-model-len 65536
 ~~~
 **INT4 Inference on CPU/Intel GPU/CUDA**