HuggingFaceTB
/

SmolLM3-3B-Base

Text Generation

Transformers.js

Model card Files Files and versions

eliebak HF Staff commited on Jul 8

Commit

52a727b

·

verified ·

1 Parent(s): 3cb3cd8

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -69,14 +69,14 @@ For local inference, you can use `llama.cpp`, `ONNX`, `MLX` and `MLC`. You can f
 ## Evaluation
-In this section, we report the evaluation results of SmolLM3 model. All evaluations are zero-shot unless stated otherwise, and we use [lighteval](https://github.com/huggingface/lighteval) to run them. For Ruler 64k evaluation, we apply YaRN to the Qwen models with 32k context to extrapolate the context length.
 We highlight the best score in bold and underline the second-best score.
 ### Base Pre-Trained Model
 #### English benchmarks
-Note: All evaluations are zero-shot unless stated otherwise.
 | Category | Metric | SmolLM3-3B | Qwen2.5-3B | Llama3-3.2B | Qwen3-1.7B-Base | Qwen3-4B-Base |
 |---------|--------|---------------------|------------|--------------|------------------|---------------|

 ## Evaluation
+In this section, we report the evaluation results of SmolLM3 model. All evaluations are zero-shot unless stated otherwise, and we use [lighteval](https://github.com/huggingface/lighteval) to run them.
 We highlight the best score in bold and underline the second-best score.
 ### Base Pre-Trained Model
 #### English benchmarks
+Note: All evaluations are zero-shot unless stated otherwise. For Ruler 64k evaluation, we apply YaRN to the Qwen models with 32k context to extrapolate the context length.
 | Category | Metric | SmolLM3-3B | Qwen2.5-3B | Llama3-3.2B | Qwen3-1.7B-Base | Qwen3-4B-Base |
 |---------|--------|---------------------|------------|--------------|------------------|---------------|