Update README.md
Browse files
README.md
CHANGED
@@ -15,9 +15,8 @@ language:
|
|
15 |
pipeline_tag: text-generation
|
16 |
---
|
17 |
|
18 |
-
<img src="https://cdn-uploads.huggingface.co/production/uploads/645ded34a45b4182d7f5c385/EgsjPDWd37LjAtamiICxk.png" width="
|
19 |
|
20 |
-

|
21 |
|
22 |
### Disclaimer
|
23 |
This model is a base model which received aggressive pruning and knowledge distillation. To make it usable for your individual application it must we finetuned.
|
@@ -66,7 +65,8 @@ Up to 40 % parameter reduction (24 B → 15 B) delivers 2× lower TTFT
|
|
66 |
| Tokens / s | 579 | **812** | +40% |
|
67 |
|
68 |
|
69 |
-
|
|
|
70 |
|
71 |
### Training scalability (distillation run, MI300A cluster)
|
72 |
|
|
|
15 |
pipeline_tag: text-generation
|
16 |
---
|
17 |
|
18 |
+
<img src="https://cdn-uploads.huggingface.co/production/uploads/645ded34a45b4182d7f5c385/EgsjPDWd37LjAtamiICxk.png" width="480" height="480" alt="image/png">
|
19 |
|
|
|
20 |
|
21 |
### Disclaimer
|
22 |
This model is a base model which received aggressive pruning and knowledge distillation. To make it usable for your individual application it must we finetuned.
|
|
|
65 |
| Tokens / s | 579 | **812** | +40% |
|
66 |
|
67 |
|
68 |
+
|
69 |
+
<img src="https://cdn-uploads.huggingface.co/production/uploads/645ded34a45b4182d7f5c385/4rDhaeC-1GMj6KWbB27f9.png" width="300" height="300" alt="image/png">
|
70 |
|
71 |
### Training scalability (distillation run, MI300A cluster)
|
72 |
|