🧠SmolLM3 Collection Smol, multilingual, long-context reasoner • 12 items • Updated about 5 hours ago • 67
Seq vs Seq: An Open Suite of Paired Encoders and Decoders Paper • 2507.11412 • Published 21 days ago • 25
Tulu 3 Datasets Collection All datasets released with Tulu 3 -- state of the art open post-training recipes. • 33 items • Updated Apr 30 • 88
💧 LFM2 Collection LFM2 is a new generation of hybrid models, designed for on-device deployment. • 15 items • Updated 8 days ago • 83
Falcon-H1 Collection Falcon-H1 Family of Hybrid-Head Language Models (Transformer-SSM), including 0.5B, 1.5B, 1.5B-Deep, 3B, 7B, and 34B (pretrained & instruction-tuned). • 38 items • Updated 5 days ago • 50
view article Article Efficient MultiModal Data Pipeline By ariG23498 and 4 others • 29 days ago • 53
view article Article SmolLM3: smol, multilingual, long-context reasoner By loubnabnl and 22 others • 29 days ago • 608
view article Article Mixture of Experts Explained By osanseviero and 5 others • Dec 11, 2023 • 797
view article Article Vision Language Models (Better, Faster, Stronger) By merve and 4 others • May 12 • 495
TinyStories: How Small Can Language Models Be and Still Speak Coherent English? Paper • 2305.07759 • Published May 12, 2023 • 36
SmolVLM: Redefining small and efficient multimodal models Paper • 2504.05299 • Published Apr 7 • 196
view article Article SmolLM - blazingly fast and remarkably powerful By loubnabnl and 2 others • Jul 16, 2024 • 403
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper • 2412.13663 • Published Dec 18, 2024 • 153