d - a that113 Collection

that113 's Collections

d

d

updated May 12

Running

3.18k

3.18k

The Ultra-Scale Playbook

🌌

The ultimate guide to training LLM on large GPU Clusters
Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme

Paper • 2504.02587 • Published Apr 3 • 33
RLHF Workflow: From Reward Modeling to Online RLHF

Paper • 2405.07863 • Published May 13, 2024 • 72
microsoft/Magma-8B

Image-Text-to-Text • 9B • Updated May 13 • 2.22k • 408