Turning the Spell Around: Lightweight Alignment Amplification via Rank-One Safety Injection Paper • 2508.20766 • Published 5 days ago • 13 • 2
Train Long, Think Short: Curriculum Learning for Efficient Reasoning Paper • 2508.08940 • Published 21 days ago • 23 • 2
An Embarrassingly Simple Defense Against LLM Abliteration Attacks Paper • 2505.19056 • Published May 25 • 6 • 2
Beyond the Last Answer: Your Reasoning Trace Uncovers More than You Think Paper • 2504.20708 • Published Apr 29 • 23 • 2
Towards Data-Efficient Pretraining for Atomic Property Prediction Paper • 2502.11085 • Published Feb 16 • 3 • 3
Towards Data-Efficient Pretraining for Atomic Property Prediction Paper • 2502.11085 • Published Feb 16 • 3 • 3
Model Merging and Safety Alignment: One Bad Model Spoils the Bunch Paper • 2406.14563 • Published Jun 20, 2024 • 31 • 1