Skip a Layer or Loop it? Test-Time Depth Adaptation of Pretrained LLMs Paper β’ 2507.07996 β’ Published 26 days ago β’ 31 β’ 14
Skip a Layer or Loop it? Test-Time Depth Adaptation of Pretrained LLMs Paper β’ 2507.07996 β’ Published 26 days ago β’ 31 β’ 14
Skip a Layer or Loop it? Test-Time Depth Adaptation of Pretrained LLMs Paper β’ 2507.07996 β’ Published 26 days ago β’ 31 β’ 14
Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large Language Models Paper β’ 2506.19697 β’ Published Jun 24 β’ 44 β’ 5
Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large Language Models Paper β’ 2506.19697 β’ Published Jun 24 β’ 44 β’ 5
What Matters in Transformers? Not All Attention is Needed Paper β’ 2406.15786 β’ Published Jun 22, 2024 β’ 32 β’ 3