Rethinking Addressing in Language Models via Contexualized Equivariant Positional Encoding Paper • 2501.00712 • Published Jan 1 • 6
Robust Weight Signatures: Gaining Robustness as Easy as Patching Weights? Paper • 2302.12480 • Published Feb 24, 2023
H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models Paper • 2306.14048 • Published Jun 24, 2023 • 12
Robust Mixture-of-Expert Training for Convolutional Neural Networks Paper • 2308.10110 • Published Aug 19, 2023 • 2
Read-ME: Refactorizing LLMs as Router-Decoupled Mixture of Experts with System Co-Design Paper • 2410.19123 • Published Oct 24, 2024 • 15