Avi66
's Collections
Papers
updated
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs
Paper
•
2504.11536
•
Published
•
62
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning
Paper
•
2505.24726
•
Published
•
271
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey
Paper
•
2503.12605
•
Published
•
36
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning
Attention
Paper
•
2506.13585
•
Published
•
263
R1-VL: Learning to Reason with Multimodal Large Language Models via
Step-wise Group Relative Policy Optimization
Paper
•
2503.12937
•
Published
•
30
Transformer^2: Self-adaptive LLMs
Paper
•
2501.06252
•
Published
•
55
s1: Simple test-time scaling
Paper
•
2501.19393
•
Published
•
126
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time
Scaling
Paper
•
2502.06703
•
Published
•
154
Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for
Mixture-of-Experts Language Models
Paper
•
2501.12370
•
Published
•
11
Self-Refine: Iterative Refinement with Self-Feedback
Paper
•
2303.17651
•
Published
•
2
Probing-RAG: Self-Probing to Guide Language Models in Selective Document
Retrieval
Paper
•
2410.13339
•
Published
Gorilla: Large Language Model Connected with Massive APIs
Paper
•
2305.15334
•
Published
•
5
PERL: Parameter Efficient Reinforcement Learning from Human Feedback
Paper
•
2403.10704
•
Published
•
60
Towards Optimal Learning of Language Models
Paper
•
2402.17759
•
Published
•
18
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Paper
•
2403.03507
•
Published
•
189
MoAI: Mixture of All Intelligence for Large Language and Vision Models
Paper
•
2403.07508
•
Published
•
78
Probing Out-of-Distribution Robustness of Language Models with
Parameter-Efficient Transfer Learning
Paper
•
2301.11660
•
Published
•
1
From RAGs to rich parameters: Probing how language models utilize
external knowledge over parametric information for factual queries
Paper
•
2406.12824
•
Published
•
21
Scaling and evaluating sparse autoencoders
Paper
•
2406.04093
•
Published
•
3
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced
Multimodal Understanding
Paper
•
2412.10302
•
Published
•
18
LLM Post-Training: A Deep Dive into Reasoning Large Language Models
Paper
•
2502.21321
•
Published
•
1
Beyond Reward Hacking: Causal Rewards for Large Language Model Alignment
Paper
•
2501.09620
•
Published
Paper
•
2505.09388
•
Published
•
288
Voila: Voice-Language Foundation Models for Real-Time Autonomous
Interaction and Voice Role-Play
Paper
•
2505.02707
•
Published
•
86
SmolVLM: Redefining small and efficient multimodal models
Paper
•
2504.05299
•
Published
•
199
EmergentTTS-Eval: Evaluating TTS Models on Complex Prosodic,
Expressiveness, and Linguistic Challenges Using Model-as-a-Judge
Paper
•
2505.23009
•
Published
•
18
FP4 All the Way: Fully Quantized Training of LLMs
Paper
•
2505.19115
•
Published
•
3