LLM Papers - a MariusjG Collection

MariusjG 's Collections

LLM Papers

updated Aug 5

DeBERTa: Decoding-enhanced BERT with Disentangled Attention

Paper • 2006.03654 • Published Jun 5, 2020 • 3
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 21
RoBERTa: A Robustly Optimized BERT Pretraining Approach

Paper • 1907.11692 • Published Jul 26, 2019 • 9
Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 16
Training language models to follow instructions with human feedback

Paper • 2203.02155 • Published Mar 4, 2022 • 23
PaLM: Scaling Language Modeling with Pathways

Paper • 2204.02311 • Published Apr 5, 2022 • 3
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

Paper • 2201.11903 • Published Jan 28, 2022 • 14
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness

Paper • 2205.14135 • Published May 27, 2022 • 15