-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 24 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 84 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 152 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 24
Collections
Discover the best community collections!
Collections including paper arxiv:2402.17764
-
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
Paper • 2208.07339 • Published • 5 -
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers
Paper • 2210.17323 • Published • 8 -
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
Paper • 2211.10438 • Published • 6 -
QLoRA: Efficient Finetuning of Quantized LLMs
Paper • 2305.14314 • Published • 56
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 624 -
nanonets/Nanonets-OCR-s
Image-Text-to-Text • 4B • Updated • 304k • 1.49k -
black-forest-labs/FLUX.1-Kontext-dev
Image-to-Image • Updated • 419k • • 2.23k -
12.7k
DeepSite v2
🐳Generate any application with DeepSeek
-
Rewnozom/agent-zero-v1-a-01
Text Generation • 4B • Updated • 3 • 1 -
TheBloke/MythoMax-L2-13B-GGUF
13B • Updated • 76k • 181 -
DavidAU/Llama-3.2-8X3B-MOE-Dark-Champion-Instruct-uncensored-abliterated-18.4B-GGUF
Text Generation • 18B • Updated • 76.1k • 337 -
QuantFactory/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored-GGUF
Text Generation • 8B • Updated • 20.5k • 112
-
Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning
Paper • 2211.04325 • Published • 1 -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper • 1810.04805 • Published • 21 -
On the Opportunities and Risks of Foundation Models
Paper • 2108.07258 • Published • 1 -
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks
Paper • 2204.07705 • Published • 2
-
Qwen2.5 Technical Report
Paper • 2412.15115 • Published • 374 -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 624 -
meta-llama/Llama-4-Scout-17B-16E-Instruct
Image-Text-to-Text • 109B • Updated • 756k • • 1.07k -
keras-io/GauGAN-Image-generation
Updated • 16 • 4
-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 24 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 84 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 152 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 24
-
Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning
Paper • 2211.04325 • Published • 1 -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper • 1810.04805 • Published • 21 -
On the Opportunities and Risks of Foundation Models
Paper • 2108.07258 • Published • 1 -
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks
Paper • 2204.07705 • Published • 2
-
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
Paper • 2208.07339 • Published • 5 -
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers
Paper • 2210.17323 • Published • 8 -
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
Paper • 2211.10438 • Published • 6 -
QLoRA: Efficient Finetuning of Quantized LLMs
Paper • 2305.14314 • Published • 56
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 624 -
nanonets/Nanonets-OCR-s
Image-Text-to-Text • 4B • Updated • 304k • 1.49k -
black-forest-labs/FLUX.1-Kontext-dev
Image-to-Image • Updated • 419k • • 2.23k -
12.7k
DeepSite v2
🐳Generate any application with DeepSeek
-
Rewnozom/agent-zero-v1-a-01
Text Generation • 4B • Updated • 3 • 1 -
TheBloke/MythoMax-L2-13B-GGUF
13B • Updated • 76k • 181 -
DavidAU/Llama-3.2-8X3B-MOE-Dark-Champion-Instruct-uncensored-abliterated-18.4B-GGUF
Text Generation • 18B • Updated • 76.1k • 337 -
QuantFactory/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored-GGUF
Text Generation • 8B • Updated • 20.5k • 112
-
Qwen2.5 Technical Report
Paper • 2412.15115 • Published • 374 -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 624 -
meta-llama/Llama-4-Scout-17B-16E-Instruct
Image-Text-to-Text • 109B • Updated • 756k • • 1.07k -
keras-io/GauGAN-Image-generation
Updated • 16 • 4