view article Article What’s MXFP4? The 4-Bit Secret Powering OpenAI’s GPT‑OSS Models on Modest Hardware By RakshitAralimatti • Aug 8 • 21
An Investigation of FP8 Across Accelerators for LLM Inference Paper • 2502.01070 • Published Feb 3 • 3
SCBench: A KV Cache-Centric Analysis of Long-Context Methods Paper • 2412.10319 • Published Dec 13, 2024 • 11
DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation Paper • 2412.07589 • Published Dec 10, 2024 • 48
DART: Denoising Autoregressive Transformer for Scalable Text-to-Image Generation Paper • 2410.08159 • Published Oct 10, 2024 • 26
view article Article Fine-tuning LLMs to 1.58bit: extreme quantization made easy By medmekk and 5 others • Sep 18, 2024 • 265
Animagine XL 3.1 Collection The next iteration of Animagine XL 3.0 in Animagine XL V3 series • 3 items • Updated Mar 18, 2024 • 35