A.S.E: A Repository-Level Benchmark for Evaluating Security in AI-Generated Code Paper • 2508.18106 • Published Aug 25 • 340
HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems Paper • 2411.02959 • Published Nov 5, 2024 • 71
Easy Dataset: A Unified and Extensible Framework for Synthesizing LLM Fine-Tuning Data from Unstructured Documents Paper • 2507.04009 • Published Jul 5 • 45
MonkeyOCR: Document Parsing with a Structure-Recognition-Relation Triplet Paradigm Paper • 2506.05218 • Published Jun 5 • 2
Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning Paper • 2507.00432 • Published Jul 1 • 78
How Well Does GPT-4o Understand Vision? Evaluating Multimodal Foundation Models on Standard Computer Vision Tasks Paper • 2507.01955 • Published Jul 2 • 35