Mind2Web 2: Evaluating Agentic Search with Agent-as-a-Judge Paper β’ 2506.21506 β’ Published Jun 26 β’ 51
Will It Still Be True Tomorrow? Multilingual Evergreen Question Classification to Improve Trustworthy QA Paper β’ 2505.21115 β’ Published May 27 β’ 139
Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning Paper β’ 2504.17192 β’ Published Apr 24 β’ 115
RIG: Synergizing Reasoning and Imagination in End-to-End Generalist Policy Paper β’ 2503.24388 β’ Published Mar 31 β’ 31
view article Article π¦Έπ»#14: What Is MCP, and Why Is Everyone β Suddenly!β Talking About It? By Kseniase β’ Mar 17 β’ 334
LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers Paper β’ 2502.15007 β’ Published Feb 20 β’ 175
Talk Structurally, Act Hierarchically: A Collaborative Framework for LLM Multi-Agent Systems Paper β’ 2502.11098 β’ Published Feb 16 β’ 13
SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering? Paper β’ 2502.12115 β’ Published Feb 17 β’ 47
SebastianBodza/flux_lora_aquarel_watercolor Text-to-Image β’ Updated Aug 17, 2024 β’ 158 β’ β’ 28