Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Yedidia AGNIMO's picture

28

Yedidia AGNIMO

YedsonUQ

·

AI & ML interests

[Uncertainty Quantification, "Hallucinations"] in LLMs, Federated Learning

Organizations

None yet

YedsonUQ 's collections 22

Fine-Tuning, PEFT

SingLoRA: Low Rank Adaptation Using a Single Matrix

Paper • 2507.05566 • Published Jul 8 • 111

Understanding LLM Representation

Large Language Models are Locally Linear Mappings

Paper • 2505.24293 • Published May 30 • 15

Test-Time Scaling (TTS)

What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models

Paper • 2503.24235 • Published Mar 31 • 55

L^2M: Mutual Information Scaling Law for Long-Context Language Modeling

Paper • 2503.04725 • Published Mar 6 • 21

AI-Automated Scientific Research

SurveyX: Academic Survey Automation via Large Language Models

Paper • 2502.14776 • Published Feb 20 • 101
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery

Paper • 2408.06292 • Published Aug 12, 2024 • 127
Towards an AI co-scientist

Paper • 2502.18864 • Published Feb 26 • 52
Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning

Paper • 2504.17192 • Published Apr 24 • 115

Distributed Training and Federated Learning

Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch

Paper • 2501.18512 • Published Jan 30 • 30

Large Language Models Think Too Fast To Explore Effectively

Paper • 2501.18009 • Published Jan 29 • 24
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28 • 123
Intuitive physics understanding emerges from self-supervised pretraining on natural videos

Paper • 2502.11831 • Published Feb 17 • 20
Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity

Paper • 2502.13063 • Published Feb 18 • 73

Linear Correlation in LM's Compositional Generalization and Hallucination

Paper • 2502.04520 • Published Feb 6 • 11
How to Steer LLM Latents for Hallucination Detection?

Paper • 2503.01917 • Published Mar 1 • 11
Are Reasoning Models More Prone to Hallucination?

Paper • 2505.23646 • Published May 29 • 25

EuroBERT: Scaling Multilingual Encoders for European Languages

Paper • 2503.05500 • Published Mar 7 • 81
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 418
Qwen2.5-1M Technical Report

Paper • 2501.15383 • Published Jan 26 • 73
Baichuan-Omni-1.5 Technical Report

Paper • 2501.15368 • Published Jan 26 • 64

Reinforcement Learning (RL)

Towards General-Purpose Model-Free Reinforcement Learning

Paper • 2501.16142 • Published Jan 27 • 31
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28 • 123
Robust Reward Modeling via Causal Rubrics

Paper • 2506.16507 • Published Jun 19 • 9

Uncertainty Quantification

Evolution and The Knightian Blindspot of Machine Learning

Paper • 2501.13075 • Published Jan 22 • 6
From Aleatoric to Epistemic: Exploring Uncertainty Quantification Techniques in Artificial Intelligence

Paper • 2501.03282 • Published Jan 5
Efficient Test-Time Scaling via Self-Calibration

Paper • 2503.00031 • Published Feb 25 • 15
Investigating Human-Aligned Large Language Model Uncertainty

Paper • 2503.12528 • Published Mar 16 • 4

Hallucination Frameworks Ideas

Query decomposition, ambiguity,

Incentivizing Reasoning for Advanced Instruction-Following of Large Language Models

Paper • 2506.01413 • Published Jun 2 • 15

Efficient Inference

Efficient Inference for Large Reasoning Models: A Survey

Paper • 2503.23077 • Published Mar 29 • 47

MARS: A Multi-Agent Framework Incorporating Socratic Guidance for Automated Prompt Optimization

Paper • 2503.16874 • Published Mar 21 • 45
Group Think: Multiple Concurrent Reasoning Agents Collaborating at Token Level Granularity

Paper • 2505.11107 • Published May 16 • 29

Foundational Deep Learning - Architecture

Forgetting Transformer: Softmax Attention with a Forget Gate

Paper • 2503.02130 • Published Mar 3 • 32
L^2M: Mutual Information Scaling Law for Long-Context Language Modeling

Paper • 2503.04725 • Published Mar 6 • 21
Transformers without Normalization

Paper • 2503.10622 • Published Mar 13 • 169
I-Con: A Unifying Framework for Representation Learning

Paper • 2504.16929 • Published Apr 23 • 30

Benchmark and Evaluation

Humanity's Last Exam

Paper • 2501.14249 • Published Jan 24 • 76
Benchmarking LLMs for Political Science: A United Nations Perspective

Paper • 2502.14122 • Published Feb 19 • 2
IFIR: A Comprehensive Benchmark for Evaluating Instruction-Following in Expert-Domain Information Retrieval

Paper • 2503.04644 • Published Mar 6 • 21
ExpertGenQA: Open-ended QA generation in Specialized Domains

Paper • 2503.02948 • Published Mar 4

Explainable AI - Interpretable AI

Analyze Feature Flow to Enhance Interpretation and Steering in Language Models

Paper • 2502.03032 • Published Feb 5 • 61

Theory, Conceptualization, Paradigms

Distillation Scaling Laws

Paper • 2502.08606 • Published Feb 12 • 49
I-Con: A Unifying Framework for Representation Learning

Paper • 2504.16929 • Published Apr 23 • 30
Chain-of-Model Learning for Language Model

Paper • 2505.11820 • Published May 17 • 122

Learning Paradigm/Scheme

Feasible Learning

Paper • 2501.14912 • Published Jan 24 • 5

Reasoning - Chain-of-Thought

Evolving Deeper LLM Thinking

Paper • 2501.09891 • Published Jan 17 • 116
Reasoning Language Models: A Blueprint

Paper • 2501.11223 • Published Jan 20 • 34
Multiple Choice Questions: Reasoning Makes Large Language Models (LLMs) More Self-Confident Even When They Are Wrong

Paper • 2501.09775 • Published Jan 16 • 34
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models

Paper • 2501.09686 • Published Jan 16 • 41

Retrieval Augmented Generation (RAG)

Chain-of-Retrieval Augmented Generation

Paper • 2501.14342 • Published Jan 24 • 60
DeepRAG: Thinking to Retrieval Step by Step for Large Language Models

Paper • 2502.01142 • Published Feb 3 • 24
SafeRAG: Benchmarking Security in Retrieval-Augmented Generation of Large Language Model

Paper • 2501.18636 • Published Jan 28 • 31
UniversalRAG: Retrieval-Augmented Generation over Multiple Corpora with Diverse Modalities and Granularities

Paper • 2504.20734 • Published Apr 29 • 63

A Survey on Large Language Models with some Insights on their Capabilities and Limitations

Paper • 2501.04040 • Published Jan 3

Fine-Tuning, PEFT

SingLoRA: Low Rank Adaptation Using a Single Matrix

Paper • 2507.05566 • Published Jul 8 • 111

Hallucination Frameworks Ideas

Query decomposition, ambiguity,

Incentivizing Reasoning for Advanced Instruction-Following of Large Language Models

Paper • 2506.01413 • Published Jun 2 • 15

Understanding LLM Representation

Large Language Models are Locally Linear Mappings

Paper • 2505.24293 • Published May 30 • 15

Efficient Inference

Efficient Inference for Large Reasoning Models: A Survey

Paper • 2503.23077 • Published Mar 29 • 47

Test-Time Scaling (TTS)

What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models

Paper • 2503.24235 • Published Mar 31 • 55

MARS: A Multi-Agent Framework Incorporating Socratic Guidance for Automated Prompt Optimization

Paper • 2503.16874 • Published Mar 21 • 45
Group Think: Multiple Concurrent Reasoning Agents Collaborating at Token Level Granularity

Paper • 2505.11107 • Published May 16 • 29

L^2M: Mutual Information Scaling Law for Long-Context Language Modeling

Paper • 2503.04725 • Published Mar 6 • 21

Foundational Deep Learning - Architecture

Forgetting Transformer: Softmax Attention with a Forget Gate

Paper • 2503.02130 • Published Mar 3 • 32
L^2M: Mutual Information Scaling Law for Long-Context Language Modeling

Paper • 2503.04725 • Published Mar 6 • 21
Transformers without Normalization

Paper • 2503.10622 • Published Mar 13 • 169
I-Con: A Unifying Framework for Representation Learning

Paper • 2504.16929 • Published Apr 23 • 30

AI-Automated Scientific Research

SurveyX: Academic Survey Automation via Large Language Models

Paper • 2502.14776 • Published Feb 20 • 101
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery

Paper • 2408.06292 • Published Aug 12, 2024 • 127
Towards an AI co-scientist

Paper • 2502.18864 • Published Feb 26 • 52
Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning

Paper • 2504.17192 • Published Apr 24 • 115

Benchmark and Evaluation

Humanity's Last Exam

Paper • 2501.14249 • Published Jan 24 • 76
Benchmarking LLMs for Political Science: A United Nations Perspective

Paper • 2502.14122 • Published Feb 19 • 2
IFIR: A Comprehensive Benchmark for Evaluating Instruction-Following in Expert-Domain Information Retrieval

Paper • 2503.04644 • Published Mar 6 • 21
ExpertGenQA: Open-ended QA generation in Specialized Domains

Paper • 2503.02948 • Published Mar 4

Distributed Training and Federated Learning

Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch

Paper • 2501.18512 • Published Jan 30 • 30

Explainable AI - Interpretable AI

Analyze Feature Flow to Enhance Interpretation and Steering in Language Models

Paper • 2502.03032 • Published Feb 5 • 61

Large Language Models Think Too Fast To Explore Effectively

Paper • 2501.18009 • Published Jan 29 • 24
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28 • 123
Intuitive physics understanding emerges from self-supervised pretraining on natural videos

Paper • 2502.11831 • Published Feb 17 • 20
Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity

Paper • 2502.13063 • Published Feb 18 • 73

Theory, Conceptualization, Paradigms

Distillation Scaling Laws

Paper • 2502.08606 • Published Feb 12 • 49
I-Con: A Unifying Framework for Representation Learning

Paper • 2504.16929 • Published Apr 23 • 30
Chain-of-Model Learning for Language Model

Paper • 2505.11820 • Published May 17 • 122

Linear Correlation in LM's Compositional Generalization and Hallucination

Paper • 2502.04520 • Published Feb 6 • 11
How to Steer LLM Latents for Hallucination Detection?

Paper • 2503.01917 • Published Mar 1 • 11
Are Reasoning Models More Prone to Hallucination?

Paper • 2505.23646 • Published May 29 • 25

Learning Paradigm/Scheme

Feasible Learning

Paper • 2501.14912 • Published Jan 24 • 5

EuroBERT: Scaling Multilingual Encoders for European Languages

Paper • 2503.05500 • Published Mar 7 • 81
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 418
Qwen2.5-1M Technical Report

Paper • 2501.15383 • Published Jan 26 • 73
Baichuan-Omni-1.5 Technical Report

Paper • 2501.15368 • Published Jan 26 • 64

Reasoning - Chain-of-Thought

Evolving Deeper LLM Thinking

Paper • 2501.09891 • Published Jan 17 • 116
Reasoning Language Models: A Blueprint

Paper • 2501.11223 • Published Jan 20 • 34
Multiple Choice Questions: Reasoning Makes Large Language Models (LLMs) More Self-Confident Even When They Are Wrong

Paper • 2501.09775 • Published Jan 16 • 34
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models

Paper • 2501.09686 • Published Jan 16 • 41

Reinforcement Learning (RL)

Towards General-Purpose Model-Free Reinforcement Learning

Paper • 2501.16142 • Published Jan 27 • 31
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28 • 123
Robust Reward Modeling via Causal Rubrics

Paper • 2506.16507 • Published Jun 19 • 9

Retrieval Augmented Generation (RAG)

Chain-of-Retrieval Augmented Generation

Paper • 2501.14342 • Published Jan 24 • 60
DeepRAG: Thinking to Retrieval Step by Step for Large Language Models

Paper • 2502.01142 • Published Feb 3 • 24
SafeRAG: Benchmarking Security in Retrieval-Augmented Generation of Large Language Model

Paper • 2501.18636 • Published Jan 28 • 31
UniversalRAG: Retrieval-Augmented Generation over Multiple Corpora with Diverse Modalities and Granularities

Paper • 2504.20734 • Published Apr 29 • 63

Uncertainty Quantification

Evolution and The Knightian Blindspot of Machine Learning

Paper • 2501.13075 • Published Jan 22 • 6
From Aleatoric to Epistemic: Exploring Uncertainty Quantification Techniques in Artificial Intelligence

Paper • 2501.03282 • Published Jan 5
Efficient Test-Time Scaling via Self-Calibration

Paper • 2503.00031 • Published Feb 25 • 15
Investigating Human-Aligned Large Language Model Uncertainty

Paper • 2503.12528 • Published Mar 16 • 4

A Survey on Large Language Models with some Insights on their Capabilities and Limitations

Paper • 2501.04040 • Published Jan 3

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs