Papers
arxiv:2502.10855

Towards Effective Extraction and Evaluation of Factual Claims

Published on Feb 15
Authors:
,

Abstract

A framework and LLM-based method for evaluating and extracting claims in fact-checking that addresses the lack of standardized evaluation and improves claim quality.

AI-generated summary

A common strategy for fact-checking long-form content generated by Large Language Models (LLMs) is extracting simple claims that can be verified independently. Since inaccurate or incomplete claims compromise fact-checking results, ensuring claim quality is critical. However, the lack of a standardized evaluation framework impedes assessment and comparison of claim extraction methods. To address this gap, we propose a framework for evaluating claim extraction in the context of fact-checking along with automated, scalable, and replicable methods for applying this framework, including novel approaches for measuring coverage and decontextualization. We also introduce Claimify, an LLM-based claim extraction method, and demonstrate that it outperforms existing methods under our evaluation framework. A key feature of Claimify is its ability to handle ambiguity and extract claims only when there is high confidence in the correct interpretation of the source text.

Community

๐Ÿ”ฅ Claimify extracts claims (facts that you can verify) from text and works on disambiguating (when applicable) based on the context.

๐ŸŽ Video presentation by paper author Dasha, who is a research data scientist at Microsoft: https://youtu.be/WTs-Ipt0k-M

Relevant Links:
๐Ÿ‘‰ MS Post: https://www.microsoft.com/en-us/research/publication/towards-effective-extraction-and-evaluation-of-factual-claims/

๐Ÿ‘‰ Claimify Blog Post: https://www.microsoft.com/en-us/research/blog/claimify-extracting-high-quality-claims-from-language-model-outputs/

๐Ÿคทโ€โ™‚๏ธ Did not see any GitHub repositories yet!

How can we access the model?

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2502.10855 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2502.10855 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2502.10855 in a Space README.md to link it from this page.

Collections including this paper 2