go back
go back
Volume 18, No. 11
CEDAR: A System for Cost-Efficient Data-Driven Claim Verification
Abstract
We present CEDAR, a system for cost-efficient, data-driven claim verification. CEDAR takes as input a collection of text documents, containing claims that can be verified from relational data. The system uses large language models (LLMs) to map claims to SQL queries that can be used for claim verification. While LLMs like GPT4 are nowadays able to map claims to queries with high accuracy, using them is expensive. This is why CEDAR implements multiple verification approaches, ranging from zero-shot LLM invocations to iterative, agent-based approaches, that realize different tradeoffs between accuracy and costs. The system may apply multiple methods to the same claim, starting with cheaper methods and resorting to more expensive versions in case of failures. CEDAR uses cost-based optimization to derive an optimal order of verification methods and an optimal number of re-tries (with randomization) for each method, enabling users to trade costs for accuracy via tuning parameters. The experiments on real data, including newspaper and Wikipedia articles, show that CEDAR achieves significantly higher accuracy than prior methods for data-driven fact-checking.
PVLDB is part of the VLDB Endowment Inc.
Privacy Policy