🗂️ Knowledge Wiki

❯

Machine Learning

❯

Factual Correctness

Factual Correctness

Feb 01, 20251 min read

completed

The factual correctness is a metric that compares and evaluates the factual accuracy of the LLM generated response with the reference.

Score ranges from $0$ to $1$ , with higher values indicating better performance.
The metric uses a LLM to first break down the response and reference into claims (statements) and then uses natural language inference to determine the factual overlap between the response and the reference.

The factual overlap can be measured using precision, recall and F1-score.

P rec i s i o n = \frac{TP}{TP + FP}

R ec a ll = \frac{TP}{TP + FN}

F 1 - score = \frac{2 \times P rec i s i o n \times R ec a ll}{P rec i s i o n + R ec a ll}

Important

When working with this metric is possible to adjust the number of claims (statements) generated by the LLM from a single sentence both for response and reference. This control can be made using the concepts of atomicity and coverage from Ragas library.

Atomicity: refers to how much a sentence is broken down into its smallest, meaningful components.
Coverage: refers to how comprehensively the claims represent the information in the original sentence.

Graph View

Backlinks

VERA: Validation and Enhancement for Retrieval Augmented systems

Created with Quartz v4.4.0 © 2025

GitHub
LinkedIn