Retrieval Metrics¶
Metrics for evaluating document retrieval quality.
Available Metrics¶
| Metric | Measures | When to Use |
|---|---|---|
| Recall@k | Coverage | Ensure all relevant docs found |
| Full Recall@k | Complete coverage | All evidence groups must be retrieved |
| Precision@k | Relevance | Minimize irrelevant results |
| F1@k | Balance | Trade-off recall and precision |
| NDCG@k | Ranking | Order matters |
| MRR | First hit | Single answer tasks |
| MAP | Overall quality | Comprehensive evaluation |
Common Parameters¶
All retrieval metrics use the top-k retrieved results compared against ground truth relevance judgments.
Base Class¶
from autorag_research.evaluation.metrics import BaseRetrievalMetricConfig
from dataclasses import dataclass
@dataclass
class MyMetricConfig(BaseRetrievalMetricConfig):
def get_metric_func(self):
return my_metric_function