Skip to content

RAGBench

RAG evaluation benchmark with generation ground truth.

Overview

Field Value
Modality Text
Generation GT Yes
HF Repository ragbench-dumps

Description

RAGBench provides datasets specifically designed for evaluating full RAG pipelines, including both retrieval and generation components. Unlike retrieval-only benchmarks, it includes expected answers for generation evaluation.

Sub-datasets

Name Domain
covidqa COVID-19 Q&A
pubmedqa Biomedical
techqa Technical

Download

autorag-research data restore ragbench covidqa_openai-small

Ingest from Source

autorag-research ingest --name=ragbench --extra config=covidqa --embedding-model=openai-small

Best For

  • Full RAG pipeline evaluation
  • Generation quality assessment
  • End-to-end benchmarking