RAGBench¶

RAG evaluation benchmark with generation ground truth.

Overview¶

Field	Value
Modality	Text
Generation GT	Yes
HF Repository	`ragbench-dumps`

Description¶

RAGBench provides datasets specifically designed for evaluating full RAG pipelines, including both retrieval and generation components. Unlike retrieval-only benchmarks, it includes expected answers for generation evaluation.

Sub-datasets¶

Name	Domain
covidqa	COVID-19 Q&A
pubmedqa	Biomedical
techqa	Technical

Download¶

autorag-research data restore ragbench covidqa_openai-small

Ingest from Source¶

autorag-research ingest --name=ragbench --extra config=covidqa --embedding-model=openai-small

Best For¶

Full RAG pipeline evaluation
Generation quality assessment
End-to-end benchmarking