Skip to content

FlashRank

Lightweight ONNX-based reranking.

Overview

Field Value
Type ONNX
Library flashrank
Default Model ms-marco-MiniLM-L-12-v2

Installation

pip install "autorag-research[gpu]"
# or
uv add "autorag-research[gpu]"

Configuration

_target_: autorag_research.rerankers.flashrank.FlashRankReranker
model_name: ms-marco-MiniLM-L-12-v2

Options

Option Type Default Description
model_name str ms-marco-MiniLM-L-12-v2 FlashRank model name
max_length int 512 Maximum input sequence length
batch_size int 64 Batch size for multiple queries

Usage

from autorag_research.rerankers import FlashRankReranker

reranker = FlashRankReranker()
results = reranker.rerank("What is RAG?", ["doc1", "doc2", "doc3"], top_k=2)

When to Use

Good for:

  • CPU-only environments
  • Low-latency requirements
  • Production deployments without GPU