Query Rewrite¶
Rewrite the query with an LLM, then retrieve with an existing retrieval pipeline.
Overview¶
| Field | Value |
|---|---|
| Type | Retrieval |
| Algorithm | Rewrite-Retrieve |
| Modality | Text |
| Paper | Query Rewriting in Retrieval-Augmented Large Language Models |
How It Works¶
- Receives a query
- Uses an LLM to rewrite the query into a retrieval-optimized search query
- Passes the rewritten query to a configured retrieval pipeline
- Persists the wrapper pipeline's retrieval outputs as usual
This is different from HyDE:
- Query Rewrite changes the query text before retrieval.
- HyDE generates a hypothetical answer passage, embeds it, and searches with that embedding.
Scope¶
This implementation covers the paper's practical inference-time rewrite-retrieve flow only. The paper's trainable/RL rewriter is out of scope for this MVP.
Configuration¶
_target_: autorag_research.pipelines.retrieval.query_rewrite.QueryRewritePipelineConfig
name: query_rewrite_bm25
llm: openai-gpt5-mini
retrieval_pipeline_name: bm25
prompt_template: |
Rewrite the following question into a concise search query.
Keep the original intent, add missing retrieval hints only when helpful, and return only the rewritten query.
Question: {query}
Rewritten query:
top_k: 10
batch_size: 128
Options¶
| Option | Type | Default | Description |
|---|---|---|---|
| name | str | required | Unique pipeline instance name |
| llm | str | required | LLM config name (from configs/llm/) |
| retrieval_pipeline_name | str | required | Existing retrieval pipeline config to wrap |
| prompt_template | str | see below | Template with {query} placeholder |
| top_k | int | 10 | Results per query |
| batch_size | int | 128 | Queries per batch |
Default prompt template:
Rewrite the following question into a concise search query.
Keep the original intent, add missing retrieval hints only when helpful, and return only the rewritten query.
Question: {query}
Rewritten query:
Usage¶
Python API¶
from langchain_openai import ChatOpenAI
from autorag_research.orm.connection import DBConnection
from autorag_research.pipelines.retrieval.bm25 import BM25RetrievalPipeline
from autorag_research.pipelines.retrieval.query_rewrite import QueryRewriteRetrievalPipeline
db = DBConnection.from_config()
session_factory = db.get_session_factory()
wrapped_retriever = BM25RetrievalPipeline(
session_factory=session_factory,
name="bm25",
tokenizer="bert",
)
pipeline = QueryRewriteRetrievalPipeline(
session_factory=session_factory,
name="query_rewrite_bm25",
llm=ChatOpenAI(model="gpt-5-mini"),
retrieval_pipeline=wrapped_retriever,
)
results = await pipeline.retrieve("Who wrote the original paper on transformers?", top_k=10)
YAML / Executor¶
Create or reuse a wrapped retrieval config such as configs/pipelines/retrieval/bm25.yaml, then point the query rewrite config at it:
# configs/pipelines/retrieval/query_rewrite_bm25.yaml
_target_: autorag_research.pipelines.retrieval.query_rewrite.QueryRewritePipelineConfig
name: query_rewrite_bm25
llm: openai-gpt5-mini
retrieval_pipeline_name: bm25
The executor will resolve retrieval_pipeline_name, instantiate the wrapped retriever, and inject it automatically.
When to Use¶
Good for:
- conversational or underspecified queries
- preserving an existing retriever while improving query phrasing
- lightweight LLM-assisted retrieval baselines
Consider other methods when:
- you need the paper's trainable/RL rewriter
- you prefer dense retrieval over rewritten text search
- added LLM latency is unacceptable