BLEU¶

Bilingual Evaluation Understudy - measures n-gram precision.

Overview¶

BLEU measures how many n-grams in the generated text appear in the reference text. Originally designed for machine translation evaluation.

_target_: autorag_research.evaluation.metrics.generation.BleuConfig
tokenize: default
smooth_method: exp
max_ngram_order: 4
effective_order: true

Option	Type	Default	Description
tokenize	str	`default`	Tokenization method
smooth_method	str	`exp`	Smoothing for zero counts
max_ngram_order	int	4	Maximum n-gram size
effective_order	bool	true	Use effective order

Good for:

Limitations: