Embeddings¶

Multi-vector embedding models for late interaction retrieval (MaxSim).

Overview¶

Unlike single-vector embeddings that produce one vector per input, multi-vector models produce one vector per token/patch. This enables late interaction retrieval where query-document similarity is computed as a sum of maximum similarities between individual token vectors (MaxSim scoring).

Available Embeddings¶

Embedding	Type	Modality	GPU Required
Infinity	API	Text + Image	No (server-side)
ColPali	Local	Text + Image	Yes
BiPali	Local	Text + Image	Yes

Base Classes¶

All multi-vector embeddings extend from the base classes in autorag_research.embeddings.base:

from autorag_research.embeddings.base import (
    MultiVectorBaseEmbedding,       # Text-only multi-vector
    MultiVectorMultiModalEmbedding, # Text + Image multi-vector
)

Methods¶

Text Embedding¶

Method	Description
`embed_text(text)`	Embed a single text
`aembed_text(text)`	Async embed a single text
`embed_query(query)`	Embed a single query
`aembed_query(query)`	Async embed a single query
`embed_documents(texts)`	Embed multiple texts
`aembed_documents(texts)`	Async embed multiple texts
`embed_documents_batch(texts)`	Embed with automatic batching
`aembed_documents_batch(texts)`	Async embed with automatic batching

Image Embedding (MultiModal only)¶

Method	Description
`embed_image(img)`	Embed a single image
`aembed_image(img)`	Async embed a single image
`embed_images(imgs)`	Embed multiple images
`aembed_images(imgs)`	Async embed multiple images
`embed_images_batch(imgs)`	Embed with automatic batching
`aembed_images_batch(imgs)`	Async embed with automatic batching

Image Input Types¶

Image methods accept any of the following types:

from autorag_research.types import ImageType
# ImageType = str | bytes | Path | BytesIO

str -- file path as string
Path -- pathlib.Path object
bytes -- raw image bytes
BytesIO -- in-memory file-like object