Embeddings¶
Multi-vector embedding models for late interaction retrieval (MaxSim).
Overview¶
Unlike single-vector embeddings that produce one vector per input, multi-vector models produce one vector per token/patch. This enables late interaction retrieval where query-document similarity is computed as a sum of maximum similarities between individual token vectors (MaxSim scoring).
Available Embeddings¶
| Embedding | Type | Modality | GPU Required |
|---|---|---|---|
| Infinity | API | Text + Image | No (server-side) |
| ColPali | Local | Text + Image | Yes |
| BiPali | Local | Text + Image | Yes |
Base Classes¶
All multi-vector embeddings extend from the base classes in autorag_research.embeddings.base:
from autorag_research.embeddings.base import (
MultiVectorBaseEmbedding, # Text-only multi-vector
MultiVectorMultiModalEmbedding, # Text + Image multi-vector
)
Methods¶
Text Embedding¶
| Method | Description |
|---|---|
embed_text(text) |
Embed a single text |
aembed_text(text) |
Async embed a single text |
embed_query(query) |
Embed a single query |
aembed_query(query) |
Async embed a single query |
embed_documents(texts) |
Embed multiple texts |
aembed_documents(texts) |
Async embed multiple texts |
embed_documents_batch(texts) |
Embed with automatic batching |
aembed_documents_batch(texts) |
Async embed with automatic batching |
Image Embedding (MultiModal only)¶
| Method | Description |
|---|---|
embed_image(img) |
Embed a single image |
aembed_image(img) |
Async embed a single image |
embed_images(imgs) |
Embed multiple images |
aembed_images(imgs) |
Async embed multiple images |
embed_images_batch(imgs) |
Embed with automatic batching |
aembed_images_batch(imgs) |
Async embed with automatic batching |
Image Input Types¶
Image methods accept any of the following types:
from autorag_research.types import ImageType
# ImageType = str | bytes | Path | BytesIO
str-- file path as stringPath--pathlib.Pathobjectbytes-- raw image bytesBytesIO-- in-memory file-like object