Contextualized Chunk Embeddings
Model Choices
Voyage currently provides the following contextualized chunk embedding models:
Model | Per Chunk Context Window | Context Length (tokens) | Embedding Dimension | Description |
|---|---|---|---|---|
In Preview: | 32,000 | 120,000* | 1024 (default), 256, 512, 2048 | Contextualized chunk embeddings optimized for general-purpose and multilingual retrieval quality. |
| 32,000 | 120,000* | 1024 (default), 256, 512, 2048 | Contextualized chunk embeddings optimized for general-purpose and multilingual retrieval quality. To learn more, see the blog post. |
Python API
Voyage contextualized chunk embeddings are accessible in Python through the voyageai package. Install the package, set up your API key, and use voyageai.Client.contextualized_embed() to vectorize your inputs.
voyageai.Client.contextualized_embed( inputs: Union[List[List[str]], List[str]], # see below for specifics on when to pass one or the other. model: str, input_type: Optional[str] = None, output_dimension: Optional[int] = None, output_dtype: Optional[str] = "float", enable_auto_chunking: Optional[bool] = False, chunk_size: Optional[int] = 512, chunk_overlap: Optional[int] = 0, chunk_fn: Optional[Callable[[str], List[str]]] = None, )
Parameters
- inputs (
List[List[str]]orList[str]) - The input texts to be vectorized. - model (
str) - Name of the model. Recommended options:voyage-context-4. - input_type (
str,optional, defaults toNone) - Type of the input text.- Options:
None,query,document. - When
input_typeisNone, the model directly converts the inputs into numerical vectors. For retrieval and search, we recommend settinginput_typetoqueryordocument. In those cases, Voyage automatically prepends a prompt before vectorizing the input. Embeddings generated with and withoutinput_typeare compatible.
- Options:
- output_dimension (
int, optional, defaults toNone) - The number of dimensions for resulting embeddings.- Options:
2048,1024(default),512, and256.
- Options:
- output_dtype The data type for returned embeddings.
- Options:
float,int8,uint8,binary,ubinary. See Flexible Dimensions and Quantization for details.float: Each embedding is a list of 32-bit floating-point numbers.int8anduint8: Each embedding is a list of 8-bit integers.binaryandubinary: Each embedding is a list of 8-bit integers representing bit-packed single-bit values. The returned list length is1/8ofoutput_dimension.binaryuses offset binary.
- Options:
- enable_auto_chunking (
bool, optional, defaults toFalse) - Whether to automatically chunk each input document on the backend. WhenTrue, inputs must be a flatList[str]of full-document strings, andinput_typemust bedocument. - chunk_size (
int, optional) - Target chunk size in tokens whenenable_auto_chunking=True. If omitted, the server resolves it to512.chunk_sizemust not exceed 32K tokens. - chunk_overlap (
int, optional, defaults to0) - Chunk overlap for improved context across chunks in tokens whenenable_auto_chunking=True.chunk_overlapmust be smaller thanchunk_size. Only a valid input whenenable_auto_chunking=True.
- The listed limits for both
chunk_sizeandchunk_overlapare upper bounds. The actualchunk_sizeandchunk_overlapcan be less than the value passed, but cannot be higher.- Overlapping tokens are billed in the same way as input tokens.
- chunk_fn (
Callable[[str],List[str]], optional, defaults toNone) - A custom client-side chunking function. If provided, it is applied locally to each input string before the request is sent. For convenience,voyageai.default_chunk_fnis available. Usechunk_fnfor client-side chunking only; it cannot be combined withenable_auto_chunking=True.
Returns
- A
ContextualizedEmbeddingsObject, containing the following attributes:- results (List[
ContextualizedEmbeddingsResult]) - One result per query or document.- embeddings (
List[List[float]]orList[List[int]]) - Embeddings corresponding to aquery, adocument, or chunks from the same document. For document chunks, embeddings are ordered to match chunk order. - chunk_texts (
List[str]) - Chunk text returned by the Python SDK for chunked document results. If you provide a client-sidechunk_fn, these correspond to the chunks produced by that function. Whenenable_auto_chunking=Truethey correspond to the backend-generated chunks. - index (
int) - The index of the query or document in the input list.
- embeddings (
- total_tokens (
int) - The total number of tokens in the input texts.
- results (List[
Example: See our quickstart below.
REST API
Voyage contextualized chunk embeddings can be accessed by calling the endpoint POST https://api.voyageai.com/v1/contextualizedembeddings. See the Contextualized Chunk Embeddings API Reference for the specification.
Example
curl -X POST https://api.voyageai.com/v1/contextualizedembeddings \
-H "Authorization: Bearer $VOYAGE_API_KEY" \
-H "content-type: application/json" \
-d '
{
"inputs": [
"This is the SEC filing on Leafy Inc.\u0027s Q2 2024 performance.\nThe company\u0027s revenue increased by 15% compared to the previous quarter.",
"This is the SEC filing on Elephant Ltd.\u0027s Q2 2024 performance.\nThe company\u0027s revenue decreased by 2% compared to the previous quarter."
],
"input_type": "document",
"model": "voyage-context-4",
"enable_auto_chunking": true,
"chunk_size": 512,
"chunk_overlap": 0
}'
Response Shape
{
"data": [
{
"data": [
{ "embedding": [...], "index": 0, "text": "chunk text here" },
{ "embedding": [...], "index": 1, "text": "..." }
],
"index": 0
}
],
"model": "voyage-context-4",
"usage": { "total_tokens": 100 },
"chunker_version": "1.0.0"
}
Inputs Validation
| Use Case | inputs shape | input_type | enable_auto_chunking | chunk_fn | Notes |
|---|---|---|---|---|---|
| Embed pre-chunked documents | List[List[str]] | document | False or omitted | Omitted | Each inner list contains one document's chunks |
| Client-side chunking and embedding | List[List[str]] | document | False or omitted | Provided | Common pattern: one full document string per inner list |
| Auto chunking and embedding | List[str] | document | True | Omitted | chunk_size and chunk_over |
| Embed queries | List[str] or List[List[str]] | query | False or omitted | Omitted | If nested, each inner list should contain a single query |
TypeScript Library
Voyage text embeddings are accessible in TypeScript through the Voyage TypeScript Library, which exposes all the functionality of our text embeddings endpoint (see Contextualized Chunk Embeddings API Reference).
Quickstart
This quickstart demonstrates getting started with each of the supported use cases for voyage-context-4. Each section includes a working example snippet and a list of valid parameters for the associated use case.
Auto Chunking and Embedding
Use this when you want Voyage to split each full document into chunks for you.
import voyageai
vo = voyageai.Client()
documents = [
"This is the SEC filing on Leafy Inc.'s Q2 2024 performance.\nThe company's revenue increased by 15% compared to the previous quarter.",
"This is the SEC filing on Elephant Ltd.'s Q2 2024 performance.\nThe company's revenue decreased by 2% compared to the previous quarter.",
]
result = vo.contextualized_embed(
model="voyage-context-4",
inputs=documents,
input_type="document",
enable_auto_chunking=True,
chunk_size=512,
chunk_overlap=0,
)
Embed Pre-Chunked Documents
Use this when you have already split each document into chunks on the client side.
import voyageai
vo = voyageai.Client()
inputs = [
[
"This is the SEC filing on Leafy Inc.'s Q2 2024 performance.",
"The company's revenue increased by 15% compared to the previous quarter.",
],
[
"This is the SEC filing on Elephant Ltd.'s Q2 2024 performance.",
"The company's revenue decreased by 2% compared to the previous quarter.",
],
]
result = vo.contextualized_embed(
model="voyage-context-4",
inputs=inputs,
input_type="document",
)
Each inner list is embedded as a group, so each chunk is encoded in the context of the other chunks from the same document.
Client-Side Chunking and Embedding
Use this when you have full documents and want to handle chunking locally in the Python SDK rather than in the backend.
import voyageai
vo = voyageai.Client()
inputs = [
[
"This is the SEC filing on Leafy Inc.'s Q2 2024 performance.\nThe company's revenue increased by 15% compared to the previous quarter.",
],
[
"This is the SEC filing on Elephant Ltd.'s Q2 2024 performance.\nThe company's revenue decreased by 2% compared to the previous quarter.",
],
]
result = vo.contextualized_embed(
model="voyage-context-4",
inputs=inputs,
input_type="document",
chunk_fn=voyageai.default_chunk_fn,
)
chunk_fn is applied locally to each input string before the request is sent.
Embedding Queries
Use this when your inputs are search queries.
import voyageai
vo = voyageai.Client()
result = vo.contextualized_embed(
model="voyage-context-4",
inputs=[
"What was the revenue growth for Leafy Inc. in Q2 2024?",
"What changed in Greenery Corp. between Q1 and Q2 2024?",
],
input_type="query",
)
The following query input shape is also valid and is treated equivalently:
result = vo.contextualized_embed(
model="voyage-context-4",
inputs=[
["What was the revenue growth for Leafy Inc. in Q2 2024?"],
["What changed in Greenery Corp. between Q1 and Q2 2024?"],
],
input_type="query",
)
Returned Chunk Text
The response contains contextualized embedding results for each query or document. For document chunks, embeddings are ordered to match chunk order. In the Python SDK, returned chunk text is available through chunk_texts on the result object. In the REST API, returned chunk text is available as text on each embedding item.
If you provide a client-side chunking function, the returned chunk text corresponds to the chunks produced by that function. When enable_auto_chunking=True, the response also includes the backend-generated chunk text for each returned embedding so you can inspect and store it.
Input Constraints
The following constraints apply to a request:
- The list must not contain more than 1,000 inputs.
- The total number of tokens across all inputs must not exceed 120K.
- The total number of chunks across all inputs must not exceed 16K.
chunk_sizeandchunk_overlaprequireenable_auto_chunking=True.chunk_overlapmust be smaller thanchunk_size.
Common Invalid Parameter Combinations
List[str]documentinputs withenable_auto_chunking=Falseare invalid.enable_auto_chunking=Truerequiresinput_type="document".- Do not use
chunk_fntogether withenable_auto_chunking=True.
Tutorial
For a full tutorial on using contextualized chunk embeddings, see Contextualized Chunk Embeddings: Combining Local Detail with Global Context. The Jupyter Notebook for this tutorial is available on GitHub in the GenAI Showcase repository.