Discussions

Ask a Question

Unicode character support

We're evaluating using voyage to generate embeddings for later use in semantic search, but it appears that at least some unicode characters are not supported and cause the embedding API call to return 400 errors: `{"detail":"There was an error parsing the body"}`, for example `Pyrex Spring Blossom Light Green Beaded Edge Nested Mixing Bowl 402 1 ½ Qt` fails but `Pyrex Spring Blossom Light Green Beaded Edge Nested Mixing Bowl 402 1 Qt` succeeds. The same code framework succeeds for both inputs using gemini's text embedding API, so the basic handling of HTTP requests, content types and encodings is correct (the input is JSON encoded in utf-8, which is the assumed encoding for type application/json, and declaring it explicitly in the content-type does not change the behaviour). Does the embedding API require some transformation of the input text to restrict it to an allowed subset of unicode, or am I somehow issuing the request incorrectly?

Clustering with multilingual embeddings

Hi there Do we need to prepend "Cluster the text: " to our texts when using voyage-multilingual-2? Do you have instructions for each supported language? Thank you!

Traditional Chinese support?

Hi, does `voyage-multilingual-2`support traditional Chinese? Is there performance metric available for reference?

Where can we find benchmark results for multlingual performance on the language models?

We're trying to create a vectorstore using VoyageAI embeddings for French text. I saw one blog post vaguely mention that the rerank-1 model supports multilingual performance. Where can we find more detailed information on the multilingual performance of different models? Is there a Voyage AI embeddings model instead of a reranker that has good performance on French text?

unable to use voyage-large-2-instruct embeddings in pgvector (for cosine distance)

table.embedding \<=> embeddings(voyage-large-2-instruct)

Get billing and use data via API

I would like to get costs info (API calls, accumulated cost, etc) via API in order to be included in my Grafana panels

When will you support js typescript in your quicklaunch/api?

Thank you! <br>

Do you have a playground/workbench?

Perhaps it's on Hugging face? I'm looking for a way of experimenting with different rolling windows and retrieval query schemes, comparing performance, etc. If you have this, that would great differentiate your product in my book, from a usability perspective. Thank you! <br>

which languages `voyage-law-2` ? does it support russian language ?

which languages `voyage-law-2` ? does it support russian language ?

Compressors

I would love to see an offering of a compressor model like microsoft/llmlingua-2, that we could use both for prompts and RAG results