Discussions

Ask a Question

Do you have a playground/workbench?

Perhaps it's on Hugging face? I'm looking for a way of experimenting with different rolling windows and retrieval query schemes, comparing performance, etc.

which languages `voyage-law-2` ? does it support russian language ?

which languages voyage-law-2 ? does it support russian language ?

Compressors

I would love to see an offering of a compressor model like microsoft/llmlingua-2, that we could use both for prompts and RAG results

Asymmetric Embeddings Perform Worse for Code Search

I'm running on an internal benchmark and Voyage has been amazing, about 5% better than OpenAI Ada v3. I was just wondering, has the code model also been instruction fine-tuned? I'm finding that if I add the document flag the overall quality is equal or worse.

VoyageAI Embeddings seem to be very similar for dis-similar documents

I've been experimenting with using VoyageAI embeddings for a project where we are using cosine similarity as a first step in matching semantic equivalence of documents.

Amount of paramteres for voyage-2

Hello I am doing a project for school and am trying to compare model sizes based on parameters. Would you be able to tell me the amount of parameters this model uses?

Retrieval performance for various european languages

OpenAI's new embedding models seem to work pretty well across a number of european languages (French, Spanish, Italian etc.). I am thinking of switching from OpenAI to Voyage for embeddings. Have your models been trained across text data in a number of languages? If so, do you have any performance benchmarks for say French vs English etc?

Languages supported by Voyage AI embeddings

I would like to know the list of languages supported by the embedding models offered by Voyage AI?

Examples to embed entire repository

Looking for any example/notebook with best practices to vectorize and embed an entire repository using Voyage SDK to evaluate the code embedding API.

Rate Limit Tacking

Would it be possible to add either a header or to the body a field to indicate the rate limit balance? I know you suggest adding a delay between calls, but that results in needless delays when you have a small number of requests, and my attempts to track the balance myself tend to be out of sync with your system.