Discussions

Ask a Question

Rate Limit Tacking

Would it be possible to add either a header or to the body a field to indicate the rate limit balance? I know you suggest adding a delay between calls, but that results in needless delays when you have a small number of requests, and my attempts to track the balance myself tend to be out of sync with your system.

Tokenizer

Hi there

voyage-large-02 is not supported

I used Langchain as a wrapper to access Voyage AI Embedding. I tried to use voyage-large-02 in my local notebook and there was no any issues. But when I deployed it to my CI/CD pipeline, which uses Kubernetes pod as the instance, there's an error said:

Is there support for asynchronous requests?

For my use case, I need to make multiple non-blocking embeddings calls in parallel. I can accomplish this using an asynchronous HTTP client, like aiohttp, but I'm wondering if it's doable with the Python client.

Do Voyage AI embeddings allow dimension reduction?

Do Voyage AI embeddings allow dimension trimming with Matryoshka Representation Learning like the newer OpenAI embedding models do?