Discussions

Ask a Question
Back to All

Is it normal for API responses to take 3 - 5 seconds

We're currently using OpenAI and we're used to couple hundred millisecond response times. Benchmarking voyage and recall is definitely improved but it comes at the cost of 10X slower embeddings? Am I doing something wrong.

t0 = time.time()
res = await voyage_async_client.embed(
    texts=["Hello, world!"],
    model="voyage-3-large",
    output_dimension=512,
)
print(f"Time taken: {time.time() - t0:.2f} seconds")

This will take 3+ seconds and on longer prompts it can reach upwards of 5. We do quite a bit of multi-stage retrieval so this becomes a large issue.