Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Estimate costs #16

Open
cbfrance opened this issue Sep 6, 2023 · 2 comments
Open

Estimate costs #16

cbfrance opened this issue Sep 6, 2023 · 2 comments

Comments

@cbfrance
Copy link
Contributor

cbfrance commented Sep 6, 2023

How much will this system cost in terms of compute overhead and APIs?

Context: personally I think we probably want the best embeddings we can get, even at very high cost — at least I am happy to throw money to achieve a few percentage points of better quality. But then, for example, we will probably want to regenerate embeddings regularly and throw a lot of those expensive API calls away. So, iterating with a RAG system could add up. I have never built a system like this before that might depend so heavily on external models.

Eventually we will do fine-tuning and potentially even model training, but for now I'm just trying to get my head around the costs of a well-prepared RAG system.

  • For inference / runtime costs, if we had x users and x queries per session, how does it pencil out?

  • For training, how many round-trips requests do we need to get our metadata refined, split, summarize, vectorize, tag, etc. to arrive at the system that is ready for inference? I assume we will use OpenAI to generate embeddings, and there will be pre-processing steps needed to get quality embeddings.

  • For other compute costs, hosting and indexes etc, we probably need a spreadsheet with all of our SaaS tools, APIs and costs.

@cbfrance
Copy link
Contributor Author

cbfrance commented Sep 6, 2023

Gut check: Doing napkin math it could be easily $3k/month in OpenAI costs alone, maybe as high as $10k/month? Does that sound right? It's unclear to me if we could even get a rate limit that high.

@cbfrance
Copy link
Contributor Author

cbfrance commented Sep 6, 2023

To clarify I am not shy about costs — I am happy to pay premium for quality and speed of iteration in the first 6 weeks especially. Later we can talk about reducing cost of operation over time eg. by swapping in cheaper endpoints.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant