Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[Design] Feature Gen AI data ingestion workflow / pipeline #1707

Open
Benvii opened this issue Aug 5, 2024 · 0 comments · May be fixed by #1713
Open

[Design] Feature Gen AI data ingestion workflow / pipeline #1707

Benvii opened this issue Aug 5, 2024 · 0 comments · May be fixed by #1713
Assignees

Comments

@Benvii
Copy link
Member

Benvii commented Aug 5, 2024

Feature issue : #1706

Write a design proposals for Gen AI data ingestion workflow using :

  • Gitlab pipeline as data ingestion scheduler
  • OpenSearch as vector DB provider
  • AWS lambda to run ingestion script with access to the database
  • AWS for infrastructure (this design may include GCP GKE reflexion also)
  • Langfuse as test dataset storage solution
  • Reuse as much as possible existing python tooling : tock-llm-indexing-tools
  • Optional Ragas for evaluators

Design should be reviewed and approved before starting any development to be sure that we are developing in the right direction.

@Benvii Benvii self-assigned this Aug 5, 2024
Benvii added a commit to CreditMutuelArkea/tock that referenced this issue Aug 5, 2024
Benvii added a commit to CreditMutuelArkea/tock that referenced this issue Aug 5, 2024
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant