Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Error handling during ingest #104

Open
ppanopticon opened this issue Aug 27, 2024 · 0 comments
Open

Error handling during ingest #104

ppanopticon opened this issue Aug 27, 2024 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@ppanopticon
Copy link
Member

Task Description

Error handling in vitrivr-engine currently has two major shortcomings.

  • An Operator's implementer decides, if an error should be handled gracefully (i.e., log and continue) or not (i.e., throw an exception). This leads to inconsistent behaviour across the a pipeline.
  • If an error is logged, the caller of a pipeline has no way to access error information since most of the time, errors are simply logged. This is not ideal in cases, where vitrivr-engine is used as a library rather than a local service.

I therefore propose three major changes to how errors should be handled:

  • In case an error occurs, operators throw an ExtractionException . This exception reports on the error condition (retrievable, name of the operator and cause) and (optionally) wraps downstream exceptions. Throwing any other exception from within an Operator is considered a programmer's error. Therefore, proper exception handling is needed.
  • When configuring a pipeline, one can determine what error handling mode should be employed. Currently I see two modes: CONTINUE and ABORT (we can of course discuss other modes). This will lead to the introduction of transparent error handling stages in the flow.
  • Regardless of what mode is employed, a per-item summary should be provided in some Context object with information about what went wrong. This Context can be accessed by the caller of a pipeline.

In addition, one can also have a discussion as to how handled errors should affect Retrievables. It might make sense to include error information at a Retrievable level as well.

Currently, this is a discussion issue. I'm open for ideas and input.

# Dependencies

None

Boundary Conditions

This should be implemented in a way such that the error handling logic is injected transparently when pipelines are constructed, rather than requiring the operators to manipulate the flow.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants