Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Add option to exit on missing Iglu schemas #382

Merged
merged 2 commits into from
Sep 5, 2024

Conversation

istreeter
Copy link
Contributor

@istreeter istreeter commented Aug 30, 2024

Before this PR, the loader would generate a failed event if it failed to fetch a required schema from Iglu. However, all events have already passed validation in Enrich, so it is completely unexpected to have an Iglu failure. An Iglu error probably means some type of configuration error or service outage.

After this PR, the loader will crash and exit on an Iglu error, instead of creating a failed event. This is probably the preferred behaviour, while the pipeline operator addresses the underlying infrastructure problem.

If an Iglu schema is genuinely now unavailable, then the pipeline operator can override the default behaviour by setting exitOnMissingIgluSchema: false in the configuration file or by listing the missing schema in skipschemas.

Before this PR, the loader would generate a failed event if it failed to
fetch a required schema from Iglu.  However, all events have already
passed validation in Enrich, so it is completely unexpected to have an
Iglu failure.  An Iglu error _probably_ means some type of configuration
error or service outage.

After this PR, the loader will crash and exit on an Iglu error, instead
of creating a failed event.  This is probably the preferred behaviour,
while the pipeline operator addresses the underlying infrastructure
problem.

If an Iglu schema is genuinely now unavailable, then the pipeline
operator can override the default behaviour by setting
`exitOnMissingFailure: false` in the configuration file or by listing
the missing schema in `skipschemas`.
@@ -151,6 +152,7 @@ object Processing {
_ <- Logger[F].debug(s"Processing batch of size ${events.size}")
v2NonAtomicFields <- NonAtomicFields.resolveTypes[F](env.resolver, entities, env.schemasToSkip ::: env.legacyColumns)
legacyFields <- LegacyColumns.resolveTypes[F](env.resolver, entities, env.legacyColumns)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: if NonAtomicFields.resolveTypes already has some Iglu failures we don't need to do this step

"Exiting because failed to resolve Iglu schemas. Either check the configuration of the Iglu repos, or set the `skipSchemas` config option, or set `exitOnMissingIgluSchema` to false.\n"
val failures = v2NonAtomicFields.igluFailures.map(_.failure) ::: legacyFields.igluFailures.map(_.failure)
val msg = failures.map(_.asJson.noSpaces).mkString(base, "\n", "")
Logger[F].error(base) *> env.appHealth.beUnhealthyForRuntimeService(RuntimeService.Iglu) *> Sync[F].raiseError(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's very much a detail but I think that setting the app unhealthy should be the very first thing to do, before logging

Comment on lines 52 to 53
Emit BadRows for an unknown schema $e12 $e12Legacy
Crash and exit for an unrecognized schema, if exitOnMissingIgluSchema is true $e13 $e13Legacy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Emit BadRows for an unknown schema $e12 $e12Legacy
Crash and exit for an unrecognized schema, if exitOnMissingIgluSchema is true $e13 $e13Legacy
Emit BadRows for an unknown schema, if exitOnMissingIgluSchema is false $e12 $e12Legacy
Crash and exit for an unknown schema, if exitOnMissingIgluSchema is true $e13 $e13Legacy

I got misleaded, thinking that they were different kinds of tests

@istreeter istreeter merged commit 32a1a0e into v2 Sep 5, 2024
2 checks passed
@istreeter istreeter deleted the feature/exit-on-missing-iglu-schema branch September 5, 2024 06:32
benjben pushed a commit that referenced this pull request Feb 14, 2025
Before this PR, the loader would generate a failed event if it failed to
fetch a required schema from Iglu.  However, all events have already
passed validation in Enrich, so it is completely unexpected to have an
Iglu failure.  An Iglu error _probably_ means some type of configuration
error or service outage.

After this PR, the loader will crash and exit on an Iglu error, instead
of creating a failed event.  This is probably the preferred behaviour,
while the pipeline operator addresses the underlying infrastructure
problem.

If an Iglu schema is genuinely now unavailable, then the pipeline
operator can override the default behaviour by setting
`exitOnMissingIgluSchema: false` in the configuration file or by listing
the missing schema in `skipschemas`.
benjben pushed a commit that referenced this pull request Feb 14, 2025
Before this PR, the loader would generate a failed event if it failed to
fetch a required schema from Iglu.  However, all events have already
passed validation in Enrich, so it is completely unexpected to have an
Iglu failure.  An Iglu error _probably_ means some type of configuration
error or service outage.

After this PR, the loader will crash and exit on an Iglu error, instead
of creating a failed event.  This is probably the preferred behaviour,
while the pipeline operator addresses the underlying infrastructure
problem.

If an Iglu schema is genuinely now unavailable, then the pipeline
operator can override the default behaviour by setting
`exitOnMissingIgluSchema: false` in the configuration file or by listing
the missing schema in `skipschemas`.
benjben pushed a commit that referenced this pull request Feb 17, 2025
Before this PR, the loader would generate a failed event if it failed to
fetch a required schema from Iglu.  However, all events have already
passed validation in Enrich, so it is completely unexpected to have an
Iglu failure.  An Iglu error _probably_ means some type of configuration
error or service outage.

After this PR, the loader will crash and exit on an Iglu error, instead
of creating a failed event.  This is probably the preferred behaviour,
while the pipeline operator addresses the underlying infrastructure
problem.

If an Iglu schema is genuinely now unavailable, then the pipeline
operator can override the default behaviour by setting
`exitOnMissingIgluSchema: false` in the configuration file or by listing
the missing schema in `skipschemas`.
benjben pushed a commit that referenced this pull request Feb 17, 2025
- Update license to SLULA 1.1
- Cluster by event_name when creating new table (#402)
- Add parallelism to parseBytes and transform (#400)
- Decrease default batching.maxBytes to 10 MB (#398)
- Fix and improve ProcessingSpec for legacy column mode (#396)
- Add legacyColumnMode configuration (#394)
- Add e2e_latency_millis metric (#391)
- Fix startup on missing existing table (#384)
- Add option to exit on missing Iglu schemas (#382)
- Refactor health monitoring (#381)
- Feature flag to support the legacy column style -- bug fixes (#379 #380)
- Require alter table when schema is evolved for contexts
- Allow for delay in Writer discovering new columns
- Stay healthy if BigQuery table exceeds column limit (#372)
- Recover from server-side schema mismatch exceptions
- Improve exception handling immediately after altering the table
- Manage Writer resource to be consistent with Snowflake Loader
benjben pushed a commit that referenced this pull request Feb 17, 2025
- Update license to SLULA 1.1
- Cluster by event_name when creating new table (#402)
- Add parallelism to parseBytes and transform (#400)
- Decrease default batching.maxBytes to 10 MB (#398)
- Fix and improve ProcessingSpec for legacy column mode (#396)
- Add legacyColumnMode configuration (#394)
- Add e2e_latency_millis metric (#391)
- Fix startup on missing existing table (#384)
- Add option to exit on missing Iglu schemas (#382)
- Refactor health monitoring (#381)
- Feature flag to support the legacy column style -- bug fixes (#379 #380)
- Require alter table when schema is evolved for contexts
- Allow for delay in Writer discovering new columns
- Stay healthy if BigQuery table exceeds column limit (#372)
- Recover from server-side schema mismatch exceptions
- Improve exception handling immediately after altering the table
- Manage Writer resource to be consistent with Snowflake Loader
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants