Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Improve resiliency of ETLv2 manage tables #807

Merged
merged 12 commits into from
Feb 21, 2019

Conversation

smgallo
Copy link
Contributor

@smgallo smgallo commented Feb 20, 2019

Description

When comparing table definitions generated from ETL configuration files to those stored in the MySQL information_schema, MySQL normalizes several items such as column and index names to lowercase. In the case where these values were capitalized in the ETL configuration files, this would cause an endless and unnecessary loop of ALTER TABLE statements to be executed during each ETL execution.

As per https://dev.mysql.com/doc/refman/5.5/en/identifier-case-sensitivity.html

The case sensitivity of the underlying operating system plays a part in the case sensitivity of database, table, and trigger names. However, column, index, stored routine, and event names are not case sensitive on any platform, nor are column aliases.

To aid in identifying items that have changed, logging and error checking was improved. If a table element definition and the current state of the database was found to differ the exact reason in the configuration is logged:

2019-02-19 20:38:03 [debug] Discover table 'modw.job_tasks'
2019-02-19 20:38:03 [debug] Column local_job_id_raw: values for "type" differ ("int(11)" != "int(11) unsigned")
2019-02-19 20:38:03 [debug] Column resource_id: values for "type" differ ("int(11)" != "int(11) unsigned")
2019-02-19 20:38:03 [debug] Column person_id: values for "type" differ ("int(11)" != "int(11) unsigned")
2019-02-19 20:38:03 [debug] Index uniq: values for "columns" differ ("organization_origin_id,federation_instance_id" != organization_origin_id,federation_blade_id")

NOTE: Since variables/macros can be used to create dynamic column names and column names are normalized to lowercase as stored in the MySQL information schema, it was necessary to make the variable names themselves case-insensitive (e.g., ${MYVAR} and ${myvar} are treated as the same variable). Note that since variable substitution occurs after the DbModel has been built, the value of those variables must also be lowercase to mitigate an needless ALTER TABLE statements.

Motivation and Context

Stop continual ALTER TABLE statements when they are not truly necessary.

Tests performed

Added tests.

$ ../xdmod-qa/travis/build.sh -s
Running only style tests
X11 forwarding request failed on channel 0
Comparing HEAD to upstream/xdmod8.1 (c1e4b1fcd9870ef68322d036f82c7aa7c7f9b21e)
travis_fold:start:syntax
Running syntax tests...
travis_fold:end:syntax
Syntax tests succeeded!

travis_fold:start:style
Running style tests...
travis_fold:end:style
Style tests succeeded!

travis_fold:start:extra
Running extra tests...
travis_fold:end:extra
Extra tests succeeded!

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project as found in the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@smgallo smgallo added enhancement Enhancement of the functionality of an existing feature Category:ETL Extract Transform Load labels Feb 20, 2019
@smgallo smgallo added this to the 8.1.0 milestone Feb 20, 2019
@smgallo smgallo force-pushed the etl2-db-discovery-comparison branch from 8813ad0 to 9914452 Compare February 21, 2019 16:19
@smgallo smgallo force-pushed the etl2-db-discovery-comparison branch from 9914452 to 3b19005 Compare February 21, 2019 16:44
@smgallo smgallo requested review from plessbd and ryanrath February 21, 2019 19:52
// Normalize property values to lowercase to match MySQL behavior
if ( in_array($property, array('time', 'event')) ) {
$value = strtoupper($value);
} elseif ( 'body' == $property && 0 !== stripos($value, "BEGIN") ) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick question, is the 'body' == $property ... portion of this elseif needed? It looks like, based on the switch statement above that it will always be 'body' at this point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

facepalm fall through w/ the other cases. It won't always be body

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be table, which is case sensitive.

@smgallo smgallo merged commit 136f3de into xdmod8.1 Feb 21, 2019
@smgallo smgallo deleted the etl2-db-discovery-comparison branch February 21, 2019 20:46
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Category:ETL Extract Transform Load enhancement Enhancement of the functionality of an existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants