Verifies the functionality of individual components or units within the data pipeline. It focuses on testing specific functions, transformations, or modules in isolation.
Example:
Verifies how the data pipeline handles errors, exceptions, or unexpected scenarios. It checks if appropriate error messages are generated, and error recovery mechanisms are in place.
Ensures that the data pipeline adheres to predefined contracts: rules, constraints, or expectations. It validates data against defined schemas, formats, or business rules to maintain data integrity.
Example:
- test_loadData ensures the function
loadData()
creates the output files and validates their schema. - test_validator unit tests the function
validator()
, checking if it deals with valid and invalid inputs correctly. In main.py, the same function is used to monitor inputs in run time.
Ensures that different components of the data pipeline work together correctly. It validates the interaction and compatibility between various stages or modules of the pipeline.
Verifies the quality, accuracy, and completeness of the data flowing through the pipeline. It checks for anomalies, inconsistencies, missing values, or data integrity issued in run time.