Skip to content

feat: Set up comprehensive Python testing infrastructure with Poetry #341

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

llbbl
Copy link

@llbbl llbbl commented Jun 15, 2025

Add Comprehensive Python Testing Infrastructure

Summary

This PR sets up a complete testing infrastructure for the TensorFlow Transform project, migrating from the legacy setup.py to modern Poetry package management and adding pytest as the primary testing framework.

Changes Made

Package Management Migration

  • Created pyproject.toml with Poetry configuration
  • Migrated all dependencies from setup.py to Poetry format
  • Preserved exact version constraints and Python version requirements
  • Added development dependencies group for testing tools

Testing Framework Setup

  • pytest - Modern, extensible testing framework
  • pytest-cov - Coverage reporting with configurable thresholds
  • pytest-mock - Enhanced mocking capabilities using pytest fixtures

Testing Configuration

Created comprehensive pytest configuration in pyproject.toml:

  • Test discovery patterns for test_*.py and *_test.py files
  • Coverage settings with 80% minimum threshold
  • Multiple coverage report formats (terminal, HTML, XML)
  • Custom markers for test organization:
    • unit - Fast, isolated unit tests
    • integration - Tests requiring external resources
    • slow - Long-running tests

Directory Structure

tests/
├── __init__.py
├── conftest.py              # Shared fixtures and configuration
├── test_setup_validation.py # Full validation tests
├── test_minimal_setup.py    # Minimal tests (work without all deps)
├── README.md               # Testing documentation
├── unit/
│   └── __init__.py
└── integration/
    └── __init__.py

Shared Testing Fixtures

Created comprehensive fixtures in conftest.py:

  • temp_dir - Temporary directory management
  • temp_file - Temporary file handling
  • mock_config - Sample configuration data
  • sample_data - Test data for transformations
  • tf_example_data - TFRecord test data generation
  • mock_preprocessing_fn - Sample preprocessing function
  • mock_schema - TensorFlow metadata schema fixture
  • Auto-reset of TensorFlow state between tests

Development Workflow

Configured Poetry scripts for consistent test execution:

poetry run test    # Run all tests
poetry run tests   # Alternative command (both work)

Both commands support all standard pytest options:

poetry run test -v                    # Verbose output
poetry run test -m unit               # Run only unit tests
poetry run test --no-cov              # Skip coverage
poetry run test tests/specific_test.py  # Run specific file

Documentation

  • Updated .gitignore with testing artifacts and development files
  • Created tests/README.md with:
    • Testing infrastructure overview
    • Instructions for running tests
    • Available fixtures documentation
    • Known issues (ARM64 compatibility)
    • Guidelines for writing new tests

Testing the Setup

Validation tests have been created and verified:

# Run minimal validation (works without all dependencies)
poetry run test tests/test_minimal_setup.py --no-cov

# Run full validation (requires all dependencies)
poetry run test tests/test_setup_validation.py

Known Issues

ARM64 Architecture Support

Some dependencies (particularly tfx-bsl) may not have pre-built wheels for ARM64 architectures. This affects:

  • Apple Silicon Macs (M1/M2)
  • ARM-based Linux systems

Workarounds are documented in tests/README.md.

Next Steps

With this infrastructure in place, developers can now:

  1. Write unit tests in tests/unit/ directory
  2. Write integration tests in tests/integration/ directory
  3. Use the provided fixtures for common testing patterns
  4. Run tests with coverage reporting to ensure code quality
  5. Use custom markers to organize and selectively run tests

The testing infrastructure is ready for immediate use - developers can start writing tests using the established patterns and fixtures.

- Migrate from setup.py to Poetry package manager in pyproject.toml
- Add pytest, pytest-cov, and pytest-mock as dev dependencies
- Configure pytest with custom markers (unit, integration, slow)
- Set up coverage reporting with 80% threshold and multiple formats
- Create tests/ directory structure with unit/ and integration/ subdirs
- Add comprehensive shared fixtures in conftest.py
- Update .gitignore with testing and development artifacts
- Create validation tests to verify infrastructure setup
- Configure Poetry scripts for 'test' and 'tests' commands
- Document testing setup and known ARM64 compatibility issues
Copy link

google-cla bot commented Jun 15, 2025

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant