Data Factory

Data Factory is my attempy at building my own Python library for building data pipelines with ease for my personal projects, inspired on azure data factory.

Features

Easy-to-use Python API for defining and executing data pipelines.
Support for defining activities and orchestrating them in a pipeline.

Installation

You can install Data Factory using poetry:

poetry install

To validate the install works you can run unit tests

pytest

Examples

Activities and Pipelines

from data_factory.pipeline import Pipeline, Activity

# Define your activities
def activity1():
    print("Executing Activity 1")

def activity2():
    print("Executing Activity 2")

# Create activities
activity_1 = Activity("Activity 1", activity1)
activity_2 = Activity("Activity 2", activity2)

# Create a pipeline and add activities
my_pipeline = Pipeline("My Pipeline", activities=[activity_1, activity_2])

# Run the pipeline
my_pipeline.run()

Orchestrating Pipelines

from data_factory.orchestrator import PipelineOrchestrator

# Create pipelines
pipeline_1 = Pipeline("Pipeline 1", activities=[activity1, activity2])
pipeline_2 = Pipeline("Pipeline 2", activities=[activity3])

# Create an orchestrator and add pipelines
orchestrator = PipelineOrchestrator([pipeline_1, pipeline_2])

# Run all pipelines sequentially
orchestrator.run_pipelines(verbose=True)

# Get the run statuses of all pipelines
pipeline_statuses = orchestrator.get_run_statuses()
print(pipeline_statuses)

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.github/workflows		.github/workflows
data_factory		data_factory
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.json		config.json
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Factory

Table of Contents

Features

Installation

Examples

Activities and Pipelines

Orchestrating Pipelines

License

About

Releases

Packages

Languages

License

jperod/data_factory

Folders and files

Latest commit

History

Repository files navigation

Data Factory

Table of Contents

Features

Installation

Examples

Activities and Pipelines

Orchestrating Pipelines

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages