DISNEY-HOTSTAR

We have Designed ETL pipelines that extract and load data seamlessly. we have developed ETL pipelines using Apache Spark with Scala programming language and configuring YAML files to enable data extraction from various batch sources in one of the online video streaming platform Disney Hotstar

YAML Configuration File:-

Create a YAML configuration file that defines the various data sources, target layer, and transformations to be applied.
The YAML file should include details such as source connection information, target connection information, and transformation rules.

Apache Spark with Scala:-

Use Apache Spark, a fast and distributed data processing engine, to implement the ETL pipeline.
Scala, a programming language compatible with Spark, will be used to write the pipeline code.

Data Extraction:-

Read the YAML configuration file to extract the details of the data sources.
Use Spark's built-in connectors or custom connectors to extract data from various sources such as databases, JSON, Parquet, CSV files etc.

Data Transformation:-

Apply the defined transformations to the extracted data.
Use Spark's DataFrame API or Spark SQL to perform transformations like filtering, aggregations, joins, etc.
The transformation rules can be defined in the YAML file, specifying the operations to be performed on the data.

Data Loading:-

Read the target layer details from the YAML configuration file.
Use Spark's connectors or custom connectors to write the transformed data to the target layer such as databases, JSON, Parquet, CSV files etc.

Error Handling and Monitoring:-

Implement error handling mechanisms to handle any exceptions or failures during the ETL process.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
data.yaml		data.yaml
log4j.properties		log4j.properties

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DISNEY-HOTSTAR

About

Releases

Packages

Rajdeep-Borana/DISNEY-HOTSTAR

Folders and files

Latest commit

History

Repository files navigation

DISNEY-HOTSTAR

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages