GitHub - lucia-vargas-a/movielens: Evaluation

Moviliens dataset analysis

Thank you for reviewing this project!!

The goal is to to analyze the dataset of moviliens by designing an effective graph structure and data pipeline.

Project components

An effective graph structure out of the dataset The graph was build using the Arrows tool for Neo4j graph databases

Graph design
The design of a data pipeline to ingest the data into the graph database
An API to retrieve individual node in the graph as well as functionality to search the graph and retrieve the results

Bonus:

A unitest to validate that a dataset of movies was loaded completely

unitest code

Data pipeline code details

Dataset source files

Defined as parameters in order to allow flexibility without affecting the code.

The configuration file can be found here
The code to read the configuration can be found in the function get_config()

Dataset consumption

Built in the class Moviliens_Consumer which contains 3 functions:

init to initialize the class components
create_constraints() to build the constraints required before loading data into the graph database
create_from_dataset to consume the dataset from the csv files

The main function to consume the data can be found here

Selected languages

The graph database is implemented in Neo4j
The code to consume the data is written in Python 3
The repository is built on Github
The container is built using Docker

author: Lucía Vargas
linkedin

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.idea		.idea
__pycache__		__pycache__
config		config
docs		docs
rest_api		rest_api
.DS_Store		.DS_Store
README.md		README.md
__init__.py		__init__.py
consumer.py		consumer.py
dockerfile		dockerfile
main.py		main.py
requirements.txt		requirements.txt
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Moviliens dataset analysis

Project components

Data pipeline code details

Dataset source files

Dataset consumption

Selected languages

About

Releases

Packages

Contributors 2

Languages

lucia-vargas-a/movielens

Folders and files

Latest commit

History

Repository files navigation

Moviliens dataset analysis

Project components

Data pipeline code details

Dataset source files

Dataset consumption

Selected languages

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages