This is a collection of Data Science projects for learning and exploration purposes. The projects are organized and grouped by subject/topic covering different approaches, algorithms and data-sets. Each project consists of a Jupyter notebook and, it has its own folder under the notebooks
folder.
⚠️ This repository contains code and models experiments and are not production-ready, reusable, optimised and fine-tuned code and models. This is rather a sandbox or a playground for learning and trying different data science, machine learning techniques and approaches. Models might not perform well, and there is a place for overfitting/underfitting.
Acknowledgements: This repository was originally inspired by 🤖 Interactive Machine Learning Experiments.
Projects were built using different libraries and tools, and the most used were pandas, scikit-learn and Tensorflow 2 with Keras API. The dependencies for each project is included in a requirements.txt
file, for projects where the data-sets were auto-generated or scraped, a data
folder is present.
Blog: For some projects, I have written a dedicated blog post on my website. The projects with a blog post have an 📝 icon link next to the project name.
Project | Notebook | Tags | Dataset | |
---|---|---|---|---|
Titanic: Machine Learning from Disaster 📝 |
|
Classification
|
Titanic | |
Credit Card Fraud Detection 📝 |
|
Imbalanced Classification
|
Credit Card Fraud Detection |
To run the repository locally, I suggest using docker to launch a Jupyter Notebook server.
It is based on the jupyter/tensorflow-notebook, which includes popular packages from the scientific Python ecosystem.
Run Jupyter with docker-compose
and open the link shown on your terminal (something like http://localhost:8888).
$ docker-compose up
The projects will be available under the notebooks
folder.
Feel free to change any settings of the Jupyter Notebook server by editing the docker-compose.yml
file.