This repository contains code and notes on various different machine learning algorithms. All code is written from scratch, with the exception of the use of numpy and pytorch libraries to make use of their provided datatypes (np.ndarray, and torch.Tensor respectively).
The packages directory contains the code for all algorithms and preprocessing modules. Models are stored in the mlr/Models
directory, and preprocessing utilities are located in the mlr/Preprocessing
directory. This directory also includes the conda environment required to run any of the experiments in the Experiments
directory. More information on how to set up this environment is described in the Environment Setup and Running Experiments section of this file.
This directory contains experiments on datasets using the machine learning packages created in the Packages
directory of this repository.
This directory contains Jupyter notebooks containing notes for all the algorithms created in the Packages
directory of this repository.
This directory contains all the datasets tested in the Experiments
section of this repository. The following datasets are provided:
Grades
: The Student Performance Dataset, used for regression modelsIris
: The Iris dataset, used for multiclass classification modelsTitanic
: The Titanic - Machine Learning from Disaster dataset, used for binary classification models
In order to set up the conda environment used to run experiments in the Experiments
directory, the conda environment must be created and activated, and the mlr
package must be installed. This can be done from the root of this project as follows:
cd Packages;
conda env create -f environment.yml;
conda activate MLR;
pip install .
After this, experiments in the Experiments
directory can be run from within the newly created MLR
conda environment.