YouTube Link - https://youtu.be/lmGOQ3SRoPc
This project shows how to build a classifier to classify emails as clutter or not clutter. Multiple machine learning models are developed and compared. All are built using model-based approach with the probabilistic framework Pyro.
The project uses Enron Email Dataset, and a lot of work is done to preprocess and prepare the dataset for training the models (all scripts are included).
You can read the jupyter notebook file for details about the dataset, the preprocessing, the probabilistic models used and the implementation.
The project is developed using Google Colab, but you can run it on your local machine provided you have python and jypyter installed.