Model Compression: Pruning

Overview

Tensorflow based implementation of Learning both Weights and Connections for Efficient Neural Networks by Han S., Pool J., et al.

Pruning is a Model Compression Technique which allows the user to compress the model to a smaller size while maintaining marginal loss in accuracy. Pruning also allows the model to be optimized for real time inference for resource-constrained devices.

For more information on Model Compression and Pruning, please read Model Compression via Pruning.

Concepts Utilised

Magnitude Based Pruning.

Explanation

This implementation utilizes a dataset which is not available for public usage. But this implementation can be utilized on other datasets.

Code has two different implementations:

Retrain Attempt: Inducing sparsity every iteration while retraining.
Baseline Attempt: Inducing sparsity by making the weight values beyond a certain threshold equal to 0.0 without retraining.

Copyright

Author @Parth Malpathak

All the codes and implementatations are a part of 10605 (Machine Learning for Large Datasets) course requirements. Please go through the academic integrity policy of Carnegie Mellon University before cloning this repository and duplicating the codes.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
Magnitude Pruning.ipynb		Magnitude Pruning.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Model Compression: Pruning

Overview

Concepts Utilised

Explanation

Copyright

About

Releases

Packages

Languages

parthmalpathak/Model_Compression_Pruning

Folders and files

Latest commit

History

Repository files navigation

Model Compression: Pruning

Overview

Concepts Utilised

Explanation

Copyright

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages