Skip to content

parthmalpathak/Model_Compression_Pruning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 

Repository files navigation

Model Compression: Pruning

Overview

Tensorflow based implementation of Learning both Weights and Connections for Efficient Neural Networks by Han S., Pool J., et al.

Pruning is a Model Compression Technique which allows the user to compress the model to a smaller size while maintaining marginal loss in accuracy. Pruning also allows the model to be optimized for real time inference for resource-constrained devices.

For more information on Model Compression and Pruning, please read Model Compression via Pruning.

Concepts Utilised

  • Magnitude Based Pruning.

Explanation

This implementation utilizes a dataset which is not available for public usage. But this implementation can be utilized on other datasets.

Code has two different implementations:

  • Retrain Attempt: Inducing sparsity every iteration while retraining.

  • Baseline Attempt: Inducing sparsity by making the weight values beyond a certain threshold equal to 0.0 without retraining.

Copyright

Author @Parth Malpathak

All the codes and implementatations are a part of 10605 (Machine Learning for Large Datasets) course requirements. Please go through the academic integrity policy of Carnegie Mellon University before cloning this repository and duplicating the codes.

About

Tensorflow implementation of Magnitude Based Pruning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published