@inproceedings{
cai2022network,
title={Network Augmentation for Tiny Deep Learning},
author={Han Cai and Chuang Gan and Ji Lin and Song Han},
booktitle={International Conference on Learning Representations},
year={2022},
url={https://openreview.net/forum?id=TYw3-OlrRm-}
}
- Python 3.8.5
- Pytorch 1.8.2
- torchpack
- torchprofile
Model | #Params | #MACs | ImageNet Top1 (%) | Pretrained weights |
---|---|---|---|---|
MobileNetV2-Tiny + NetAug | 0.75M | 23.5M | 53.3% | pth |
MCUNet + NetAug | 0.74M | 81.8M | 62.7% | pth |
ProxylessNAS-Mobile (w0.35, r160) + NetAug | 1.8M | 35.7M | 60.8% | pth |
More are available on Google Drive.
To evaluate pretrained models, please run eval.py.
Example:
torchpack dist-run -np 1 python eval.py \
--dataset imagenet --data_path /dataset/imagenet/ \
--image_size 160 \
--model proxylessnas-0.35 \
--init_from <path_of_pretrained_weight>
Scripts for training models with NetAug on ImageNet are available under the folder bash/imagenet.
Notes:
- With netaug, the expand ratio of the augmented model will be very large. We find the fout initialization strategy does not work well for such kind of models. Thus, we use nn.init.kaiming_uniform initialization when netaug is used.
- We sort the channels according to the channel's L1 value at the beginning of each epoch, which forces the target model to take the most important channels.
- We stop augmenting the width multiplier (i.e., width multiplier augmentation ratio is always 1.0) in the second half of the training epochs, which slightly improves the results in our early experiments.
- When using netaug, running mean and running var in BN layers are not accurate. Thus, if netaug is used, we always use a subset of training images to re-estimate running mean and running var in BN layers after getting the trained model.
To run transfer learning experiments, please first download our pretrained weights or train the models on the pretraining dataset by yourself. Scripts are available under the folder bash/transfer/.
TinyTL: Reduce Activations, Not Trainable Parameters for Efficient On-Device Learning (NeurIPS'20)
MCUNet: Tiny Deep Learning on IoT Devices (NeurIPS'20, spotlight)
Once for All: Train One Network and Specialize it for Efficient Deployment (ICLR'20)
ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware (ICLR'19)