-
Notifications
You must be signed in to change notification settings - Fork 172
Walkthrough: AlexNet
Contents:
- About Minerva owl.net
- About ImageNet
- About AlexNet
- Training AlexNet using Minerva
- Multi-view classification using AlexNet
- Using AlexNet to extract feature
owl.net is a DNN training framework build on Minerva's python interface owl. The main purposes of this package are: 1) Providing a simple way for Minerva users to train deep neural network for computer vision problems. 2) Providing a prototype about how to build user applications utilizing the advantages of Minerva.
We borrow Caffe's well-defined network and solver configure file format but the execution is conducted in Minerva engine. It's a showcase of Minerva's flexibile interface (building Caffe's main functionality in several hundreds of lines) and computation efficiency (Multi-GPU training).
If you are not familiar with ImageNet Large Scale Visual Recognition Challenge, please see here. The classification task contains 1.28 million images belong to 1000 classes.
To make IO efficient, we recommend transfer the original image into LMDB after you download the dataset. We could use the tool provided by Caffe to do the convertion. After converting the images, we need to compute the mean value of each pixel among the dataset. When training, mean values are subtracted from the image to produce a zero-mean input. Mean_file for ILSVRC12 can be downloaded by the script provided by Caffe.
In ILSVRC2012, AlexNet(http://www.cs.toronto.edu/~fritz/absps/imagenet.pdf) was proposed. It's the winning model of ILSVRC2012 classification task and it achieved a large accuracy margin compared with the non-DNN methods. It contains 5 convolutional layers and 3 fully-connected layers. During training, some randomness is introduced in the data augmentation process and dropout layer. Those details are defined in the configure file provided by Caffe. Note that currently we don't support convolutions with more than one group, for one GPU released recently has enough RAM to hold the whole model.