Skip to content

ThrunGroup/implicit-hyper-opt

Β 
Β 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

72 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Optimizing Millions of Hyperparameters by Implicit Differentiation

This repository is an implementation of Optimizing Millions of Hyperparameters by Implicit Differentiation.

Running Experiments

Setup Environment

Create a Python 3.7 environment and install required packages:

conda create -n ift-env python=3.7
source activate ift-env
conda install pytorch torchvision cudatoolkit=9.0 -c pytorch
pip install -r requirements.txt

Install Jupyter lab:

conda install -c conda-forge jupyterlab

Simple test

Consider the following tests to verify the environment is correctly setup:

mnist_test.py

python mnist_test.py 
  --datasize <train set size> 
  --valsize <validation set size> 
  --lrh <hyperparameter lr need to be negative> 
  --epochs <min epochs for training model> 
  --hepochs <# of iterations for hyperparameter update> 
  --l2 <initial log weight decay> 
  --restart <reinitialize model weight after each hyperparameter update or not> 
  --model <cnn for lenet like model, mlp for logistic regession and mlp>
  --dataset <CIFAR10 or MNIST>
  --num_layers <# of hidden layer for mlp>
  --hessian<KFAC: KFAC estiamte; direct:true hessian and inverse>
  --jacobian<direct: true jacobian; product: use d_L/d_theta * d_L/d_lambda>

Trained models after each hyperparameter update will be stored in folder defined in line 627 in mnist_test.py. To use CG to compute inverse of hessian, change line 660's hyperparameter updator.

python mnist_test.py --datasize 40000 --valsize 10000 --lrh 0.01 --epochs=100 --hepochs=10 --l2=1e-5 --restart=10 --model=mlp --dataset=MNIST --num_layers=1 --hessian=KFAC --jacobian=direct

Deployment

First, make sure you are on the master node:

ssh <USERNAME>@q.vectorinstitute.ai

Submit a job to the Slurm scheduler:

srun --partion=gpu --gres=gpu:1 --mem=4GB python mnist_test.py

Or, submit a batch of jobs defined by srun_script.sh:

sbatch --array=0-2 srun_script.sh

View queued jobs for a user:

squeue -u $USERNAME

Cancel jobs for a user:

scancel -u $USERNAME

Cancel a specific job:

scancel $JOBID

Experiments

Here, we should place commands for deploying experiments with and without Slurm

To deploy all of the experiments data generation:

sbatch run_all.sh

Train Data Augmentation Network and/or Loss Reweighting Network

Data Augmentation Network

python train_augment_net2.py --use_augment_net

Loss Reweighting Network

python train_augment_net2.py --use_reweighting_net --loss_weight_type=softmax

Regularization Experiments

LSTM Experiments

The LSTM code in this repository is built on the AWD-LSTM codebase. These commands should be run from inside the rnn folder.

First, download the PTB dataset by running:

./getdata.sh

Tune LSTM hyperparameters with 1-step unrolling

python train.py

STN Comparison

To train an STN, run the following command from inside the stn folder:

python hypertrain.py --tune_all --save

Train a baseline model to get a checkpoint

python train_checkpoint.py --dataset cifar10 --model resnet18 --data_augmentation

Finetune the trained checkpoint

python finetune_checkpoint.py --load_checkpoint=baseline_checkpoints/cifar10_resnet18_sgdm_lr0.1_wd0.0005_aug1.pt --num_finetune_epochs=10 --wdecay=1e-4

Experiment 1

Explain what experiment does, and what figure it is in the paper.

To run python script:

python script.py

To deploy with Slurm:

srun ...

Project Structure

.
β”œβ”€β”€ HAM_dataset.py
β”œβ”€β”€ README.md
β”œβ”€β”€ cutout.py
β”œβ”€β”€ data_loaders.py
β”œβ”€β”€ finetune_checkpoint.py
β”œβ”€β”€ finetune_ift_checkpoint.py
β”œβ”€β”€ grid_search.py
β”œβ”€β”€ images
β”œβ”€β”€ inverse_comparison.py
β”œβ”€β”€ isic_config.py
β”œβ”€β”€ isic_loader.py
β”œβ”€β”€ kfac.py
β”œβ”€β”€ kfac_utils.py
β”œβ”€β”€ minst_ref.py
β”œβ”€β”€ mnist_test.py
β”œβ”€β”€ models
β”‚Β Β  β”œβ”€β”€ __init__.py
β”‚Β Β  β”œβ”€β”€ resnet.py
β”‚Β Β  β”œβ”€β”€ resnet_cifar.py
β”‚Β Β  β”œβ”€β”€ simple_models.py
β”‚Β Β  β”œβ”€β”€ unet.py
β”‚Β Β  └── wide_resnet.py
β”œβ”€β”€ papers
β”‚Β Β  β”œβ”€β”€ haoping_project
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ main.tex
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ neurips2019.tex
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ neurips_2019.sty
β”‚Β Β  β”‚Β Β  └── references.bib
β”‚Β Β  └── nips
β”‚Β Β      β”œβ”€β”€ main.tex
β”‚Β Β      β”œβ”€β”€ neurips_2019.sty
β”‚Β Β      └── references.bib
β”œβ”€β”€ random_search.py
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ rnn
β”‚Β Β  β”œβ”€β”€ config_scripts
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ dropoute_ift_no_lrdecay.yaml
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ dropouto
β”‚Β Β  β”‚Β Β  β”‚Β Β  β”œβ”€β”€ dropouto_2layer_lrdecay.yaml
β”‚Β Β  β”‚Β Β  β”‚Β Β  β”œβ”€β”€ dropouto_2layer_no_lrdecay.yaml
β”‚Β Β  β”‚Β Β  β”‚Β Β  β”œβ”€β”€ dropouto_ift_lrdecay.yaml
β”‚Β Β  β”‚Β Β  β”‚Β Β  β”œβ”€β”€ dropouto_ift_neumann_1_lrdecay.yaml
β”‚Β Β  β”‚Β Β  β”‚Β Β  β”œβ”€β”€ dropouto_ift_neumann_1_no_lrdecay.yaml
β”‚Β Β  β”‚Β Β  β”‚Β Β  β”œβ”€β”€ dropouto_ift_no_lrdecay.yaml
β”‚Β Β  β”‚Β Β  β”‚Β Β  β”œβ”€β”€ dropouto_lrdecay.yaml
β”‚Β Β  β”‚Β Β  β”‚Β Β  β”œβ”€β”€ dropouto_no_lrdecay.yaml
β”‚Β Β  β”‚Β Β  β”‚Β Β  └── dropouto_perparam_ift_no_lrdecay.yaml
β”‚Β Β  β”‚Β Β  └── wdecay
β”‚Β Β  β”‚Β Β      β”œβ”€β”€ ift_wdecay_per_param_no_lrdecay.yaml
β”‚Β Β  β”‚Β Β      β”œβ”€β”€ wdecay_ift_lrdecay.yaml
β”‚Β Β  β”‚Β Β      └── wdecay_ift_neumann_1_lrdecay.yaml
β”‚Β Β  β”œβ”€β”€ create_command_script.py
β”‚Β Β  β”œβ”€β”€ data.py
β”‚Β Β  β”œβ”€β”€ embed_regularize.py
β”‚Β Β  β”œβ”€β”€ getdata.sh
β”‚Β Β  β”œβ”€β”€ locked_dropout.py
β”‚Β Β  β”œβ”€β”€ logger.py
β”‚Β Β  β”œβ”€β”€ model_basic.py
β”‚Β Β  β”œβ”€β”€ plot_utils.py
β”‚Β Β  β”œβ”€β”€ rnn_utils.py
β”‚Β Β  β”œβ”€β”€ run_grid_search.py
β”‚Β Β  β”œβ”€β”€ train.py
β”‚Β Β  β”œβ”€β”€ train2.py
β”‚Β Β  └── weight_drop.py
β”œβ”€β”€ search_configs
β”‚Β Β  β”œβ”€β”€ cifar100_wideresnet_bern_dropout_sep.yaml
β”‚Β Β  β”œβ”€β”€ cifar100_wideresnet_gauss_dropout_sep.yaml
β”‚Β Β  β”œβ”€β”€ cifar10_resnet32_data_aug.yaml
β”‚Β Β  β”œβ”€β”€ cifar10_resnet32_grid.yaml
β”‚Β Β  β”œβ”€β”€ cifar10_resnet32_random.yaml
β”‚Β Β  β”œβ”€β”€ cifar10_resnet32_wdecay_per_layer.yaml
β”‚Β Β  β”œβ”€β”€ cifar10_wideresnet_bern_dropout.yaml
β”‚Β Β  β”œβ”€β”€ cifar10_wideresnet_bern_dropout_sep.yaml
β”‚Β Β  β”œβ”€β”€ cifar10_wideresnet_gauss_dropout.yaml
β”‚Β Β  β”œβ”€β”€ cifar10_wideresnet_gauss_dropout_sep.yaml
β”‚Β Β  β”œβ”€β”€ isic_grid.yaml
β”‚Β Β  └── isic_random.yaml
β”œβ”€β”€ search_scripts
β”‚Β Β  β”œβ”€β”€ cifar100_wideresnet_bern_dropout_sep
β”‚Β Β  β”œβ”€β”€ cifar100_wideresnet_gauss_dropout_sep
β”‚Β Β  β”œβ”€β”€ cifar100_wideresnet_random
β”‚Β Β  β”œβ”€β”€ cifar10_wideresnet_bern_dropout
β”‚Β Β  β”œβ”€β”€ cifar10_wideresnet_bern_dropout_sep
β”‚Β Β  β”œβ”€β”€ cifar10_wideresnet_gauss_dropout
β”‚Β Β  └── cifar10_wideresnet_gauss_dropout_sep
β”œβ”€β”€ srun_script.sh
β”œβ”€β”€ stn
β”‚Β Β  β”œβ”€β”€ datasets
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ __init__.py
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ cifar.py
β”‚Β Β  β”‚Β Β  └── loaders.py
β”‚Β Β  β”œβ”€β”€ hypermodels
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ __init__.py
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ alexnet.py
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ hyperconv2d.py
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ hyperlinear.py
β”‚Β Β  β”‚Β Β  └── small.py
β”‚Β Β  β”œβ”€β”€ hypertrain.py
β”‚Β Β  β”œβ”€β”€ models
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ __init__.py
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ alexnet.py
β”‚Β Β  β”‚Β Β  └── small.py
β”‚Β Β  └── util
β”‚Β Β      β”œβ”€β”€ __init__.py
β”‚Β Β      β”œβ”€β”€ cutout.py
β”‚Β Β      β”œβ”€β”€ dropout.py
β”‚Β Β      └── hyperparameter.py
β”œβ”€β”€ train.py
β”œβ”€β”€ train_augment_net2.py
β”œβ”€β”€ train_augment_net_graph.py
β”œβ”€β”€ train_augment_net_multiple.py
β”œβ”€β”€ train_augment_net_slurm.py
β”œβ”€β”€ train_baseline.py
β”œβ”€β”€ train_checkpoint.py
└── utils
    β”œβ”€β”€ csv_logger.py
    β”œβ”€β”€ discrete_utils.py
    β”œβ”€β”€ logger.py
    β”œβ”€β”€ plot_utils.py
    └── util.py

17 directories, 103 files

Authors

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.2%
  • Other 0.8%