Created using Colaboratory

gan3sh500 · gan3sh500 · commit 18e5d533d8a0 · 2019-05-11T09:29:39.000+05:30
diff --git a/notebook.ipynb b/notebook.ipynb
@@ -12,7 +12,8 @@
     "kernelspec": {
       "name": "python3",
       "display_name": "Python 3"
-    }
+    },
+    "accelerator": "GPU"
   },
   "cells": [
     {
@@ -25,6 +26,18 @@
         "<a href=\"https://colab.research.google.com/github/gan3sh500/mixmatch-pytorch/blob/master/notebook.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
       ]
     },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "hmES8DZG7pFc",
+        "colab_type": "text"
+      },
+      "source": [
+        "This notebook tries to implement the MixMatch technique from the [paper](https://arxiv.org/pdf/1905.02249.pdf) MixMatch: A Holistic Approach to Semi-Supervised Learning and recreate their results on CIFAR10 with WideResnet28. \n",
+        "\n",
+        "It depends on Pytorch, Numpy and imgaug. The WideResnet28 model code is taken from [meliketoy](https://github.com/meliketoy/wide-resnet.pytorch/blob/master/networks/wide_resnet.py)'s github repository. Hopefully I can train this on Colab. :)"
+      ]
+    },
     {
       "cell_type": "code",
       "metadata": {
@@ -35,12 +48,22 @@
       "source": [
         "import torch\n",
         "import numpy as np\n",
-        "import imgaug as ia\n",
         "import imgaug.augmenters as iaa"
       ],
       "execution_count": 0,
       "outputs": []
     },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "Z_V6d_r-8QUi",
+        "colab_type": "text"
+      },
+      "source": [
+        "Now that we have the basic imports out of the way lets get to it. \n",
+        "First we shall define the function to get augmented version of a given batch of images. The below function returns the function to do that. "
+      ]
+    },
     {
       "cell_type": "code",
       "metadata": {
@@ -62,6 +85,16 @@
       "execution_count": 0,
       "outputs": []
     },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "se8HRC8z8byR",
+        "colab_type": "text"
+      },
+      "source": [
+        "Next we define the sharpening function to sharpen the prediction from the averaged prediction of all the unlabeled augmented images. It does the same thing as applying a temperature within the softmax function but to the probabilities. "
+      ]
+    },
     {
       "cell_type": "code",
       "metadata": {
@@ -77,6 +110,16 @@
       "execution_count": 0,
       "outputs": []
     },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "IhvvJUKN80lU",
+        "colab_type": "text"
+      },
+      "source": [
+        "A simple implementation of the [paper](https://arxiv.org/pdf/1710.09412.pdf) mixup: Beyond Empirical Risk Minimization used in this paper as well."
+      ]
+    },
     {
       "cell_type": "code",
       "metadata": {
@@ -94,6 +137,16 @@
       "execution_count": 0,
       "outputs": []
     },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "HU0JHbCh90o5",
+        "colab_type": "text"
+      },
+      "source": [
+        "This covers Algorithm 1 from the paper. "
+      ]
+    },
     {
       "cell_type": "code",
       "metadata": {
@@ -118,6 +171,16 @@
       "execution_count": 0,
       "outputs": []
     },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "dmSvUmiP94zT",
+        "colab_type": "text"
+      },
+      "source": [
+        "The combined loss for training from the paper."
+      ]
+    },
     {
       "cell_type": "code",
       "metadata": {
@@ -136,7 +199,176 @@
         "    def forward(X, U, p, q):\n",
         "        X_ = np.concatenate([X, U], axis=1)\n",
         "        y_ = np.concatenate([p, q], axis=1)\n",
-        "        return self.xent(preds[:len(p)], p) + self.mse(preds[len(p):], q)"
+        "        return self.xent(preds[:len(p)], p) + \\\n",
+        "                                    self.lambda_u * self.mse(preds[len(p):], q)"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "CCqJtpJ--Cik",
+        "colab_type": "text"
+      },
+      "source": [
+        "Now that we have the MixMatch stuff done, we have a few things to do. Namely, define the WideResnet28 model, write the data and training code and write testing code. \n",
+        "Let's start with the model. The below is just a copy paste mostly from the wide-resnet.pytorch repo by meliketoy. "
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "GIkBy3T15P7l",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "def conv3x3(in_planes, out_planes, stride=1):\n",
+        "    return torch.nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,\n",
+        "                           bias=True)"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "Fud8CmEtCaSN",
+        "colab_type": "text"
+      },
+      "source": [
+        "Will need the below init function later before training."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "FZBBH5EYCZhi",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "\n",
+        "def conv_init(m):\n",
+        "    classname = m.__class__.__name__\n",
+        "    if classname.find('Conv') != -1:\n",
+        "        torch.nn.init.xavier_uniform(m.weight, gain=np.sqrt(2))\n",
+        "        torch.nn.init.constant(m.bias, 0)\n",
+        "    elif classname.find('BatchNorm') != -1:\n",
+        "        torch.nn.init.constant(m.weight, 1)\n",
+        "        torch.nn.init.constant(m.bias, 0)"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "V_gOfar1CeUx",
+        "colab_type": "text"
+      },
+      "source": [
+        "The basic block for the WideResnet"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "QZ068XQR6LZP",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "class WideBasic(torch.nn.Module):\n",
+        "    def __init__(self, in_planes, planes, dropout_rate, stride=1):\n",
+        "        super(WideBasic, self).__init__()\n",
+        "        self.bn1 = torch.nn.BatchNorm2d(in_planes)\n",
+        "        self.bn2 = torch.nn.BatchNorm2d(planes)\n",
+        "        self.conv1 = torch.nn.Conv2d(in_planes, planes, kernel_size=3,\n",
+        "                                     padding=1, bias=True)\n",
+        "        self.conv2 = torch.nn.Conv2d(planes, planes, kernel_size=3,\n",
+        "                                     padding=1, bias=True)\n",
+        "        self.dropout = torch.nn.Dropout(p=dropout_rate)\n",
+        "        self.shortcut = torch.nn.Sequential()\n",
+        "        if stride != 1 or in_planes != planes:\n",
+        "            self.shortcut = torch.nn.Sequential(\n",
+        "                torch.nn.Conv2d(in_planes, planes, kernel_size=1,\n",
+        "                                stride=stride, bias=True)\n",
+        "            )\n",
+        "\n",
+        "    def forward(self, x):\n",
+        "        out = self.dropout(self.conv1(torch.nn.functional.relu(self.bn1(x))))\n",
+        "        out = self.conv2(torch.nn.functional.relu(self.bn2(out)))\n",
+        "        return out + self.shortcut(x)"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "wdew7GNoChmh",
+        "colab_type": "text"
+      },
+      "source": [
+        "Aaand the full model with default params set for CIFAR10."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "YvE9l4W27jTx",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "class WideResNet(torch.nn.Module):\n",
+        "    def __init__(self, depth=28, widen_factor=10,\n",
+        "                 dropout_rate=0.3, num_classes=10):\n",
+        "        super(WideResNet, self).__init__()\n",
+        "        self.in_planes = 16\n",
+        "        n = (depth - 4) // 6\n",
+        "        k = widen_factor\n",
+        "        nStages = [16, 16*k, 32*k, 64*k]\n",
+        "        self.conv1 = conv3x3(3, nStages[0])\n",
+        "        self.layer1 = self.wide_layer(WideBasic, nStages[1], n, dropout_rate,\n",
+        "                                      stride=1)\n",
+        "        self.layer2 = self.wide_layer(WideBasic, nStages[2], n, dropout_rate,\n",
+        "                                      stride=2)\n",
+        "        self.layer3 = self.wide_layer(WideBasic, nStages[3], n, dropout_rate,\n",
+        "                                      stride=2)\n",
+        "        self.b1 = torch.nn.BatchNorm2d(nStages[3], momentum=0.9)\n",
+        "        self.linear = torch.nn.Linear(nStages[3], num_classes)\n",
+        "    \n",
+        "    def wide_layer(self, block, planes, num_blocks, dropout_rate, stride):\n",
+        "        strides = [stride] + [1] * (num_blocks - 1)\n",
+        "        layers = []\n",
+        "        for stride in strides:\n",
+        "            layers.append(block(self.in_planes, planes, dropout_rate, stride))\n",
+        "            self.in_planes = planes\n",
+        "        return torch.nn.Sequential(*layers)\n",
+        "    \n",
+        "    def forward(self, x):\n",
+        "        out = self.conv1(x)\n",
+        "        out = self.layer3(self.layer2(self.layer1(out)))\n",
+        "        out = torch.nn.functional.relu(self.bn1(out))\n",
+        "        out = torch.nn.functional.avg_pool2d(out, 8)\n",
+        "        out = out.view(out.size(0), -1)\n",
+        "        return self.linear(out)"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "EjCTPM8wB-dR",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        ""
       ],
       "execution_count": 0,
       "outputs": []