Skip to content

Commit

Permalink
updates notebook
Browse files Browse the repository at this point in the history
  • Loading branch information
Jeffrey Ling committed Dec 10, 2015
1 parent 0cda29e commit 9f0aca0
Show file tree
Hide file tree
Showing 2 changed files with 95 additions and 93 deletions.
188 changes: 95 additions & 93 deletions TempConv.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
},
{
"cell_type": "code",
"execution_count": 13,
"execution_count": 1,
"metadata": {
"collapsed": false
},
Expand Down Expand Up @@ -163,7 +163,7 @@
"collapsed": true
},
"source": [
"The output of the above gives us our feature map $[\\widehat{\\mathbf{c_1}}, \\widehat{\\mathbf{c_2}}, \\ldots, \\widehat{\\mathbf{c_{d'}}}]$. Finally we add a logistic regression layer (with dropout) for predicting the sentiment from this vector of features."
"The output of the above gives us our feature map $[\\widehat{\\mathbf{c_1}}, \\widehat{\\mathbf{c_2}}, \\ldots, \\widehat{\\mathbf{c_{d'}}}]$. Finally we add a logistic regression layer for predicting the sentiment from this vector of features."
]
},
{
Expand All @@ -176,7 +176,6 @@
"source": [
"logistic = nn.Sequential()\n",
"\n",
"logistic:add(nn.Dropout(0.5))\n",
"logistic:add(nn.Linear(nd, nY))\n",
"logistic:add(nn.LogSoftMax())\n",
"\n",
Expand All @@ -200,10 +199,10 @@
{
"data": {
"text/plain": [
"-1.2024 -0.3573\n",
"-0.6194 -0.7727\n",
"-2.2890 -0.1069\n",
"-2.4340 -0.0918\n",
"-0.6319 -0.7584\n",
"-0.5740 -0.8285\n",
"-0.6088 -0.7853\n",
"-0.7404 -0.6481\n",
"[torch.DoubleTensor of size 4x2]\n",
"\n"
]
Expand All @@ -221,8 +220,31 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"As expected, we get (log) prediction probabilities for 2 classes for each input.\n",
"\n",
"As expected, we get (log) prediction probabilities for 2 classes for each input."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Include a negative-log-likelihood criterion:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"criterion = nn.ClassNLLCriterion()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can also implement these modules on GPUs. Specifically, we include the `cudnn` package, which has some GPU optimized versions of some of the above modules. One thing that requires modification is the convolution step - `cudnn` has no built in `TemporalConvolution` module, so we have to adapt the `SpatialConvolution` by reshaping our feature map matrix.\n",
"\n",
"Here's the full implementation on `cudnn` (using batch mode):"
Expand All @@ -245,133 +267,113 @@
"\n",
"nd = 10\n",
"h = 3\n",
"S = 10\n",
"conv = nn.Sequential()\n",
"conv:add(nn.Reshape(1, nV, d, false))\n",
"conv:add(nn.Reshape(1, S, d, false))\n",
"conv:add(cudnn.SpatialConvolution(1, nd, d, h))\n",
"conv:add(nn.Reshape(nd, nV-h+1, false))\n",
"conv:add(nn.Reshape(nd, S-h+1, false))\n",
"conv:add(cudnn.ReLU())\n",
"conv:add(nn.Max(3))\n",
"\n",
"cudnn_model:add(conv)\n",
"\n",
"logistic = nn.Sequential()\n",
"\n",
"dropout_p = 0.5\n",
"logistic:add(nn.Dropout(0.5))\n",
"logistic:add(nn.Linear(nd, nY))\n",
"logistic:add(cudnn.LogSoftMax())\n",
"\n",
"cudnn_model:add(logistic)\n",
"\n",
"criterion = nn.ClassNLLCriterion()\n",
"\n",
"-- Move to GPU\n",
"cudnn_model:cuda()"
"cudnn_model:cuda()\n",
"criterion:cuda()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Include a negative-log-likelihood criterion:"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"criterion = nn.ClassNLLCriterion()"
"## Training"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Training"
"We perform training with `adadelta`. In each epoch, we create a closure that returns the gradient updates."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
},
{
"cell_type": "code",
"execution_count": 16,
"execution_count": 8,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
" 1 1 1 2 3 4 7 1 1 1\n",
" 1 1 1 5 8 4 7 1 1 1\n",
" 1 1 1 9 8 4 7 1 1 1\n",
" 1 1 1 6 8 4 7 1 1 1\n",
"[torch.DoubleTensor of size 4x10]\n",
"\n",
" 1\n",
" 2\n",
" 2\n",
" 2\n",
"[torch.DoubleTensor of size 4]\n",
"\n"
"Epoch:\t1\t1.0637046591443\t\n",
"Epoch:\t2\t0.92574699271742\t\n",
"Epoch:\t3\t0.80757029787584\t\n"
]
},
"execution_count": 16,
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"print(X, y)"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"collapsed": false
},
"outputs": [
},
{
"data": {
"text/plain": [
"Epoch:\t4\t0.70017095481893\t\n",
"Epoch:\t5\t0.62009300393375\t\n",
"Epoch:\t6\t0.53988474604789\t\n",
"Epoch:\t7\t0.48039989542535\t\n"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"text/plain": [
"Epoch:\t8\t0.42860197930536\t\n",
"Epoch:\t9\t0.38423555419475\t\n",
"Epoch:\t10\t0.34843210201123\t\n",
"Epoch:\t11\t0.31601698906774\t\n",
"Epoch:\t12\t0.29034451988978\t\n"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"text/plain": [
"Epoch:\t13\t0.26565687644075\t\n",
"Epoch:\t14\t0.24537762059148\t\n",
"Epoch:\t15\t0.22573977925808\t\n",
"Epoch:\t16\t0.20809219151101\t\n",
"Epoch:\t17\t0.19274937869357\t\n"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"text/plain": [
"Epoch:\t1\t1.2071026258965\t\n",
"Epoch:\t2\t0.55329548576161\t\n",
"Epoch:\t3\t1.0799819021057\t\n",
"Epoch:\t4\t0.60753844090924\t\n",
"Epoch:\t5\t0.5422639899197\t\n",
"Epoch:\t6\t0.85666510359066\t\n",
"Epoch:\t7\t0.46099172270831\t\n",
"Epoch:\t8\t0.44800228949465\t\n",
"Epoch:\t9\t0.74064167750147\t\n",
"Epoch:\t10\t0.41474709839995\t\n",
"Epoch:\t11\t0.68156844923208\t\n",
"Epoch:\t12\t0.33358810087793\t\n",
"Epoch:\t13\t0.40544432852648\t\n",
"Epoch:\t14\t0.63637760405637\t\n",
"Epoch:\t15\t0.37222033442753\t\n",
"Epoch:\t16\t0.30155980691742\t\n",
"Epoch:\t17\t0.88969589482503\t\n",
"Epoch:\t18\t0.30856292827357\t\n",
"Epoch:\t19\t0.44125077919892\t\n",
"Epoch:\t20\t0.38404520878646\t\n"
"Epoch:\t18\t0.17790250866323\t\n",
"Epoch:\t19\t0.16478886826202\t\n",
"Epoch:\t20\t0.15308249381824\t\n"
]
},
"execution_count": 20,
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
Expand Down Expand Up @@ -407,13 +409,13 @@
]
},
{
"cell_type": "code",
"execution_count": null,
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
"source": [
"Note that training error goes down after every epoch, as expected."
]
}
],
"metadata": {
Expand Down
Binary file added TempConv.pdf
Binary file not shown.

0 comments on commit 9f0aca0

Please # to comment.