Would you mind explaining an issue about gradient descent in lecture 1b #10

theanhle · 2017-03-02T18:44:03Z

I've read your slides in lecture 1b (Deep neural network are our friends). In slide: "Gradient are our friends" explaining arg min C(w, b): w0, b0 = 2, 2; C(w0, b0) = 68. This's correct. But after that, I don't understand why the results of expression sum(-2(y^ - y)*x) are: 8, -40, -72. I think that: -8, 40, 72 are correct.
By the way, I implemented this simple network but when I trained it through 100 times, the value of cost function was not convergent. Here is my code:

import numpy as np 
x=np.array([1,5,6])
y=np.array([0,16,20])
w = 2
b = 2
epoches = 101
learning_rate = 0.05
for epoch in range(epoches):
    out = x*w + b
    cost = np.sum((y - out)**2) 
    if(epoch % 10 ==0):
        print('Epoch:', epoch, ', cost:', cost)
    dcdw = np.sum(-2*(out - y)*x)
    dcdb = np.sum(-2*(out - y))
    w = w - learning_rate*dcdw
    b = b - learning_rate*dcdb

, and here is result:
Epoch: 0 , cost: 68
Epoch: 10 , cost: 1.1268304493e+19
Epoch: 20 , cost: 3.00027905999e+36
Epoch: 30 , cost: 7.98849058743e+53
Epoch: 40 , cost: 2.12700154184e+71
Epoch: 50 , cost: 5.66331713039e+88
Epoch: 60 , cost: 1.50790492101e+106
Epoch: 70 , cost: 4.01492128811e+123
Epoch: 80 , cost: 1.06900592505e+141
Epoch: 90 , cost: 2.84631649237e+158
Epoch: 100 , cost: 7.57855254577e+175

Please explain for me. Thank you in advance!

mleue · 2017-03-23T20:49:25Z

Hey, two issues here.

First: your gradient calculation is off. When you define the cost as (y - out)**2 then the derivative w.r.t. w will be -2*(y - out)*x and not -2*(out - y)*x. So it seems like you just mixed it up there. Same issue for your gradient w.r.t. b.

Second: Diverging cost is usually a sign for a too-high learning rate. Try something lower. Go in steps of dividing by 10.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Would you mind explaining an issue about gradient descent in lecture 1b #10

Would you mind explaining an issue about gradient descent in lecture 1b #10

theanhle commented Mar 2, 2017

mleue commented Mar 23, 2017

Would you mind explaining an issue about gradient descent in lecture 1b #10

Would you mind explaining an issue about gradient descent in lecture 1b #10

Comments

theanhle commented Mar 2, 2017

mleue commented Mar 23, 2017