Skip to content

Latest commit

 

History

History
31 lines (21 loc) · 3.82 KB

ai-ml-math.md

File metadata and controls

31 lines (21 loc) · 3.82 KB

Index

dark

Every single thing that a machine learning algorithm does maps one set of numbers to another set of numbers.

Linear Regression

dark

Given some input vector $\large{\color{Purple}\vec{\mathbf{v}}}$, if you want to connect it to some output vector $\large{\color{Purple}\vec{\mathbf{y}}}$ via a linear model, for some reason you think that the connection between input and output, the regression connection is actually through a linearity connection. In that case let us say our $\large{\color{Purple}h(x)}$ with parameters $\large{\color{Purple}w}$ is assumed to be a linear model, in that case all you do is you take a hypothesis function $\large{\color{Purple}h(x)}$, you say that my guess is $\large{\color{Purple}\hat{\mathbf{y}}}$, you will have already got some ground truth $\large{\color{Purple}\hat{\mathbf{y}}}$ and using these two you calculate the cost function $\large{\color{Purple}\mathrm{J}}$ and you feed it back so as to improve w ok by looking at $\huge {\color{Purple} \frac{\partial J }{\partial w}}$ .

Linear regression

  • In linear regression you have $\large{\color{Purple} \vec{x}}$, you multiply by $\large{\color{Purple} w}$ and run it through a summation $\large{\color{Purple} \sum }$ and you get $\large{\color{Purple} \hat{y}}$, this is linear regression.

Logistic Regression

  • You take $\large{\color{Purple} \vec{x}}$, again the same parameters $\large{\color{Purple} w}$, run it through a summation and we add a one small change, we add a nonlinear function. This is called a non-linear activation function and this gives our $\large{\color{Purple} \hat{y}}$, this is called logistic regression for certain choices of activation functions.
  • An Activision function is on your linear combination you add nonlinearity over this ok, so we will typically denote the non-linear activation function by $\large{\color{Purple} g}$ so $\large{\color{Purple} g()}$ stands for some non-linear function.

Neural net flow

  • More than a layer is called Deep network.
  • So you take your $\large{\color{Purple} \vec{x}}$, run it through a linear combination with some weight $\large{\color{Purple} w}$, run it through a nonlinear function $\large{\color{Purple} g()}$. Then run it through another linear combination with some other weights let us call them $\large{\color{Purple} w_1}$, some other weights $\large{\color{Purple} w_2}$, and another non-linear combination $\large{\color{Purple} g()}$ and so on and so forth and finally you get your output prediction $\large{\color{Purple} \hat{y}}$.

Important notes

  • how do we characterize the output $\large{\color{Purple}y,\hat{y}}$.
  • what is the feed forward model. $\large{\color{Purple} \textit{ Which non-linear function as }\textbf{g()}?}$
  • The 3rd thing is what is the loss function $\large{\color{Purple}J}$
  • How do we calculate $\large{\color{Purple} \frac{\partial J}{\partial w}}$ ? in other words this is the gradient problem
  • There is a 5th problem which will not be discussing very much which is how do we use $\large{\color{Purple} \frac{\partial J}{\partial w}}$ to find better w, Optimization problem.