-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Feature add cosine proximity loss #30
base: master
Are you sure you want to change the base?
Feature add cosine proximity loss #30
Conversation
|
||
y_true = l2_normalize(y, axis=-1) | ||
y_pred = l2_normalize(y_pred, axis=-1) | ||
return 1. - np.sum(y_true * y_pred, axis=-1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw 2 different implementation. First one is
return -np.sum(y_true * y_pred, axis=-1)
or
return 1. - np.sum(y_true * y_pred, axis=-1)
Which one should we choose for our implementation? They are same in nature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the first since it ranges between -1 and 1, similar to the cosine distance itself.
return 1. - np.sum(y_true * y_pred, axis=-1) | ||
|
||
@staticmethod | ||
def grad(y, y_pred, z, act_fn): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a sufficient way to do grad of cosine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing this from my phone, so please check for errors:
If f(x, y) = (x @ y) / (norm(x) * norm(y))
, then we have
df/dy = x / (norm(x) * norm(y)) - (f(x, y) * y) / (norm(y) ** 2)
where norm
is just the 2-norm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that since cosine loss == negative cosine distance, you should multiply df/dy by -1
vector_length_max = 100 | ||
|
||
for j in range(2, vector_length_max): | ||
x = np.random.uniform(0., 1., [j, ]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To generate random vector array, i set the bound from 0. to 1.
@WuZhuoran - Just ping me when this is finished and I'll take a look. |
@ddbourgin Thank you. I think I need some help with the grad of cos loss and how to test grad function? |
General comment: It looks like right now the documentation is copied directly from |
I will update documentation at next commits. Thanks |
This pull request closes #29 .
- What I did
Add
Cosine Proximity Loss Function.- How I did it
Refer class comments.
- How to verify it
This pull request adds a new feature to Numpy-ml. Ask @ddbourgin to take a look.