-
-
Notifications
You must be signed in to change notification settings - Fork 96
CRF head [WIP] #393
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Open
singularperturbation
wants to merge
8
commits into
mratsim:master
Choose a base branch
from
singularperturbation:crf_head
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
CRF head [WIP] #393
singularperturbation
wants to merge
8
commits into
mratsim:master
from
singularperturbation:crf_head
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Want to try and add a new CRF layer for sequence tagging / prediction, and will implement Viterbi decoding and NLL as the loss value.
There is some logic needed to properly create the transitions matrix, so add an initializer function using range + xavier uniform, and disallow Any -> BOS or EOS -> Any transitions.
Following other implementations, will do scores + log partition function for the forward pass (getting NLL).
Uses array passed in and only reshapes if needed (the new Tensor has a larger size than the old one). Needed / think should help when doing index_select with each subset of the same size. Example here is selecting batch_size for each time step in CRF emissions.
Ran 'nimpretty' to clean up formatting / long lines, and passed more information to the nnp_crf functions.
Implementation of forward pass underway, starting with scores (non normalized log prob with emission + transition components).
Fix some bugs with CRF non-normalized score calculation (mostly making sure that not returning matrix when shouldn't when using index_select). Also fix some out-of-bounds bug due to loop over time steps.
Don't hesitate to ask if you are stuck on a specific thing. |
Thanks, appreciate it! I'm going to try and pick this up again tonight/tomorrow and will probably have some better questions once I'm done with the forward pass. |
Haven't forgotten about this, just haven't had as much time to work on this as I thought I would. |
No problem, I don't have much time myself |
# for free
to join this conversation on GitHub.
Already have an account?
# to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I'm going to need some help refining this (especially not sure yet how the backward pass will work), but I think that I can add a CRF head for sequence prediction that should work with (for example) the GRU layer.
I've been mostly following your guide #331
TODO: