Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Character-level BiLSTM take the first and last hidden state #426

Open
allanj opened this issue Oct 20, 2017 · 2 comments
Open

Character-level BiLSTM take the first and last hidden state #426

allanj opened this issue Oct 20, 2017 · 2 comments

Comments

@allanj
Copy link

allanj commented Oct 20, 2017

I want to implement character embedding with BiLSTM as in this paper(Neural Architectures for Named Entity Recognition Guillaume)
.
Specifically, I give the characters of a word as input to a BiLSTM.
Then I concatenate the last hidden state of the forward LSTM and the first hidden state of the backward LSTM, that's the result I want.

However, I found it would be hard if I have the variable length of words.

Let's say the word contains 3 characters (1 2 and 3), the maximal length is 5.
So, the input to the BiLSTM will be the embeddings of the following tokens:

1, 2, 3, 0, 0

But, if I want to take the last hidden state, it would become 0 since the last hidden state is padded by 0. I couldn't let the model know to get the third position as a different sentence has a different position.

@tastyminerals
Copy link
Contributor

tastyminerals commented Oct 22, 2017

This should not be the input to your BiLSTM, you first use a LookupTable to encode character indeces into character vectors. Read 4.1 Character-based models of words paragraph from your paper again.

@allanj
Copy link
Author

allanj commented Oct 22, 2017

Yes. Sorry I didn't put an embedding layer before that.
The problem is still the same, so the network now becomes the (embedding layer + BiLSTM).
Still the same input.

But the problem exists.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants