-
Notifications
You must be signed in to change notification settings - Fork 312
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
How to use MaskZero with LSTM and nn.ClassNLLCriterion for variable length squences #75
Comments
I currently am facing the same issue to implement the char-rnn example using rnn. |
Are you sure you need I adapted Karpathy's char-rnn to eat variable lengths of word sequences and it works very fine without any padding or masking. (Sanity-checked with a Reber grammar length 5 to 50.) |
@kmnns Using |
@jundeng86 The model seems ok. It masks the zeros correctly AFAIK. As for the criterion, the MaskZero cannot be used to decorate a criterion. We would need a MaskZeroCriterion for that. Working on it. |
@jundeng86 So for the criterion, use MaskZeroCriterion(criterion). |
Thanks! You helped me out! |
FYI, |
@nicholas-leonard just a heads-up: |
@jfsantos Yes please! Also, nice meeting you at NIPS! |
@nicholas-leonard Are you sure the nn.MaskZeroCriterion works when the mask is zero? (an input exists that fills the whole sequence) |
hello I have built this augmented reality model with MaskZero
|
@emergix You are only using MaskZero at a small part of the model, not for the whole model. Probably that part of the model does not have any issue with having it's output set to zeros. |
I see, The case you want to address is the case where you have inputs of differents length and you want a consistent evaluation of the criterion. so you need to adjust the rho of the LSTM to the maximum of the possible lengh of the inputs. So the criterion has to compute something which should be usable to compare between inputs of different lengths. something like 1/(nb of non zero inputs) sum{square distance(inputs,target), over non zero inputs) and you should make sure that it is impossible that a zero inputs is a possible meaningful inputs
Am I right ?
—
|
@nicholas-leonard |
@shuzi |
@nicholas-leonard @jnhwkim I see, right-aligned with left side zero paddings, thanks a lot! |
@shuzi you're welcome. |
can anyone answer this question? http://stackoverflow.com/questions/38539011/batch-processing-variable-length-sequences-using-element-research-rnn-for-torch |
I changed the criterion to MaskZeroCriterion as suggested, then I tried to run the example code given by @jundeng86 . require 'rnn'
require 'optim'
inSize = 20
batchSize = 2
hiddenSize = 10
seqLengthMax = 11
numTargetClasses=5
numSeq = 30
x, y1 = {}, {}
for i = 1, numSeq do
local seqLength = torch.random(1,seqLengthMax)
local temp = torch.zeros(seqLengthMax, inSize)
local targets ={}
if seqLength == seqLengthMax then
targets = (torch.rand(seqLength)*numTargetClasses):ceil()
else
targets = torch.cat(torch.zeros(seqLengthMax-seqLength),(torch.rand(seqLength)*numTargetClasses):ceil())
end
temp[{{seqLengthMax-seqLength+1,seqLengthMax}}] = torch.randn(seqLength,inSize)
table.insert(x, temp)
table.insert(y1, targets)
end
model = nn.Sequencer(
nn.Sequential()
:add(nn.MaskZero(nn.FastLSTM(inSize,hiddenSize),1))
:add(nn.MaskZero(nn.Linear(hiddenSize, numTargetClasses),1))
:add(nn.MaskZero(nn.LogSoftMax(),1))
)
--criterion = nn.SequencerCriterion(nn.MaskZero(nn.ClassNLLCriterion(),1))
criterion = nn.SequencerCriterion(nn.MaskZeroCriterion(nn.ClassNLLCriterion(),1))
output = model:forward(x)
print(output[1])
err = criterion:forward(output, y1)
print(err) |
Hello, I am trying to use SequencerCriterion with maskZeroCriterion on my varilable-length batch input (e.g., sentences with different number of words). However, I found that, if the predictions fed to the criterion is all zeros, even though the targets are not all zeros, the err is zero. If so, the global minimum would be for the model to produce all zeros regardless of what the input is. I hope I am wrong somewhere. Thank you for your help in advance! For example:
err = 0. It gets the same result if I pad zero to the left of targets |
This may be a bit counter intuitive but MaskZeroCriterion is based on the zeros of the input (so in this case |
Hi Guys,
I tried to make use LSTM to deal with variable length sequences. But I failed to do that by using the MaskZero function. Could you please help me out? Thanks a lot!!
Here a minimal code example of what I mean:
The text was updated successfully, but these errors were encountered: