Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Validation Error with Speaker Change Detection - ValueError: zero-size array... #243

Closed
prlabu opened this issue Nov 22, 2019 · 6 comments
Closed

Comments

@prlabu
Copy link

prlabu commented Nov 22, 2019

When running pyannote-change-detection validate, I'm getting an error that doesn't seem to have much precedent. Full-length stdout is below. Most forum posts on ValueError: zero-size array to reduction operation maximum which has no identity don't apply to our case.

pyannote-change-detection train worked without issue.

I first suspected that I simply hadn't updated the validate paths in database.yml but as far as I can tell they are correct.

I've tried with both python 3.6 and 3.7. On the develop branch. I'm working on a Google Cloud Platform cloud compute instance starting from the c2-deeplearning-pytorch image. Using AMI.

Wouldn't be surprised at all if it's something straightforward, but I haven't been able to figure it out.

Feature extraction: 26it [00:38,  1.48s/it]
0iteration [00:00, ?iteration/s]multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
  File "/home/plb1_rice_edu/.conda/envs/pyannote36/lib/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/home/plb1_rice_edu/.conda/envs/pyannote36/lib/python3.6/multiprocessing/pool.py", line 44, in mapstar
    return list(map(*args))
  File "/home/plb1_rice_edu/.conda/envs/pyannote36/lib/python3.6/site-packages/pyannote/audio/applications/change_detection.py", line 175, in validate_helper_func
    return metric(reference, hypothesis, uem=uem)
  File "/home/plb1_rice_edu/.conda/envs/pyannote36/lib/python3.6/site-packages/pyannote/metrics/base.py", line 116, in __call__
    components = self.compute_components(reference, hypothesis, **kwargs)
  File "/home/plb1_rice_edu/.conda/envs/pyannote36/lib/python3.6/site-packages/pyannote/metrics/segmentation.py", line 206, in compute_components
    return self._process(reference, hypothesis)
  File "/home/plb1_rice_edu/.conda/envs/pyannote36/lib/python3.6/site-packages/pyannote/metrics/segmentation.py", line 197, in _process
    detail[CVG_INTER] = np.sum(np.max(K, axis=1)).item()
  File "<__array_function__ internals>", line 6, in amax
  File "/home/plb1_rice_edu/.conda/envs/pyannote36/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 2621, in amax
    keepdims=keepdims, initial=initial, where=where)
  File "/home/plb1_rice_edu/.conda/envs/pyannote36/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 90, in _wrapreduction
    return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
ValueError: zero-size array to reduction operation maximum which has no identity
"""
The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/plb1_rice_edu/.conda/envs/pyannote36/bin/pyannote-change-detection", line 8, in <module>
    sys.exit(main())
  File "/home/plb1_rice_edu/.conda/envs/pyannote36/lib/python3.6/site-packages/pyannote/audio/applications/change_detection.py", line 329, in main
    start=start, end=end, every=every, in_order=in_order)
  File "/home/plb1_rice_edu/.conda/envs/pyannote36/lib/python3.6/site-packages/pyannote/audio/applications/base.py", line 347, in validate
    validation_data=validation_data)
  File "/home/plb1_rice_edu/.conda/envs/pyannote36/lib/python3.6/site-packages/pyannote/audio/applications/change_detection.py", line 223, in validate_epoch
    _ = self.pool_.map(validate, validation_data)
  File "/home/plb1_rice_edu/.conda/envs/pyannote36/lib/python3.6/multiprocessing/pool.py", line 266, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/home/plb1_rice_edu/.conda/envs/pyannote36/lib/python3.6/multiprocessing/pool.py", line 644, in get
    raise self._value
ValueError: zero-size array to reduction operation maximum which has no identity
@prlabu
Copy link
Author

prlabu commented Nov 24, 2019

Update: I was curious if pyannote-speech-detection would encounter the same issue. It does not. Speech detection validation works just fine.

@hbredin
Copy link
Member

hbredin commented Nov 25, 2019

You did find a bug in pyannote.metrics. This happens when the hypothesized segmentation is empty.

However, I do not understand why this happens in your case.
Why would hypothesis be empty here:

hypothesis = pipeline(current_file)

Can you try to narrow this down?

@prlabu
Copy link
Author

prlabu commented Nov 25, 2019

I found the issue. I'm not exactly sure how it happened, but the annotation file was clipped short and therefore didn't match utterances from annotated. It was sneaky because the first half of the file looked normal, but the latter portion of the file was missing. Thanks!

@prlabu prlabu closed this as completed Nov 25, 2019
@hbredin
Copy link
Member

hbredin commented Nov 26, 2019

Glad your issue is solved.

Would you mind sharing annotation and annotated files before and after your fix so that I can better fix the related bug in pyannote.metrics?

@prlabu
Copy link
Author

prlabu commented Nov 26, 2019

MixHeadset_val_dirty_rttm.txt
MixHeadset_val_dirty_uem.txt
MixHeadset_val_rttm.txt
MixHeadset_val_uem.txt

Here are the files. I had to change them to a .txt extension such that Github would let me upload them. The _dirty files are before the fix. I really do not know how a line 1794 got chopped in the middle of the line, but I'm sure that would create issues somewhere.

@hbredin
Copy link
Member

hbredin commented Nov 28, 2019

Thanks.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants