Skip to content

Diarization #130

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from
Draft

Diarization #130

wants to merge 4 commits into from

Conversation

ggerganov
Copy link
Member

@ggerganov ggerganov commented Nov 8, 2022

Some unsuccessful experiments with audio embedding clustering

Tried to apply C-means fuzzy clustering on:

  • embeddings after the initial convolution in the encoder
  • self KV embeddings from each encoder layer
  • KQV embeddings from each encoder layer
  • embeddings from the last encoder layer
  • cross KV embeddings of each decoder layer

Instead of clustering the full embedding dimensions, first reduce dimensionality using SVD:

  • decompose the embeddings E = USV
  • compute singular vectors U' = US
  • project E on U' and take the top few coordinates

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant