Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Explore audio compatible models #148

Open
deven96 opened this issue Nov 28, 2024 · 1 comment
Open

Explore audio compatible models #148

deven96 opened this issue Nov 28, 2024 · 1 comment
Assignees
Labels
enhancement New feature or request experimentation Experimenting on things

Comments

@deven96
Copy link
Owner

deven96 commented Nov 28, 2024

Potential for adding models that can process audio files into embeddings

@deven96 deven96 added enhancement New feature or request experimentation Experimenting on things labels Nov 28, 2024
@HAKSOAT
Copy link
Collaborator

HAKSOAT commented Jan 9, 2025

We probably should be looking in the direction of

CLAP: https://huggingface.co/laion/clap-htsat-unfused
Wav2Vec2 model (and its variants): https://huggingface.co/facebook/wav2vec2-base-960h.

I can also see the Hubert model: https://huggingface.co/docs/transformers/en/model_doc/hubert and Encodec: https://huggingface.co/facebook/encodec_24khz, but they do not seem to have a tokenizer and requires training.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
enhancement New feature or request experimentation Experimenting on things
Projects
None yet
Development

No branches or pull requests

2 participants