We used a language identification model (https://huggingface.co/TalTechNLP/voxlingua107-epaca-tdnn) to analyze the language of the utterances in the VoxCeleb2 dev dataset. The number of utterances for each language can be found in lang.txt.
1.Results in this paper
2.Results in Pretraining Conformer with ASR for Speaker Verification
1.Create a Python 3.9 environment
conda create -n sre python=3.9
2.Pytorch
conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.3 -c pytorch
3.Install required packages
git clone https://github.com/zds-potato/multilingual-phonetic-sv.git
cd multilingual-phonetic-sv
pip -r requirements.txt
We borrowed a lot of code from:
We will open-source our trained multilingual speech recognition model soon...