Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
Dapwner authored Jun 3, 2024
1 parent d553a1a commit 9761b30
Showing 1 changed file with 11 additions and 1 deletion.
12 changes: 11 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,12 @@
# Accent Conversion in Text-To-Speech Using Multi-Level VAE and Adversarial Training
This repo combines a Tacotron2 model with a ML-VAE and adversarial learning to target accent conversion in TTS settings (pick a speaker A with and assign them accent B).
This repo combines a Tacotron2 model with a ML-VAE and adversarial learning to target accent conversion in TTS settings (pick a speaker A with and assign them accent B).
Paper link: TBA
Samples link: https://amaai-lab.github.io/Accented-TTS-MLVAE-ADV/

## Training
First preprocess your data into mel spectrogram .npy arrays with the preprocess.py script. We used L2CMU in this paper, which stands for a combination of L2Arctic (24 speakers) and CMUArctic (4 speakers). Then run CUDA_VISIBLE_DEVICES=X python train.py --dataset L2CMU

## Inference
Once trained, you can run extract_stats.py to retrieve the accent and speaker embeddings of your evaluation set and store them. Then, you can synthesize with one of the synth scripts. :-)

Once trained, you can run CUDA_VISIBLE_DEVICES=X python synthesize.py --dataset L2Arctic --restore_step [N] --mode [batch/single] --text [TXT] --speaker_id [SPID] --accent [ACC]

0 comments on commit 9761b30

Please # to comment.