Gosmokeless28
07/02/2022, 8:28 PMGosmokeless28
07/02/2022, 8:28 PMpretrained_model: Vocal_HP_4BAND_3090_arch-124m.pth
window_size: 320
parameter: auto-detect
aggressiveness: 0.5
TTA: on
deepExtraction: on
isVocal: on
download: on or off, it's your choice
export_as_mp3: on or off, it's your choice
(Alternatively, you can use this site, which isn't a notebook: https://www.lalal.ai)
HiFi-GAN vocoder trainers for Tacotron 2:
For legacy TT2: https://colab.research.google.com/drive/1ume3953K2K-EdNL90vNqPNSWM1KRuwqp
For pipeline TT2: https://colab.research.google.com/drive/1SKu2xRJy5q1wzuP5CSO8dJ-Nf-UIKz0KGosmokeless28
07/02/2022, 8:29 PMInterested if this works for you all. The training time is pretty slow with the 3 speakers, and would be even slower with enough data for the GST to pick up on the interesting stuff, but hopefully this is useful? It also uses ARPAbet, which we've noticed is pretty helpful.
Backup version of the TalkNet output generator notebook that you can use in case the other ones are broken: https://colab.research.google.com/github/justinjohn0306/TalkNET-colab/blob/main/Controllable_TalkNet.ipynb