Uberduck #machine-learning

Join Discord

zwf

03/25/2021, 7:20 PM

nice, good luck. it's a lot of fun

zwf

03/27/2021, 7:09 PM

@User Here's the glowTTS repo, They have a notebook ("Inference.ipynb") that you can use

zwf

03/27/2021, 7:09 PM

I've never tried it myself

user

03/27/2021, 7:09 PM

How to run inference.ipynb

user

03/27/2021, 7:10 PM

Can we use pretrained model?

zwf

03/27/2021, 7:11 PM

yes, they link a pretrained model in the repo README.md

zwf

03/27/2021, 7:11 PM

Oops, I forgot the link

zwf

03/27/2021, 7:11 PM

https://github.com/jaywalnut310/glow-tts

user

03/27/2021, 7:11 PM

Can we use the michael rosen datasets?

user

03/27/2021, 7:11 PM

You're asking ZWF to hand over his datasets for you?

user

03/27/2021, 7:12 PM

i made a michael R dataset already

zwf

03/27/2021, 7:12 PM

you put together the audio, but you didn't transcribe it, which is the time-consuming part

user

03/27/2021, 7:12 PM

how to transcribe it?

SidPlays_144p

03/27/2021, 7:13 PM

you can use Descript or you can try to do it yourself

zwf

03/27/2021, 7:13 PM

I've found Descript to be really useful

user

03/27/2021, 7:13 PM

what is this?

zwf

03/27/2021, 7:13 PM

Although you still need to go back through and correct the transcriptions

zwf

03/27/2021, 7:14 PM

It's a program that lets you edit audio like text https://www.descript.com/

zwf

03/27/2021, 7:15 PM

So yeah, basically the input to the model is a text file that looks like:

Copy code

path/to/wav/1.wav|Transcription of the first file.
path/to/wav/2.wav|Transcription of the second file.

user

03/27/2021, 7:15 PM

zwf

03/27/2021, 7:15 PM

where each individual wav is between 1 and 10 seconds

user

03/27/2021, 7:15 PM

also we can write

zwf

03/27/2021, 7:16 PM

I make my datasets using Descript, so if you create a Descript project where each paragraph contains 1 to 10 seconds of audio then I can easily make the training set

user

03/27/2021, 7:17 PM

michael rosen needs to be updated vo.codes

zwf

03/27/2021, 7:18 PM

yeah, sounds like high-fidelity models are coming on vo.codes though

zwf

03/27/2021, 7:18 PM

cuz they have more funding now

Monero

03/29/2021, 11:46 PM

@zwf have you used Nvidia Nemo?

zwf

03/29/2021, 11:46 PM

I've seen it, but never used it

Monero

03/30/2021, 12:01 AM

Going to try it out, been downloading the docker container for the last hour 😅

zwf

03/30/2021, 12:06 AM

nice, let us know how it goes!