Uberduck #🎃general🎃

Join Discord Communities

Em Elle

01/19/2021, 6:29 PM

@User Hey did you work with Hifi Gan? before ?

zwf

01/19/2021, 6:30 PM

I haven't worked with it much but I'm using a pre-trained hifi gan on uberduck.ai.

user

01/19/2021, 6:31 PM

what will we hit 400?

Em Elle

01/19/2021, 6:31 PM

Oh I see so effectively you just pass on your mels to hifigan

zwf

01/19/2021, 6:31 PM

I meant people online.

zwf

01/19/2021, 6:31 PM

Yes, I'm using it excatly as I used waveglow

Em Elle

01/19/2021, 6:31 PM

and like you don't really need to train the model e2e with hifigan

zwf

01/19/2021, 6:31 PM

I found it's much higher quality than waveglow.

user

01/19/2021, 6:32 PM

i found a colab waveglow tacotron2 but its only the LJSpeech

zwf

01/19/2021, 6:32 PM

it would definitely improve the quality to train e2e but it's not necessary to have satisfactory output in my opinion.

Em Elle

01/19/2021, 6:32 PM

@User okay I guess ill train on fastpitch and see what happens, some of the data processing is annoying is there a script to get it into a form that is similar to LJSpeech

zwf

01/19/2021, 6:33 PM

LOL, that Tom Scott video is exactly what the interactive part of Uberduck is. GPT3 into a synthesizer.

zwf

01/19/2021, 6:33 PM

Yeah, the data processing is extremely annoying. It's the most time-consuming part.

zwf

01/19/2021, 6:34 PM

Do you have any transcriptions now?

Em Elle

01/19/2021, 6:34 PM

Also btw is anyone into anime culture here?

user

01/19/2021, 6:34 PM

do you have screenshots how you work on tacotron2?

zwf

01/19/2021, 6:34 PM

I have a tool that I've written for myself, and I know there are some other colabs.

Em Elle

01/19/2021, 6:34 PM

Yeah I have Scarlett Jo and also some anime voice actress

Em Elle

01/19/2021, 6:34 PM

just gotta get around to processing and cleaning some data

user

01/19/2021, 6:35 PM

if i clone someones voice in colab but sounds like dying person

user

01/19/2021, 6:35 PM

and starts speak gibberish

zwf

01/19/2021, 6:35 PM

This often happens if you don't have enough data.

user

01/19/2021, 6:36 PM

it needs 10 or 5 secs

zwf

01/19/2021, 6:36 PM

ah, I see, the corentinJ one.

zwf

01/19/2021, 6:36 PM

yeah that one works differently, small amounts of data are ok.

zwf

01/19/2021, 6:36 PM

But it's not that great for voices that it was not trained on.

Em Elle

01/19/2021, 6:36 PM

@User I heard he nurfed that code so that it was hard to reproduce results

zwf

01/19/2021, 6:37 PM

The thing I'm working on now will hopefully provide similar functionality.

user

01/19/2021, 6:37 PM

wich voice you working on?

zwf

01/19/2021, 6:37 PM

I am trying to replicate the results in this paper https://arxiv.org/abs/1910.10838