Uberduck #tacotron-2-support

Join Discord

Eugene

10/21/2022, 11:15 PM

Aight

Reclezon

10/21/2022, 11:17 PM

Phones about dead so brb ig

Eugene

10/21/2022, 11:17 PM

Thank you guys for helping me :)

Reclezon

10/21/2022, 11:20 PM

Does sound a bit grainy for background noise but the voice is very good otherwise.

Eugene

10/21/2022, 11:24 PM

Is there anything to conclude from these images?

{K EY1} (Kei)

10/21/2022, 11:27 PM

That means the model is very good

{K EY1} (Kei)

10/21/2022, 11:27 PM

The first thing is a map of the generated phonetics One on the right is an alignment graph (straight orange diagonal line is good) Emojis are just whatever is detected in the text

Eugene

10/21/2022, 11:29 PM

Hmm

Eugene

10/21/2022, 11:30 PM

I do hope that with more voice lines the model could become more realistic.

Eugene

10/21/2022, 11:30 PM

And I also find quite some words in the output audios hard to understand.

Reclezon

10/21/2022, 11:42 PM

More is always better, though mind that rule about exponential decreases or whatever.

Eugene

10/21/2022, 11:46 PM

What's that rule?

legorunnerkid

10/21/2022, 11:55 PM

I remember having my model sound very weird when I thought I had it.

legorunnerkid

10/21/2022, 11:55 PM

I believe it had something to do with me not waiting enough.

Reclezon

10/22/2022, 12:16 AM

Law of diminishing returns! I remember now

legorunnerkid

10/22/2022, 12:53 AM

What's interesting about the source I'm using is that certain cutscenes with echos have two audio channels.

legorunnerkid

10/22/2022, 1:25 AM

I finally got 200 voiceclips transcribed now.

legorunnerkid

10/22/2022, 1:29 AM

I've got about 3 minutes of voicelines.

🏝✍Berkeley•Ray•Brewski

10/22/2022, 1:48 AM

As someone who went through 165; I can't help but be proud of you!

legorunnerkid

10/22/2022, 1:49 AM

Oh wait, I was meant to post this in #768215837248716819.

legorunnerkid

10/22/2022, 1:49 AM

Since I technically don't need help.

Gamma Prime

10/22/2022, 3:48 AM

Does setting a batch size of 1 hurt models?

Gamma Prime

10/22/2022, 3:48 AM

I did that on accident and didn't realize it until training started. 50 epochs in, the model was stroking pretty bad.

Gosmokeless28

10/22/2022, 5:05 AM

Yes

Gamma Prime

10/22/2022, 5:10 AM

I had a few samples in the dataset that might not have been doing him favors either. Turns out that Adobe Shasta maybe shouldn't be used to clean Transformer voices.

Eugene

10/22/2022, 2:02 PM

Is it hard to make a model that works with reference audio (instead of text to speech)?

Gosmokeless28

10/22/2022, 2:25 PM

AhmadGT

10/22/2022, 3:14 PM

is it normal that colab delete everything when disconnect?

AhmadGT

10/22/2022, 3:25 PM

also is the pipeline better or the legacy?

Reclezon

10/22/2022, 3:41 PM

Pipeline