Uberduck #machine-learning

Join Discord

tylerdurdenceketi

09/15/2022, 8:16 AM

hello is there a way to train at lower sample rates without issues?

PixPrucer

09/15/2022, 8:19 AM

Why would you want to train on anything lower than 22.05kHz though?

PixPrucer

09/15/2022, 8:19 AM

I can't imagine anything else than reducing the training time or fitting the parameters to the WAV clips

tylerdurdenceketi

09/15/2022, 8:27 AM

both currently, i am trying to train on turkish dataset has wavs with 16khz tried 22khz with international phonetic alphabet but didn't work

PixPrucer

09/15/2022, 8:28 AM

You can upsample the wavs to be 22kHz easily

tylerdurdenceketi

09/15/2022, 8:28 AM

yeah but it increases the training time 😦

PixPrucer

09/15/2022, 8:28 AM

Not by much

PixPrucer

09/15/2022, 8:29 AM

You'll be fine

PixPrucer

09/15/2022, 8:29 AM

The real pain is training full-resolution WAVs with 44.1kHz I don't think anyone attempted that yet here

tylerdurdenceketi

09/15/2022, 8:30 AM

that's overkill

PixPrucer

09/15/2022, 8:31 AM

I actually accidentally trained a 44khz model once and it was fine, but 0.5 tempo 🧑‍🦲

PixPrucer

09/15/2022, 8:31 AM

Oh and the pitch was very much broken, metal growl kind of vibe

tylerdurdenceketi

09/15/2022, 8:32 AM

resampling and nyquist shit probably

tylerdurdenceketi

09/15/2022, 8:33 AM

what would you suggest about alphabets? should i use ipa or turkish alphabet (basic cleaner)

PixPrucer

09/15/2022, 8:33 AM

I've heard IPA produces more accurate results prior to text transcripts

tylerdurdenceketi

09/15/2022, 8:35 AM

how much wav should i have i got a dataset with transcriptions from internet i have cleaned transcriptions and such in the end i got 251 wav file

PixPrucer

09/15/2022, 8:37 AM

I'd count by minutes, because a 250 WAV dataset can have either 7 or 25 minutes of data

tylerdurdenceketi

09/15/2022, 8:39 AM

that's not enough i suppose i have tried ipa with 10 speakers speech was unrecognizable i will try to make a synthetic dataset using tts reader or something

PixPrucer

09/15/2022, 8:45 AM

That will do pretty good as a basemodel

hecko

09/15/2022, 8:52 AM

i vaguely tried but it seems i'd at least have to make a new base model

hecko

09/15/2022, 8:52 AM

which takes weeks

hecko

09/15/2022, 8:52 AM

especially on t4

hecko

09/15/2022, 8:52 AM

ipa won't work on uberduck without extra coding

PixPrucer

09/15/2022, 9:31 AM

Ah r.i.p

{K EY1} (Kei)

09/15/2022, 1:42 PM

That's why a bunch of my models aren't on uberduck 🤭

Cris140

09/15/2022, 3:03 PM

The way it's been implemented crashes after some time because of Ram, I had to change some stuff to get it working, but now it's working perfectly

zwf

09/15/2022, 3:04 PM

you're the man 💯

HolyArapaima

09/15/2022, 3:04 PM

I trained a model before this with accidentally a different sample rate from everything else and it came out half demon. When I fixed it the training came out worse somehow so I am gonna do some investigations today and see whats up.

hecko

09/15/2022, 3:24 PM

iirc setting the sample rate to 44100 while the wavs are 44100 makes the audio sound chipmunky

hecko

09/15/2022, 3:25 PM

and to get a proper pitch you have to do 22050 * √2 and even then it's slower than it should be