Uberduck #tacotron-2-support

Join Discord

Cris140

03/20/2023, 3:07 PM

Check if the notebook you're using is the one from #841437191073955920 , if not, try using it

k24789304

03/20/2023, 4:18 PM

thanks, but im getting another issue

Copy code

126 dirs = [x for x in os.scandir(target_path)]
    127 if len(dirs) == 1:
--> 128   with open(sorted(glob.glob(os.path.join(dirs[0].path, "*.txt")))[0]) as f:
    129     already_multispeaker = (len(f.readline().split("|")) == 3)
    130 
IndexError: list index out of range

k24789304

03/20/2023, 4:22 PM

might be due to the fact that i wrote "transcripts" instead of "transcription"

Cris140

03/20/2023, 4:35 PM

make sure your txt has no blank lines in it

k24789304

03/20/2023, 4:50 PM

hmm still getting the same issue

Cris140

03/20/2023, 5:05 PM

send the txt here

Cris140

03/20/2023, 5:08 PM

It's missing the wavs location

k24789304

03/20/2023, 5:09 PM

does tacotron have its own formatting

k24789304

03/20/2023, 5:09 PM

i used whisper for this extract

Cris140

03/20/2023, 5:10 PM

yes

Cris140

03/20/2023, 5:11 PM

(wav location)|(text)

Cris140

03/20/2023, 5:11 PM

so, something like

k24789304

03/20/2023, 5:11 PM

swag

Cris140

03/20/2023, 5:11 PM

wavs/01.wav|Why are you talking to us yes I am.

Cris140

03/20/2023, 5:11 PM

Also, there's a bunch of lines with one word, you'll have to clean that by hand

Cris140

03/20/2023, 5:11 PM

One word won't be enough for one audio

k24789304

03/20/2023, 5:18 PM

is the format correct?

k24789304

03/20/2023, 5:49 PM

lol should be .wav

k24789304

03/20/2023, 6:14 PM

cool, still getting the same error

The Watts and the Waves

03/20/2023, 6:15 PM

try adding punctuation at the end of each line

k24789304

03/20/2023, 6:16 PM

is it related to format?

The Watts and the Waves

03/20/2023, 6:17 PM

YEAH I dunno if pipeline does it but legacy always threw a fit when we didn't close a line with at LEAST a period so your first line should prolly be smth like

wavs/cooltext.wav|Maybe now.

The Watts and the Waves

03/20/2023, 6:19 PM

and you can use most punctuation, I've gotten away with having questions and exclamations and it read em just fine

k24789304

03/20/2023, 6:30 PM

hmmmmmmmmmmmmmmmmmmm still scuffed

k24789304

03/20/2023, 6:41 PM

maybe i'll just use tcotron's transcribe

k24789304

03/20/2023, 6:55 PM

any cool tips on how to ensure quality of data model? how important is data cleaning? should i make sure non human voice is removed? any way i can avoid paying premium on google colab while transcribing?

Reclezon

03/20/2023, 8:07 PM

Clean data is very important, especially with dataset, i wanna say 20 minutes total or less? Bigger ones can handle more, like with some whistles or just noise that couldn't be removed in general by cleaners. Effects can be fine as long as itd not too harsh. There's also the Whisper NB for transcripts which does a decent job for what I've thrown at it.

Reclezon

03/20/2023, 8:07 PM

I have used the large model though so idk how the smaller ones fair

Cris140

03/20/2023, 9:04 PM

You have only 1 wav?

Cris140

03/20/2023, 9:05 PM

each line must point to a different file which corresponds to the transcription after the "|"