https://uberduck.ai/ logo
Join Discord
Powered by
# tacotron-2-support
  • t

    tylerdurdenceketi

    11/26/2022, 9:20 AM
    Add a new code block and type this: !pip install --upgrade gdown
  • l

    lil beby

    11/26/2022, 10:03 AM
    Guys, how to get in?
  • h

    hecko

    11/26/2022, 12:01 PM
    did you run the previous steps
  • h

    hecko

    11/26/2022, 12:02 PM
    are you using a school computer or school google account? if so then you can't
  • h

    hecko

    11/26/2022, 12:03 PM
    you'd have to use your own device with your own personal account
  • h

    hecko

    11/26/2022, 12:03 PM
    (or take it up with the computer person in your school but i doubt it'll work)
  • u

    9 x x

    11/26/2022, 5:03 PM
    yeah make sure you are not on a work account
  • u

    9 x x

    11/26/2022, 5:03 PM
    also
  • u

    9 x x

    11/26/2022, 5:03 PM
    #841437191073955920
  • u

    9 x x

    11/26/2022, 5:04 PM
    make sure the notebook is there
  • u

    9 x x

    11/26/2022, 5:04 PM
    and when you make a copy you are not switching accounts
  • s

    Sonic2022_mario

    11/26/2022, 6:57 PM
    I Just Uploaded Sonic (Roger Craig Smith, Frontiers) And Tested It And Sounds Like This
  • d

    Daft

    11/26/2022, 11:12 PM
    how come my model is pronouncing certain words wrong
  • l

    Lexi (delulu posts on the daily)

    11/27/2022, 12:42 AM
    because dataset could be silly or u could use arpabet
  • m

    mynameisNegan

    11/27/2022, 1:51 AM
    @hecko What's the best epoch for AITCH using Uberduck Tacotron 2 Pipeline? I have over 547 wav files total of 32:19 mins.
  • h

    hecko

    11/27/2022, 1:52 AM
    there's no hard number for things like that
  • h

    hecko

    11/27/2022, 1:52 AM
    just train it for some time and test a few of the checkpoints in the synthesis notebook
  • m

    mynameisNegan

    11/27/2022, 1:52 AM
    Okay.
  • t

    tylerdurdenceketi

    11/27/2022, 11:31 AM
    I have preprocessed files with preprocess_audio.py. Graphs were good. But speeches cut off early. So i ran preprocess files with preprocess_audio.py again with 300ms padding. But mel spectogram shows a black region at the end. I have read something about adding noise to audio tracks in github issues section of the tacotron2 repo. What should I do?
  • u

    (Dawn) Will Draw Fictional Women

    11/27/2022, 11:32 AM
    >But speeches cut off early.
  • u

    (Dawn) Will Draw Fictional Women

    11/27/2022, 11:32 AM
    you overtrained
  • u

    (Dawn) Will Draw Fictional Women

    11/27/2022, 11:32 AM
    even then a few tries would probably yield the full prompt
  • t

    tylerdurdenceketi

    11/27/2022, 11:33 AM
    Well loss ratio says otherwise.
  • t

    tylerdurdenceketi

    11/27/2022, 11:33 AM
    It was 0.30
  • u

    (Dawn) Will Draw Fictional Women

    11/27/2022, 11:33 AM
    if you get it TOO low it could be overtrained
  • u

    (Dawn) Will Draw Fictional Women

    11/27/2022, 11:34 AM
    if you would like tips on a third attempt i suggest multiple sentences in a single wav file
  • u

    (Dawn) Will Draw Fictional Women

    11/27/2022, 11:34 AM
    make sure they dont exceed 12 seconds tho
  • t

    tylerdurdenceketi

    11/27/2022, 11:35 AM
    Well I have 10000 wav files.
  • u

    (Dawn) Will Draw Fictional Women

    11/27/2022, 11:35 AM
    @hecko has a wav merger cell that should alleviate all the labor
  • t

    tylerdurdenceketi

    11/27/2022, 11:36 AM
    I set batch size 32 for 20 epoch. It stalled at 0.30 Then I set it to 128 Then it triggered lr_decay Learning rate got smaller and loss started to decrease
1...777879...158Latest