https://uberduck.ai/ logo
Join Discord
Powered by
# talknet-support
  • r

    Reclezon

    12/27/2022, 3:34 AM
    Hold the fuck up
  • r

    Reclezon

    12/27/2022, 3:34 AM
    There's a web app for audacity?
  • r

    Reclezon

    12/27/2022, 3:35 AM
  • p

    PixPrucer

    12/28/2022, 6:48 AM
    not really talknet but HiFi-GAN finetuning Currently trying to run the notebook but I'm getting stuck at the
    Compose required files
    cell
  • p

    PixPrucer

    12/28/2022, 6:48 AM
    No matter where I put the tacotron model at it refuses to load it in (I even tried putting it into
    /content/hifi-gan/
    folder and same error occurs)
  • x

    xomnow

    12/30/2022, 12:20 AM
    Question on your recommended settings if I may... I've got the notebook running locally on my gpu and I'm able to run larger batch sizes that the defaults. Is there any benefit to this or do I run the risk of overfitting my models if I tinker with batch size?
  • g

    Gosmokeless28

    12/30/2022, 2:34 AM
    The only benefit of using a larger batch size is that it makes the training happen faster—but at the risk of underfitting the model.
  • x

    xomnow

    12/30/2022, 2:36 AM
    thanks for that - one other question if I may - should the learning rates for the other training steps remain default, or is there benefit in changing to the 1e-4/3e-7 on those as well?
  • g

    Gosmokeless28

    12/30/2022, 2:37 AM
    I've never tinkered with those, actually. In my opinion, it's alright to leave them unchanged.
  • g

    Gosmokeless28

    12/30/2022, 2:37 AM
    I only change the Spectrogram Generator's parameters
  • x

    xomnow

    12/30/2022, 2:39 AM
    ty kindly for the answers - I have a suspicion I've been over-training things since I worked out completely local training. I have solid datasets and fully checked transcriptions but the results haven't been nearly as good as I expected
  • g

    Gosmokeless28

    12/30/2022, 2:40 AM
    It took me months to realize that I have been actually lowering the learning rates instead of raising them, lol.
  • g

    Gosmokeless28

    12/30/2022, 2:41 AM
    I thought 1e-4 & 3e-7 were higher than 1e-3 & 3e-6. It wasn't until recently that I learned that those are negative numbers.
  • x

    xomnow

    12/30/2022, 2:46 AM
    is there any other "special sauce" advice you think particularly relevant? all my data is from audiobooks with generally clear recordings, but I can't seem to get past the slight metallic tinge when I synthesize
  • g

    Gosmokeless28

    12/30/2022, 2:49 AM
    You should train the TalkNet model's HiFi-GAN vocoder for 3,100 epochs (Not to be confused with 3,100 steps).
  • x

    xomnow

    12/30/2022, 2:50 AM
    I want to make sure I have that right... 3100 epochs? this is contrary to where you said 5k steps (though, to be sure, you mention not to worry about overfitting)
  • g

    Gosmokeless28

    12/30/2022, 2:51 AM
    For HiFi-GAN, training for as many epochs as you can actually causes the vocoder to perform better for the TalkNet model it belongs to.
  • x

    xomnow

    12/30/2022, 2:52 AM
    this is a bit of a weird question - roughly how many steps are in an epoch? I can't for the life of me get my local running copy to output steps/epochs as the colab one does. I ginned up a way to parse the log to see steps, but I don't see epochs
  • g

    Gosmokeless28

    12/30/2022, 2:55 AM
    Good question. That depends on the amount of data you're training the model with. If you're training HiFi-GAN with a large amount of data, there are many steps per epoch.
  • x

    xomnow

    12/30/2022, 2:56 AM
    I'm trying to look back on old runs on colab - looks like ~650/epoch, something in that neighborhood
  • x

    xomnow

    12/30/2022, 2:57 AM
    wow, so. something akin to ~2mil steps
  • h

    hudmaceachern

    12/31/2022, 2:07 PM
    for some reason now this happens: --------------------------------------------------------------------------- MessageError Traceback (most recent call last) in 1 #@markdown Step 2: Mount Google Drive. 2 from google.colab import drive ----> 3 drive.mount('drive') 3 frames /usr/local/lib/python3.8/dist-packages/google/colab/_message.py in read_reply_from_input(message_id, timeout_sec) 100 reply.get('colab_msg_id') == message_id): 101 if 'error' in reply: --> 102 raise MessageError(reply['error']) 103 return reply.get('data', None) 104 MessageError: Error: credential propagation was unsuccessful
  • r

    Reclezon

    12/31/2022, 2:54 PM
    It didn't sign into GDrive
  • f

    Felipixel

    12/31/2022, 8:29 PM
    Hello, I have a question, I'm training a singing talknet model with 228 wavs, what are the best steps for it? I trained one model before with 150 wavs and if I'm not wrong I had around 4000 steps, this new 228 wav model was trained around 5300 steps, and it sounded horrible compared to the 150 wav one, then I tried to test and train again my old 150 wav model this time with 10k steps and it end up sounding as bad as the 228 wavs one TL;DR: What are the best steps while training a 228 wavs model?
  • a

    Alexius08

    01/02/2023, 6:40 AM
    The reference audio I'm using is still messy after using vocal isolators on it. Is the "debug pitch" button revealing the version of the reference audio being used to generate the resulting audio? It's something I might be able to clean up.
  • u

    (Dawn) Will Draw Fictional Women

    01/02/2023, 7:03 AM
    no its
  • u

    (Dawn) Will Draw Fictional Women

    01/02/2023, 7:03 AM
    debugging the pitch detection
  • u

    (Dawn) Will Draw Fictional Women

    01/02/2023, 7:04 AM
    it takes the clip as a whole because it has to take into account both pitch and duration
  • w

    WeegeeFan1

    01/03/2023, 1:14 AM
    The first is SUPPOSED to be the regurtitation of the second. I trained the model on the dataset that includes the second clip. Something isn't normal
  • w

    WeegeeFan1

    01/03/2023, 1:15 AM
    I'm really hoping Diff SVC can have some of talknet and the way it should function into it
1...575859...74Latest