https://uberduck.ai/ logo
Join Discord
Powered by
# machine-learning
  • r

    Reclezon

    11/05/2022, 2:44 AM
    3 wavs? Would shoot towards having less than 0.10 loss value to start
  • r

    Reclezon

    11/05/2022, 2:45 AM
    This is obviously an improvement over sounding like LJSpeech, but it's not right on the mark.
  • a

    Amizade | Pony's voice creator

    11/05/2022, 2:46 AM
    Okay, but where did I go wrong was in the epoch?
  • r

    Reclezon

    11/05/2022, 2:48 AM
    Nothing wrong with the epoch, it's mostly the loss you need to watch for. Some models may need more epochs to train on, some may not
  • a

    Amizade | Pony's voice creator

    11/05/2022, 2:48 AM
    Oh, I understand
  • a

    Amizade | Pony's voice creator

    11/05/2022, 2:49 AM
    but it took me almost 1 hour to wait for the model to be ready
  • r

    Reclezon

    11/05/2022, 3:00 AM
    I've only gotten accidentally lucky twice on each nb so far.. 😅 Consistent testing is definitely avoids any risk any any further pains of trying to correct models
  • r

    Reclezon

    11/05/2022, 3:02 AM
    The lucky ones weren't as great as they probably could've really been either imo
  • a

    Amizade | Pony's voice creator

    11/05/2022, 3:03 AM
    Later I will see how epoch works
  • r

    Reclezon

    11/05/2022, 3:18 AM
    Speaking of that here is a sample from one of that lucky 1 wave attempt.
  • r

    Reclezon

    11/05/2022, 3:19 AM
    Can't dl audio from the first model
  • a

    Amizade | Pony's voice creator

    11/05/2022, 3:19 AM
  • a

    Amizade | Pony's voice creator

    11/05/2022, 3:20 AM
    The voice sounds so cool
  • a

    Amizade | Pony's voice creator

    11/05/2022, 3:20 AM
    no robotic and no repetitive
  • r

    Reclezon

    11/05/2022, 3:24 AM
    I did start training it with a very low LR and kinda fell asleep halfway through so I'm actually surprised it turned out good. Sample using the earliest model i can grab audio from which does use more wavs (even more if I can get my shit together and do it :))
  • a

    Amizade | Pony's voice creator

    11/05/2022, 3:27 AM
    that's nice. you guys from uberduck create a lot of perfect voices
  • w

    WeegeeFan1

    11/10/2022, 4:40 PM
    What does HIFI-GAN actaully do?
  • w

    WeegeeFan1

    11/10/2022, 4:41 PM
    I know what the other 2 steps do, but not Hifigan.
  • w

    WeegeeFan1

    11/10/2022, 4:41 PM
    I'm talking in relation to Talknet2 model training
  • h

    hecko

    11/10/2022, 5:04 PM
    basically, talknet doesn't output raw audio because that's too hard to measure the quality of instead it outputs something called a spectrogram, which is a 2d image of what frequencies play where and then it's the job of hifi-gan to turn that into actual audio
  • w

    WeegeeFan1

    11/10/2022, 5:07 PM
    Ahh
  • w

    WeegeeFan1

    11/10/2022, 5:07 PM
    So why does running it longer make it sound better?
  • w

    WeegeeFan1

    11/10/2022, 5:12 PM
    Also, I have a singer who has a somewhat rigid range of singing. Because I have not found any data of him singing certain notes, the AI singing will yell the sounds of a seizure in place of those notes. But once I give it a note I've actaully given it, it will do well. Is there a way to artificially generate these spaces in the vocal range? Is there a bit of the training I might be able to train longer to generate this?
  • h

    hecko

    11/10/2022, 5:32 PM
    because you're training it to get good at specifically converting the current model's output into stuff that sounds like the training data the base model is already pretty good granted
  • h

    hecko

    11/10/2022, 5:33 PM
    you could try making pitch-shifted copies of the data using something like melodyne or newtone or vocalshifter
  • w

    WeegeeFan1

    11/10/2022, 5:33 PM
    Would that be cheating or do you just need to do that sometimes?
  • h

    hecko

    11/10/2022, 5:33 PM
    why would it be cheating
  • h

    hecko

    11/10/2022, 5:34 PM
    why would anything be cheating really
  • w

    WeegeeFan1

    11/10/2022, 5:34 PM
    I'll start keeping a note of every half-step I have musical data of
  • h

    hecko

    11/10/2022, 5:34 PM
    if it gives a better result then by all means do it
1...100110021003...1068Latest