https://uberduck.ai/ logo
Join Discord
Powered by
# tacotron-2-support
  • e

    Eugene

    10/21/2022, 11:15 PM
    Aight
  • r

    Reclezon

    10/21/2022, 11:17 PM
    Phones about dead so brb ig
  • e

    Eugene

    10/21/2022, 11:17 PM
    Thank you guys for helping me :)
  • r

    Reclezon

    10/21/2022, 11:20 PM
    Does sound a bit grainy for background noise but the voice is very good otherwise.
  • e

    Eugene

    10/21/2022, 11:24 PM
    Is there anything to conclude from these images?
  • u

    {K EY1} (Kei)

    10/21/2022, 11:27 PM
    That means the model is very good
  • u

    {K EY1} (Kei)

    10/21/2022, 11:27 PM
    The first thing is a map of the generated phonetics One on the right is an alignment graph (straight orange diagonal line is good) Emojis are just whatever is detected in the text
  • e

    Eugene

    10/21/2022, 11:29 PM
    Hmm
  • e

    Eugene

    10/21/2022, 11:30 PM
    I do hope that with more voice lines the model could become more realistic.
  • e

    Eugene

    10/21/2022, 11:30 PM
    And I also find quite some words in the output audios hard to understand.
  • r

    Reclezon

    10/21/2022, 11:42 PM
    More is always better, though mind that rule about exponential decreases or whatever.
  • e

    Eugene

    10/21/2022, 11:46 PM
    What's that rule?
  • l

    legorunnerkid

    10/21/2022, 11:55 PM
    I remember having my model sound very weird when I thought I had it.
  • l

    legorunnerkid

    10/21/2022, 11:55 PM
    I believe it had something to do with me not waiting enough.
  • r

    Reclezon

    10/22/2022, 12:16 AM
    Law of diminishing returns! I remember now
  • l

    legorunnerkid

    10/22/2022, 12:53 AM
    What's interesting about the source I'm using is that certain cutscenes with echos have two audio channels.
  • l

    legorunnerkid

    10/22/2022, 1:25 AM
    I finally got 200 voiceclips transcribed now.
  • l

    legorunnerkid

    10/22/2022, 1:29 AM
    I've got about 3 minutes of voicelines.
  • u

    🏝✍Berkeley•Ray•Brewski

    10/22/2022, 1:48 AM
    As someone who went through 165; I can't help but be proud of you!
  • l

    legorunnerkid

    10/22/2022, 1:49 AM
    Oh wait, I was meant to post this in #768215837248716819.
  • l

    legorunnerkid

    10/22/2022, 1:49 AM
    Since I technically don't need help.
  • g

    Gamma Prime

    10/22/2022, 3:48 AM
    Does setting a batch size of 1 hurt models?
  • g

    Gamma Prime

    10/22/2022, 3:48 AM
    I did that on accident and didn't realize it until training started. 50 epochs in, the model was stroking pretty bad.
  • g

    Gosmokeless28

    10/22/2022, 5:05 AM
    Yes
  • g

    Gamma Prime

    10/22/2022, 5:10 AM
    I had a few samples in the dataset that might not have been doing him favors either. Turns out that Adobe Shasta maybe shouldn't be used to clean Transformer voices.
  • e

    Eugene

    10/22/2022, 2:02 PM
    Is it hard to make a model that works with reference audio (instead of text to speech)?
  • g

    Gosmokeless28

    10/22/2022, 2:25 PM
    No
  • a

    AhmadGT

    10/22/2022, 3:14 PM
    is it normal that colab delete everything when disconnect?
  • a

    AhmadGT

    10/22/2022, 3:25 PM
    also is the pipeline better or the legacy?
  • r

    Reclezon

    10/22/2022, 3:41 PM
    Pipeline
1...636465...158Latest