https://uberduck.ai/ logo
Join Discord
Powered by
# tacotron-2-support
  • b

    Butt Witch Simp

    12/20/2022, 2:49 AM
  • r

    Reclezon

    12/20/2022, 2:50 AM
    Most annoying part is the actual collection of the audio, up to 99% of the rest can mostly be done automatically, like transcription, though usually you'll have to correct errors that may have been given.
  • r

    Reclezon

    12/20/2022, 2:51 AM
    Once you do get audio, and b4 cutting it save some time and run it through a cleaner like mvsep.com so you don't have to clean individual files.
  • r

    Reclezon

    12/20/2022, 2:51 AM
    It may take longer to upload smaller files than one long one.
  • p

    Pastelgothsloth

    12/20/2022, 2:52 AM
    So I could stitch together the files into a single longer file? And then run it through the cleaner?
  • r

    Reclezon

    12/20/2022, 2:53 AM
    ?
  • r

    Reclezon

    12/20/2022, 2:54 AM
    Uh,
  • r

    Reclezon

    12/20/2022, 2:56 AM
    •Dl source file (mp4, mp3, whatever) — If there's background noise, clean audio • once clean audio is obtained, then identify correct speaker and cut into wavs
  • r

    Reclezon

    12/20/2022, 2:57 AM
    Must be 16 bit @ 22050 HZ wavs
  • r

    Reclezon

    12/20/2022, 2:57 AM
    I'd say stitchkng them back after cutting isnt ideal?
  • r

    Reclezon

    12/20/2022, 2:57 AM
    You're still seperating them back either way
  • p

    Pastelgothsloth

    12/20/2022, 2:57 AM
    Ok, gotchu
  • b

    Butt Witch Simp

    12/20/2022, 2:59 AM
    I don't understand what any of this means so you'll have to explain that. xD
  • r

    Reclezon

    12/20/2022, 3:01 AM
    Reposting this as reference if it helps how to format it. Not sure how else to xplain
  • b

    Butt Witch Simp

    12/20/2022, 3:06 AM
    Your destination will be on the right?? 😭
  • p

    Pastelgothsloth

    12/20/2022, 3:09 AM
    Okay, that makes sense!
  • r

    Reclezon

    12/20/2022, 3:12 AM
    Google maps. Just a random file used as example.
  • r

    Reclezon

    12/20/2022, 3:14 AM
    If you were to make one urself from a different voice, you'd follow the same format of
    directory/file.wav|transcript
  • t

    Trilly

    12/20/2022, 3:18 AM
    Does the API work with all voices or only commercial voices?
  • b

    Butt Witch Simp

    12/20/2022, 3:59 AM
    Do we need to pay money for this in any way btw?
  • r

    Reclezon

    12/20/2022, 4:10 AM
    No@
  • r

    Reclezon

    12/20/2022, 4:15 AM
    Shit's open source too.
  • g

    Gosmokeless28

    12/20/2022, 4:33 AM
    Not unless you want a commission granter to do the work for you
  • b

    Butt Witch Simp

    12/20/2022, 4:49 AM
    Ah okay, then there’s no problem. ^^
  • h

    hecko

    12/20/2022, 1:50 PM
    it's allowed, but if they don't sound like the character then it might make the model worse if you insist though (and are using pipeline) then i recommend you put them in separate datasets so that the ai knows it's only supplementary data and not the exact voice it's supposed to do
  • h

    hecko

    12/20/2022, 8:17 PM
    ok i figured it out,
    prepare_input_sequence
    has the input
    text_cleaner
    which i hadn't set so it defaulted to english sorry it took so long, i forgot about it until the turkish team stumbled upon the same issue
  • p

    PixPrucer

    12/20/2022, 8:18 PM
    Hihi thank you for eventually fixing it 🙏
  • h

    hecko

    12/20/2022, 8:20 PM
    now you're on the bugfinder list
  • p

    PixPrucer

    12/20/2022, 8:20 PM
    nifty
  • p

    PixPrucer

    12/20/2022, 8:20 PM
    I got to test the synthesis notebook again sometime around
1...949596...158Latest