https://uberduck.ai/ logo
Join Discord
Powered by
# machine-learning
  • m

    mepc36

    12/16/2022, 8:31 PM
    has anyone come across a tool that automatically removes unsuitable audio from a tts training dataset?
  • m

    mepc36

    12/16/2022, 8:32 PM
    Here are the following classes of unsuitable audio I'd like it to detect: 1. Audio has too much noise in it. 2. Audio's speech does not match transcription speech. 3. Speaker is speaking too quickly to be intelligibly understood. 4. Audio captures a different speaker than the labeled speaker. 5. Audio speech contains out-of-vocabulary words.
  • m

    mepc36

    12/16/2022, 8:33 PM
    I'm about to build a solution to do this so if anyone could save me a month of work by telling me that'd be great, haha
  • u

    (Dawn) Will Draw Fictional Women

    12/16/2022, 9:11 PM
    >out of vocabulary words
  • u

    (Dawn) Will Draw Fictional Women

    12/16/2022, 9:11 PM
    would that be an issue???
  • h

    haru0l

    12/17/2022, 4:54 PM
    @Gosmokeless28 apologies for the ping ™️ but is it fine if i used your spongebob dataset to train on diff-svc?
  • g

    Gosmokeless28

    12/17/2022, 7:46 PM
    That depends: Which SpongeBob dataset did you use?
  • g

    Gosmokeless28

    12/17/2022, 7:47 PM
    Cuz one of them was originally made by Speaking of AI, not me
  • m

    MegaKeith

    12/18/2022, 1:36 AM
    Hi I wish to train a voice clone for steve jobs and I got a rtx 4090... I just wonder is this gpu good enough for training?
  • g

    Gosmokeless28

    12/18/2022, 1:38 AM
    I assume so, but why do you need your own GPU? Are you going to train the model locally or something?
  • m

    MegaKeith

    12/18/2022, 1:41 AM
    Oh gotcha I can use Colab!
  • h

    hecko

    12/18/2022, 1:42 AM
    4090 is probably like 5x better than colab
  • h

    hecko

    12/18/2022, 1:42 AM
    that being said it does take some effort and knowledge to set up local training
  • h

    hecko

    12/18/2022, 1:43 AM
    i tried to make the pipeline notebook not depend on colab but it hasn't been tested outside of it
  • h

    hecko

    12/18/2022, 1:44 AM
    and it does still depend on linux, specifically debian/ubuntu/etc
  • m

    MegaKeith

    12/18/2022, 1:46 AM
    ops I do not have linux though 😢
  • h

    hecko

    12/18/2022, 1:49 AM
    ,,though come to think of it the parts that depend on linux are mostly just the dataset loader which you probably won't need
  • h

    hecko

    12/18/2022, 1:50 AM
    i know @Justin trains talknet on windows, idk about tacotron though
  • h

    haru0l

    12/18/2022, 2:37 AM
    that would be the recent one in #835647732453605376
  • c

    Cris140

    12/18/2022, 12:20 PM
    It's easy to set up with Anaconda
  • g

    Gosmokeless28

    12/18/2022, 6:41 PM
    But there are two recent ones in datasets. Can you specify which one?
  • g

    GaryThisSide

    12/19/2022, 8:19 AM
    so hi
  • g

    GaryThisSide

    12/19/2022, 8:19 AM
    when i was trying to fit data in the modle
  • g

    GaryThisSide

    12/19/2022, 8:19 AM
    random forest regressor
  • g

    GaryThisSide

    12/19/2022, 8:19 AM
    im getting this error
  • g

    GaryThisSide

    12/19/2022, 8:19 AM
    how i can deal with it
  • m

    mepc36

    12/19/2022, 2:30 PM
    I would think it was, why wouldn't it be? I always thought that transcription engines like Kaldi check audio against a pre-defined dictionary. At the very least I've gotten OOV errors when using forced word alignment tools like Gentle: https://github.com/lowerquality/gentle Is your experience different? I'd love to know if so, thanks @(Dawn) Will Draw Fictional Women
  • m

    mepc36

    12/19/2022, 2:32 PM
    On this topic, is there any solution for easily turning Colab notebooks into web-accessible servers with RESTful APIs? I had to spin up a JS server on a GPU from AWS' EC2 and call python3 as a child process in order to do some audio synthesis for the prod env of my app. It was super manual, ugh
  • h

    hecko

    12/19/2022, 2:49 PM
    i've seen some notebooks use ngrok and cloudflare and such but for production i highly recommend against this, you'd have to manually restart it every day even with pro+ you should be able to synthesize without a gpu though, last i heard uberduck synthesizes on cpu
  • m

    mepc36

    12/19/2022, 2:50 PM
    oh thats awesome if thats true, ill look into it! thanks
1...102410251026...1068Latest