https://uberduck.ai/ logo
Join Discord
Powered by
# machine-learning
  • w

    WeegeeFan1

    12/24/2022, 11:28 PM
    And I do all of my synthesis locally so that would take a lot of it
  • w

    WeegeeFan1

    12/24/2022, 11:29 PM
    This was supposed to be twice a WEEK not month
  • a

    Amizade | Pony's voice creator

    12/25/2022, 8:05 PM
    Copy code
    js
    Window size: 320
    Model: Vocal_HP_4BAND_3090_arch-124m
    Parameter: 4band_44100
    Aggressiveness: 0.3
    High end process: mirroring2
    TTA: True
    Deep Extraction: False
    
    loading model... Model loading failed, trying again...
    loading model... Model loading failed, trying again...
    loading model... Model loading failed, trying again...
    loading model... Model loading failed, trying again...
    An error has occurred: [Errno 2] No such file or directory: 'models/v5_new/Vocal_HP_4BAND_3090_arch-124m.pth'
    ---------------------------------------------------------------------------
    FileNotFoundError                         Traceback (most recent call last)
    /usr/lib/python3.8/shutil.py in move(src, dst, copy_function)
        790     try:
    --> 791         os.rename(src, real_dst)
        792     except OSError:
    
    FileNotFoundError: [Errno 2] No such file or directory: 'separated/1_Vocal_HP_4BAND_3090_arch-124m_Instruments.wav' -> '1/1_Vocal_HP_4BAND_3090_arch-124m_Instruments.wav'
    
    During handling of the above exception, another exception occurred:
    
    FileNotFoundError                         Traceback (most recent call last)
    4 frames
    /usr/lib/python3.8/shutil.py in copyfile(src, dst, follow_symlinks)
        262         os.symlink(os.readlink(src), dst)
        263     else:
    --> 264         with open(src, 'rb') as fsrc, open(dst, 'wb') as fdst:
        265             # macOS
        266             if _HAS_FCOPYFILE:
    
    FileNotFoundError: [Errno 2] No such file or directory: 'separated/1_Vocal_HP_4BAND_3090_arch-124m_Instruments.wav'
  • a

    Amizade | Pony's voice creator

    12/25/2022, 8:05 PM
    how can i fix it?
  • h

    hecko

    12/25/2022, 8:23 PM
    uhhh good question, it seems the file host they used for the model broke i propose https://mvsep.com/
  • a

    Amizade | Pony's voice creator

    12/25/2022, 8:25 PM
    But do I have to wait in the queue?
  • a

    Amizade | Pony's voice creator

    12/25/2022, 8:25 PM
    I've used this site before and it takes forever to wait
  • h

    hecko

    12/25/2022, 8:29 PM
    odd, every time i used it it was pretty quick
  • h

    hecko

    12/25/2022, 8:30 PM
    you can also run
    Ultimate Vocal Remover
    on your own computer but it takes up several gigabytes
  • a

    Amizade | Pony's voice creator

    12/25/2022, 8:32 PM
    Is it from Colab?
  • h

    hecko

    12/25/2022, 8:33 PM
    no
  • h

    hecko

    12/25/2022, 8:33 PM
    colab only hosts the thing
  • h

    hecko

    12/25/2022, 8:34 PM
    but it's the same model as on colab
  • t

    TheRoyalRuby2000

    12/25/2022, 8:34 PM
    I finally have the voice data I need for Elmer. Of course I transcribe the first line as: "Shh! Be very quiet, I'm hunting rabbits." Right?
  • a

    Amizade | Pony's voice creator

    12/25/2022, 8:35 PM
    oh ok
  • a

    Amizade | Pony's voice creator

    12/25/2022, 8:35 PM
    thanks
  • h

    hecko

    12/25/2022, 8:37 PM
    yeah
  • t

    TheRoyalRuby2000

    12/25/2022, 8:40 PM
    At least he's FINALLY going to say Shhh! without the messed up transcription that has "Shh!" removed from Jacob's version. Jacob if you're reading this, no offense dude, but I think that model needs some work and I'm here to help.
  • j

    Justin

    12/27/2022, 4:18 AM
    https://github.com/AbdullahAlfaraj/Auto-Photoshop-StableDiffusion-Plugin
  • m

    mepc36

    12/29/2022, 10:43 PM
    Anyone know a TTS library where you can dictate the rhythm of the output using alphanumeric characters, instead of a reference track?
  • m

    mepc36

    12/29/2022, 10:43 PM
    Something that could be called from a CLI like this:
  • m

    mepc36

    12/29/2022, 10:44 PM
    /path/to/hypotheticalTTSbinary --output_text "Hello world" --output_rhythm 101
  • m

    mepc36

    12/29/2022, 10:45 PM
    The 1's indicate emphasized syllables, and the 0 indicate non-emphasized syllables
  • m

    mepc36

    12/29/2022, 10:45 PM
    I basically need a TTS library whose outputted .wavs have rhythms I can control programmatically by some method other than a reference track
  • m

    mepc36

    12/29/2022, 10:47 PM
    A package that implements Speech Synthesis Markup Language (SSML) might work: https://www.w3.org/TR/speech-synthesis11/
  • h

    hecko

    12/29/2022, 11:07 PM
    you may be looking for arpabet
  • h

    hecko

    12/29/2022, 11:07 PM
    it's supported by some of the voices on uberduck
  • h

    hecko

    12/29/2022, 11:07 PM
    you can convert with https://app.uberduck.ai/g2p
  • h

    hecko

    12/29/2022, 11:08 PM
    and then adjust the numbers,
    1
    is stress
    0
    is no stress
    2
    is half stress
  • u

    Uberduck

    12/29/2022, 11:08 PM
    @hecko quacked: hello world. {HH AH0 L OW1} {W ER1 L D}. {HH EH1 L OW0} {W ER1 L D}.
1...102610271028...1068Latest