Uberduck #machine-learning

Join Discord

WeegeeFan1

12/24/2022, 11:28 PM

And I do all of my synthesis locally so that would take a lot of it

WeegeeFan1

12/24/2022, 11:29 PM

This was supposed to be twice a WEEK not month

Amizade | Pony's voice creator

12/25/2022, 8:05 PM

Copy code

js
Window size: 320
Model: Vocal_HP_4BAND_3090_arch-124m
Parameter: 4band_44100
Aggressiveness: 0.3
High end process: mirroring2
TTA: True
Deep Extraction: False

loading model... Model loading failed, trying again...
loading model... Model loading failed, trying again...
loading model... Model loading failed, trying again...
loading model... Model loading failed, trying again...
An error has occurred: [Errno 2] No such file or directory: 'models/v5_new/Vocal_HP_4BAND_3090_arch-124m.pth'
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
/usr/lib/python3.8/shutil.py in move(src, dst, copy_function)
    790     try:
--> 791         os.rename(src, real_dst)
    792     except OSError:

FileNotFoundError: [Errno 2] No such file or directory: 'separated/1_Vocal_HP_4BAND_3090_arch-124m_Instruments.wav' -> '1/1_Vocal_HP_4BAND_3090_arch-124m_Instruments.wav'

During handling of the above exception, another exception occurred:

FileNotFoundError                         Traceback (most recent call last)
4 frames
/usr/lib/python3.8/shutil.py in copyfile(src, dst, follow_symlinks)
    262         os.symlink(os.readlink(src), dst)
    263     else:
--> 264         with open(src, 'rb') as fsrc, open(dst, 'wb') as fdst:
    265             # macOS
    266             if _HAS_FCOPYFILE:

FileNotFoundError: [Errno 2] No such file or directory: 'separated/1_Vocal_HP_4BAND_3090_arch-124m_Instruments.wav'

Amizade | Pony's voice creator

12/25/2022, 8:05 PM

how can i fix it?

hecko

12/25/2022, 8:23 PM

uhhh good question, it seems the file host they used for the model broke i propose https://mvsep.com/

Amizade | Pony's voice creator

12/25/2022, 8:25 PM

But do I have to wait in the queue?

Amizade | Pony's voice creator

12/25/2022, 8:25 PM

I've used this site before and it takes forever to wait

hecko

12/25/2022, 8:29 PM

odd, every time i used it it was pretty quick

hecko

12/25/2022, 8:30 PM

you can also run

Ultimate Vocal Remover

on your own computer but it takes up several gigabytes

Amizade | Pony's voice creator

12/25/2022, 8:32 PM

Is it from Colab?

hecko

12/25/2022, 8:33 PM

hecko

12/25/2022, 8:33 PM

colab only hosts the thing

hecko

12/25/2022, 8:34 PM

but it's the same model as on colab

TheRoyalRuby2000

12/25/2022, 8:34 PM

I finally have the voice data I need for Elmer. Of course I transcribe the first line as: "Shh! Be very quiet, I'm hunting rabbits." Right?

Amizade | Pony's voice creator

12/25/2022, 8:35 PM

oh ok

Amizade | Pony's voice creator

12/25/2022, 8:35 PM

thanks

hecko

12/25/2022, 8:37 PM

yeah

TheRoyalRuby2000

12/25/2022, 8:40 PM

At least he's FINALLY going to say Shhh! without the messed up transcription that has "Shh!" removed from Jacob's version. Jacob if you're reading this, no offense dude, but I think that model needs some work and I'm here to help.

Justin

12/27/2022, 4:18 AM

https://github.com/AbdullahAlfaraj/Auto-Photoshop-StableDiffusion-Plugin

mepc36

12/29/2022, 10:43 PM

Anyone know a TTS library where you can dictate the rhythm of the output using alphanumeric characters, instead of a reference track?

mepc36

12/29/2022, 10:43 PM

Something that could be called from a CLI like this:

mepc36

12/29/2022, 10:44 PM

/path/to/hypotheticalTTSbinary --output_text "Hello world" --output_rhythm 101

mepc36

12/29/2022, 10:45 PM

The 1's indicate emphasized syllables, and the 0 indicate non-emphasized syllables

mepc36

12/29/2022, 10:45 PM

I basically need a TTS library whose outputted .wavs have rhythms I can control programmatically by some method other than a reference track

mepc36

12/29/2022, 10:47 PM

A package that implements Speech Synthesis Markup Language (SSML) might work: https://www.w3.org/TR/speech-synthesis11/

hecko

12/29/2022, 11:07 PM

you may be looking for arpabet

hecko

12/29/2022, 11:07 PM

it's supported by some of the voices on uberduck

hecko

12/29/2022, 11:07 PM

you can convert with https://app.uberduck.ai/g2p

hecko

12/29/2022, 11:08 PM

and then adjust the numbers,

is stress

is no stress

is half stress

Uberduck

12/29/2022, 11:08 PM

@hecko quacked: hello world. {HH AH0 L OW1} {W ER1 L D}. {HH EH1 L OW0} {W ER1 L D}.