Uberduck #machine-learning

Join Discord

hecko

12/29/2022, 11:09 PM

iiiiiit's subtle

(Dawn) Will Draw Fictional Women

12/29/2022, 11:14 PM

i was about to test this with a low data voice of mine but then i realized most of them are pipeline

hecko

12/29/2022, 11:16 PM

and?

(Dawn) Will Draw Fictional Women

12/29/2022, 11:17 PM

the plaintext one would be autoconverted

(Dawn) Will Draw Fictional Women

12/29/2022, 11:17 PM

redundant central

hecko

12/29/2022, 11:18 PM

i mean this one was autoconverted too

(Dawn) Will Draw Fictional Women

12/29/2022, 11:18 PM

i thought chills was legacy?

hecko

12/29/2022, 11:18 PM

was gonna remove it but didn't for i guess clarity

hecko

12/29/2022, 11:18 PM

yeah but arpabet

hecko

12/29/2022, 11:18 PM

in fact he was the arpa base

(Dawn) Will Draw Fictional Women

12/29/2022, 11:18 PM

oh pure arpa?

(Dawn) Will Draw Fictional Women

12/29/2022, 11:18 PM

i thought it was a mixed model

hecko

12/29/2022, 11:18 PM

not sure

(Dawn) Will Draw Fictional Women

12/29/2022, 11:19 PM

i thought you were comparing plaintext to the two arpa strings

hecko

12/29/2022, 11:19 PM

point is he has arpa on and that means everything gets autoconverted

(Dawn) Will Draw Fictional Women

12/29/2022, 11:19 PM

(Dawn) Will Draw Fictional Women

12/29/2022, 11:20 PM

do i have any mixed models up on the site i wonder…

(Dawn) Will Draw Fictional Women

12/29/2022, 11:20 PM

i cant check because profile voice lists still broken

mepc36

12/29/2022, 11:37 PM

Good look, I need to be able to deviate away from a word's normal verbal stress though (which is what ARPAbet identifies.) This is because I'm creating music out of the text, not just speech, so sometimes the word will deviate away from its normal pronunciation, and sometimes we need to ignore certain small words (like "today") altogether.

mepc36

12/29/2022, 11:37 PM

I think I found the answer though, I'm surprised a search of "SSML" (and "Speech Synthesis Markup Language") of this discord srvr came up nil though.

mepc36

12/29/2022, 11:38 PM

For posterity, I'm trying AWS Polly to do this. I tried Google's Text-To-Speech but their authentication system is so unnecessarily complciated to me: https://docs.aws.amazon.com/polly/latest/dg/supportedtags.html

mepc36

12/29/2022, 11:39 PM

hecko do you work for uberduck? You're always on top of this stuff, thank you for that

hecko

12/29/2022, 11:39 PM

not really work, just moderate things

Reclezon

12/29/2022, 11:43 PM

Isn't there like, no commonly accepted standard? SSML is made for this application, but not everyone supports it. Arpabet is made only for American English which is not helpful

Reclezon

12/29/2022, 11:43 PM

IPA idek what's the opinion on that

hecko

12/29/2022, 11:44 PM

ssml is supported by all the big players really

hecko

12/29/2022, 11:44 PM

microsoft, amazon, google,

hecko

12/29/2022, 11:44 PM

for phonemes there's also a universal system, ipa, but idk how much that is supported

hecko

12/29/2022, 11:45 PM

i know amazon supports it, and uberduck has an ipa symbol set but apparently there are issues with getting it to work

Reclezon

12/29/2022, 11:46 PM

Ik pepe have treied to train IPA but I haven't heard much from other than saying they would