Uberduck #machine-learning

Join Discord

user

04/14/2021, 10:32 PM

And I think this was Applejack's encoder on RJ Banks, giving him an accent.

zwf

04/14/2021, 10:47 PM

very interesting! can you post some output from the original models as comparison?

user

04/14/2021, 11:20 PM

RJ Banks, no pony

user

04/14/2021, 11:21 PM

Applejack, no Banks

user

04/14/2021, 11:22 PM

The voice blending was disappointing, it just kinda sounded like the two overlapped, rather than like the throat and mouth of the two averaged out, which I suppose makes sense with it being, what, noise vocoded with spectrograms of voices?

user

04/14/2021, 11:24 PM

RJ Banks encoder on pony

user

04/14/2021, 11:24 PM

Accent? Gone

user

04/14/2021, 11:24 PM

At least that part works well enough so something was gained from fucking with it all.

user

04/14/2021, 11:28 PM

Seems to be that the encoder* parts impacts accents, and the decoder* parts are the voice timbre

user

04/14/2021, 11:32 PM

Here's my mess of a notebook, I ain't cleaned it or nothing since it's a sandbox. The blending part checks all the named entries in the model dict and if it doesn't match a name you entered it gets deleted, and what's left is used to replace values in another model. The google drive links to models in the code cell there are my own, except applejack, so don't expect them to stick around forever.

user

04/14/2021, 11:33 PM

Run

print(torch.load('RJBANKS')['state_dict'].keys())

to get a list of all the parameters in the saved models

user

04/14/2021, 11:35 PM

decoder.attention_rnn* seems to handle some of the gruffness and emphasis / stress.

user

04/14/2021, 11:41 PM

decoder.linear_projection* from RJ Banks onto pony, suddenly it's all smoker

zwf

04/15/2021, 12:06 AM

I've tried freezing various layers to try to train better on small datasets but it didn't really work well

F.B

04/15/2021, 12:24 AM

here's a few samples of my oomerang Annoucer Model with a whopping 86 audio Files

F.B

04/15/2021, 12:24 AM

of data

SidPlays_144p

04/15/2021, 12:27 AM

that's John O'Hurley, right? i think he's got some audiobooks that'd work for data

F.B

04/15/2021, 12:27 AM

yep

F.B

04/15/2021, 1:27 AM

this what i call: perfect

SidPlays_144p

04/15/2021, 1:51 AM

Amazing

mega b

04/15/2021, 3:24 AM

Is there any better notebooks or programs than https://colab.research.google.com/drive/18lBRBWOs4uV1DjhoW_fVzoydYUw400PW to transcribe audio?

F.B

04/15/2021, 3:06 PM

I realize that The "Josie And the pussycats" had another A in it's Transcript so i fixed It

F.B

04/15/2021, 11:10 PM

Now I wait

Toasty

04/15/2021, 11:34 PM

The mystery of figuring out the transcript problem continues

mega b

04/16/2021, 3:08 AM

Would a batch size of 16 be good for 100 wav files?

mega b

04/16/2021, 4:21 AM

new training with hand-written transcription

mega b

04/16/2021, 4:22 AM

much better

mega b

04/16/2021, 4:22 AM

epoch 120

mega b

04/16/2021, 5:23 AM

epoch 200 🎉

mega b

04/16/2021, 5:23 AM

Epoch: 200 Validation loss 1207:  0.103400