https://uberduck.ai/ logo
Join Discord
Powered by
# machine-learning
  • u

    user

    04/14/2021, 10:32 PM
    And I think this was Applejack's encoder on RJ Banks, giving him an accent.
  • z

    zwf

    04/14/2021, 10:47 PM
    very interesting! can you post some output from the original models as comparison?
  • u

    user

    04/14/2021, 11:20 PM
    RJ Banks, no pony
  • u

    user

    04/14/2021, 11:21 PM
    Applejack, no Banks
  • u

    user

    04/14/2021, 11:22 PM
    The voice blending was disappointing, it just kinda sounded like the two overlapped, rather than like the throat and mouth of the two averaged out, which I suppose makes sense with it being, what, noise vocoded with spectrograms of voices?
  • u

    user

    04/14/2021, 11:24 PM
    RJ Banks encoder on pony
  • u

    user

    04/14/2021, 11:24 PM
    Accent? Gone
  • u

    user

    04/14/2021, 11:24 PM
    At least that part works well enough so something was gained from fucking with it all.
  • u

    user

    04/14/2021, 11:28 PM
    Seems to be that the encoder* parts impacts accents, and the decoder* parts are the voice timbre
  • u

    user

    04/14/2021, 11:32 PM
    Here's my mess of a notebook, I ain't cleaned it or nothing since it's a sandbox. The blending part checks all the named entries in the model dict and if it doesn't match a name you entered it gets deleted, and what's left is used to replace values in another model. The google drive links to models in the code cell there are my own, except applejack, so don't expect them to stick around forever.
  • u

    user

    04/14/2021, 11:33 PM
    Run
    print(torch.load('RJBANKS')['state_dict'].keys())
    to get a list of all the parameters in the saved models
  • u

    user

    04/14/2021, 11:35 PM
    decoder.attention_rnn* seems to handle some of the gruffness and emphasis / stress.
  • u

    user

    04/14/2021, 11:41 PM
    decoder.linear_projection* from RJ Banks onto pony, suddenly it's all smoker
  • z

    zwf

    04/15/2021, 12:06 AM
    I've tried freezing various layers to try to train better on small datasets but it didn't really work well
  • f

    F.B

    04/15/2021, 12:24 AM
    here's a few samples of my oomerang Annoucer Model with a whopping 86 audio Files
  • f

    F.B

    04/15/2021, 12:24 AM
    of data
  • s

    SidPlays_144p

    04/15/2021, 12:27 AM
    that's John O'Hurley, right? i think he's got some audiobooks that'd work for data
  • f

    F.B

    04/15/2021, 12:27 AM
    yep
  • f

    F.B

    04/15/2021, 1:27 AM
    this what i call: perfect
  • s

    SidPlays_144p

    04/15/2021, 1:51 AM
    Amazing
  • m

    mega b

    04/15/2021, 3:24 AM
    Is there any better notebooks or programs than https://colab.research.google.com/drive/18lBRBWOs4uV1DjhoW_fVzoydYUw400PW to transcribe audio?
  • f

    F.B

    04/15/2021, 3:06 PM
    I realize that The "Josie And the pussycats" had another A in it's Transcript so i fixed It
  • f

    F.B

    04/15/2021, 11:10 PM
    Now I wait
  • t

    Toasty

    04/15/2021, 11:34 PM
    The mystery of figuring out the transcript problem continues
  • m

    mega b

    04/16/2021, 3:08 AM
    Would a batch size of 16 be good for 100 wav files?
  • m

    mega b

    04/16/2021, 4:21 AM
    new training with hand-written transcription
  • m

    mega b

    04/16/2021, 4:22 AM
    much better
  • m

    mega b

    04/16/2021, 4:22 AM
    epoch 120
  • m

    mega b

    04/16/2021, 5:23 AM
    epoch 200 🎉
  • m

    mega b

    04/16/2021, 5:23 AM
    Epoch: 200 Validation loss 1207:  0.103400
1...111213...1068Latest