while I understand overtraining is a thing it seems harder to argue against it 0given the context of voice synthesis so i've been taking the default duration and pitch epochs x10 (200 and 500 I think respectively)
obviously improvements are increasingly negligible but ideally we do want them to sound literally the same as the wavs theyre trained on? 😅