Creating a custom voicebank is expensive, so unless you’ve got a voice that fills a niche, I doubt that Dreamtonics would consider it. The actual cost is hidden behind an NDA, but the Kickstarter for Solaria was around $42,000.
I don’t have access to the inner workings of Dreamtonics, but I wouldn’t place any hopes on being able to create a custom voice in the immediate future. They’ve just released Vocoflex, and I suspect that wouldn’t have gone that route if custom voicebanks were easy to do.
If you really want to clone your voice, you’ve got at least three options, each with caveats:
- SynthesizerV + Vocoflex
- SynthesizerV + RVC
- ACE Studio (obviously not SynthesizerV)
The biggest caveat is that much of makes a singer distinctive isn’t in their vocal timbre, but their performance. Just in the way that two people will read the same words differently, different singers will sing differently. If you’ve listened to bad RVC, you’ll know what I’m talking about.
And none of the options I listed captures the vocal performance.
So if you simply replaced a source singer with another singer’s timbre, it’s still going to sound very much like the original singer if they’ve got a distinctive style.
Vocoflex
Careful reading of the Vocoflex’s description will show it’s not voice cloning software. It’s intended to be used as a “creative” tool.
Still, it might work for you. Or not. The results for me aren’t great, because I’ve got a pretty low voice that doesn’t seem to match the Vocoflex vocal synthesis model very well. That’s fine, because - once again - Vocoflex isn’t really voice cloning software.
RVC
RVC is a popular method for voice cloning. It’s likely to sound more like you than Vocoflex - which doesn’t claim to do voice cloning - but how well this works again depend on how well the underlying performance matches how you sing.
The expensive bit of RVC is the training, which requires hours on GPUs, and laughably long times using a CPU. If you haven’t got a AI-capable video card, you can instead buy some credits to use Google Collab. Actually rendering can be done in realtime if you’ve got a reasonably fast CPU.
ACE Studio
ACE Studio does have an option for voice cloning. Like the other methods, it doesn’t capture performance. So again, the success of the cloning is going to be highly dependent on how much the source “sings” in your style.
Obviously, this is going outside the Dreamtonics ecosystem, but if an all-in-one voice cloning solution is what you’re after, they’ve got that.
Summary
With any option other than Vocoflex, you’ll need lots of voice data. ACE Studio recommends 30 to 100 minutes of audio for best results.
These are the options that I’m currently aware of that are available to you. For background vocals or replacing short segments of vocals, using RVC on a SynthV is probably the cheapest option.
Of course, all the is subject to rapid change as time passes.