I admire your precision, which I don’t have, but I’m not complaining. I can write songs in which I use my singing and AI singing and I can make musical accompaniment according to my wishes. Synthesizer V Pro is an amazing tool for me, from which I get AI voices for my needs, but they only reach the level of my abilities. It doesn’t show up in my songs so much that I don’t reach where others can, who dare to show off their abilities on the most famous and popular hits, where when listening (with AI singing) you can remember the original and positively evaluate both versions. Many of you on the forum are great at this. In addition to these abilities, you have an unreal knowledge of the problem in depth. Luckily, I have the opportunity to create according to my abilities and I can create for my satisfaction. Thank you for a little insight into your offered explanations and the possibility of better orientation in this issue.
True that, but also the downbeat for the syllable is the peak volume of the syllable, not the first audible sample. Many words, in order to SOUND on the grid, must start early so the phoneme can develop. This makes it hard to cut and paste words and phrases, but is a much more natural groove. THEN, if you want to advance/delay the onset for feel, you are free to do so knowing where the flow is.
I find I need to wait till the entire piece is rough-edited before I can confirm the time fluidity, your smillage may vary.
Sure, many songs need a maximum audible volume and not a starting sound like the breath or glottal stop. It is important to synchronize with the beat (usually the kick). Note that we are talking in milliseconds at 120 BPM here.
See this picture:
You need (as SynthV does) to be in sync with point 2.
But with a specific type of music (without lyrics for example), the start could be between (e.g. glottal stop). It also depends on the BPM, the distance between these 2 points and overall the singer’s speed time to reach the sound peak.
As a drummer on jazz music, the best rhythm is not to follow the exact time for the groove. For vocals, it depends on the needs (groove or not). But often, if you need an exact position with fast words on fast music. It is necessary to adapt.
Mostly it is not necessary on a simple ballad song with a normal voice (like a standard song). This is how SynthV works for singers, on most songs. It is a fact built on sound design.
So we agree in principle then, SynthV’s phoneme timing, maybe you could argue it’s a little ‘generic’ but is correct.
Absolutely, and I find this is a task I may have to repeat post- mixdown, after listening a few times.