Your Ideal Vocal Synth Engine?

Scarlet Illusion · Apr 20, 2020

If there was a vocal synth that functioned exactly the way you like, would it be? I thought this would be a fun discussion.

For me, I’d love something with Piapro Studio’s interface and tuning like UTAU’s pitch edit plugin. Also for dynamics and other parameters, something like the way Vocaloid does it would be great! (Basically UTAU crosses with Piapro.)

Oh, a text-to-speech function and vocoder would be great, too.

peaches2217 · Apr 20, 2020

I'd love to see a SynthV-style editor with UTAU's pitch-bend input method! That's my one gripe with SynthV: it's hard to get the pitch bends to behave exactly as you want them to, which is something I love about UTAU.

kozet · Apr 20, 2020

runs natively on Linux
ideally FOSS but good luck hoping for that
uses X-SAMPA for phonemic input instead of whatever stupid crap SynthV is doing for English and Japanese
TTS function would be handy, I'd admit

Twillby · Apr 20, 2020

~~One that does all the tuning for me~~

CeVIO ~~Pro~~ with a brightness and/or tension parameter, maybe? And with an option to use control points like UTAU because drawing edits is hard ~~and more English banks even if I like IA Eng well enough~~

(Honestly it’s hard for me to say whether SynthV or CeVIO is really more ideal for me now that I think about it but I don’t really understand SynthV's timing features very well and it’s giving me some grief with a cover right now so CeVIO's looking good pffft /tangent)

Krin · Apr 20, 2020

kozet said:
runs natively on Linux

ideally FOSS but good luck hoping for that

that would honestly be the dream. having a synth that's FOSS would be very cool especially to see what people could do with it on Linux!! i wouldn't think it'd be too much to ask for? so far UTAU is the closest we can get, except for it not being open sourced (it's shareware)... luckily there's a wide range of plugins that can be used to mod it. but, it's too bad that the creator hasn't put out an update to the software in 7 years however ameya's still out there doing things and working on resamplers...? so??

ummm that sort of derailed,,,, but yes i think what i'd want in a vocal synth the most is for it to be open sourced. also;

- seperate tracks for voice and audio (kinda like how vocaloid handles it) would be cool
- using multiple voices at once???
- like @RoboCheatsyTM said, have a function that you "draw" the pitchbends like UTAU's pitch edit plugin
- a clean ui aesthetic similar to vocaloids. just not UTAU's painfully bright colours that threaten to burn my retina.

uncreepy · Apr 20, 2020

kozet said:
uses X-SAMPA for phonemic input instead of whatever stupid crap SynthV is doing for English and Japanese

English uses arpassing (I actually like it better than X-SAMPA, get results I want quicker

). Japanese just uses romaji or hiragana/katakana, which is the same as every Japanese synth.

Anyway, my ideal synth.

Yeah, uh, I'll take one synth that is capable of voice texture without having to use external programs like vocascreamer and a side of fries, please.
Looks like Piapro (how the pitch bends are shown with a red line, hate the guesswork of Vocaloid Editor). Except it runs fast and doesn't lag like crazy.
Has extra phonemes for all voice banks so you can make whatever language you want.
I guess if we're being over-the-top, I'd like it to also be able to do TTS and be a vocoder you can edit.
Not rainbowrifficly eye-burning (looking at default CeVIO). Font size isn't written for leprechauns (aka size 10 *coughSynthVcough*).
Edit: Includes both a standalone version and VST version.

Scarlet Illusion · Apr 20, 2020

Wow! I love everyone’s ideas!
(Also, I guess I’ll add to mine and say it’s be super neat if we got an open-source vocal synth!)

frankensalad · Apr 21, 2020

I just want a synth where you can make your own English voicebanks as easily as you can make Japanese voicebanks in Utau. Preferably with a UI that actually looks like it's from this decade and without the need to change your computers locale.

lIlI · Apr 21, 2020

A synth the produces English pronunciation indistinguishable from a real singer. It's a lot of ask, but it's endgame for me.

Overcast Immortal · May 2, 2020

My ideal vocal synth engine contains multiple engines or renderers for more or less clarity and more or less robotic output. It is able to work as a sampled-based synthesizer or as an analysis of a voice bank like the first version of Vocaloid. It works in any DAW with very low system requirements and can run on every major operating system. It allows users to create their own voice banks with a convenient UI for recording and processing samples. Its interface is like the current Piapro Studio.

It allows cross synthesis between every voice bank ever. It has tools allowing finding and replacing phonemes automatically to allow, for example, Japanese phonemes to be substituted with appropriate Spanish phonemes to quickly allow a Spanish VB to sing in Japanese without editing all the phonemes by hand. It is able to handle voice banks of every brand. It has extensive vibrato customization.

It has versatile and effective growl parameters allowing for grit in the voice, vocal fry, false chord screaming, and fry screaming based on analysis of the voice bank. It has a power parameter allowing soft voice banks to sound more powerful and powerful voice banks to sound softer. Perhaps most important of all, it includes a virtual model of the vocal tract and allows users to create new entirely synthetic voices by altering the virtual vocal tract, while also being capable of producing every human speech sound out of the box with no need for voice providers.

Oh, and that last part is not science fiction. Speech synthesis based on modeling of the vocal tract is already a thing. I don't know if anyone has tried getting such a program to sing yet, and in any case, it's cheaper and easier to just use samples. Plus I think fans of singing synthesizers and human singers will consider it sacrilege for a computer program to be able to sing anything in any voice, including their favorite Vocaloids or live singers, but people will get used to it.

FluoroLime · May 2, 2020

My ideal synth would have:
- UTAU's pitch bend editing
- Synth V's method of dealing with multi-syllable words
- CEVIO's way of sounding so realistic but without that engine noise
- pink

Scarlet Illusion · May 2, 2020

Overcast Immortal said:
My ideal vocal synth engine contains multiple engines or renderers for more or less clarity and more or less robotic output. It is able to work as a sampled-based synthesizer or as an analysis of a voice bank like the first version of Vocaloid. It works in any DAW with very low system requirements and can run on every major operating system. It allows users to create their own voice banks with a convenient UI for recording and processing samples. Its interface is like the current Piapro Studio.

It allows cross synthesis between every voice bank ever. It has tools allowing finding and replacing phonemes automatically to allow, for example, Japanese phonemes to be substituted with appropriate Spanish phonemes to quickly allow a Spanish VB to sing in Japanese without editing all the phonemes by hand. It is able to handle voice banks of every brand. It has extensive vibrato customization.

It has versatile and effective growl parameters allowing for grit in the voice, vocal fry, false chord screaming, and fry screaming based on analysis of the voice bank. It has a power parameter allowing soft voice banks to sound more powerful and powerful voice banks to sound softer. Perhaps most important of all, it includes a virtual model of the vocal tract and allows users to create new entirely synthetic voices by altering the virtual vocal tract, while also being capable of producing every human speech sound out of the box with no need for voice providers.

Oh, and that last part is not science fiction. Speech synthesis based on modeling of the vocal tract is already a thing. I don't know if anyone has tried getting such a program to sing yet, and in any case, it's cheaper and easier to just use samples. Plus I think fans of singing synthesizers and human singers will consider it sacrilege for a computer program to be able to sing anything in any voice, including their favorite Vocaloids or live singers, but people will get used to it.

FluoroLime said:
My ideal synth would have:
- UTAU's pitch bend editing
- Synth V's method of dealing with multi-syllable words
- CEVIO's way of sounding so realistic but without that engine noise
- pink

Yes to all of this! Maybe we all should learn how to do all this stuff so we can get together and build our ideal vocal synth, lol. XD

frankensalad · May 2, 2020

I would love to see people come together to develop a new vocalsynth, but does anyone on this forum have any programming knowledge?

Scarlet Illusion · May 2, 2020

I don’t have any programming knowledge, but does anyone know what programming language is typically used for vocal synth? I could at least look into it if I knew.

(Also I’m pretty sure @SeleDreams knows a thing or two about programming! Hope you don’t mind I @ you like this!)

kozet · May 2, 2020

I have some programming knowledge, but not much in the area of vocal synthesis.

Älfa Dröttning · May 2, 2020

I know how to program in Java (I’ve taken a few classes and am still learning), but I don’t think that’s very applicable to vocal synths. I think C is usually used for audio software in general (don’t quote me on that). Lua is used for Job plugins in Vocaloid but I’m pretty sure that language is mainly meant for adding onto existing programs, not building them from scratch, so I’m not sure what language would work best.

Edit: Just did some Googling and apparently C++ is a common language for audio software.

Oh and I completely forgot about Matlab. Matlab is a program I’ve used for signal processing. I’m not sure if it would be useful for vocal synths, but from what I’ve done with it, you can use it to get data from audio signals and then change the signal by creating filters and other things like that (upsampling/downsampling/etc). You can also graph the signal and see its different frequencies. I don’t think you would use it to create a vocal synth, but it could be used for analysis of some sort.

Edit 2: I just found a language called ChucK that’s free and supposedly easy to learn: ChucK => Strongly-timed, On-the-fly Music Programming Language

SeleDreams · May 2, 2020

For audio synthesis, C and C++ are often languages of choice due to their efficiency on mathematics, they allow some neat speed saving tricks and have a lot of libraries available, however you can still use any language to do anything.
now for audio synthesis it requires some experience in the level of speech synthesis to create a new algorithm, however, we can use existing algorithms in any vocal synth (multiple vocal synths can use the same speech synthesis algorithm and work completely differently)

so someone could take the WORLD source code from here mmorise/World and make a synth that works completely differently from other synths using this algorithm

xuu · May 2, 2020

I feel like the odd one out but having suffered through using UTAU for six years I can't think of anything I'd want less than UTAU's pitchbending system, especially as a default. Control points as an option, sure, but keep the rest of it as far away as possible.

Honestly SynthV is pretty close to my ideal synthesizer, but I wish it had the phoneme flexibility that VOCALOID has, more than one resampler/vocoder and used X-SAMPA because Arpasing is not ideal. The Chinese and Japanese voicebanks could probably go with X-SAMPA too for standardisation or at least have them as aliases. In terms of synthesis machine learning is absolutely the way to go and anything else would have a limited lifespan. It's really early days but I adore the method ByteSing is using, even if I'm unsure if it could be applied to the traditional score editing methods we're used to in VOCALOID and co.

frankensalad · May 3, 2020

SeleDreams said:
so someone could take the WORLD source code from here mmorise/World and make a synth that works completely differently from other synths using this algorithm

That's actually something I was thinking about. I know some of the resamplers people commonly use in Utau are based on open source code and could potentially be used as bases or inspiration for developing an alternative to Utau.

Your Ideal Vocal Synth Engine?

Veteran

Give me Gackpoid AI or give me DEATH

Conlanger

Longtime Listener

UTAU is my religion!!!

Veteran

Veteran

Banned

⚡

Budding producer

cangqiong

Veteran

Banned

Veteran

Conlanger

Aspiring Fan

Hardcore Fan

long suffering synth fan

Banned

Users Who Are Viewing This Thread (Users: 0, Guests: 1)