• We're currently having issues with our e-mail system. Anything requiring e-mail validation (2FA, forgotten passwords, etc.) requires to be changed manually at the moment. Please reach out via the Contact Us form if you require any assistance.

SynthV Synthesizer V Studio 2

lIlI

Staff member
Administrator
Apr 6, 2018
951
The Lightning Strike
Dreamtonics has revealed more details about their new version of Synthesizer V!


Summary, key points in bold:
  • From user testing, Dreamtonics has concluded that sounding human, and sounding natural, are different things. SV2 is focused on vocals that feel natural.
  • They want to improve dynamics and control.
  • They aren't interested in making an AI composer, but they want to make the editing process as intuitive as possible.
  • They've sped up rendering by 300% on modern machines, 50% on older machines.
  • They showed an A/B test with Natalie in both versions of the software (will link to a timestamp when the livestream has ended) Kevin, Ayame and Felicia were also demoed.
  • The software is still completely offline (no cloud). The increased speed is achieved in part with parallel processing.
  • Vocal modes are knobs; the vocal mode's impact on pitch, timbre, and pronunciation can now be adjusted separately.
  • Expression now has four points that can be adjusted, labelled 'Vibrant', 'Refined', 'Stable', and 'Raw'. Raw brings results closer to the original dynamics of the training samples, and will likely please users who feel SynthV homogenises their singers.
  • AI retakes now has four options: pitch, timbre, timing, and all.
  • Phoneme timing and strength is now adjusted in a panel at the bottom of the window, so you can see the duration of every phoneme visualised on the bar.
  • You can add new control points and drag them across AI pitch curves, which will re-render around the point in real time.
  • The pencil tool will try and fill in pitch curves, so you don't need to hand draw everything. You can still draw completely custom pitch curves if you want.
  • Pitch bends move with notes.
  • There is a new parameter called 'mouth opening'. This does what you would expect.
  • Mentioned in the stream chat: it's backwards compatible with Synthesizer V 1 voicebanks.
Overall I'm psyched by this new version. I really like that they're pushing the philosophy of more user control: this will give musicians even more options for creative tuning and interesting performances. And those new ease-of-use tweaks are much appreciated; pitch curves moving with notes and phoneme duration visualised on the piano roll are the two biggest selling points for me. Bummer: no glottal effects it seems.
 

morrysillusion

v flower enthusiast
Jul 14, 2018
863
Socal
morrysillusion.net
i think on some ends this stuff may not seem insanely "new" enough to warrant a new version of the software but, there is a lot more going on here than what was present in the current version that i can only see a whole new program having been possible. because its not just about the mechanical side itself, but the presentation and tool it is made into to use it-- like UI, being the controls we need for these new functions, cant be be shoved into a mold that doesnt fit all that it needs to be. dreamtonics has already made some significant improvements to the current program through updates but that can only go on so long before itll morph into something very different than what it was, and it would be far too limiting to fit the shell that synth v 1 has had for almost 5 years now. (and i also think people would begin protesting against synth v turning into what 2 looks like. like if vocaloid 4 just got an update that made it look like vocaloid 5 lol... tho not as extreme)

the AI retakes and vocal mode changes are BIG imo and while they may be improvements of existing tools (very very big ones) they really bring a huge amount of change to the vocals we can produce, even more than just mixing vocal mode intensity in the parameters windows. i could always hear "parts" of the voices id wish to change through vocal modes tone and these knobs basically summarize all those parts i hopes to tweak. AI retakes especially is something i dropped not long after it came out, because there just wasnt much of any control, nor did it give results that felt big enough to work i had already done manually. and seeing that interact seamlessly with the rest of the program's existing parameters makes it much more appealing to reroll the takes on parts of the voices.

one thing tho, which has some regards to things like "glottal" effects and whatnot, is that i really notice a lot of the time imperfections, vocal fry, etc can come up in some of these generated takes... but its just always something that generates by chance. it shouldnt be hard for them to implement control over imperfections thats more than just a preset for the AI retakes. so i hope they will consider that (maybe ill straight up tell them at NAMM tomorrow.... i will have many questions lol). you could go into impossible detail with controlling a voice to where some of it isnt practical of course, but with them adding a detail like mouth opening, i would hope to see it. i dont doubt there are some possibilities w the new software we may not have seen clearly but, something like that i would expect to be covered, so its a shame we didnt seem to get that.

in general tho i think this is a great step in a new but comfortable direction for the program. hearing them talk about how much of the development of what sounds like a good natural voice is preference, shows with the amount of detail you can go into and the changes they made to letting voices have more variation with the tools they provide. so i hope the new program will make it even easier for them to extend to focus and new tools to do that
 

pico

robot enjoyer
Sep 10, 2020
564
The performance improvements, the phoneme timing/strength changes, the improvement to soft/whisper voices, and the 4-axis vibrant/refined/stable/raw expression controls are very very important for me as an end user, so I am very pleased.

My perspective on it being marketed as "a new version" versus "an update" is that because there will be voicebanks with SV2-specific features, the new "version" methodology seems justified. I think it would be confusing to a consumer for there to be certain voicebanks that are fully functional on the version of the software you have installed, and some that are not fully interoperable, with no clear indication of why other than the version numbering. A lot can change in a technology and in its quality that isn't immediately visible to an end-user.

I think this is a look at a piece of software that is having a 'coming of age' moment. It's looking like it's becoming more usable and is starting to increase its capacity for personal expression as an instrument, which is absolutely the best angle, at least in my opinion. Folks that want quick results will continue to have it and it got better, but there are more tools now, and there's a more intuitive methodology of interacting with the system under the hood for enthusiasts. Some of these features will probably not be used by very many people (mouth opening in particular), but I hope producers will dig into it deeply.

If 'raw' is truly an effective tool for bringing out the unique characteristics of training data, my biggest and most all-encompassing gripe with this suite will be addressed.

The two things I thought were a little odd: It seems like the engine got even more fixated on vibrato generation, which for me, sounds quite unnatural, but perhaps testers thought otherwise. It does work in the very operatic and dramatic contexts, but not all contexts. Perhaps this kind of performance is most suited to professional contexts. That's something that I'm going to continue to crank way down for many projects and will continue to stand out to me in SynthesizerV renders published online. The second one is: I did find it surprising that Korean was nowhere to be seen. The strangeness of that announcement intensifies.
 

Infoholic

CEO of Chorical, LLC.
Mar 26, 2018
335
I must admit I was rather skeptical when SV2 was announced; not because I thought it would be bad, but I wasn't sure how it could improve enough to warrant an entirely new program. However, after watching the video, I was pleasantly surprised! They exceeded my expectations from a technical standpoint, as all the features and overhauls showcased are, in my opinion, long overdue.

For those of us who frequently use vocal synthesizers and compare different engines more than the average consumer (who may only own one voice on one engine), it had become increasingly clear that the infrastructure or "container" of SV1 was somewhat bottlenecking potential improvements. SV1's UX wasn't originally designed for AI, and with AI evolving so rapidly over the years, the interface became cluttered—something Kanru and Miguel specifically mentioned in the video regarding the multitude of ways to manipulate pitch, many of which have become obsolete and take up valuable space. Personally, I was never a fan of SynthV's UX—especially the tabs—and over time, the problem worsened. I'm glad to see the new editor offers a more simplified and streamlined approach to using the software.

The synthesis for Natalie specifically was insanely clear—you could hear her plosives and (for lack of a better term) "mouth noises." The vocal modes overall seemed more impactful, and the new support for timbre/pronunciation/pitch splitting across them is a welcome addition. I didn't notice as much of a difference with Kevin or Ayame, but Natalie absolutely blew me away.

SV2 seems to address several key points I was dissatisfied with in the platform:
  1. The overall editor layout and clutter.
  2. The lack of intuitive direction (particularly regarding phoneme timing and the abundance of different sliders).
  3. The increasingly noisy plosives over time (e.g., [k] and [p] often sounding like gusts of air).
Hearing about the processing improvements and thought process behind the update proves that this couldn't have been done within the current iteration of the software. My initial fear that SV2 might feel like an unnecessary upgrade—or worse, a cash grab—has now dissipated. I may be more excited for this update than I have been in a long time! That said, I don't think the video did the technology or the update justice. While Dreamtonics has improved their PR, a few aspects of the current rollout leave something to be desired.

1. The Video Presentation

The video’s setting was nicely put together—where Kanru and Miguel were sitting looked professional—but the overall quality left a lot to be desired. Despite watching in HD, the video quality still felt like it was recorded on a nokia flip phone from 2013. The pacing and general vibe also felt off. Did we really need to spend so much time at the beginning discussing the number 5 and things like Windows XP? While referencing the original SynthV1 livestream from AHS was a nice full-circle moment, the callback segment didn’t need to be that long.

Transitions between sections could have been smoother, but I believe the real issue lies with the speakers. It was hard to tell if the video was scripted or not, which gave off an impression of "just winging it." Whether scripted or spontaneous, the end result was a general sense of unpreparedness that didn't inspire confidence. The technology itself did a lot of the heavy lifting, making the interview segments feel like a distraction rather than an enhancement. Miguel, in particular, often felt like a bystander, offering generic reactions like "oo," "aa," and "wow that's so good"—even when the take wasn't particularly impressive. When Miguel did provide valuable input, Kanru seemed caught off guard, which leads me to my next point.

2. The Information (or Lack Thereof)

While we did get to see the software in action, it appeared somewhat bare-bones. Combined with the sudden announcement before NAMM and the lack of a concrete timeline or details for long-time customers, it feels like the lengthy intro segment was added to pad the video runtime. With so much left unknown, it makes me wonder if this rollout was premature. As Miguel pointed out in the video, the "three most important questions" were release date, price, and upgrade options.

Once Miguel asked about these "three most important questions," Kanru shut it down with a quick "We know people want upgrades, but we're not sure. Thanks for tuning in!" Showcasing the software without a concrete commercialization plan, combined with the earlier mentioned awkwardness and runtime padding, makes it harder to convince consumers that this upgrade is worth it. In fact, it may have the opposite effect, with some viewers feeling that "this could've been an email." I left the stream feeling more uncertain than when I started, despite the impressive technology—an opinion I know I'm not alone in.

This is purely speculation, but it feels like the rollout was rushed to ensure it was ready for NAMM (SV2 only has three timeslots in Dreamtonics' entire booth schedule). I think this was a mistake. The lack of a clear timeline for release puts current in-production voicebanks in a difficult spot—if a new voicebank releases next month, should people even bother purchasing it if SV2 is about to drop?

TL;DR

The technology speaks for itself and is undoubtedly a worthy successor, justifying a new application. However, the rollout so far is doing it a disservice, making it feel rushed for NAMM and raising many uncertainties. Uncertainties in commercial products do not inspire confidence. Dreamtonics should have waited until they had more concrete details on release dates, upgrade options, and pricing before starting this rollout.

(Oh, and apparently manual mode has been removed entirely, according to people at NAMM... As someone who prefers manual mode for complex and hands-on tuning, I'm not thrilled about this change.)
 

morrysillusion

v flower enthusiast
Jul 14, 2018
863
Socal
morrysillusion.net
i have just come home from NAMM myself and i really wish i was able to really actually get my hands on it in the sense of like. in depth using it. but it wasnt possible, not like id expect them to just allow strangers to go ham on a completely new beta software lol... but the people were nice and they did a good job showing everything, and talked a bit more directly when i said id been using vocal synths for 10 years.

and yes, it seems to be that "manual" mode is gone. there is still sing and rap but it seems to be their intention is for pitch control to be so integrated and not separated as before that converting things to "manual" isnt uh, necessary? especially with how pitch drawing/nodes works? i can see what theyre trying to do. seems most of all pitch control is on the timeline/pianoroll, on the notes directly, and i see a lack of any "nodes" shown in the way we have in SV1. so i think in that regard "manual" model wouldnt really make sense in the same way that we have it now. if people want "manual" in a way that means they want to 'turn off' the auto pitch generation and leave it flat for tuning, that still seemed possible w some of the slider stuff they had there. hard to express what im thinking and trying to saw based on what i saw when i dont have it in front of me lol...

sadly there wasnt too much else there that isnt already covered in the video. i did bring up things regarding control of "imperfections" that can come from AI retakes-- voice breaks, vocal fry, etc. and their message to me seems right now to really be in line with "make this easy for people". i as much as many others would love to have detailed control and ability to induce vocal fry or growls or whatever when we want and im not fighting against that it would be great to have. but SV before and now has always been very on top of wanting to make the software one that makes it quick and easy for non-users to get good results. the AI retakes will by chance generate those things but there isnt specific control, and honestly as it stands i can understand their focus and why that isnt a thing currently. still, their work on mouth openness is a unique parameter to include and i hope we will see some expansion on other singing details like that.
 

Alphonse

Aspiring Fan
Mar 13, 2021
41
i as much as many others would love to have detailed control and ability to induce vocal fry or growls or whatever when we want and im not fighting against that it would be great to have. but SV before and now has always been very on top of wanting to make the software one that makes it quick and easy for non-users to get good results. the AI retakes will by chance generate those things but there isnt specific control, and honestly as it stands i can understand their focus and why that isnt a thing currently. still, their work on mouth openness is a unique parameter to include and i hope we will see some expansion on other singing details like that.
I don't think the existence of depth for people who want it gets in the way of the software being easy to use for people who don't.
 
  • Like
Reactions: Mika

morrysillusion

v flower enthusiast
Jul 14, 2018
863
Socal
morrysillusion.net
I don't think the existence of depth for people who want it gets in the way of the software being easy to use for people who don't.
thats definitely not something i am trying to say at all.

what i am saying in its most simple form is that the state they have shown synth v 2 in right now fits to what they seem to have in mind as a goal over all and for its showcase, so i can see why we dont have the more extended in depth features people do keep asking for right now and also why im not surprised it wasnt brought up.

i think their priority is ease of access to fast, clear, high quality production for people unfamiliar with this tool. but i did not say that more in depth tools for more experienced users shouldnt exist or get in the way of what that other goal may be ( i am after all, one of those advanced users who would like that in depth stuff). imo i expect that kinda stuff to come up once SV2 has been released, and feedback from experienced users start flowing, i hope
 

Infoholic

CEO of Chorical, LLC.
Mar 26, 2018
335
I mean... at this point Synth V one came out seven years ago. Eventually, they were going to make a successor.
If you're talking about the original SynthesizerV (R1), yes it came out 7 years ago, but SynthesizerV Studio (R2) already was the successor to that, and is just about to turn 5 in a couple of months. Couple that with how AI was not a separate successor to R2, I think it can be seen how some would be confused or say "why not just add it to current SV like always."
 

lIlI

Staff member
Administrator
Apr 6, 2018
951
The Lightning Strike
Joez on Twitter bought up a few things in more depth:

To create an equivalent to manual mode, you set the voice to Stable in the Expressions panel and reduce the vibrato slider; this flattens the auto-pitch. My impression is that rather than being two separate options, manual and auto-pitch are now part of one spectrum. (A bit like Voisona's auto-pitch slider)

Standard banks are still compatible!

AHS also announced that Miki and Kiyoteru are being developed for SV2, and that this was the cause for their delay.

Overall I think it's impossible to judge the workflow at this stage, I suspect the challenges and advantages of it will become clear when people start using it.
 

Users Who Are Viewing This Thread (Users: 0, Guests: 1)