• We're currently having issues with our e-mail system. Anything requiring e-mail validation (2FA, forgotten passwords, etc.) requires to be changed manually at the moment. Please reach out via the Contact Us form if you require any assistance.

VOCALOID VOCALOID6

sapplerx

+-=÷x
Dec 26, 2020
602
United States
Yeah, it seems like editor updates/major version releases happen around every 4 years after the last and it’s been ≈2.5 years since V6, along with multiple companies being unaware of major editor updates while updating their voicebanks in the past
V1: 2004
V2: 2007 (3 years)
V3: 2011 (4 years)
V4: 2014 (3 years)
V5: 2018 (4 years)
V6: 2022 (4 years)
I guess we'll get V7 in 2026 then...
 
  • Like
Reactions: junky
Guess we're getting a whisper parameter for AI vbs soon
[Update Information] VOCALOID6 updater ver6.7.0 will be released soon.

This update will support “silent sound (whisper) in AI Voicebank”

By raising the Air parameter above a certain level, it will be possible to lower the voiced sound in addition to the breath component. Also, by raising the Air parameter to 100 (maximum value), you can make it completely silent.
 

Vector

Passionate Fan
Mar 6, 2022
186
I don't think they take advantage any specialty hardware acceleration. (Too bad, because Apple ships ML accelerator hardware on their systems.) They're very fast to render though. Usually I edit a note, and the little render progress zooms by in about a half second to a second and it's ready to play.

Typically you need beefy hardware to train models, but just running inference on them is fairly resource-light.

I'm actually really curious about how the V6 engine works under the hood, and I wish they'd publish a paper or something like they did for the original Vocaloid. The existence of vocalo-changer and the relative file size of the editor seem like a hint...I wonder if there's a set of internal "carrier" samples that the editor uses, like a traditional sample based bank, and then it alters the timbre with the voice bank's ML model. Or if the editor phoneme/pitch signals just provoke the model to emit whatever it it's been trained as an "ah" sound or whatever.
 

Users Who Are Viewing This Thread (Users: 0, Guests: 1)