tl;dr CeVIO's updated engine that uses deep learning has DAW integration, balances quality/render time better than before (went from taking 10 hours for a 5 min song to now running on a laptop), will probably be called "NeoCeVIO", and the vocals appear to be getting Neo names (ex: "Neo Sasara").
Note:
For those of you out of the loop, check out the English/Japanese/Chinese demos from December 2018 here: Reproducing high-quality singing voice
For the most recent demo using English/Japanese, check out this full length song ("Itsuka Kanarazu"): New singing synthesis demo from CeVIO developer Techno-Speech
On October 9th, Techno-Speech teased the upcoming deep-learning-based "NeoCeVIO" (temporary name) at Meiji Kinenkan (it's a historic venue especially used for parties/weddings). It was free to the public and contained posters and a software demonstration.
Kazuhiro Nakamura is a researcher at Techno-Speech who conducted the booth.
On the page linked, ‘–±ÈbuICTƒCƒmƒx[ƒVƒ‡ƒ“ƒtƒH[ƒ‰ƒ€2019v‚ÌŠJÃ
it explains that the event was part of "ICT Innocation Forum 2019" to show off technology research and development in the sphere of telecommunications hosted by the Ministry of Internal Affairs and Communications.
(Eji is a person who collects a lot of information from Miku/Crypton-related events.)
Eji says that other potential names for the new CeVIO are: CeVIO AI (Techno-Speech apparently used this name before), CeVIO Pro (a user called PSGOZ heard this name), and NeoCeVIO (kM4osM (pronounced "kurosu") heard this name at the event). (I'm going to call it NeoCeVIO, because that's what was heard during this event a few days ago.)
The goal of the update is so the voices can have diverse expression. The other things noted by Eji are too advanced in my understanding on vocal synthesis to rephrase.
(Chiteico is a person related to the Synth V sphere and frequently talks to Amano Kei about it.)
Chiteico thinks that while VOCALOID:AI is aimed at pros, NeoCeVIO is aimed at "DTMers" (the Japanese term for Desktop Musicians who make MIDI music)/Vocaloid producers.
KM4osM says that NeoCeVIO's good point is that it balances the quality of the voices and the speed of synthesis. If the balance is changed in one direction, it makes the product change greatly (ex: high quality, very slow vs low quality, very fast).
With this tweet, we will move on to KM4osM's blog post. They appear to be the only member of the vocal synth tinfoil hat brigade who actually went to the event and took pictures to share on Twitter. Kazuhiro Nakamura said many people showed up, but I could not find any other tweets than the ones I shared.
(This is a summary of it, not word for word because of time constraints.)
Note:
For those of you out of the loop, check out the English/Japanese/Chinese demos from December 2018 here: Reproducing high-quality singing voice
For the most recent demo using English/Japanese, check out this full length song ("Itsuka Kanarazu"): New singing synthesis demo from CeVIO developer Techno-Speech
On October 9th, Techno-Speech teased the upcoming deep-learning-based "NeoCeVIO" (temporary name) at Meiji Kinenkan (it's a historic venue especially used for parties/weddings). It was free to the public and contained posters and a software demonstration.
Kazuhiro Nakamura is a researcher at Techno-Speech who conducted the booth.
On the page linked, ‘–±ÈbuICTƒCƒmƒx[ƒVƒ‡ƒ“ƒtƒH[ƒ‰ƒ€2019v‚ÌŠJÃ
it explains that the event was part of "ICT Innocation Forum 2019" to show off technology research and development in the sphere of telecommunications hosted by the Ministry of Internal Affairs and Communications.
(Eji is a person who collects a lot of information from Miku/Crypton-related events.)
Eji says that other potential names for the new CeVIO are: CeVIO AI (Techno-Speech apparently used this name before), CeVIO Pro (a user called PSGOZ heard this name), and NeoCeVIO (kM4osM (pronounced "kurosu") heard this name at the event). (I'm going to call it NeoCeVIO, because that's what was heard during this event a few days ago.)
The goal of the update is so the voices can have diverse expression. The other things noted by Eji are too advanced in my understanding on vocal synthesis to rephrase.
(Chiteico is a person related to the Synth V sphere and frequently talks to Amano Kei about it.)
Chiteico thinks that while VOCALOID:AI is aimed at pros, NeoCeVIO is aimed at "DTMers" (the Japanese term for Desktop Musicians who make MIDI music)/Vocaloid producers.
KM4osM says that NeoCeVIO's good point is that it balances the quality of the voices and the speed of synthesis. If the balance is changed in one direction, it makes the product change greatly (ex: high quality, very slow vs low quality, very fast).
With this tweet, we will move on to KM4osM's blog post. They appear to be the only member of the vocal synth tinfoil hat brigade who actually went to the event and took pictures to share on Twitter. Kazuhiro Nakamura said many people showed up, but I could not find any other tweets than the ones I shared.
(This is a summary of it, not word for word because of time constraints.)
In 2018, Techno-Speech showed their deep-learning-based vocal synthesis that sounded human.
Upon seeing Kazuhiro Nakamura's tweet, KM4osM felt an obligation to go to the event.
Meiji Kinenkan had a regal aire about it, which made sense on account of it being a Ministry of Internal Affairs and Communications event.
At the venue, this was the booth (Nakamura was there):
IT HAS DAW INTEGRATION!!!!!!!!!!!!!!!!! (#1 impression) (It was being used with REAPER at the event.)
It seems that NeoCeVIO is closer to being a full product that can be used with a DAW (and maybe standalone?).
On top of that, NeoCeVIO was even running on laptop!
Last year around March, it took 10 hours to render a high quality 5 minute song ("Itsuka Kanarazu" linked at the start of this post). But now it successfully decreased the render time.
[The main question]
From what KM4osM could see, NeoCeVIO (temporary name) seemed to be quite complete. (It sounds like even if you don't tune, the singing sounds good so you can compose late at night and it feels like "the future is here".)
The parameters it had were volume, pitch, timing (duration), vibrato, so it isn't very different from the current CeVIO. KM4osM wasn't sure if they will add more parameters/adjust them due to this being a temporary version or not.
At the booth, "Neo Sasara" (temporary name) sang a cover of "Ai o Komete Hanataba o" by Superfly.
^ This is not the Neo Sasara version, this is the real human version.
It seemed like Neo Sasara could sing with a calm tone of voice, you could feel the expression in her singing. The pitch changed, there was subtle "shakuri" (Japanese singing technique where you sing slightly lower than the note's pitch and ease into the "correct" pitch).
[When will it become a product?]
We don't know. The engine seems fine, but there were things that needed work with the GUI.
Last edited: