Cryptonloid voicebank updates, collabs, & concert news (crypton_wat Twitter translations)

Wario94

Passionate Fan
Jan 5, 2019
131
25
The first concert for FRIDAY isn't until:
1 pm~2pm JST (10 pm CST on the 8th)
and
6 pm~7pm JST (3 am CST on the 9th).
(It's only 7 am in Japan right now, so no concerts have actually started, and I am not going to stay up late to find out what happens.)

On SATURDAY in Japan, the concerts will be:
1 pm~2 pm JST (10 pm CST on the 9th)
and
5 pm~6 pm JST (2 am CST on the 10th).
Okay, for those who lived in the Pacific Coast like myself, the time goes like this:
The first concert start on February 8th on 8:00 p.m. PST, the second start on February 9th on 1:00 a.m. PST, the third start on February 9th on 8:00 p.m. PST and the fourth and final concert start on February 10th on 1:00 a.m. PST. Did I get it right?
Also, congratulations for reaching for the 100th post for this thread!
 

Wario94

Passionate Fan
Jan 5, 2019
131
25
Okay, it's past 9:00 p.m. PST, I would like to ask: Did Crypton finally told their fans about the new Appends yet?
 
  • Like
Reactions: Jikyusan

uncreepy

😱
Apr 9, 2018
689
USA
Presently, were are meeting with Warner [Music Japan] about Miku Symphony, but... we've been continuing to look at #KAITObirthday2019 Tweets, and are now discussing MEIKO & KAITO's relationship. Doing our best with the budget!
Based on the new Miku Symphony poster, it says that Miku and Meiko (and others) will be hosting the event mainly. So that's probably what the budget is related to.

1885

^
1886

Bonus note: The huge Meiko x Kaito fan that dissed Wat several times before about him not treating Meiko and Kaito the same as the Character Voice Cryptonloids is pleased.
 

uncreepy

😱
Apr 9, 2018
689
USA
Wat being cryptic about the bumpy progress for the Cryptonloid Appends:
[Monologue] Let's suppose I had mixed feelings of love and hate for the Vocaloid synthesis engine.... Rather than cheating with something similar or two-timing with a subsitution, start by being able to correctly sublimate* the way to date a WAV. Just a bit longer and it seems we'll be able to edit the waveforms into a Vocaloid-y ideal.

※ I'm sorry I can't digest this and that even I am uncertain of the meaning.
*sublimate = divert/channel something into something better

I think he's talking about cheating with something similar/using a substitution like how IA is on CeVIO and Vocaloid.

The WAV/waveforms = the phoneme audio clips they've been editing to sound "Vocaloid-y" and "not necessarily human" in order for listeners to be able to easily understand what the characters are singing.

The "Vocaloid-y idea" is probably Wat's several year-long goal of making their Vocaloids sound like Vocaloids rather than completely human, basically. (They have been tweaking Meiko and Kaito's V3 voices and Luka/Rin/Len/Miku's V2 voices in order to get rid of muddy pronunciation, and it doesn't seem like this is going very well. Saying they're coming "soon" could be months from now, Wat's always late and changes his mind a lot. And add that he doesn't even understand whay cryptic stuff he's saying makes it seem like their goal is somehow unclear/in uncertain fate.)


Notable replies:
takamin39hi: The eternal challenge is waveform smoothing rather than neatly synthesizing phonemes, huh.

Eji: Just a bit longer, huh...?
 

Exemplar

Enthusiast
May 17, 2018
678
😅 Yeah. It sounds like the thought crossed his mind, but he's deciding not to do it after all. Doesn't give me confidence when the boss man admits to wanting to jump the Vocaloid ship.
I still stand by the notion of CFM wanting to make their own engine at some point. Wat probably saw what Kanru was able to do on his own and thought "hey I got more staff than he does and we certainly have more money & resources too, we could pull this off.". Don't be shocked to see Kanru tweet some photos from the Crypton offices in the future.
 

uncreepy

😱
Apr 9, 2018
689
USA
Wat alerts us to Crypton's Labopton BLOG being updated:
New vocal effector / Introduction to dealing with vocal synthesis

My translation of the blog post:
[start translation]

New vocal effector / Introduction to dealing with vocal synthesis

Hello, everyone, at Crypton, I am T.Ryo who is involved with vocal synthesis-related research and development.

Up until now, not very publicly showed, but at our company, we use things like vocal signal processing, deep learning , and are dealing with a new voice technological development.

Since around last year, with several research and development members, we started to make our way to a voice-related academic conference, and are checking out the latest research.

Lately, we were able to observe things like vocal synthesis that used deep learning, and reasearch of real-time conversion of voice qualities. And, so we think there are many voice analysis synthesis technology that can be used.

As a matter of fact, at even our company, we are advancing things like our own voice analysis synthesis technology, and vocal effector research and development.

So this in this post, out of the technology we developed before now, we will introduce two VST vocal effectors.

An effector called "VOCAL DRIVE" that changes the tone of voice
⦁ Simulates the phenomenon that occurs when distorting a voice, and the effector assigns a distortion effect to the voice.
⦁ Depending on the setting, effects ranging from a soft rough voice, to a pop growl, to a death growl can be produced.

1915

A voice analysis synthesis effector called "CHERRY PIE"
⦁ It is an effector that uses real-time high quality voice analysis synthesis technology.
⦁ Due to things like unrestricted pitch management, the spectral envelope, and modifications of indicators of non-periodic functions; the voice can be greatly altered.
⦁ Achieved through deep learning, and by reading various network files, conversion functions for voice qualities can be changed to sound like they are the voices of different people.

1916

Demonstration Movie
Now, please watch the effector being practically applied in the demonstration movie.
By the way, this movie was shown to the public at ADC (Audio Developers Conference) in London on 11-20-2018. (For that reason, this is an older version than the current one.)


Translator's Note: Here is a screen shot of the settings demonstrating VOCAL DRIVE:
1917

How was it?
We think you can understand the alteration of a voice to sound like other people with the transformation of voice qualities using the final stages of deep learning.
This is a bit technical, but in the demonstration of transforming the voice quality in the movie, both the female voice input, and male voice input, are using same conversion model (network file), so even if a female voice is inputted, or even if a male voice is inputted, we think the altered voice doesn't resemble the original one.
So we may be able to develop the technology to be like "no matter who is singing, the voice quality becomes like 〇〇"!
That was the introduction of two effectors, but we have already developed more than 10 types of effectors being used for voice production.
We are currently considering things like announcements and commercialization of the technology we have been working on. Thank you for your continued support!
[/end translation]

My thoughts:
  • The voice coating seems way more natural compared to Crypton's other demos. Hopefully this is what ends up in the new Piapro.
  • I think it sounds like Miku is in the demo video at 1:35? If so, her English seems to have improved a lot. (Unless editing the pitch of the singers just made the voice sound very Miku-like.)
  • I think the female singer sounds like the samples that came with V5, but I checked and couldn't find any of the lines sung in the presets, so maybe I'm imagining it.
  • I checked on my old translations, and around November 20th of last year, they were working on Rin and Len's "patterns" (whatever that means), and then 5 days later said they were working on Kaito. I guess it really does seem like this new tech works on both male and female voices (because the male and female singer were edited to sound identical in the demo) like Wat alluded to before.
  • Hallelujah for death growl! 🙌
EDIT:
Wat retweeted this article called "Vocals of a different person with deep learning / Crypton's new technology can even be used for BABINIKU?"
Babiniku バ美肉 = Virtual (ba) bishoujo/beautiful girl (bi) incarnation (ku) バーチャル美少女受肉
Which is when a dude is a cute anime girl VTuber (even though they still have a man's voice). So the writer thinks these guys will be able to convert their voice in real time into a cute girl's. Yay! I'm excited.
 
Last edited:

Exemplar

Enthusiast
May 17, 2018
678
Lately, we were able to observe things like vocal synthesis that used deep learning, and reasearch of real-time conversion of voice qualities. And, so we think there are many voice analysis synthesis technology that can be used.

As a matter of fact, at even our company, we are advancing things like our own voice analysis synthesis technology, and vocal effector research and development.
* coughcoughcoughTheyNoticedKanru'sResearchLikeISaidAFewDaysAgocoughcough *
 
Last edited:

mobius017

Aspiring ∞ Creator
Apr 8, 2018
836
Wow, those are some pretty badass tools. Sounds like they've been working with them for awhile. I wonder if they've been using them in some form to produce their voices, but not releasing them to the public, or if they're actually newer than that--the article doesn't really make that clear (at least as I read the translation).

It would be awesome to see that stuff in a new Piapro...or it could be they'll sell them as standalone offerings?

Loving the different flavors of growl--I'll be interested to see if these tools replace it, or at least provide a maybe easier-to-use alternative.
 

uncreepy

😱
Apr 9, 2018
689
USA
I talked on the Discord about this news, looked back at old Wat tweets, and thought about this some more and have some more stuff to say...

The blog post said both VOCAL DRIVE and CHERRY PIE are VSTs. VST is another word for a plug-in. Wat has been saying the "plugin" (aka VST) was more sensitive than VocaListener, it was for automatic tuning, and that the plugin controlled the pronunciation-- all of which CHERRY PIE demonstrated. I now believe that this whole time, plug-in didn't mean like the ones used in VOCALOID4 Editor (like Idol Style, V3KeroPitch, ZOLA Unison) that are now unsupported in VOCALOID5 Editor, but rather more modern DAW-style VSTs you can further edit the instrument with (like reverb, gain, distortion).

Piapro and VOCALOID5 EDITOR are VSTis (virtual instruments-- stuff like synths and virtual pianos that make the sounds and are further edited through VSTs). The VSTs that are inside of VOCALOID5 EDITOR are de-esser, equalizer, compressor, gain, reverb, distortion, chorus, phaser, tremolo, auto pan, delay. The blog post about the new Piapro (I'm just gonna say it's still going to be called Piapro) said that they were writing about 2 VSTs (VOCAL DRIVE and CHERRY PIE) and that there were actually around 10. I just listed 11 from VOCALOID5 Editor. VSTs are audio plugins like reverb, delay, phaser, distortion, etc. So I believe that this demonstration was using a new Piapro VSTi with modern features for editing the sound further like VOCALOID5 Editor has its Audio Effect section (so maybe these features are in pop-ups in the new Piapro).

1918

I am going to call the VocaListener-esque plug-in the CHERRY PIE VST unless some other news happens.

So, I think that even though the VSTs demonstrated seem shocking, I think that this is still Vocaloid:
  • The VSTs (like reverb, gain, equalizer, and Crypton's VOCAL DRIVE) are optional add-on pop-ups that just edit the voice to sound funky after you're done with it, you don't HAVE to edit Vocaloid singing at all.
  • CHERRY PIE is related to deep-learning in order to tune the Vocaloid's voice based on what it has learned from human samples-- which is also optional, you don't HAVE to tune using it at all, you can draw your own tuning and choose to not make them sound realistic. (That would be like saying you have to use VocaListener in VOCALOID4 Editor in order to make a song.)
  • So that leaves the question of "What would the sound source for Miku be?" In my humble opinion, it would have to be Vocaloid, because Wat said in his monologue on the 24th that they were still using the Vocaloid engine (even though it had complex issues), Wat had a nightmare in July about getting crushed by a giant Vocaloid Editor, the fact that they're editing the Cryptonloids V2 and V3 voice samples, Crypton was on that net history documentary about popularizing Vocaloid for the review of the Heisei era-- Crypton IS the face of Vocaloid.
  • To summarize, I think it's:
    VSTi (Piapro, which will probably be standalone and also connect to a DAW) = Vocaloid engine operating through updated Piapro for synthesizing Cryptonloid voices > VST plugins (VOCAL DRIVE, CHERRY PIE, normal VST editing stuff like reverb) = inside the Piapro VSTi to optionally make the voice sound loud/funky/echoey/realistic
 

mobius017

Aspiring ∞ Creator
Apr 8, 2018
836
If these two plugins are this good, I have get really interested in what the other 8 do, and if/when we'll ever see them. Either they led with some of their best, or probably (you'd think) the ones that are the most polished/finished. And the ones that are left are either 1) unpolished, 2) do little niche things, or 3) do really far-out stuff that will take more development time.

Cherry Pie has to still be a codename. I wonder where it came from/what the final name will end up being.... By its nature, the plugin doesn't seem like it would be related to a particular Cryptonloid, and none of them are associated with cherries AFAIK. Luka's pink, and her anniversary's Mar. 19, but I'm not sure if that's related. Maybe it's some kind of metaphor for how somebody thought of how the thing works/its AI component somehow... Or maybe they just picked a random dessert, like Google seems to for its Android OS version names.
 

GreenFantasy64

カイミクfangirl
Apr 9, 2018
441
soundcloud.com
Wat had a nightmare in July about getting crushed by a giant Vocaloid Editor
:arsloid_ani_lili::ring_ani_lili:
Oh gosh, Wat is slowing losing his head....

But the Vocal Drive is very interesting.
Depending on the setting, effects ranging from a soft rough voice, to a pop growl, to a death growl can be produced
Like... O~oh, pop growl to death growl. :mirai_ani_lili: I won't really use Death Growl alot perhaps, but pop growl yes probably.
 

uncreepy

😱
Apr 9, 2018
689
USA
This isn't a Wat tweet and he didn't retweet it, but the Sony Store in Sapporo did a collab with Miku, here she is talking about it:


I feel like the way she talks is a bit more emotive and the intonation is better compared to previous talking examples. Maybe they're still improving their voice coating.
 
This isn't a Wat tweet and he didn't retweet it, but the Sony Store in Sapporo did a collab with Miku, here she is talking about it:


I feel like the way she talks is a bit more emotive and the intonation is better compared to previous talking examples. Maybe they're still improving their voice coating.
yeah it sounds a lot better on that front, but also it sounds a lot clearer than the other samples i've heard. like it seems like the software coats their voice in a bunch of static noise or something, so it usually sounds more expressive but is also coated in this layer of noise. is that accurate or am i just imagining that?

EDIT: i have listened to some older ones and it seems like the talkloid videos are very clear and the singing ones with cherry pie are not clear at all but impressive.
 
Last edited:

Users Who Are Viewing This Thread (Users: 0, Guests: 1)