• We're currently having issues with our e-mail system. Anything requiring e-mail validation (2FA, forgotten passwords, etc.) requires to be changed manually at the moment. Please reach out via the Contact Us form if you require any assistance.

Other Cryptonloid voicebank updates, collabs, & concert news (crypton_wat Twitter translations)

uncreepy

👵Escaped from the retirement home
Apr 9, 2018
1,618
A trailer for bluray and CD of the last Miku Symphony got released, and it has clips of Miku and Luka (and Teto) talking. It's the same method where Miku got her voice coated for the VocaListener-esque plugin while using one of her new Append voices, so I assume that they are also using one of Luka's Append voices. It's kind of choppy like the last Miku clip I shared (for the Sony Store collab), but at least we finally get to hear Luka's voice with it (other than that really hard to hear clip inside the shop for her anniversary). However, even in this clip, it is hard to hear with the concert hall echo, so I recommend headphones.


Here are all the times Luka speaks:
1:55-2:09
3:14-3:34 (there's a human choir singing with her)
8:13-8:22
8:31-8:56

On a side note, I'm not really satisfied seeing these animations. (The models are cute, but the animation itself needs work.) The lighting is blown out, their feet don't touch the ground while walking, their hands are hanging down in a weird pose with splayed fingers, they don't blink at all while walking. When they turn to look at each other, they turn their entire bodies like some sort of really old RPG. Hrng... :alien:
 

GreenFantasy64

カイミク || Len English || Arsloid || V5/Piapro
Staff member
Moderator
Apr 9, 2018
667
soundcloud.com
Luka does sound a bit choppy, but I think she sounds much better than Miku. Now I want to hear Kaito's voice. :kaito_ani_lili:

With Wat not talking in Twitter, he and the crew must be really busy (until Golden Week). Still I wish we knew what they were working on right now. Piapro Studio, one of their voicebanks, box designs, etc.!
 

uncreepy

👵Escaped from the retirement home
Apr 9, 2018
1,618
I was looking up Crypton and Cherry Pie on Google and found something to translate: The LinkedIn profile of a developer at Crypton named Ryo, who is directly involved with working on Cherry Pie (Crypton's real-time voice converter based on deep learning).

Source
He currently works at Crypton (2013 - present), and went to Hokkaido University (2003-2012).

Summary
Currently conducting research and development related to voices of the likes of real-time vocal analysis synthesis technology, and deep learning voice conversion.
Also, making these technologies into practical use for VST/AU audio plug-in (effector).*

At graduate school, I did natural language processing (NLP) research.**
I am an expert in computer science.

I make roast beef 2x a week.

Projects
Vocal effector plug-in development
Effector that distorts the voice "Vocal drive"
Through deep learning, a real-time vocal analysis synthesis effector able to convert voice types with "Cherry Pie"
I was in charge of the idea of these effectors, from the plan of the algorithm, to the signal processing system, and as far as implementing the GUI feature.
* (Source) An audio plug-in, in computer software, is a plug-in that can add or enhance audio-related functionality in a computer program. Such functionality may include digital signal processing or sound synthesis. Audio plug-ins usually provide their own user interface, which often contains GUI widgets that can be used to control and visualise the plug-in's audio parameters.'

**(Source) Natural language processing = a subfield of computer science, information engineering, and artificial intelligence concerned with the interactions between computers and human (natural) languages, in particular how to program computers to process and analyze large amounts of natural language data.

----
This is Ryo's Twitter: https://twitter.com/Ryo_Jerky
It hadn't been updated for a long time, but started to update around the time that Cherry Pie was announced. I wonder if he wrote Crypton's blog post about it? He talks in detail about how measuring pitch is extremely important in vocal analysis synthesis (vocal coating) and also talks about Deep Neural Networks. However, I will be translating or summarizing some tweets later. I have been doing a lot of personal translation projects and got behind on sharing my finding and just want to get this initial information out sooner than later.

Lastly, I assume since Ryo conceptualized Cherry Pie, he might be responsible for the strange name behind it? I wonder if I could @ him and ask at some point? : P
 

uncreepy

👵Escaped from the retirement home
Apr 9, 2018
1,618
I asked Ryo if he would tell us how Cherry Pie got its name and he actually replied!

[We] called it a name that included the meaning of "easily usable".

cherry pie: <American slang> A thing that is easily able to be done
I know of "easy as pie", but I never heard of cherry pie having anything to do with easiness. Cherry or cherry pie can be an adult-related slang... so... er... maybe that's what easy came from? 😱 This is one of those bad translation kind of situations, but I'm not gonna be the one to break the news to him. Not sure how to reply... like, simply "Thanks for explaining! I'm looking forward to Cherry Pie"???
 

DefiantKitsune

Lonely kanon fan
Apr 11, 2018
622
I asked Ryo if he would tell us how Cherry Pie got its name and he actually replied!



I know of "easy as pie", but I never heard of cherry pie having anything to do with easiness. Cherry or cherry pie can be an adult-related slang... so... er... maybe that's what easy came from? 😱 This is one of those bad translation kind of situations, but I'm not gonna be the one to break the news to him. Not sure how to reply... like, simply "Thanks for explaining! I'm looking forward to Cherry Pie"???
In all fairness, "Pie" would be a pretty bland and generic name
 

razelberii

Bless the Lord, O my Soul
Apr 8, 2018
423
雨リカ
razzyru.com
I would like to ask: which one is easier, Vocalistener or Cherry Pie?
It is perhaps difficult to answer at the time, as Cherry Pie has not been released.
They both function a bit different as well, so it is a bit difficult to compare. I have not used Vocalistener, but from my understanding, it effects the VSQ and can help almost tune the VSQ for you, based on a recording of vocals.
For Cherry Pie: From my understanding, there is no VSQ. It appears to be wav/mp3/audio files, which means you cannot manipulate with VSQ (Think of an exported wav file from the Vocaloid Editor)
So, editing the audio is most likely done with an effect VST. Based on the video, it seems to be easy to change the parameters.
In that sense, Cherry Pie may be easier in comarison, as parameters can easily be manipulated, and Vocalistener may require an audio recording.

Hopefully that helps a bit, sorry I can't give much further beyond this. I have not used either Vocalistener or Cherry Pie. They both appear useful, but are built for different functions:
Vocalistener - VSQ based on audio
Cherry Pie - VST for audio (specifically vocals)

Also, for anyone, if there are any errors in what I said, please correct me
 

uncreepy

👵Escaped from the retirement home
Apr 9, 2018
1,618
Yeah, no one's gotten to use Cherry Pie yet, so we don't really know. I feel like comparing Cherry Pie to VocaListener is like comparing apples to watermelons at this point, like... they have different functions, purposes, and results. Plus, you can't buy VocaListener anymore, so it doesn't really matter unless you were able to buy it before. I have used VocaListener quite a bit, so here's my opinion on it.

Cherry Pie can be used in real time in presumably any DAW (because it is a VST). You had to upload a .wav of your human singing into VocaListener for it to analyze it and it only worked in Vocaloid 3 or 4.

Cherry Pie is more sensitive compared to VocaListener, because Cherry Pie was made with deep learning to figure out female and male speech patterns. VocaListener just analyzes the .wav and makes the pitch bend and dynamics match what it heard. VocaListener was imperfect, because it would be unable to read low, growly singing. After it analyzed the singing, you had to tell it what the lyrics were and where each lyric started/ended (it could guess, but needed help). I don't know how Cherry Pie knows the lyrics if you are using it for Vocaloid-related music making.

Cherry Pie works with English and Japanese. VocaListener was Japanese-only (seriously, the interface was never translated), and its English version was never released.

Cherry Pie is aimed at not just Vocaloid producers, it's also meant for VTubers (so they can change their gender in real time for livestreams, for example). VocaListener is exclusively for making Vocaloid songs.

Lastly, VocaListener would overwrite a VSQ if you had one imported before analyzing a .wav, so if you wanted to do a cover, you had to make sure you sang it with perfect timing. Because Cherry Pie seems like it's used in real time, I assume it also has this same issue.
 

Jikyu

Producer in Training
Apr 8, 2018
37
27
USA
linktr.ee
Yeah, no one's gotten to use Cherry Pie yet, so we don't really know. I feel like comparing Cherry Pie to VocaListener is like comparing apples to watermelons at this point, like... they have different functions, purposes, and results. Plus, you can't buy VocaListener anymore, so it doesn't really matter unless you were able to buy it before. I have used VocaListener quite a bit, so here's my opinion on it.

Cherry Pie can be used in real time in presumably any DAW (because it is a VST). You had to upload a .wav of your human singing into VocaListener for it to analyze it and it only worked in Vocaloid 3 or 4.

Cherry Pie is more sensitive compared to VocaListener, because Cherry Pie was made with deep learning to figure out female and male speech patterns. VocaListener just analyzes the .wav and makes the pitch bend and dynamics match what it heard. VocaListener was imperfect, because it would be unable to read low, growly singing. After it analyzed the singing, you had to tell it what the lyrics were and where each lyric started/ended (it could guess, but needed help). I don't know how Cherry Pie knows the lyrics if you are using it for Vocaloid-related music making.

Cherry Pie works with English and Japanese. VocaListener was Japanese-only (seriously, the interface was never translated), and its English version was never released.

Cherry Pie is aimed at not just Vocaloid producers, it's also meant for VTubers (so they can change their gender in real time for livestreams, for example). VocaListener is exclusively for making Vocaloid songs.

Lastly, VocaListener would overwrite a VSQ if you had one imported before analyzing a .wav, so if you wanted to do a cover, you had to make sure you sang it with perfect timing. Because Cherry Pie seems like it's used in real time, I assume it also has this same issue.
Honestly, I would like to use Cherry Pie to make my voice sound cuter for a English VTuber. Lol.
 

Wario94

Passionate Fan
Jan 5, 2019
219
30
Another question about Cherry Pie, besides Crypton's own Vocaloid, is it possible if other Vocaloid companies used their own Vocaloid voice banks for Cherry Pie?
 
  • Like
Reactions: Jikyu

razelberii

Bless the Lord, O my Soul
Apr 8, 2018
423
雨リカ
razzyru.com
Another question about Cherry Pie, besides Crypton's own Vocaloid, is it possible if other Vocaloid companies used their own Vocaloid voice banks for Cherry Pie?
Because the VST uses audio rather than a Vocaloid voicebank or VSQ, I believe there are not going to be restrictions such as this.
Now, if a company records a Vocaloid and adjusts the audio samples with Cherry Pie, I will be honest, I am not sure if that is allowed or not. It may be.
Perhaps we may have to wait and see regarding this, if there are license details by chance.
Again, I believe it is fine and has no issues, but we may not have a 100% clear answer until the product is released or access to the license is given to see details such as this.
Anyone is free to correct or give their thoughts/perspective on this, just wanted to share my thoughts
 

uncreepy

👵Escaped from the retirement home
Apr 9, 2018
1,618
Cherry Pie talking demo on YouTube by nyanyannya:
@Kazumi alerted me to the existence of this video. It was uploaded on April 3rd. nyanyannya got to test the "system currently in development" (quote from the description of the video). (nyanyannya is known for creating a song series called the Namari Hime Series (鉛姫シリーズ / Lead Princess Series).)


Okaerinasai, MASUTAA. Atarashii seikatsu wa ikaga desu ka? Nanika, komatta koto wa arimasen ka? Kanashii koto wa arimasen ka? Yokatta! MASUTAA ni makenai you ni boku mo uta no renshuu ganbarimasu! Dakara, MASUTAA atarashii uta, utawasete kudasai ne

Welcome home, master. How is your new life? Is there something you're worried about? Something you're sad about? Good! I'm trying my best at practicing singing so master won't lose! So please have me sing new songs, master.
Edit: The description is Japanese says "制作中のシステムのテストを兼ねて制作しました", which implies they have been working on this test for quite some time now.


nyanyannya's thoughts about the talking attempt on Twitter:
I went back and found a tweet about it from the same day:
If you're interested, I got big brother KAITO to talk for me to this extent.
I think it will probably sound more natural being covered up by things like BGM. I will use this method for singing in my next song. Of course, this method is in the middle of being improved and [we] are aiming for tuning that is more natural and in real-time.

Translated interface reveals Cubase being used:
While translating the interface, I came to realize that the software being used seems to be Cubase Elements (I think version 9?).
2003

Here is a screen shot of Cubase to compare:
2004
The headings are the same, the shape of the Inspector is the same. So, it looks like instead of coming with Studio One, Crypton is jumping over to Cubase? Unless we can use the VSTs (Cherry Pie and Vocal Drive) in any DAW...


My thoughts:
I thought that Kaito was easy to understand and didn't sound super wonked up. You can actually hear intonation and emotion in his voice, he sounds gentle.

Since the demo was from last month, maybe nyanyannya is done with their song by now, but can't post it? Maybe we will get the new Appends this year, after all. After realizing they are using Cubase, I guess it proves Crypton really is just making VSTs and not making their own DAW from scratch. I wonder if any other producers got to test out the VSTs and we just haven't noticed them yet? Also, it's cool that there's artwork of Kaito in the program, it seems more fun being able to see the vocalist while working. Glad we finally got to hear Kaito's updated voice.
 
Last edited:

Kona

Avanna's #1 Fan
Apr 8, 2018
814
USA
I can’t listen to things right now so I’ll put my full thoughts here later, but at least for the DAW side of it, I don’t think it’s CFM leaving Studio One for bundles, but just the fact that nyanyamnya is probably just using their DAW of choice, which is Cubase in this case. My guess is it is like Piapro Studio, just a VSTi that can be used in any DAW, which could be a better choice on their part, since it’s just more logical to me at least, that you’d be able to do everything from your DAW rather than need a separate program that could just be a VST.

I hope this gets more news on the release side soon, I’m getting really excited and impatient seeing it all.
 

uncreepy

👵Escaped from the retirement home
Apr 9, 2018
1,618
Yes, I think you're right about Cubase just being nyanyannya's choice of DAW now that you mention it. I did some more research after reading what you wrote to try to prove this theory and to also stop me from constantly wondering about the software Crypton used for the initial Cherry Pie demo and I think I finally figured it.

Screenshot from the video from a few months ago as a reminder:
2007

I think they were using iZotope (compare the blue power buttons, the upper left menu, the wiggly audio icon for each track):
2009

Definitely just a VST for any DAW.
 

Wario94

Passionate Fan
Jan 5, 2019
219
30
There's one thing that's bugging me: we do not know if either Cherry Pie would contain the Crypton voice banks for free or we would buy the Crypton Vocaloid voice banks in order to use Cherry Pie?
 
  • Like
Reactions: Jikyu

Jikyu

Producer in Training
Apr 8, 2018
37
27
USA
linktr.ee
There's one thing that's bugging me: we do not know if either Cherry Pie would contain the Crypton voice banks for free or we would buy the Crypton Vocaloid voice banks in order to use Cherry Pie?
Interesting. I do wonder about that. Currently, we don't know.
 
  • Like
Reactions: Wario94

uncreepy

👵Escaped from the retirement home
Apr 9, 2018
1,618
Untuned Cherry Pie demo by nyanyannya:
nyanyannya uploaded an untuned version (just using Cherry Pie with no further editing). (Thanks for the tip again, @Kazumi !)



I THINK he's saying:
"dou da, ureshii ka yo, kore" (basically "how's that? happy about this?")
"ii yatsu da, mou vocaloid editor ?? shinakute ii da zo" ("this (Cherry Pie) is good", can't hear it all so I'm assuming it's something like "if I didn't have to use Vocaloid Editor anymore, it would be good")

The comment on the video says that because they uploaded a tuned version, this time they are showing a non-edited version and that the edited version has more precision.

I did notice that they changed the icon to be their OC, the name of the track is "せんせー / sensei / doctor", so I guess I am assuming you can upload your custom icons and save preset voice settings. The title of the video in Japanese says that with deep learning, Doctor Funk Beat (their character from the Lead Princess series) was able to talk to this extent. (Doctor Funk Beat's voice is created from Kaito's in case it's not obvious.)

Here's the tweet about it:


My thoughts:
I think that the untuned version sounds pretty rough. I kept thinking that Crypton's Cherry Pie was going to blow Voidol out of the water, but I guess not. It reminds me of the IA English/Satous Sasara demo made with deep learning that required Melodyne to fix it up. @DefiantKitsune reminded me that the Appends were still in alpha, I checked and Wat never said they were in beta. I guess I just assumed they were because so much time had passed, so I guess in combination with realizing that and hearing the untuned version, I feel actual disappointment and feel like we won't get a commercial version any time soon. But I mean, the tuned version sounded good, but it makes me wonder how much time is ACTUALLY saved by using Cherry Pie vs. tuning completely by hand.
 

uncreepy

👵Escaped from the retirement home
Apr 9, 2018
1,618
Warning: Technobabble combined with Japanese lingo that might be hard for others to understand.
tl;dr The guy who made Cherry Pie will be talking in 4 hours at a music/computer technology event that will be livestreamed.


Ryo, a developer at Crypton who is directly involved with creating the Cherry Pie and Vocal Drive effectors (from the algorithm, to signal processing, and the GUI) tweeted a bit ago.

Today at 1:30 PM (Japan time), Ryo is demonstrating Crypton's real-time voice analysis synthesis effector (aka Cherry Pie) at an event in Kyoto University sponsored by SIGMUS. SIGMUS is an acronym for Special Interest Group on MUSic and computer, it is a society for data processing and computer science related to music. The event is called 音学シンポジウム2019 (Ongaku Symposium 2019, where there is a play on words for "ongaku" > 音楽 = music, 音学 = sound/music science).

The event he linked to has a lot of things being talked about by various people and it is 2 days long, but he is only talking on the 23rd at 1:30 PM for a "poster session" (a fancy name for a poster presentation at an academic conference).

They are going to livestream the event, but it's only 8:30 AM in Japan right now and the link won't be updated until closer to start. However, some of the events won't be streamed. I will keep my eyes peeled for the URL and try to see if we can get a glimpse of new Cherry Pie information.
 
Last edited:

Wario94

Passionate Fan
Jan 5, 2019
219
30
At long last! The moment of truth is going to be reveal at any minute now! We would be the first people (though not on Ongaku but rather right here on Vocaverse Network) to heard the voice-coating synthesizers for our favorite Crypton family members!
 

mobius017

Aspiring ∞ Creator
Apr 8, 2018
2,044
Nice, thanks @uncreepy ! This is exciting. Hopefully we can get some additional information/demonstration. If they're talking about it publicly like this, I guess a release date might not be entirely close, but it might not be astronomically far away, either...?
 

Users Who Are Viewing This Thread (Users: 0, Guests: 5)