• Hello Guest, recently a massive leak of e-mails and passwords ocurred on the internet. Please read more in here to see if you have been compromised.
  • Bookmarks are back! Read more in here.

Cryptonloid voicebank updates, collabs, & concert news (crypton_wat Twitter translations)

uncreepy

Visual kei enthusiast
Apr 9, 2018
402
USA
A trailer for bluray and CD of the last Miku Symphony got released, and it has clips of Miku and Luka (and Teto) talking. It's the same method where Miku got her voice coated for the VocaListener-esque plugin while using one of her new Append voices, so I assume that they are also using one of Luka's Append voices. It's kind of choppy like the last Miku clip I shared (for the Sony Store collab), but at least we finally get to hear Luka's voice with it (other than that really hard to hear clip inside the shop for her anniversary). However, even in this clip, it is hard to hear with the concert hall echo, so I recommend headphones.


Here are all the times Luka speaks:
1:55-2:09
3:14-3:34 (there's a human choir singing with her)
8:13-8:22
8:31-8:56

On a side note, I'm not really satisfied seeing these animations. (The models are cute, but the animation itself needs work.) The lighting is blown out, their feet don't touch the ground while walking, their hands are hanging down in a weird pose with splayed fingers, they don't blink at all while walking. When they turn to look at each other, they turn their entire bodies like some sort of really old RPG. Hrng... :alien:
 

uncreepy

Visual kei enthusiast
Apr 9, 2018
402
USA
I was looking up Crypton and Cherry Pie on Google and found something to translate: The LinkedIn profile of a developer at Crypton named Ryo, who is directly involved with working on Cherry Pie (Crypton's real-time voice converter based on deep learning).

Source
He currently works at Crypton (2013 - present), and went to Hokkaido University (2003-2012).

Summary
Currently conducting research and development related to voices of the likes of real-time vocal analysis synthesis technology, and deep learning voice conversion.
Also, making these technologies into practical use for VST/AU audio plug-in (effector).*

At graduate school, I did natural language processing (NLP) research.**
I am an expert in computer science.

I make roast beef 2x a week.

Projects
Vocal effector plug-in development
Effector that distorts the voice "Vocal drive"
Through deep learning, a real-time vocal analysis synthesis effector able to convert voice types with "Cherry Pie"
I was in charge of the idea of these effectors, from the plan of the algorithm, to the signal processing system, and as far as implementing the GUI feature.
* (Source) An audio plug-in, in computer software, is a plug-in that can add or enhance audio-related functionality in a computer program. Such functionality may include digital signal processing or sound synthesis. Audio plug-ins usually provide their own user interface, which often contains GUI widgets that can be used to control and visualise the plug-in's audio parameters.'

**(Source) Natural language processing = a subfield of computer science, information engineering, and artificial intelligence concerned with the interactions between computers and human (natural) languages, in particular how to program computers to process and analyze large amounts of natural language data.

----
This is Ryo's Twitter: https://twitter.com/Ryo_Jerky
It hadn't been updated for a long time, but started to update around the time that Cherry Pie was announced. I wonder if he wrote Crypton's blog post about it? He talks in detail about how measuring pitch is extremely important in vocal analysis synthesis (vocal coating) and also talks about Deep Neural Networks. However, I will be translating or summarizing some tweets later. I have been doing a lot of personal translation projects and got behind on sharing my finding and just want to get this initial information out sooner than later.

Lastly, I assume since Ryo conceptualized Cherry Pie, he might be responsible for the strange name behind it? I wonder if I could @ him and ask at some point? : P
 

uncreepy

Visual kei enthusiast
Apr 9, 2018
402
USA
I asked Ryo if he would tell us how Cherry Pie got its name and he actually replied!

[We] called it a name that included the meaning of "easily usable".

cherry pie: <American slang> A thing that is easily able to be done
I know of "easy as pie", but I never heard of cherry pie having anything to do with easiness. Cherry or cherry pie can be an adult-related slang... so... er... maybe that's what easy came from? 😱 This is one of those bad translation kind of situations, but I'm not gonna be the one to break the news to him. Not sure how to reply... like, simply "Thanks for explaining! I'm looking forward to Cherry Pie"???
 

DefiantKitsune

Lonely kanon fan
Apr 11, 2018
278
I asked Ryo if he would tell us how Cherry Pie got its name and he actually replied!



I know of "easy as pie", but I never heard of cherry pie having anything to do with easiness. Cherry or cherry pie can be an adult-related slang... so... er... maybe that's what easy came from? 😱 This is one of those bad translation kind of situations, but I'm not gonna be the one to break the news to him. Not sure how to reply... like, simply "Thanks for explaining! I'm looking forward to Cherry Pie"???
In all fairness, "Pie" would be a pretty bland and generic name
 

RazzyRu

Designer
Staff member
Administrator
Apr 8, 2018
237
piapro.jp
I would like to ask: which one is easier, Vocalistener or Cherry Pie?
It is perhaps difficult to answer at the time, as Cherry Pie has not been released.
They both function a bit different as well, so it is a bit difficult to compare. I have not used Vocalistener, but from my understanding, it effects the VSQ and can help almost tune the VSQ for you, based on a recording of vocals.
For Cherry Pie: From my understanding, there is no VSQ. It appears to be wav/mp3/audio files, which means you cannot manipulate with VSQ (Think of an exported wav file from the Vocaloid Editor)
So, editing the audio is most likely done with an effect VST. Based on the video, it seems to be easy to change the parameters.
In that sense, Cherry Pie may be easier in comarison, as parameters can easily be manipulated, and Vocalistener may require an audio recording.

Hopefully that helps a bit, sorry I can't give much further beyond this. I have not used either Vocalistener or Cherry Pie. They both appear useful, but are built for different functions:
Vocalistener - VSQ based on audio
Cherry Pie - VST for audio (specifically vocals)

Also, for anyone, if there are any errors in what I said, please correct me
 

uncreepy

Visual kei enthusiast
Apr 9, 2018
402
USA
Yeah, no one's gotten to use Cherry Pie yet, so we don't really know. I feel like comparing Cherry Pie to VocaListener is like comparing apples to watermelons at this point, like... they have different functions, purposes, and results. Plus, you can't buy VocaListener anymore, so it doesn't really matter unless you were able to buy it before. I have used VocaListener quite a bit, so here's my opinion on it.

Cherry Pie can be used in real time in presumably any DAW (because it is a VST). You had to upload a .wav of your human singing into VocaListener for it to analyze it and it only worked in Vocaloid 3 or 4.

Cherry Pie is more sensitive compared to VocaListener, because Cherry Pie was made with deep learning to figure out female and male speech patterns. VocaListener just analyzes the .wav and makes the pitch bend and dynamics match what it heard. VocaListener was imperfect, because it would be unable to read low, growly singing. After it analyzed the singing, you had to tell it what the lyrics were and where each lyric started/ended (it could guess, but needed help). I don't know how Cherry Pie knows the lyrics if you are using it for Vocaloid-related music making.

Cherry Pie works with English and Japanese. VocaListener was Japanese-only (seriously, the interface was never translated), and its English version was never released.

Cherry Pie is aimed at not just Vocaloid producers, it's also meant for VTubers (so they can change their gender in real time for livestreams, for example). VocaListener is exclusively for making Vocaloid songs.

Lastly, VocaListener would overwrite a VSQ if you had one imported before analyzing a .wav, so if you wanted to do a cover, you had to make sure you sang it with perfect timing. Because Cherry Pie seems like it's used in real time, I assume it also has this same issue.
 

RogerDelmar

Pikachu Music Producer
Apr 8, 2018
11
21
Florida, USA
sites.google.com
Yeah, no one's gotten to use Cherry Pie yet, so we don't really know. I feel like comparing Cherry Pie to VocaListener is like comparing apples to watermelons at this point, like... they have different functions, purposes, and results. Plus, you can't buy VocaListener anymore, so it doesn't really matter unless you were able to buy it before. I have used VocaListener quite a bit, so here's my opinion on it.

Cherry Pie can be used in real time in presumably any DAW (because it is a VST). You had to upload a .wav of your human singing into VocaListener for it to analyze it and it only worked in Vocaloid 3 or 4.

Cherry Pie is more sensitive compared to VocaListener, because Cherry Pie was made with deep learning to figure out female and male speech patterns. VocaListener just analyzes the .wav and makes the pitch bend and dynamics match what it heard. VocaListener was imperfect, because it would be unable to read low, growly singing. After it analyzed the singing, you had to tell it what the lyrics were and where each lyric started/ended (it could guess, but needed help). I don't know how Cherry Pie knows the lyrics if you are using it for Vocaloid-related music making.

Cherry Pie works with English and Japanese. VocaListener was Japanese-only (seriously, the interface was never translated), and its English version was never released.

Cherry Pie is aimed at not just Vocaloid producers, it's also meant for VTubers (so they can change their gender in real time for livestreams, for example). VocaListener is exclusively for making Vocaloid songs.

Lastly, VocaListener would overwrite a VSQ if you had one imported before analyzing a .wav, so if you wanted to do a cover, you had to make sure you sang it with perfect timing. Because Cherry Pie seems like it's used in real time, I assume it also has this same issue.
Honestly, I would like to use Cherry Pie to make my voice sound cuter for a English VTuber. Lol.
 

Wario94

Aspiring Fan
Jan 5, 2019
26
24
Another question about Cherry Pie, besides Crypton's own Vocaloid, is it possible if other Vocaloid companies used their own Vocaloid voice banks for Cherry Pie?
 
  • Like
Reactions: RogerDelmar

RazzyRu

Designer
Staff member
Administrator
Apr 8, 2018
237
piapro.jp
Another question about Cherry Pie, besides Crypton's own Vocaloid, is it possible if other Vocaloid companies used their own Vocaloid voice banks for Cherry Pie?
Because the VST uses audio rather than a Vocaloid voicebank or VSQ, I believe there are not going to be restrictions such as this.
Now, if a company records a Vocaloid and adjusts the audio samples with Cherry Pie, I will be honest, I am not sure if that is allowed or not. It may be.
Perhaps we may have to wait and see regarding this, if there are license details by chance.
Again, I believe it is fine and has no issues, but we may not have a 100% clear answer until the product is released or access to the license is given to see details such as this.
Anyone is free to correct or give their thoughts/perspective on this, just wanted to share my thoughts
 

uncreepy

Visual kei enthusiast
Apr 9, 2018
402
USA
Cherry Pie talking demo on YouTube by nyanyannya:
@Kazumi alerted me to the existence of this video. It was uploaded on April 3rd. nyanyannya got to test the "system currently in development" (quote from the description of the video). (nyanyannya is known for creating a song series called the Namari Hime Series (鉛姫シリーズ / Lead Princess Series).)


Okaerinasai, MASUTAA. Atarashii seikatsu wa ikaga desu ka? Nanika, komatta koto wa arimasen ka? Kanashii koto wa arimasen ka? Yokatta! MASUTAA ni makenai you ni boku mo uta no renshuu ganbarimasu! Dakara, MASUTAA atarashii uta, utawasete kudasai ne

Welcome home, master. How is your new life? Is there something you're worried about? Something you're sad about? Good! I'm trying my best at practicing singing so master won't lose! So please have me sing new songs, master.
Edit: The description is Japanese says "制作中のシステムのテストを兼ねて制作しました", which implies they have been working on this test for quite some time now.


nyanyannya's thoughts about the talking attempt on Twitter:
I went back and found a tweet about it from the same day:
If you're interested, I got big brother KAITO to talk for me to this extent.
I think it will probably sound more natural being covered up by things like BGM. I will use this method for singing in my next song. Of course, this method is in the middle of being improved and [we] are aiming for tuning that is more natural and in real-time.

Translated interface reveals Cubase being used:
While translating the interface, I came to realize that the software being used seems to be Cubase Elements (I think version 9?).
2003

Here is a screen shot of Cubase to compare:
2004
The headings are the same, the shape of the Inspector is the same. So, it looks like instead of coming with Studio One, Crypton is jumping over to Cubase? Unless we can use the VSTs (Cherry Pie and Vocal Drive) in any DAW...


My thoughts:
I thought that Kaito was easy to understand and didn't sound super wonked up. You can actually hear intonation and emotion in his voice, he sounds gentle.

Since the demo was from last month, maybe nyanyannya is done with their song by now, but can't post it? Maybe we will get the new Appends this year, after all. After realizing they are using Cubase, I guess it proves Crypton really is just making VSTs and not making their own DAW from scratch. I wonder if any other producers got to test out the VSTs and we just haven't noticed them yet? Also, it's cool that there's artwork of Kaito in the program, it seems more fun being able to see the vocalist while working. Glad we finally got to hear Kaito's updated voice.
 
Last edited:

Kona

Avanna's #1 Fan
Staff member
Moderator
Apr 8, 2018
614
USA
I can’t listen to things right now so I’ll put my full thoughts here later, but at least for the DAW side of it, I don’t think it’s CFM leaving Studio One for bundles, but just the fact that nyanyamnya is probably just using their DAW of choice, which is Cubase in this case. My guess is it is like Piapro Studio, just a VSTi that can be used in any DAW, which could be a better choice on their part, since it’s just more logical to me at least, that you’d be able to do everything from your DAW rather than need a separate program that could just be a VST.

I hope this gets more news on the release side soon, I’m getting really excited and impatient seeing it all.
 

uncreepy

Visual kei enthusiast
Apr 9, 2018
402
USA
Yes, I think you're right about Cubase just being nyanyannya's choice of DAW now that you mention it. I did some more research after reading what you wrote to try to prove this theory and to also stop me from constantly wondering about the software Crypton used for the initial Cherry Pie demo and I think I finally figured it.

Screenshot from the video from a few months ago as a reminder:
2007

I think they were using iZotope (compare the blue power buttons, the upper left menu, the wiggly audio icon for each track):
2009

Definitely just a VST for any DAW.
 

Users Who Are Viewing This Thread (Users: 0, Guests: 0)