• We're currently having issues with our e-mail system. Anything requiring e-mail validation (2FA, forgotten passwords, etc.) requires to be changed manually at the moment. Please reach out via the Contact Us form if you require any assistance.

Real-time voice conversion software aimed at VTubers called "Voidol"

uncreepy

👵Escaped from the retirement home
Apr 9, 2018
1,618
@Wario94 Well, I want to buy at least one add-on voice from Voidol, but I also want to buy at least one Cherry Pie voice (probably Kaito), so I guess I'll be stuck waiting either way. I think Voidol has a chance they might update and fix some issues.



They updated their blog to explain they will be at CEATEC on the 15th-18th to show off Voidol and brAInMelody (it's a thing you wear while listening to music and it analyzes your emotions to make more MIDI notes. I've never mentioned it, because it sounds kind of useless/weird. It's supposed to be so you can create BGM for projects).
 
  • Like
Reactions: mobius017

mobius017

Aspiring ∞ Creator
Apr 8, 2018
2,036
brAInMelody (it's a thing you wear while listening to music and it analyzes your emotions to make more MIDI notes. I've never mentioned it, because it sounds kind of useless/weird. It's supposed to be so you can create BGM for projects).
Meh, I think it's kind of cool. Seems like it might be related somehow to the spectral centroid, an equation that represents different images/sounds as different frequencies, and which people seem to commonly unconsciously agree upon; the basic idea is that people generally agree that emotion A, shape A, and sound A are all roughly equivalent to each other, so shapes can represent/evoke emotions, sounds can represent/evoke shapes, etc. And people can guess others' emotional states from the sounds/shapes those people produce--much like how we all do with music, of course.

Still, at this stage, I'm not sure how useful it could be unless machine learning has given it some pre-existing melodies to remix/concatenate in response to the brain activity it measures. There has to be some way that it converts the brain activity into meaningful music; if it were just converting electrical activity into MIDI notes or something, it seems like it would be more or less boring or meaningless noise.

And then, how similar would the melodies be for two people who felt similarly, or one person in the same emotional state at different times...? Does it really count as creativity if whatever a person's brain is doing is being re-interpreted/"augmented" by something pre-generated?

Nah, maybe not really useful right now, but cool to speculate about for a few minutes.
 
  • Like
Reactions: uncreepy

uncreepy

👵Escaped from the retirement home
Apr 9, 2018
1,618
It wasn't announced, but apparently Voidol got an update on September 5th. A notification points you to the download page upon startup of opening Voidol. The bug appears to only fix issues with the Japanese language interface.

Yeah, I bought it on August 29th and haven't opened it after recording the tests I did because I was too crushed by the disappointment of waiting for the add-on voices. Still no mention of when those are coming out. Seriously? Imagine buying what is essentially half of the software, expecting to be able to, y'know... use it properly along with the add-on voices the iPhone has been able to use for like... over a year, but being stuck in no-news-limbo with no sight in end. And even though they have have been notified of customer complaints/installation issues, those people don't even get acknowledged while Crimson Technology goes and runs off across the world to show off their incomplete/buggy software in order to get more people's money. :}
 
  • Like
Reactions: TheStarPalace

Prism

Enthusiast
Jul 18, 2019
525
I wonder if Crimson Technology is hard on cash from development cost and is going to the show for promotion and looking for investment
 

Voidquestions

New Fan
Aug 29, 2019
8
I know Zunko's creator is pretty protective of Zunko and doesn't want her to be defamed. Maybe he's having second thoughts. I wonder what's the hold up with every other voice though. The only starting girl voice, Iroha, doesn't match a wide range of personalities/bodies. For me, she doesn't even sound female as a male.
 

uncreepy

👵Escaped from the retirement home
Apr 9, 2018
1,618
I know Zunko's creator is pretty protective of Zunko and doesn't want her to be defamed.
Er, have you looked at Zunko's official twitter? They do a great job at sullying her name themselves. (Retweeting creepy fan art and adult doujins of her and the other girls related to her, even though they're tweens or teens.)

Anyway, I don't think that the add-on voices are necessarily the problem. The voices that are already available for purchase for iPhone are Zunko, Peroro, Tsubasa, Yoshida-kun, Miranda, and Tsukasa. The only voice that is for sure confirmed to be a new vocal is Queen Shuffle (whose tweet thread about the matter stopped after sharing the box art and hasn't updated for months). Also, Zunko's twitter reminded everyone that she would be on the PC version soon after the Windows version finally came out. All of the voices I listed were happily demonstrated on DTM Stations livestream as well (except Queen Shuffle).

I feel like maybe the problem is on Crimson Technology's side? But it's so weird they haven't been able to give us any sort of reason for it being late. All they ever say is a new estimated release date with no explanation/appology. I thought maybe they were having bugs while trying to go from an app to a PC program, but it doesn't make sense. Why are they being so secretive? Do they not have enough workers? And the workers they do have can't figure out the problems/don't have enough time? Why would the free voices (CRIMMZOH, Iroha, Minato) work on the PC ver but not the other ones? Is the issue rather how to sell the voices? I know the iPhone one is through in-app purchases, is selling on Amazon that difficult? Or are they trying to fix people's complaints before releasing the voices so people feel more satisfied? Sorry for asking so many questions that no one knows the answer to.
 
  • Like
Reactions: Voidquestions

Trevor

?
May 2, 2018
78
I wonder if Crimson Technology is hard on cash from development cost and is going to the show for promotion and looking for investment
After researching, I would say it's just a new tech company... similar to how alter/ego was first created. Alter/ego did generate buzz, but the team couldn't compete with powerhouses like Yamaha due to the lack of resources. These things spring up all the time with innovative ideas that lack the technology, resources, funds and people needed to develop a product past the "functional" phase. Voidol was ambitious and the product is a great novelty. It was just too large of a task imo. In the future (as with many starter companies with considerable outside support), we may see a practical version of the software. I wouldn't call it a cash grab. Its just the best the team could do at the time.

P.S. looked at the papers "published" for BrAInmelody. It looked like the passion project of a researcher at a middle-of-the-road school who couldn't generate enough of a response to push it past a primitive toy.
 
Last edited:
  • Like
Reactions: uncreepy

Prism

Enthusiast
Jul 18, 2019
525
I think it might be on how to sell the voices and making sure it meshes well and is hard to pirate
 

uncreepy

👵Escaped from the retirement home
Apr 9, 2018
1,618
I was stalking the Voidol tag on Twitter and saw Iroha's voice provider was at CEATEC 2019 (where they showed of brAInMelody and Voidol). They had a flyer in English, which she has a backwards picture of.

voidol.jpg
Here's it flipped and edited a little, basically very hard to read.

Voidol changes your voice into a character voicec and helps for Vtuber or live streaming.
Windows version of "Voidol Powered by RC voice" is now on
sale. It can change your choice into various character voice like
a cute girl voice or a handsom man voice on real-time.
The application ?? on the paid application category of Mac App Store, amazon Japan and Rakuten ??? in Japan.

"Otomiya Iroha" female character, "Kanade Minato" male character, and CRIMMZO (??? character) are prepared as the ??? in the
software. More voice models will be released in a later date, including popular characters such as "Tohoku Zunko" or "Taka no Tsume Yoshida-kun" and
Voidol original characters such as "Koeno Tsubasa" by Koiwai Kotori.

Helpful Function for live-streaming or contents creation.
???

Functions
Able to change narrator type.
Three preset voice models.
Mixing of background sound
Space Effect
Noise Gate
Welp. CEATEC was on October 15th ~ 18th and this flyer says the add-on voices are coming later. Total time of me waiting to buy Tsukasa so far since this thread was created: 220 days. I hope when they finally release them, the quality will have gone up.
:ring_ani_lili:
 

uncreepy

👵Escaped from the retirement home
Apr 9, 2018
1,618
It's up
Thanks for letting us know!


Upon starting Voidol, there is an update available (from 1.1.1 to 1.2.0), which adds the new voice called "Rice-Chan". (The .zip file includes user guides in English, Japanese, and Chinese.)
Note: Avast kept blocking the VoidolSetup.exe, I had to add the folder it was in as an exception to its protection scan in order to update. orz
rice.png

Clip of me testing her with both the female and male model options:


Source: 「Voidol - Powered by リアチェンvoice -」追加ボイスモデルを発売。「鷹の爪団 吉田くん(CV:FROGMAN)」や「東北ずん子 (CV佐藤聡美)」「東北きりたん(CV茜屋日海夏) 」など

Free default voices (have to buy the base software to use them):
CRIMMZOH (くりむ蔵) / CV: ?
Iroha Otomiya (音宮いろは) / CV: Mayu Tohno (遠野まゆ)
Minato Kanade (奏ミナト) / CV: ?
Rice-chan (ヨネちゃん) / CV: ?

6 out of the 12 planned voices are now available for purchase:
Zunku Tohoku (東北ずん子) / CV: Satou Satomi (佐藤聡美)
Cutie Alien Peroro (キューティ・エイリアン ペロロ) / CV: ?
Tsubasa Koeno (声乃ツバサ) / CV: Kotori Koiwai (小岩井ことり) [Note: Same VP as Meika Hime/Mikoto]
Eagle Talon Yoshida-kun (鷹の爪団 吉田くん) / CV: FROGMAN
Dub fairy Miranda (吹替の妖精ミランダ) / CV: Maya Okamoto (岡本麻弥)
Tsukasa Otoshiro (音城ツカサ) / CV: Takayuki Fujimoto

The other 6 will be released during November:
Queen Shuffle (王女シャッフル) / CV: Tomoko Uzawa (鵜澤朋子)
Jack Blow (ジャック・ブロウ) / CV: I unfortunately don't know the reading to his name (笹井崇裕) [Note: They still haven't added more drawings of him even though we've been waiting since post #13]
Ichiru Senkin (千色いちる) / CV: Mai Kadowaki (門脇舞以)
Zombi-Ko Cafeno (カフェ野ゾンビ子) [Note: This is the only new vocal we didn't know was coming, this is her YouTube channel: Zombi-Ko Channel ]
Gate Jobs (ゲート・ジョブス) / CV: AIJI [Note: I still can't find info on this person other than that blurry picture from the promotional video on Crimson Technology's YouTube]
Kiritan Tohoku (東北きりたん) / Himika Akaneya (茜屋日海夏)
^ The only name that I noticed during the livestream that isn't on this list is "Pepper".

Unfortunately, each voice costs $38.47 (¥4,180)!!! I was expecting half that price based on how much each voice costs on the phone version! With the quality being questionable, I am not sure if I will still splurge for Tsukasa or not. orz
 
Last edited:

uncreepy

👵Escaped from the retirement home
Apr 9, 2018
1,618
The stupid add-on voice catalog doesn't have any actual hyperlinks to hear demos of their voices, so I had to use their search function:

Peroro:

Tsubasa:

Zunko:

Yoshida-kun:

Miranda:

Tsukasa:

Somehow these clips sound way higher quality (as in no crackling) than what normal people can replicate. But usually the guy's voice being coated sounds worse/closer to what results people can expect. It seems like the further you go in this list, the worse the quality gets and it seems guys trying to convert their voices have worse results compared to women converting their voice. But it also seems like women trying to convert to a male voice sounds not good.

Maybe they will have a sale once the other 6 voices are released?
 
  • Like
Reactions: TheStarPalace

Voidquestions

New Fan
Aug 29, 2019
8
I think the reason the guy's to girls voice conversion sounds off is because the guy (bottom left in the samples) they used to train the machine learning algorithm, hardly has a spec of testosterone in his voice.
 

uncreepy

👵Escaped from the retirement home
Apr 9, 2018
1,618
I think the machine was taught by at least 4 people, it seems, not just that one guy (he might not even have taught it, maybe he is just demoing it?).

CRIMMZOH has female 1, female 2, male 1, male 2.
Kanade Minato has female 1, female 1 (high pitch conversion), female 2, female 2 (high pitch conversion), male 1, male 2.
Iroha has female 1, female 1 (high pitch conversion), female 2, female 2 (high pitch conversion), male 1, male 2.
Rice-chan has female 1, male 1.

I think the high pitch conversion = literally just pitched up.
Personally, even though I'm a girl, I've had some luck with the male setting for certain characters. Maybe the female voices that taught it might have been by higher pitched voices than the typical English speaking girl?


Out of curiosity, I checked Crypton's Cherry Pie and each voice (Miku, Rin, Len, at least) appears to have 2 male and 2 female patterns to pick from.
Male 2 CV01, Female 2 CV01, Male 2 CV01 (GAN), Female 2 CV01 (GAN).
Note: 2 = "to", C01 = Miku, I have no clue what GAN stands for.

One last thing I noticed is that Cherry Pie doesn't have crackling like Voidol does, but it does have muffled parts in the untuned demo.
 
  • Like
Reactions: Wario94

Voidquestions

New Fan
Aug 29, 2019
8
I think the machine was taught by at least 4 people, it seems, not just that one guy (he might not even have taught it, maybe he is just demoing it?).

CRIMMZOH has female 1, female 2, male 1, male 2.
Kanade Minato has female 1, female 1 (high pitch conversion), female 2, female 2 (high pitch conversion), male 1, male 2.
Iroha has female 1, female 1 (high pitch conversion), female 2, female 2 (high pitch conversion), male 1, male 2.
Rice-chan has female 1, male 1.

I think the high pitch conversion = literally just pitched up.
Personally, even though I'm a girl, I've had some luck with the male setting for certain characters. Maybe the female voices that taught it might have been by higher pitched voices than the typical English speaking girl?


Out of curiosity, I checked Crypton's Cherry Pie and each voice (Miku, Rin, Len, at least) appears to have 2 male and 2 female patterns to pick from.
Male 2 CV01, Female 2 CV01, Male 2 CV01 (GAN), Female 2 CV01 (GAN).
Note: 2 = "to", C01 = Miku, I have no clue what GAN stands for.

One last thing I noticed is that Cherry Pie doesn't have crackling like Voidol does, but it does have muffled parts in the untuned demo.
All I know is that GAN has got to do with machine learning, but yeah one thing that rings true that the reviews on Amazon JP say is, the lack of documentation. :mikoto_lili:
 
  • Like
Reactions: Wario94

uncreepy

👵Escaped from the retirement home
Apr 9, 2018
1,618
All I know is that GAN has got to do with machine learning, but yeah one thing that rings true that the reviews on Amazon JP say is, the lack of documentation. :mikoto_lili:
Personally, I don't really understand what kind of documentation they are looking for. The zip file comes with an installation guide pdf and a link to an explanation of features. I don't know if people just can't figure out to how set up their own mic and mess with the small amount of settings? Or they think they are using the software wrong because it has crackling, even though that's how bad the quality always sounds?

It would be nice if they had proper demos that weren't misleading in quality or I guess just had a setup video for mics vs AUX set ups for live streaming, though.

Anyway, I looked up about GAN, it must be "Generative Adversarial Networks" (network able to produce new content).
Where generative = generates new data,
adversarial = there are two networks (the discriminator and generator?) that are pitted against each other (one for real references to train the network, one for generated "fake" things, I think?).
You're supposed to have both sides of the GAN (the two networks pitted against each other) relatively equal, or the discriminator will make things too close to the references, for example.
I barely understood what I tried to learn about this, but I'm curious how (for example) the normal "Male 2 CV01" vs "Male 2 CV01 (GAN)" works. Maybe one is higher quality or faster than the other?
I wonder if Voidol uses GAN at all?
 

Prism

Enthusiast
Jul 18, 2019
525
I hate how convoluted it is. I know it was made for live broadcast but I wish there was a vst version that could be used in daws
 
  • Like
Reactions: uncreepy

frankensalad

Banned
Feb 27, 2019
103
currently, what are the steps someone would have to take if they wanted to run a pre-recorded audio file through it? I'm interested in the software, but I would prefer if I could record the audio first to make sure I get a good take and THEN run it through Voidol.
 
  • Like
Reactions: Wario94

uncreepy

👵Escaped from the retirement home
Apr 9, 2018
1,618
currently, what are the steps someone would have to take if they wanted to run a pre-recorded audio file through it? I'm interested in the software, but I would prefer if I could record the audio first to make sure I get a good take and THEN run it through Voidol.
I personally can't figure out how to do it. My setup is:
Voidol + audio box (Behringer) + cardioid mic / have Audacity up and recording "What U Hear" (might be called "Stereo Mix" on other computers) (this is the PC's sound card basically)

Because of this setup, the only way to play pre-recorded audio is off of an mp3 player or phone held up to the mic (because if I played the clip off of the computer, it would be using up "What U Hear" coming out of the computer.

I've tried to have Voidol use pre-recorded audio + holding my mp3 player up to my mic, but the results were the usual level of bad with audio parts dropping out and not being converted to legible speech. At least for doing it in real time, you can attempt to say the same line in different ways until it picks it up. (Hopefully this explanation makes sense.)

Note: Voidol is recommended for live streaming and VTubers.They say it's good for making music, but I seriously doubt that. Unless you enjoy listening to music sung by a really crackly voice.

The base cost of Voidol: $22
+ 1 add-on voice: $38 [unless you're okay with the 4 free default voices]
= $60

The base cost to use Crimson Technology's recommended equipment (assuming you don't already own these things):
+ having to buy an audio box: $25ish for entry level one
+ an XLR male to female mic cable: $7ish
+ a mic: $18ish for an entry level one
+ a pop guard: $3
+ a mic stand: $12ish
= $65

Total cost to run Voidol with recommended equipment and an add-on voice: $125

Even thought I bought Voidol, I think the audio quality is sub-par, you can't use Voidol directly in a DAW, the software has no ability to export/record audio (have to do it yourself in an external program).

Honestly, you might as well wait and see what Crypton's Cherry Pie has to offer before making a decision. The official demo showed them converting pre-recorded audio to Miku's voice with seemingly the click of a button, you can customize the voice further to be higher/lower pitched/faster/growly/whatever. Plus, it's supposed to also work in real-time. Voidol's only option is switching between the female/male models in terms of customization and everything has to be real-time.
 

Users Who Are Viewing This Thread (Users: 0, Guests: 1)