Real-time voice conversion software aimed at VTubers called "Voidol"

uncreepy · Oct 10, 2019

@Wario94 Well, I want to buy at least one add-on voice from Voidol, but I also want to buy at least one Cherry Pie voice (probably Kaito), so I guess I'll be stuck waiting either way. I think Voidol has a chance they might update and fix some issues.

They updated their blog to explain they will be at CEATEC on the 15th-18th to show off Voidol and brAInMelody (it's a thing you wear while listening to music and it analyzes your emotions to make more MIDI notes. I've never mentioned it, because it sounds kind of useless/weird. It's supposed to be so you can create BGM for projects).

mobius017 · Oct 10, 2019

uncreepy said:
brAInMelody (it's a thing you wear while listening to music and it analyzes your emotions to make more MIDI notes. I've never mentioned it, because it sounds kind of useless/weird. It's supposed to be so you can create BGM for projects).

Meh, I think it's kind of cool. Seems like it might be related somehow to the spectral centroid, an equation that represents different images/sounds as different frequencies, and which people seem to commonly unconsciously agree upon; the basic idea is that people generally agree that emotion A, shape A, and sound A are all roughly equivalent to each other, so shapes can represent/evoke emotions, sounds can represent/evoke shapes, etc. And people can guess others' emotional states from the sounds/shapes those people produce--much like how we all do with music, of course.

Still, at this stage, I'm not sure how useful it could be unless machine learning has given it some pre-existing melodies to remix/concatenate in response to the brain activity it measures. There has to be some way that it converts the brain activity into meaningful music; if it were just converting electrical activity into MIDI notes or something, it seems like it would be more or less boring or meaningless noise.

And then, how similar would the melodies be for two people who felt similarly, or one person in the same emotional state at different times...? Does it really count as creativity if whatever a person's brain is doing is being re-interpreted/"augmented" by something pre-generated?

Nah, maybe not really useful right now, but cool to speculate about for a few minutes.

uncreepy · Oct 15, 2019

It wasn't announced, but apparently Voidol got an update on September 5th. A notification points you to the download page upon startup of opening Voidol. The bug appears to only fix issues with the Japanese language interface.

Yeah, I bought it on August 29th and haven't opened it after recording the tests I did because I was too crushed by the disappointment of waiting for the add-on voices. Still no mention of when those are coming out. Seriously? Imagine buying what is essentially half of the software, expecting to be able to, y'know... use it properly along with the add-on voices the iPhone has been able to use for like... over a year, but being stuck in no-news-limbo with no sight in end. And even though they have have been notified of customer complaints/installation issues, those people don't even get acknowledged while Crimson Technology goes and runs off across the world to show off their incomplete/buggy software in order to get more people's money. :}

Prism · Oct 15, 2019

I wonder if Crimson Technology is hard on cash from development cost and is going to the show for promotion and looking for investment

Voidquestions · Oct 15, 2019

I know Zunko's creator is pretty protective of Zunko and doesn't want her to be defamed. Maybe he's having second thoughts. I wonder what's the hold up with every other voice though. The only starting girl voice, Iroha, doesn't match a wide range of personalities/bodies. For me, she doesn't even sound female as a male.

uncreepy · Oct 15, 2019

Voidquestions said:
I know Zunko's creator is pretty protective of Zunko and doesn't want her to be defamed.

Er, have you looked at Zunko's official twitter? They do a great job at sullying her name themselves. (Retweeting creepy fan art and adult doujins of her and the other girls related to her, even though they're tweens or teens.)

Anyway, I don't think that the add-on voices are necessarily the problem. The voices that are already available for purchase for iPhone are Zunko, Peroro, Tsubasa, Yoshida-kun, Miranda, and Tsukasa. The only voice that is for sure confirmed to be a new vocal is Queen Shuffle (whose tweet thread about the matter stopped after sharing the box art and hasn't updated for months). Also, Zunko's twitter reminded everyone that she would be on the PC version soon after the Windows version finally came out. All of the voices I listed were happily demonstrated on DTM Stations livestream as well (except Queen Shuffle).

I feel like maybe the problem is on Crimson Technology's side? But it's so weird they haven't been able to give us any sort of reason for it being late. All they ever say is a new estimated release date with no explanation/appology. I thought maybe they were having bugs while trying to go from an app to a PC program, but it doesn't make sense. Why are they being so secretive? Do they not have enough workers? And the workers they do have can't figure out the problems/don't have enough time? Why would the free voices (CRIMMZOH, Iroha, Minato) work on the PC ver but not the other ones? Is the issue rather how to sell the voices? I know the iPhone one is through in-app purchases, is selling on Amazon that difficult? Or are they trying to fix people's complaints before releasing the voices so people feel more satisfied? Sorry for asking so many questions that no one knows the answer to.

Trevor · Oct 16, 2019

Prism said:
I wonder if Crimson Technology is hard on cash from development cost and is going to the show for promotion and looking for investment

After researching, I would say it's just a new tech company... similar to how alter/ego was first created. Alter/ego did generate buzz, but the team couldn't compete with powerhouses like Yamaha due to the lack of resources. These things spring up all the time with innovative ideas that lack the technology, resources, funds and people needed to develop a product past the "functional" phase. Voidol was ambitious and the product is a great novelty. It was just too large of a task imo. In the future (as with many starter companies with considerable outside support), we may see a practical version of the software. I wouldn't call it a cash grab. Its just the best the team could do at the time.

P.S. looked at the papers "published" for BrAInmelody. It looked like the passion project of a researcher at a middle-of-the-road school who couldn't generate enough of a response to push it past a primitive toy.

Prism · Oct 17, 2019

I think it might be on how to sell the voices and making sure it meshes well and is hard to pirate

uncreepy · Nov 10, 2019

I was stalking the Voidol tag on Twitter and saw Iroha's voice provider was at CEATEC 2019 (where they showed of brAInMelody and Voidol). They had a flyer in English, which she has a backwards picture of.

Here's it flipped and edited a little, basically very hard to read.

Voidol changes your voice into a character voicec and helps for Vtuber or live streaming.
Windows version of "Voidol Powered by RC voice" is now on
sale. It can change your choice into various character voice like
a cute girl voice or a handsom man voice on real-time.
The application ?? on the paid application category of Mac App Store, amazon Japan and Rakuten ??? in Japan.

"Otomiya Iroha" female character, "Kanade Minato" male character, and CRIMMZO (??? character) are prepared as the ??? in the
software. More voice models will be released in a later date, including popular characters such as "Tohoku Zunko" or "Taka no Tsume Yoshida-kun" and
Voidol original characters such as "Koeno Tsubasa" by Koiwai Kotori.

Helpful Function for live-streaming or contents creation.
???

Functions
Able to change narrator type.
Three preset voice models.
Mixing of background sound
Space Effect
Noise Gate

Welp. CEATEC was on October 15th ~ 18th and this flyer says the add-on voices are coming later. Total time of me waiting to buy Tsukasa so far since this thread was created: 220 days. I hope when they finally release them, the quality will have gone up.
:ring_ani_lili:

Voidquestions · Nov 14, 2019

It's up

Edit: Link with all voice previews

Voidol ボイスモデル

Voidol用の追加ボイスモデル一覧。新しいボイスモデルも続々登場！

crimsontech.jp

uncreepy · Nov 14, 2019

Voidquestions said:
It's up

声乃ツバサ（CV小岩井ことり）

人気声優　小岩井ことりさんの声になれるボイスモデルです。ボイスチェンジャーVoidolとともにお使いください。

crimsontech.jp

Thanks for letting us know!

Upon starting Voidol, there is an update available (from 1.1.1 to 1.2.0), which adds the new voice called "Rice-Chan". (The .zip file includes user guides in English, Japanese, and Chinese.)
Note: Avast kept blocking the VoidolSetup.exe, I had to add the folder it was in as an exception to its protection scan in order to update. orz

Clip of me testing her with both the female and male model options:

Source: 「Voidol - Powered by リアチェンvoice -」追加ボイスモデルを発売。「鷹の爪団吉田くん(CV:FROGMAN)」や「東北ずん子 (CV佐藤聡美)」「東北きりたん（CV茜屋日海夏) 」など

Free default voices (have to buy the base software to use them):
CRIMMZOH (くりむ蔵) / CV: ?
Iroha Otomiya (音宮いろは) / CV: Mayu Tohno (遠野まゆ)
Minato Kanade (奏ミナト) / CV: ?
Rice-chan (ヨネちゃん) / CV: ?

6 out of the 12 planned voices are now available for purchase:
Zunku Tohoku (東北ずん子) / CV: Satou Satomi (佐藤聡美)
Cutie Alien Peroro (キューティ・エイリアンペロロ) / CV: ?
Tsubasa Koeno (声乃ツバサ) / CV: Kotori Koiwai (小岩井ことり) [Note: Same VP as Meika Hime/Mikoto]
Eagle Talon Yoshida-kun (鷹の爪団　吉田くん) / CV: FROGMAN
Dub fairy Miranda (吹替の妖精ミランダ) / CV: Maya Okamoto (岡本麻弥)
Tsukasa Otoshiro (音城ツカサ) / CV: Takayuki Fujimoto

The other 6 will be released during November:
Queen Shuffle (王女シャッフル) / CV: Tomoko Uzawa (鵜澤朋子)
Jack Blow (ジャック・ブロウ) / CV: I unfortunately don't know the reading to his name (笹井崇裕) [Note: They still haven't added more drawings of him even though we've been waiting since post #13]
Ichiru Senkin (千色いちる) / CV: Mai Kadowaki (門脇舞以)
Zombi-Ko Cafeno (カフェ野ゾンビ子) [Note: This is the only new vocal we didn't know was coming, this is her YouTube channel: Zombi-Ko Channel ]
Gate Jobs (ゲート・ジョブス) / CV: AIJI [Note: I still can't find info on this person other than that blurry picture from the promotional video on Crimson Technology's YouTube]
Kiritan Tohoku (東北きりたん) / Himika Akaneya (茜屋日海夏)
^ The only name that I noticed during the livestream that isn't on this list is "Pepper".

Unfortunately, each voice costs $38.47 (¥4,180)!!! I was expecting half that price based on how much each voice costs on the phone version! With the quality being questionable, I am not sure if I will still splurge for Tsukasa or not. orz

Prism · Nov 14, 2019

For something that questionable I'm going to wait for a sale

uncreepy · Nov 14, 2019

The stupid add-on voice catalog doesn't have any actual hyperlinks to hear demos of their voices, so I had to use their search function:

Peroro:

キューティ・エイリアンペロロ

Voidol用追加ボイスモデル

crimsontech.jp

Tsubasa:

声乃ツバサ（CV小岩井ことり）

人気声優　小岩井ことりさんの声になれるボイスモデルです。ボイスチェンジャーVoidolとともにお使いください。

crimsontech.jp

Zunko:

東北ずん子（CV佐藤聡美）

佐藤聡美さんが声優を務める東北ずん子の声になりきれるボイスモデルです。ボイスチェンジャーVoidolとともにお使いください。

crimsontech.jp

Yoshida-kun:

鷹の爪団吉田くん(CV:FROGMAN)

人気キャラクター鷹の爪団吉田くんの声になりきれるボイスチェンジャー！

crimsontech.jp

Miranda:

吹替の妖精ミランダ(CV岡本麻弥)

岡本麻弥さんがCVを務める「吹替の妖精ミランダ」の声になりきれるボイスモデルです。ボイスチェンジャーVoidolとともにお使いください。

crimsontech.jp

Tsukasa:

音城ツカサ（CV藤本隆行）

藤本隆行さんがCVを務める「音城ツカサ」の声になりきれるボイスモデルです。ボイスチェンジャーVoidolとともにお使いください。

crimsontech.jp

Somehow these clips sound way higher quality (as in no crackling) than what normal people can replicate. But usually the guy's voice being coated sounds worse/closer to what results people can expect. It seems like the further you go in this list, the worse the quality gets and it seems guys trying to convert their voices have worse results compared to women converting their voice. But it also seems like women trying to convert to a male voice sounds not good.

Maybe they will have a sale once the other 6 voices are released?

Voidquestions · Nov 14, 2019

I think the reason the guy's to girls voice conversion sounds off is because the guy (bottom left in the samples) they used to train the machine learning algorithm, hardly has a spec of testosterone in his voice.

uncreepy · Nov 14, 2019

I think the machine was taught by at least 4 people, it seems, not just that one guy (he might not even have taught it, maybe he is just demoing it?).

CRIMMZOH has female 1, female 2, male 1, male 2.
Kanade Minato has female 1, female 1 (high pitch conversion), female 2, female 2 (high pitch conversion), male 1, male 2.
Iroha has female 1, female 1 (high pitch conversion), female 2, female 2 (high pitch conversion), male 1, male 2.
Rice-chan has female 1, male 1.

I think the high pitch conversion = literally just pitched up.
Personally, even though I'm a girl, I've had some luck with the male setting for certain characters. Maybe the female voices that taught it might have been by higher pitched voices than the typical English speaking girl?

Out of curiosity, I checked Crypton's Cherry Pie and each voice (Miku, Rin, Len, at least) appears to have 2 male and 2 female patterns to pick from.
Male 2 CV01, Female 2 CV01, Male 2 CV01 (GAN), Female 2 CV01 (GAN).
Note: 2 = "to", C01 = Miku, I have no clue what GAN stands for.

One last thing I noticed is that Cherry Pie doesn't have crackling like Voidol does, but it does have muffled parts in the untuned demo.

Voidquestions · Nov 14, 2019

uncreepy said:
I think the machine was taught by at least 4 people, it seems, not just that one guy (he might not even have taught it, maybe he is just demoing it?).

CRIMMZOH has female 1, female 2, male 1, male 2.
Kanade Minato has female 1, female 1 (high pitch conversion), female 2, female 2 (high pitch conversion), male 1, male 2.
Iroha has female 1, female 1 (high pitch conversion), female 2, female 2 (high pitch conversion), male 1, male 2.
Rice-chan has female 1, male 1.

I think the high pitch conversion = literally just pitched up.
Personally, even though I'm a girl, I've had some luck with the male setting for certain characters. Maybe the female voices that taught it might have been by higher pitched voices than the typical English speaking girl?

Out of curiosity, I checked Crypton's Cherry Pie and each voice (Miku, Rin, Len, at least) appears to have 2 male and 2 female patterns to pick from.
Male 2 CV01, Female 2 CV01, Male 2 CV01 (GAN), Female 2 CV01 (GAN).
Note: 2 = "to", C01 = Miku, I have no clue what GAN stands for.

One last thing I noticed is that Cherry Pie doesn't have crackling like Voidol does, but it does have muffled parts in the untuned demo.

All I know is that GAN has got to do with machine learning, but yeah one thing that rings true that the reviews on Amazon JP say is, the lack of documentation. :mikoto_lili:

uncreepy · Nov 14, 2019

Voidquestions said:
All I know is that GAN has got to do with machine learning, but yeah one thing that rings true that the reviews on Amazon JP say is, the lack of documentation.

Personally, I don't really understand what kind of documentation they are looking for. The zip file comes with an installation guide pdf and a link to an explanation of features. I don't know if people just can't figure out to how set up their own mic and mess with the small amount of settings? Or they think they are using the software wrong because it has crackling, even though that's how bad the quality always sounds?

It would be nice if they had proper demos that weren't misleading in quality or I guess just had a setup video for mics vs AUX set ups for live streaming, though.

Anyway, I looked up about GAN, it must be "Generative Adversarial Networks" (network able to produce new content).
Where generative = generates new data,
adversarial = there are two networks (the discriminator and generator?) that are pitted against each other (one for real references to train the network, one for generated "fake" things, I think?).
You're supposed to have both sides of the GAN (the two networks pitted against each other) relatively equal, or the discriminator will make things too close to the references, for example.
I barely understood what I tried to learn about this, but I'm curious how (for example) the normal "Male 2 CV01" vs "Male 2 CV01 (GAN)" works. Maybe one is higher quality or faster than the other?
I wonder if Voidol uses GAN at all?

Prism · Nov 14, 2019

I hate how convoluted it is. I know it was made for live broadcast but I wish there was a vst version that could be used in daws

frankensalad · Nov 15, 2019

currently, what are the steps someone would have to take if they wanted to run a pre-recorded audio file through it? I'm interested in the software, but I would prefer if I could record the audio first to make sure I get a good take and THEN run it through Voidol.

uncreepy · Nov 15, 2019

frankensalad said:
currently, what are the steps someone would have to take if they wanted to run a pre-recorded audio file through it? I'm interested in the software, but I would prefer if I could record the audio first to make sure I get a good take and THEN run it through Voidol.

I personally can't figure out how to do it. My setup is:
Voidol + audio box (Behringer) + cardioid mic / have Audacity up and recording "What U Hear" (might be called "Stereo Mix" on other computers) (this is the PC's sound card basically)

Because of this setup, the only way to play pre-recorded audio is off of an mp3 player or phone held up to the mic (because if I played the clip off of the computer, it would be using up "What U Hear" coming out of the computer.

I've tried to have Voidol use pre-recorded audio + holding my mp3 player up to my mic, but the results were the usual level of bad with audio parts dropping out and not being converted to legible speech. At least for doing it in real time, you can attempt to say the same line in different ways until it picks it up. (Hopefully this explanation makes sense.)

Note: Voidol is recommended for live streaming and VTubers.They say it's good for making music, but I seriously doubt that. Unless you enjoy listening to music sung by a really crackly voice.

The base cost of Voidol: $22
+ 1 add-on voice: $38 [unless you're okay with the 4 free default voices]
= $60

The base cost to use Crimson Technology's recommended equipment (assuming you don't already own these things):
+ having to buy an audio box: $25ish for entry level one
+ an XLR male to female mic cable: $7ish
+ a mic: $18ish for an entry level one
+ a pop guard: $3
+ a mic stand: $12ish
= $65

Total cost to run Voidol with recommended equipment and an add-on voice: $125

Even thought I bought Voidol, I think the audio quality is sub-par, you can't use Voidol directly in a DAW, the software has no ability to export/record audio (have to do it yourself in an external program).

Honestly, you might as well wait and see what Crypton's Cherry Pie has to offer before making a decision. The official demo showed them converting pre-recorded audio to Miku's voice with seemingly the click of a button, you can customize the voice further to be higher/lower pitched/faster/growly/whatever. Plus, it's supposed to also work in real-time. Voidol's only option is switching between the female/male models in terms of customization and everything has to be real-time.

Real-time voice conversion software aimed at VTubers called "Voidol"

Veteran

Aspiring ∞ Creator

Veteran

Enthusiast

New Fan

Veteran

?

Enthusiast

Veteran

New Fan

Veteran

Enthusiast

Veteran

New Fan

Veteran

New Fan

Veteran

Enthusiast

Banned

Veteran

Users Who Are Viewing This Thread (Users: 0, Guests: 3)