• We're currently having issues with our e-mail system. Anything requiring e-mail validation (2FA, forgotten passwords, etc.) requires to be changed manually at the moment. Please reach out via the Contact Us form if you require any assistance.

Hark! Vocaliod Lyrics Wiki is having a discourse.

Vector

Passionate Fan
Mar 6, 2022
238
Context, for those who missed it:


And the wiki discussion: https://vocaloidlyrics.miraheze.org/wiki/Vocaloid_Lyrics_Wiki:General_Discussion#Suspected_AI_Usage:_OKISO_-_CryDie

It seems likely to me, both from the example in the video, and because it should be absolutely trivial to defuse the situation...if one had a VSQ or similar. A screenshot or video of the project takes minutes to make.

Though I do find myself agreeing with the commenter cautioning against cultivating an environment where people automatically presume something is AI until they see proof, because that's also annoying for other reasons. I want neither that situation nor an explosion of poseurs.
 

MagicalMiku

♡Miku♡
Apr 13, 2018
2,528
Sapporo
I totally agree with Vector, and also, while I'm not familiar with that wiki, in general is better to follow what wikipedia usually does: put a "warning" or a message on top of page and several notices about "need more sources".
Or they could create a page listing all producers/songs/videos that need more verification for their content. So, if the user A doesn't know anything about it, at least reading that page or a warning, can understand that something is not (still) clear.
 

WacoWacko39

VFlower devotee
Jul 22, 2025
114
19
Florida. U.S.A
The last topic the wiki discussed was "Zako," which the consensus was a moderate response. I have a feeling a harder stance may be taken on the one though. I learned of this all today, myself, and I believe I'll sleep on it before leveling my full opinions.
 

Luxie

Kagamine Fan
Aug 3, 2022
67
Eastern USA
because it should be absolutely trivial to defuse the situation...if one had a VSQ or similar. A screenshot or video of the project takes minutes to make.
This is what confuses me most about producers facing AI allegations. Are there no demos, VSQs, and no instruments to take a picture of in the DAW? I forget who, but I saw an Akita Neru song that was facing AI allegations (sounding very clean, a bit awkward, and well tuned despite it being the “producer’s” first song and it taking “two weeks”, but I digress). I remember distinctly thinking that if they sent a video or a few pictures of the DAW, then they’d be in the clear!! Not doing so makes them seem guilty.
It’s so confusing to me. :miku2_move:
 

Tortoiseshel

Aspiring Fan
Aug 23, 2021
61
If it turns out that OKISO lied about their songs containing genuine VOCALOID output when the vocals were actually generated by a program like SUNO, I believe the lyrics wiki/VocaDB/any community resource would be fully justified in removing said songs from their catalogs. Same as if someone were to try passing off vocoded or autotuned human vocals as being from a vocal synthesizer. This is a community for synthesized voices, these are not synthesized voices- or at least, not the ones you're saying they are; get outta here you liar.

However, it looks to me like the main complaint the community has in this situation isn't that OKISO (may have) lied about using AI, but that they (may have) used AI at all, period. There's literally a comment on that Lyrics wiki discussion page that says "AI songs do not belong in the Vocaloid community". Which I just find kind of funny because VOCALOID (the product) is AI now. And so is Synthesizer V, and VoiSona, and Neutrino, and VOX Factory, and DiffSinger, and NNSVS, and basically every "modern" vocal synthesis engine. Even Emvoice's most recent vocals use AI! I kinda love you Emvoice, keep being you.

I understand that AI, especially so-called "generative AI", is a real hot topic these days and a lot of people have a lot of strong feelings about it. But whenever I see people in the vocal synth community, particularly the more "fannish" side of the community, talk about how "generative AI" is all soulless garbage... and then in the next breath/tweet talk about how much they love Hatsune Miku or Kasane Teto or whoever, I can't help but be struck by the sheer amount of cognitive dissonance on display.

Because sooner or later, the vocal synth community is going to have to come to terms with the fact that vocal synthesis is "generative AI" now. At least, most of it is. I know there are still some concatenative holdouts like UTAU and retro-style software like Chipspeech, but the vast majority of new releases- including virtually all major commercial releases- are going to use artificial neural network technology. And this really shouldn't be surprising to you at all if you've been paying attention to the history of vocal synthesis development. Stuff like ElevenLabs and 15(dot)ai and even SUNO aren't like, aberrations or "tech bros" infringing on "our turf"; they're the natural evolution of it. VOCALOID, CANTOR, LaLaVoice, DECTalk, Software Automatic Mouth, all the way back to 1962 when some guy at IBM first made a computer sing Daisy Bell, it's all led us to this.

As you might be able to tell by my use of snarky quotation marks, I'm really not a fan of the term "generative AI". I don't think its usage or definition are consistent or coherent enough to be useful as anything other than snarl words to denote any use of AI the speaker doesn't like. But if it were to have a consistent or coherent definition, I cannot imagine it not including stuff like VOCALOID6 or Synthesizer V Studio 2. They use "artificial intelligence" (deep neural networks, to be specific) to generate an output. Like, it's right there!

So if you wanna be one of those people who are staunchly opposed to all "generative AI" no matter what, I would ask that you at least be consistent with your stated values and include (modern) vocal synth software in that. Or! Maybe you could try to approach it with a level of nuance and judge things on more of a case-by-case basis. Maybe "generative AI" can include a lot of really harmful stuff, like spam or dangerous misinformation or malicious deepfakes. But maybe it can also include totally benign stuff, like Vocaloid songs, and we should be able to identify and articulate which is which instead of throwing everything into the Slop Bucket sight unseen. I know which way I'd like to look at it, personally.
 

WacoWacko39

VFlower devotee
Jul 22, 2025
114
19
Florida. U.S.A
This is what confuses me most about producers facing AI allegations. Are there no demos, VSQs, and no instruments to take a picture of in the DAW? I forget who, but I saw an Akita Neru song that was facing AI allegations (sounding very clean, a bit awkward, and well tuned despite it being the “producer’s” first song and it taking “two weeks”, but I digress). I remember distinctly thinking that if they sent a video or a few pictures of the DAW, then they’d be in the clear!! Not doing so makes them seem guilty.
It’s so confusing to me. :miku2_move:
Oddly enough, it reminds me of the Watergate scandal lol.
 

Vector

Passionate Fan
Mar 6, 2022
238
Not to see this continue to derail into a general thread about AI use or what AI even is, but there are some fundamental differences between Vocaloid 6 and something like Suno. (AI is a magically vague and meaningless term in computer science, which I have a bachelors degree in, but I'm making a promise to myself to not digress too much.)

Vocaloid is, despite using Machine Learning models to generate sound instead of raw samples stitched together, still a glorified sampler. MIDI in, sound based on a specific thing out. Similarly, nobody is too fussed about 60gb of piano samples on their hard drive being automatically selected and played back by a MIDI trigger (e.g. a Kontakt library) versus those samples being crunched down into a mathematical model representing the sound of a piano to turn MIDI inputs into a piano sound (a physically modeled instrument). It's just a compression of data to achieve the same goal of making an instrument for a musician to operate.

Suno is to Vocaloid as ordering DoorDash is to making dinner. People will definitely have some things to say if you say you're making them dinner, but order from a restaurant on DoorDash, plate up the food, and hide the boxes. Art is a human process and a form of communication, and people don't like it when someone is a fake. They don't want someone to pretend to make them dinner and lie to them about it.

Typing a DoorDash order into Suno and saying "make me some Miku spaghetti," and having something vaguely resembling that is not equivalent to sitting down and actually conceiving of Spaghetti Miku and spending hours of your life painstakingly creating it from your own mind. (Similarly, seeing people's Blender art is really cool. Seeing an image generator like Stable Diffusion puke out a response to someone's prompt is novel at first and eventually tiresome at worst.)

Suno is also antithetical to the continued existence of Miku, Crypton and Vocaloid. If someone made a janky Utau by clipping Miku phonemes from songs, we wouldn't accept it, and neither would Crypton's lawyers. If some company builds a machine that pukes out a facsimile of Crypton's product without licensing it, that's firstly not something an enthusiast community should tolerate and secondly a potential long term problem for the continued development of Miku.
 

Grzesiek11

New Fan
Aug 6, 2025
11
21
Poland
grzesiek11.stary.pc.pl
They use "artificial intelligence" (deep neural networks, to be specific) to generate an output.
It really does matter what the output actually is.

Speech synthesis using machine learning, while being an output of a subcategory of a text-to-audio model, is still speech synthesis. I doubt anyone here believes there's something wrong with speech synthesis, we speak, and we also taught computers to speak. Depending on the context, you might want a person to speak, and not a computer, but generally computers speaking is not something that takes away from humanity.

What I personally, and many others, do not like, is when computers try to mimic creativity (which is possible with machine learning, thus lumping "AI" into the equation). Thus, while there's nothing technically wrong with text-to-image or text-to-video models, their output is only ever meant to achieve the goal of generating "art". Creativity is a human thing, it comes from work, experience and thought, not calculating the "most likely" outcome.

Stuff like ElevenLabs and 15(dot)ai and even SUNO aren't like, aberrations or "tech bros" infringing on "our turf"; they're the natural evolution of it.
With that said, I can draw a clear-cut line here. ElevenLabs? Sure, it's just speech synthesis. 15(.)ai? See above. Suno? No, that is just generating "music".

There is another aspect to this, which is that those mentioned machine learning speech synthesis models are not vocal synths. This is just not something we care about - tuning is a craft, those just offer generic (and honestly, in my opinion, kinda low quality) speech. Does that mean they're bad? Nah, just... not interesting.

Take a look at the "AI" guidelines for VocaDB submissions:

Song entries for AI generated songs (AI generated music or AI generated uncontrolled vocal synthesis) are not allowed.
More or less encompasses those two points - creativity, and this being a vocal synth community.

When people say they hate "generative AI", they probably just mean that they hate "AI creativity". That is my case, at least.
 
  • Like
Reactions: WyndReed and Vector

junky

Aspiring Fan
Apr 30, 2022
54
Agree with this heavily. On top of that, SynthesizerV uses diffusion models just like SUNO and StableDiffusion do, so there’s a decent chance that SynthV itself is also a generative program that simply has an ethical dataset, higher quality vocals, and higher user control. I think AI should be regulated rather than outright banned and that vocal synthesizers are a perfect example of how AI can be responsibly developed and used for good. Blanket hatred of all AI reminds me back when people accused digital artists and music producers of “just clicking a button.” People were calling Vocaloid “AIslop” before AI was even capable of content creation.
 

carmenxsleepy

SOLARIA <3
Feb 12, 2025
25
www.youtube.com
If it turns out that OKISO lied about their songs containing genuine VOCALOID output when the vocals were actually generated by a program like SUNO, I believe the lyrics wiki/VocaDB/any community resource would be fully justified in removing said songs from their catalogs.
I just want to bring to light that the mods of Vocadb have decided to remove all of their songs besides their covers and drama pvs as those have been verified to use real synths. If anyone is interested, here is the document they posted where they analyzed their songs and their statement to come to this conclusion.
VocaDB Analysis

I'm really disheartened about this whole situation and the fact that we have to even verify that art made is not ai anymore. It was supposed to help the synth sound better, not imitate the synths at all.:miku3_move:
 

Users Who Are Viewing This Thread (Users: 0, Guests: 2)