• We're currently having issues with our e-mail system. Anything requiring e-mail validation (2FA, forgotten passwords, etc.) requires to be changed manually at the moment. Please reach out via the Contact Us form if you require any assistance.

Dictionary Use and Thoughts on Phonemes

Nokone Miku

Aspiring Lyricist/Producer
Jul 14, 2021
76
www.youtube.com
So far in all my projects I've found myself building each vocal directly from phonemes. The trouble is that we sing sounds, not words. I basically sing the line out loud and then mimic the sounds in Piapro.

For example, the dictionary has the word "face" as [ f eI s ] which makes sense. But when singing it it's too abrupt and almost just sounds like "ace." To get the what I wanted I ended up using [ f e ][ j I ][ - I s ] something like "feh'yihs" (with velocity/opening lowered on the "f"). One struggle was the word "laughs" where the dictionary's [ l0 { f s ] ends up sounding like Miku trying to say "rays" with cotton in her mouth. I finally settled on [ l0 e ](1/16th gap)[ - f s z ] with opening and velocity way low.

Then of course there's the word "the" for which I rarely seem to use [ D V ]. I've ended up using stuff like [ T e ] , [ d V ] , [ z i: ] , [ t I ] , [ D Q ] , [ th @ ] , [ dh e ] all based on context.

Some other odd words I've ended up using:
  • "cries" = [kh @][ r Q I ][ - z z ] like "kur raheez"
  • "chance" = [ Sil tS e ][ - e n z ] like "chenz"
  • "me" = [ Sil m e I j ] like "meh ihy"
  • "for" = [ f_0 f @U ][ - O@ ] like " hfoh wor"
  • "restart" = [ r i: z ](1/16th gap)[ s t O: ][ Q@ t ] like "reez stau'hart"
  • "heart" = [ h Q@ r ][ r @ t ] like "harr'rut"
  • "carve" = [ kh aU ][ - r v ] like "cow wurv"
  • "love" = [ l0 @ ][ V ][ - v v ] like "lar uhvv"

I've also found that when a voicebank sings outside their optimal range, or the note/syllable is really short it changes their pronunciation a bit. Like when I had the word "for" as a 1/16th note [ f O@ ] barely produces any kind of sound at all. I tried all kinds of things like [ f @U ][ - r ] , [ v @ r ] , [ f v Q r ] , [ v @U ][ - @U r ] , [ f Q ][ u: r ] , [ v U h r ] all with mixed results. Even adjusting the portamento alters the sound when the note is so short. I finally ended up just using [ v V ] sort of a "vuh" sound. Your brain sorta fills in the "r" sound without having to actually hear it.

One recurring high pitched syllable I struggled with was "two." If you just put [ th u: ] you just get a shrill "oo" sound and the "t" isn't really audible. I literally, actually spent at least eight hours on this one word. Every variation of phonemes, timings, vibrato, portamento, and parameters. What I finally arrived at for the line "the two of us would" was: [ d V Sil ][ T t U ](1/16th gap)[ U w ][ Sil w Q v ][ V ][ - V s ][ s w U ][ - d ] with fast, shallow vibratos, and opening low on the word "two." So like "duh - thTuh -- uw - wuv uhs swood"

One last note, I've read several places people saying that velocity and opening don't matter much. But I've found they can make a big difference. Especially on m, w, d, t, p, z, f, v. And the opening parameter affects almost every vowel sound. On one song I maxed out the opening and boosted the clearness by 20 on the whole song and it made them sound a little more like they're yelling.

I'm still getting to grips with the oddities of the Japanese phonemes. In some cases putting certain consonants together makes a unique sound. Stringing two or three vowels together with different amounts of vibrato also creates some interesting sounds. In some cases it makes a difference whether it's [ V V ] or [ V ][ V ] or [ V ][ - V ]. Then of course various uses of the [ *_0 ] devoiced sonorants both on the ends and in the middle of words causes noise most of the time, but other times gives some really interesting pronunciations. And the [ ? ] glottal stop is sorta hit-or-miss whether it does anything noticeable.

Sorry if this is a long post. I've just had a lot of this on my mind. I realize that part of this is probably me dealing with a non-native English voicebank. In the future I want to get one, maybe Avanna. Though I also want to get Elizabeth Forte at some point. I'll have to learn all the SynthV phonemes. @_@
 
Last edited:

mahalisyarifuddin

Passionate Fan
Apr 9, 2018
138
26
Palembang, Indonesia
Using non-native English Vocaloid voicebanks in general is surely pain in the butt. Not just because the oddity of the pronunciations, but also the oddity on how Vocaloid works in general. Not to mention the overwhelming usage of American accent in English songs and only a handful of English Vocaloid voicebanks can fully support this. Phoneme changing and note splitting is inevitable in this case. I never use native English Vocaloid voicebanks myself, but I heard that it'll give you less headache even though not too much?

I personally tend to not relying too much on the built-in dictionaries, but I use real dictionaries (yeah, I'm not a native speaker wwww). I use the phonemes from Oxford Learner's Dictionaries and the vanilla CMU Pronouncing Dictionary that later be adopted and modified by Synthesizer V and CeVIO for their respective softwares. This is also helpful if you want to work with Utau that all you can do is just inputting lots of phonetic notations. Oh, and also common sense is no less important, for example if you want the American accent, you don't sing "get it" with the clear t but rather with flap.
 
Last edited:
  • Like
Reactions: Nokone Miku

Nezuh

Official Piko Husband
Apr 17, 2018
77
Argentina
www.youtube.com
One last note, I've read several places people saying that velocity and opening don't matter much.
I may be going a bit off topic here, but VEL and OPE (well, mostly VEL) do literally nothing in VOCALOID2 Editor.
Maybe in V3 either, I don't remember much about that version.

That's why in old songs (like V2 era), the only consonant people usually stretched was [n] or [m] n japanese, since those are a separate phonemes.
 
  • Like
Reactions: Nokone Miku

Nokone Miku

Aspiring Lyricist/Producer
Jul 14, 2021
76
www.youtube.com
I don't think phoneme edits are a bad thing by any means,
I will probably hope to learn to use phoneme replacement to help her pronunciations more than I currently do.
You can edit every last phoneme to your heart's content
I actually am one of those overly neurotic people who phoneme edit probably every single word
I posted this here so I don't hijack the Unpopular Opinions thread.

I was surprised reading this discussion. So far I've been building songs out of nothing but individual phonemes. I just assumed that people often did something similar. I also don't know if it's common to use the cut tool on a lot of words to get the right amount of sustain on the vowels? I work at 1/32nd note grid pretty much the whole time. I also get the impression it's odd that I make extensive use of the Portamento parameter and rarely touch the Pitch Bend parameter.

Here's a sample of some of the more complex bits (extra zoomed in so the phonemes are visible):
piapro-lyric-phoneme-screenshot.jpg

(I really, really hate to sound like a shill, but) When I posted the Lost Story cover I was really hoping to get feedback on the pronunciation. By the time I was done with it I wasn't sure if it was a substantial improvement over the default pronunciation or if I was just wasting my time on something no one else would even notice? I'd like to know before I finish all my other works-in-progress, because I'm spending a lot of time on them too.

(Sans feedback, I'm just about ready to pay a professional money to critique my stuff. I just gotta find one who's also familiar with Vocaloid.)
 

Leon

AKA missy20201 (Elliot)
Apr 8, 2018
979
I absolutely cut words up, to 2 or 3 notes per syllable (depending) to get the right vowel sustain! Some banks are so bad at "getting to the consonant too quickly". Tonio is a really bad culprit of this LOL

I'll give that a listen! :)
 

mobius017

Aspiring ∞ Creator
Apr 8, 2018
1,982
I was surprised reading this discussion. So far I've been building songs out of nothing but individual phonemes. I just assumed that people often did something similar. I also don't know if it's common to use the cut tool on a lot of words to get the right amount of sustain on the vowels? I work at 1/32nd note grid pretty much the whole time. I also get the impression it's odd that I make extensive use of the Portamento parameter and rarely touch the Pitch Bend parameter.
(I really, really hate to sound like a shill, but) When I posted the Lost Story cover I was really hoping to get feedback on the pronunciation. By the time I was done with it I wasn't sure if it was a substantial improvement over the default pronunciation or if I was just wasting my time on something no one else would even notice? I'd like to know before I finish all my other works-in-progress, because I'm spending a lot of time on them too.
I don't think what you're doing (building out of individual phonemes, splitting syllables) is necessarily unusual--I'm sure I've heard of other people who do both of those things. Like I mentioned in the Unpopular Opinions thread, I think that there are sort of two different approaches to the job here (more default dictionary and more custom phoneme), and you seem to be in more of the latter group. I imagine that you/folks who prefer the custom phoneme approach simply have more specific/stringent ideas about what you want the final pronunciation to be, and that's perfectly fine.

Or maybe...I'm going to revise my opinion a little here. It's not so much that there are two different ideologies about how to do this. I think it's arguable that the dictionary is intended to provide a good starting point, and phoneme replacement is a more advanced technique. Folks new to using a vocal synth can lean on the dictionary until they are more comfortable or until they find the need to replace phonemes in order to achieve the results they desire. (This is more or less where I am. I was being honest in Unpopular Opinions when I said that I'm often ok with the default pronunciations, but I do try to improve them when 1) I spot an issue that bothers me and 2) I can find a way to fix it.)

Alternatively, the dictionary could be viewed as a time-saving device intended to do the brunt of the work. Obviously, how much of the work it ends up doing for a given person depends on his/her requirements.

I listened to your Lost Story cover. It's a little difficult to compare directly without hearing what the default dictionary would have provided, but I thought the pronunciation was really fluid and much better than what I think the default dictionary would give on its own. I think you should be really happy with it!

You mentioned something somewhere, I believe, about looking for tips to make...tuning, I think it was...faster. Phoneme replacement isn't really tuning, per se, I guess, but I'd consider making a list of the most common words you end up phoneme-replacing. That way you can refer to the list later rather than having to invent the words again. Undoubtedly, at some point you'll just start remembering the words anyway, but it might end up being helpful, at least until you just remember the phonemes.

In that vein, as sort of a side note, seeing someone who's as focused on pronunciation as you seem to be, and who's spent so much time on it, is actually kind of exciting to me. I think you're going to be very good at it. I hope at some point you'll think about writing a guide for the site's Resources section, not necessarily listing the different words/their phonemes, but going over different techniques you use (e.g., stuff like stretching vowels by using the cut tool, or whatever you're doing with Portamento).
 

Nokone Miku

Aspiring Lyricist/Producer
Jul 14, 2021
76
www.youtube.com
I hope at some point you'll think about writing a guide for the site's Resources section, not necessarily listing the different words/their phonemes, but going over different techniques you use (e.g., stuff like stretching vowels by using the cut tool, or whatever you're doing with Portamento).
You've given me some stuff to think about. And thanks for the feedback. I've thought about writing up a guide about stuff I've discovered in my hours messing with Piapro. I just felt I'm too new to be telling people anything. ^^;

Things I could potentially cover:
  • I was surprised that leaving gaps between connected syllables affects how they're sung. A recent example: [ s t r e ][ - e N ] sounds a bit different if you make it [ s t r e ] (gap) [ - e N ]. Or how [word][word][word] differs from [word] _ [word] _ [word].
  • Cases where doubled-up consonants and combined consonants help. Like ending a word with [ - z z V_0 ] or using [ f v ] to get a stronger "f" sound.
  • Portamento tweaks between words of different pitches is obvious, but I'm still trying to understand how it affects words of the same pitch right next to each other (without just guessing and messing with it until it sounds right).
  • Smoothing over problematic vowel combinations, awkward stresses, or engine noise with different variations of very fast, very shallow vibrato.
  • To make beginning and ending consonants stronger and clearer we often sing with the consonant tacked onto the following word. It's often useful to replicate this. Like "one of us" becomes "wun nof vus." Or "this is all" becomes "this siz zall."
  • Differences in using [ Sil ] or [ *_0 ] or Pitch Snap Mode for more firm transitions between notes.
  • Using [ j ], [ h ], and [ w ] to alter certain words.
I don't know how much of this is common knowledge or what people care about. I know a lot of people just tune Japanese covers, so I don't know if any of this would be relevant to them. Soon I'm going to do a cover of "If You're Gonna Jump" by omoi using Otomachi Una so that's when I'm really going to focus on methods of using Japanese phonemes to sing in English.
So you really oughta tell me if you're gonna jump
'cause I wanna be right there with you
what, you thought I'd try an' stop ya?
there's a bigger bond b'tween us two

so grab my hand we'll take the plunge
give terra firma a big thumbs up
wait a sec, I got a weird feelin'
there's somethin' that I'm forgetting
oh, that's right! there was a show I wanted to see
let's call this whole flying thing off for a while
 
Last edited:

v3xman

sshhhhh im new here
Jul 30, 2021
11
www.youtube.com
Not sure about Piapro but I'm gonna drop my 2 cents here. (also as an aside, I work mostly with English VBs so I'm not too familiar with JP phonemes)

I always see the dictionary as a starting point. Sometimes it gets the result you want, other times you have to start tweaking phonemes.

And oh boy tweaking/tuning is a skill you get with experience. I'll try to comment on your observations based on what I learned through several years of tuning Vocaloid.

Things I could potentially cover:
  • I was surprised that leaving gaps between connected syllables affects how they're sung. A recent example: [ s t r e ][ - e N ] sounds a bit different if you make it [ s t r e ] (gap) [ - e N ]. Or how [word][word][word] differs from [word] _ [word] _ [word].
You're correct. Phonemes behave differently whether they are "connected" or "gapped".
A recent example that I've encountered is Big Al pronouncing [@] differently if there's a connection to a previous note/phoneme. [@] sounds more like 'U' if there's a connection, but sounds like [V] if you put a gap or [sil]/[?] before the note. Consequently, removing the connection also adds more "strength" on that phoneme sound.

I think what we can infer here is that phonemes in a voicebank are recorded in combinations. Once you're aware of that, it all start making sense why phonemes sound different. For example a VB could have samples for the following combinations:
- [e] - by itself; no connecting phoneme before or after it
- [bh e]
- [e e] - or basically an e followed by another e regardless if on same note or not.
etc etc.

  • Cases where doubled-up consonants and combined consonants help. Like ending a word with [ - z z V_0 ] or using [ f v ] to get a stronger "f" sound.
On the same project, I got Oliver which [eI] phoneme sounds too similar to [e]. I have years of experience with that VB so I know the workaround there is to do [eI I] or [e I]: the former adds more emphasis to 'I' and the latter has more emphasis on 'e'.

And yes, doubling the phonemes may help (in some VBs) to add more strength on the consonants or even vowels.


  • Smoothing over problematic vowel combinations, awkward stresses, or engine noise with different variations of very fast, very shallow vibrato.
IMO this is the hardest ones to solve. Techniques that work on one VB does not necessarily apply to other VBs even on the same language and engine.
Take for example the Oliver workaround [e eI] I mentioned above. On Oliver, the transition is very smooth and you barely notice it. If you do that on Gumi English, you will notice a jarring transition going from e to eI. This is just how her VB is programmed and you'll have to think of another workaround for that. There's a whole ton of "broken vowel transition" glitches in Gumi English that I made a laundry list of them back in VocaloidOtaku (RIP) when they were asking for feedback prior to her release lol.

  • To make beginning and ending consonants stronger and clearer we often sing with the consonant tacked onto the following word. It's often useful to replicate this. Like "one of us" becomes "wun nof vus." Or "this is all" becomes "this siz zall."
That's true. IMO, I use the velocity parameter in Vocaloid to adjust the consonant length (or emphasis) on the note. It takes a couple of tries to figure out the value but it is what it is.

Synth-V does it infinitely better with a visual waveform and duration sliders per phoneme (!).

I could go on but this topic could easily fill up a book.

What I can say in the end is:
1. You will learn a lot by trial and error ;)
2. Different VBs behave differently.
3. Keep in mind how VBs are recorded and how the engine concatenates/blends them.
 

mobius017

Aspiring ∞ Creator
Apr 8, 2018
1,982
Glad it helped!

I think all of those topics would make good material for inclusion, if you're so inclined. (Personally, I'm interested in Portamento between words of different pitches, also, since I've never really touched the Portamento parameter, though I can guess how you might use it if I think about it.) People here are of all skill levels--some are absolutely new to vocal synth/digital music software and benefit from guides related to how a DAW and synth plugins work together, and others are more advanced and are looking for trickier/more niche topics like how to make metal screams. (We have guides for both. ;) ) We also have both cover artists and people who work primarily in original songs. So don't be afraid that the material you're sharing is too simple or be too concerned that everyone has a baseline level of knowledge or won't be interested. If you had to work to learn it, it's worth sharing ;) .

Vocal synth work is, to a certain extent, very much a DIY/learn-on-your-own sort of pursuit (Though the manuals included with Vocaloid 5 or Piapro, for example, can be quite helpful.), and it's my feeling that the digital music space in general, and possibly vocal synth users in particular, truly depend on guides like these to assist their growth. At least if my experience is any indication, we have to go all over, hunting down material describing how to do all kinds of different things or how different things work (music theory, mixing, mastering, how to work with the vocal synths themselves, how to work with our DAWs). If we didn't have tutorials about these different things, we wouldn't get anywhere. It isn't a Pollyanna space, but despite the drama that is often lamented on Twitter/elsewhere, this kind of sharing/community is a key feature of vocal synth culture (especially on VVN, but you can find guides for other digital music topics all over, as well as collaborations on the Piapro website, as other examples), and we all grow best when we pool what insights we have and grow together.

Good luck on your cover!
 
Last edited:
  • Like
Reactions: Nokone Miku

mobius017

Aspiring ∞ Creator
Apr 8, 2018
1,982
One other thing I forgot to mention was that the Vocaloid 4 editor supports custom dictionaries. So as you find phoneme combinations you prefer to use, I think you could save them and have it use them automatically. Hypothetically, you could use the V4 editor to do your basic phoneme work and then export as a Vocaloid 4 file and import that into Piapro if you wanted to use EVEC or something. The downsides to this, though, are the increased complexity, plus the expense of buying a V4 editor, as well as the fact that the V4 editor is rather difficult to come by after the release of Vocaloid 5. (I'm not sure if Vocaloid 5 supports custom dictionaries; possibly our resident V5 expert, @patuk, would know.) Plus, my own experience suggests that while there are some default dictionary phoneme combinations that you might want to replace all the time, other times the phonemes remain different from one use to another, so how helpful a custom dictionary will be will vary.

Also, I remembered that our Tuning resource has a section on phoneme replacement with links to phoneme charts and a very limited amount of more technique-based guidance. As you keep working on replacing phonemes, you might find it useful, if you haven't seen it already.
 
  • Like
Reactions: Nokone Miku

Nokone Miku

Aspiring Lyricist/Producer
Jul 14, 2021
76
www.youtube.com
So, my challenge today was the lyric: "Can you read a heart's words like ink on paper?"
After (an unreasonable amount of time) I came up with a couple variations that basically worked. In the end I went with:

[ k { n ][ j U ][ r I d ][ d V ][ - h_0 ]
[ h Q ][ - Q][ r V t s ][ s w Q ][ - Q ][ r V t z ][s l0 aI k ]
[ I n k ] [ Q n ] [ p eI ][ - j p ][ ph O: ] [ O: r r_0 ]


Sorta something like:

can you read duh'
haaruts swaarutz slike
ink on payp poh orr'

I had to up the Clearness and Brightness parameters to get strong enough enunciation. And put aggressive vibrato on "haa" and "swaa."
Sometimes Miku says "heart" with no problem. Other times she just does a little "huwd" sound. It took a while to get her to say a strong "paper" instead of "pewpaw."

It's funny how I'll do 90 seconds of a song with no problems and then there will be a 10 second part that takes me forever to get it the way I want it.


(EDIT): Bonus Round - The dictionary has the word "strangling" as s t r { N g U l I N
After much trial and error I finally arrived at: [ s t r e ][ - N k ][ gh U ][ - U l ][ l0 i: N ] like "strengk gu'l leeng." I was having the hardest time trying to make it not sound like "stuhn galin." @_@

(EDIT part II: the Revenge): It drives me absolutely crazy! When I've been looping sections of a song for a couple hours, the software engine stops rendering things correctly (or something?). Anyway, I'll shut down the program and when I come back later (I guess it does a fresh rendering when I load it) a bunch of the stuff I was working on doesn't sound the same as it did before!

Take the example above. When I exported, saved, and shut down the program last time, the transition from [ - N k ] to [ gh U ] sounded great. When I went back and opened the file later the "k" was suddenly overpowering, when before it was barely audible.
 
Last edited:

Nokone Miku

Aspiring Lyricist/Producer
Jul 14, 2021
76
www.youtube.com
great, now try using miku v3 english!

i'm joking of course. (i love her though)
Actually, I recently bought Miku V3! There are few things I want to try with her.

One idea is that I can run two vocal tracks and if Miku V4 is struggling with a word or phrase I can see if Miku V3 works better. I've tried this using her Japanese voicebank, Miku V4 (Original), and it has worked in certain cases. The trick is that you have to draw in the words before and after the word/phrase you want to replace so that the transition curves are the same (and if the notes are really close together you might need the ending/transition vowel to be the same). Then you can use a 1/32nd crossfade on the volume parameter between the notes/words.

My other plan is when I want to double-up/layer the vocals I'll try using V3 and V4 together, rather than having two instances of V4. My thinking is that, with the vocals being slightly different from one another, this might cut down a little on resonance? If not then maybe it will at least produce a richer tone.

I also want to try having V3 sing harmony, accompaniment, or duets with V4.

There might also be potential in using the doubling technique where they have the vocalist record two near identical takes and then in stereo pan one left and the other right. I've tried this using V4 where I adjust the dynamics and brightness on the second track and offset certain sections by 1/64th. It sounds fuller, but I would need someone else to listen to it to tell me if it sounds odd.

I also want to test these ideas using Otomachi Una's two voicebanks. With XSY I might be able to adjust how similar or how different the two sound, based on what I'm trying to accomplish.


(I don't know if this stuff makes sense or if I'm using the right terminology. Maybe I'm overcomplicating things that already have simple solutions that I'm just not aware of? The voicebanks might sound either too similar or too different from each other for it to work. They might clash and make it so that neither pronunciation sounds good. I haven't had time to mess around with Miku V3 yet, so I don't know.

I've had instances in the past where I've made something that sounded interesting, but when someone else listened to it, it just sounded awkward. So, I tend to be an insecure perfectionist about how my stuff sounds. That's why sometimes I spend hours trying to correct a bit of pronunciation that most people would think was perfectly passable. Or I spend an entire evening adjusting EQ and compressor plugins.)
 

mobius017

Aspiring ∞ Creator
Apr 8, 2018
1,982
One idea is that I can run two vocal tracks and if Miku V4 is struggling with a word or phrase I can see if Miku V3 works better. I've tried this using her Japanese voicebank, Miku V4 (Original), and it has worked in certain cases. The trick is that you have to draw in the words before and after the word/phrase you want to replace so that the transition curves are the same (and if the notes are really close together you might need the ending/transition vowel to be the same). Then you can use a 1/32nd crossfade on the volume parameter between the notes/words.
That sounds like it would work. If you like, a simpler alternative to try might be to use V4 as your primary voice and XSY with V3. For the word/phrase you want to change, automate XSY all the way up so that V4 is as much like V3 as possible. Granted, XSY isn't a straight mixing of the two voices, but with XSY turned all the way up or down, the voice usually seems to end up basically like one or the other.
 

AmazingStrange39

Miku-Avanna-Gumi enthusiast
May 23, 2019
288
Actually, I recently bought Miku V3! There are few things I want to try with her.

One idea is that I can run two vocal tracks and if Miku V4 is struggling with a word or phrase I can see if Miku V3 works better. I've tried this using her Japanese voicebank, Miku V4 (Original), and it has worked in certain cases. The trick is that you have to draw in the words before and after the word/phrase you want to replace so that the transition curves are the same (and if the notes are really close together you might need the ending/transition vowel to be the same). Then you can use a 1/32nd crossfade on the volume parameter between the notes/words.

My other plan is when I want to double-up/layer the vocals I'll try using V3 and V4 together, rather than having two instances of V4. My thinking is that, with the vocals being slightly different from one another, this might cut down a little on resonance? If not then maybe it will at least produce a richer tone.

I also want to try having V3 sing harmony, accompaniment, or duets with V4.

There might also be potential in using the doubling technique where they have the vocalist record two near identical takes and then in stereo pan one left and the other right. I've tried this using V4 where I adjust the dynamics and brightness on the second track and offset certain sections by 1/64th. It sounds fuller, but I would need someone else to listen to it to tell me if it sounds odd.

I also want to test these ideas using Otomachi Una's two voicebanks. With XSY I might be able to adjust how similar or how different the two sound, based on what I'm trying to accomplish.


(I don't know if this stuff makes sense or if I'm using the right terminology. Maybe I'm overcomplicating things that already have simple solutions that I'm just not aware of? The voicebanks might sound either too similar or too different from each other for it to work. They might clash and make it so that neither pronunciation sounds good. I haven't had time to mess around with Miku V3 yet, so I don't know.

I've had instances in the past where I've made something that sounded interesting, but when someone else listened to it, it just sounded awkward. So, I tend to be an insecure perfectionist about how my stuff sounds. That's why sometimes I spend hours trying to correct a bit of pronunciation that most people would think was perfectly passable. Or I spend an entire evening adjusting EQ and compressor plugins.)
I love doing V3/V4 duets as well as trying XSY between them (though I don't use it much in practice). Their accents have similarities but are so different at the same time and their differences in tone can be so cool?
 
  • Like
Reactions: Josh Kouki

Users Who Are Viewing This Thread (Users: 0, Guests: 1)