NEUTRINO Techniques and idea

Notes:

This is NOT a tutorial guide, basic understanding of NEUTRINO & AI voice synthesizer is required.
This guide supposed to work with NEUTRINO voice support tool
This guide focus toward general technique, can be used in practical.

Basic concept of AI singer voice
It may be fresh for everyone who is accustomed to the conventional singing voice synthesis software that has a GUI on a regular basis, but this one is called "machine learning AI singer's"

AI singer's is perform exceptionally well, if operating at their optimal range, and perform quite poorly outside their optimal range.
*optimal range ≈ range of learning data
Sounds that are not in the training data cannot be synthesized. Basically.

In general singing voice synthesis software, we will think about how to season after being solid, but in the case of AI singers, we will control the seasoning at the stage of creating a sequence.

The way of singing can be corrected with an editor, but basically it cannot be corrected unless the basic output voice is reasonably good. How well you can sing an AI singer depends on how well you can skip instructions on the score . In that sense, the vocalist in this case is like a "Vocal Operator".

Optimizing performance for your machine when using Run.bat
Although it is a little different from the tone, I will introduce how to use NEUTRINO itself well.
NEUTRINO's singing voice synthesis process requires considerable resources. Obviously.

-- If your PC is too heavy for you to do other things, there is a workaround. All you have to do is limit your resources .
-- If you open Run.bat with Notepad, you can see that there is a parameter called "NumThreads". This shows "how many CPU threads to use". The default value is 3 = 3 threads.
-- If you rewrite this number to a smaller number, it will take longer to synthesize, but the processing will be lighter. On the contrary, if you can afford it, you can increase the processing speed by setting NumThreads to 6.
-- Appropriate numbers for NumThreads will vary from PC to PC. Check the part number of the CPU of your PC on the net to know the number of threads.

Or you can avoid all of this by using NEUTRINO Online version with Google Colaboratory.

Google Colaboratory is a web service that allows you to run Google's cloud PC on your browser.
Use this service to run NEUTRINO online. ---> How to use NEUTRINO Offline & Online

-- It is characterized by being able to execute machine learning / deep learning programs on a browser and check the results as if writing notes, and is widely used in data analysis sites, research, and education.
-- Since the operation is completed on the web browser, you don't even need a PC, and it works on smartphones. And at Colab, you can also use the GPU for free. Please take this opportunity to experience high-speed rendering and the latest neural vocoder singing voice synthesis (NSF version).

MuseScore
So far, i have introduced the technique on the NEUTRINO side, but there is also technique on the MuseScore side.

Utilize the existing sequences
-- It would be easier if you could incorporate the VSQX and UST you have made so far into NEUTRINO. There is a way to import.
-- First, open the sequence with VOCALOID or UTAU and export the MIDI. Then, read the exported MIDI with MuseScore.
-- Probably the lyrics are garbled, but in that case it should be solved by selecting "Shift JIS" in "Character code" at the bottom of the screen and applying it.
-- If you can correct the lyrics correctly, you can export MusicXML. It's a lot easier.

Cheat only for CeVIO user
-- If you are a CeVIO person, type in a sequence with CeVIO and export MusicXML from "Export". Simply rewrite the extension of the completed sequence.xml to sequence.musicxml and you will be able to import it into NEUTRINO.
-- If it is troublesome to rewrite the extension one by one, rewrite Run.bat.
-- If you change "SUFFIX = musicxml" in Run.bat to "SUFFIX = xml", you can really use the XML exported by CeVIO as it is.

Create a sequence with your favorite software and convert
I don't think anyone is accustomed to creating sequences for singing voice synthesis on the staff score. If so, standard DAW should do the job perfectly.
So basically, I think it's easy to create a sequence with the software you usually use and convert it to MusicXML with MuseScore.

File name
Sound like an issue?, Obviously. File name can cause an error. Create an simple name is the way to go.

Timing voice tech
In NEUTRINO, you can control the length of phonemes and the timing of pronunciation by playing with the LAB file. It's VEL in VOCALOID, consonant velocity in UTAU, and TMG in CeVIO.

Tips for using NEUTRINO voice support tool
I tried using the tool with reference to the article, but there were some parts that stumbled when I used it for the first time, so I will summarize it.

If there are two characters in one note, pack them all at once or shift them back.
The number one reason I personally use the tone tool is timing adjustments, so I'll write a little more about it.

It is like this when you make a score that says "when there are two letters in one sound".

I think it's an expression that you often see in English songs, but it's surprisingly difficult to make it sing naturally. It is necessary to devise ways such as dividing the notes and adjusting the length to make them sing, but often it does not come to mind.

Therefore, the NEUTRINO voice support tool plays an active role .
You can flexibly deal with such expressions by using the tone support tool.

I think that the reason why it is difficult to adjust the "star" on the score is that the "u" in "su" is devoiced or hardly pronounced, so I will correct it with a tool.

As shown in the image, you can specify a singing style with almost no sound by using a tool to fill in the "u" sounds.

In addition to the above example, you can also specify the singing voice of consecutive vowels on the tool to make it more natural.

The vowel of "Nai" in the above score is continuous with "ai", and the sound changes smoothly, but it is difficult to express this in the score.

Again, use the tools and make corrections to make your singing voice more natural.

It is a personal feeling, but the continuous vowel is behind the vowel shift back I think I made a natural singing voice and.
**By shifting the sound of "i" to the back, the change in the sound of the continuous vowels is expressed!

"tsu" (sokuon) is expressed by rests
This also applies when using other singing voice synthesis tools,
but it is better to type in "tsu" (sokuon) with a slightly different expression.

That said, it's easy to do.
All you have to do is replace the "tsu" in the score with a rest.

Where you sing "sour"

Just say "pai".

**In some cases, it may be better to type "tsu" in your favorite singing voice. I hope you can remember it as a trick.

Vowel dropout
-- When you want to sing「Bokura」and「Ashita」you may want to sing「bo k ra」and「a sh ta」at a reasonable rate, right? The middle vowel is missing.
-- You can usually type [bo] [ku] [ra] or [a] [shi] [ta], but then you will sing the vowels clearly like "bo ku ra" and "a shi ta".
-- If you type this [boku ’ ] [ - ra] or [ashi ’ ] [ta] it will sing 「bo k ra」「a sh ta」This alone will give you a feeling of "understanding".

Note split / vowel split
-- Do not divide the long tone into [A] [A] or [A] [ー] . In the case of NEUTRINO, if you divide it like this, it will be rephrased clearly anyway. It will be "Ah, ah" instead of "Ah".
-- Let's give up on producing squirrels by dividing. It's best to wait for it to squeeze, or write it out and then edit the pitch.

Diphthong vowels
-- It is the same as note division / vowel division, but double vowels such as「あぃ」"ai" and「おぅ」"ou" are often separated into "a, i" and "o, u" when you type them normally.
-- The workaround is to "make the notes of the secondary vowels short enough". For example,「たーいー」instead of「たーーい」How short it should be depends on the time and the case, so it's depend on the song, tempo, keys and of course... luck. It may not be as short as it gets.
-- You can also process the exported audio with iZotope VocalSynth 2 or Vocalizer, to morph it. It's still very difficult.

Pitch tone
Once you have a good Kanji score, dig into sig's editor (NEUTRINO voice support tool) to synthesize it and play with the pitch. Regarding pitch tone, there is no particular difference from the conventional singing voice synthesis, so please continue as before.

Please check the following article for details on how to use it.
How to use NEUTRINO voice support tool

Tone Gacha
-- If there is something you don't understand about the AI singer's singing style, try drawing a tone gacha.
-- If you play with the score or something, the singing style may become more stable, or you may suddenly sing very well.
-- Recently, I've come to sing in a good way without having to play a lot of gacha. Gacha is useful when you want to sing a song that certain AI vbs is not good at.

Key change + pitch change gacha
"Key change pitch change gacha" that raises the entire song by *n keys and types it in, and lowers it by *n keys with the pitch change function of NEUTRINO is effective.
-- It's a "learning singing voice synthesis" technique that CeVIO and Sinsy also do. A technique that changes the key at the stage of typing until it sings well, and later returns to the correct pitch by changing the pitch.

As usual, Lower 2 keys and hit, raise the pitch by 2 keys and return.
-- Hitting down 2 keys to raise the pitch by 2 keys is repeated until a good sound is produced. By the way, since the pitch of the output audio is not changed, the deterioration of sound quality is unlikely to occur here.

Note:
* n keys reference to number of keys, you can freely adjusting your keys.

Range gacha
-- NEUTRINO is a little weak at singing other than the range that it is good at (≈ the range of learning data). With the update, you can sing most of the range, but sometimes the singing style is not stable. Even in such a case, the key change and pitch change can be used.
-- First, change the key and type in so that the highest / lowest note in the song is in the range you are good at. You may be able to sing well if you correct it with the pitch changed during composition.

Strong and weak gacha
-- NEUTRINO does not have a parameter to control the strength, so I will do my best with the key variable pitch variable gacha.
-- In the case of AI Kiritan, the bass tends to be weak and the treble tends to sing strongly. If you want to make it weaker, hit it low to raise the pitch, and if you want to make it stronger, hit it high to lower the pitch.

"Tsu" gacha
-- In CeVIO and Sinsy, in general, when you type "tsu", the consonant immediately after it is extended forward. In the case of NEUTRINO, the consonant does not extend much with "tsu", but it may change the pronunciation a little.
-- I don't have much to do, but if I can't fix the pronunciation, it may work if I write the note I want to fix or just before that as the lyrics.

Yoon Gacha
-- In the first place, the pronunciation of the yoon gacha such as「ゃゅょぁぃぅぇぉ」"Ya~yu~yo~a~i~u~e~o" may not be supported. Because sounds that are not in the training data cannot be synthesized.
-- In this case, you may be able to deal with it in a way similar to how to make diphthongs. AI Kiritan can't sing [Nya] (so Nyanyanyanyanyanyanya! Can't sing at all), but how to type this in [ni] [a] or [ni] [ya] to make [ni] as short as possible, this one maybe work. The other is to use iZotope Vocal Synth2, Vocalizer

Devoiced gacha

-- In NEUTRINO, unintended devoicing may occur on rare occasions. This is known to be "prone to happen if there is a rest immediately before". With a recent update, this error is unlikely to occur.
-- The workaround is to " erase the last rest ". If it's in the phrase that has been devoiced, it's okay to just fill in the last rest. If you absolutely want a little break there, apply the breath symbol to your last note. There is space for the breath, but there are no rests, so you can avoid devoicing.
-- If it is the beginning of the phrase that has been devoiced, place a dummy note (such as a note with the lyrics "tsu". There is a note but no sound).

The problem that the position of silence is not changed
This is a caveat when using the tool.
The tone support tool has a lightweight mode to make the confirmation work easier. With it, it takes minutes to tens of minutes to generate audio.

To use the lightweight mode, just select "KIRITAN_FAST" from the gear mark .

However, this feature also has some weaknesses.
This is because in the lightweight mode, silent timing changes are not reflected in the output audio. (Even if you change the length of the sound, the sound before the change will be produced)

Therefore, when changing the silence, I think it is better to output the singing voice with normal "KIRITAN" just in case.
*Let's write it briefly.
Silence timing change
　　→ Execute with " KIRITAN " ( choose one that is not fast )
Phoneme timing change
　　→ Execute with " KIRITAN_FAST " ( fast ver. OK!)

Happy producing.

Search

Search

NEUTRINO Techniques and idea

More resources from IO+

Share this resource

Latest updates

์Neutrino Techniques and idea update