Notes:
About NEUTRINO
NEUTRINO is a singing voice synthesizer that uses a neural network. This software is freeware.
A neural network is used to estimate the timing of vocalization, pitch, voice quality, and faintness of voice from the score. The voice is synthesized by vocoder based on the above estimated parameters.
The name of this software is based on the desire to develop songs and genres that have never been heard before.
What to Prepare
1. Neutrino
2. Voicebanks
3. MuseScore for create the lyrics and MusicXML files
4. Notepad for edit the batch file
5. (Optional) Any MIDI Sequencer that can export to MIDI file, If you are not good at write music score.
Vocaloid Editor, UTAU, or simply any DAW should do it.
6. (Recommended) NEUTRINO voice support tool (Neutrino Editor)
7. (Recommended) Hiragana / Phoneme list
Download
NEUTRINO
VoiceBanks
MuseScore
Notepad++ (you can use standard notepad that come with Operating System)
NEURINO Voice Support Tool
More detail: NEUTRINO -Neural singing synthesizer
Installation
NEUTRINO
Unzip the downloaded file to any folder.
Since there is no installation work and it will be executed when it is unzipped, it will be unzipped to the location where it will actually be used, not a temporary folder.
When uninstalling, delete each folder.
MuseScore
What is MuseScore?
Finale and Sibelius are some of the most famous music notation software, but they are expensive, the free version has limited functionality, and they are not updated and are difficult to use. On the other hand, MuseSocre is open source music notation software.
It is updated frequently and it is characterized by running on multiple platforms such as Windows / Mac / Linux.
It is easy to use with no functional restrictions, MIDI import is possible, and you can try MIDI and sheet music created with other DAWs and GUIs relatively easily.
MIDI import
Please refer to the following for MIDI import.
You can speedily work by reading your existing MIDI or converting a sequence created with other software to MIDI and reading it.
The following settings are recommended.
Channel: 1 only
Quantize: 16th note or 32nd note
Voice part: 1
tuplet: 3, 4, 5, 7, 9
Simplified note: ON
Staccato display: OFF
Dotted note: ON
Display tempo text: ON
NEUTRINO Modules
About each module
musicXML_to_label
Converts MusicXML to the label format used for neural network input. MuseScore is recommended for
score creation . You can output a MusicXML format file by selecting [File]-> [Export] and setting the file type to uncompressed MusicXML file. In other software, the output will be "* .xml", so change SUFFIX in Run.bat to xml.
NEUTRINO
Estimate the utterance timing, pitch, voice quality, and faintness of voice from the label.
You can change the sound source with ModelDir.
You can change the number of processors used by NumThreads.
Example
WORLD
A vocoder (WORLD) synthesizes a voice waveform based on the pitch, voice quality, and faintness of the voice.
You can change the number of processors used by NumThreads.
You can change the pitch with Pitch Shift.
You can change the voice quality with Formant Shift. Raise it to make it childish, and lower it to make it more mature. (Around 0.85-1.15 is recommended.)
Example
Key table for Pitch Shift
NSF_IO (Windows / online version only)
A neural network (NSF) synthesizes a voice waveform based on the pitch, voice quality, and faintness of the voice.
NSF is one of the methods to generate high-quality voice waveforms close to real voice at high speed using neural networks. It is a clear and attacking sound without over-smoothing bass. If it is in the proper range, there is almost no deviation in sound quality and the quality is stable and high.
On the other hand, WORLD is good against processing such as pitch shift and formant shift, and is also good against sounds that are far out of the proper range. Both have advantages and disadvantages.
System Specification Recommended
Remarks
[NVIDIA GPU (3GB or more GPU memory is recommended)] is required to use some functions of the Windows version (high-speed rendering by GPU, composition by NSF). Please update the NVIDIA driver to the latest version before using.
**Please be assured that you can use it as an ordinary GPU for PC games and graphics without any problems.
You can use all the features of NEUTRINO in the online version. Since the operation is completed on the web browser, it does not require an NVIDIA GPU and can be operated from a smartphone.
How to use (Windows version)
Make a sequence
Create a sequence for loading into NEUTRINO. I use MuseScore
MuseScore is not a singing voice synthesis software, but a pure music notation software. Basically, write notes on the staff. The score, including the lyrics input, will be completed here.
When you're done, export the MusicXML format (* .musicxml). It is this sequence .musicxml that NEUTRINO loads. Rest assured that you can import MIDI.
Creating a score (MusicXML) Create xxx.musicxml with score creation software such as
MuseScore. You can use MIDI Sequencer that can export to MIDI file, If you are not good at write music score, by create the melody and lyrics inside Vocaloid Editor, UTAU, etc and then export midi file or simply use any DAW and import midi to MuseScore and then write the lyrics in MuseScore manually.
Load the file output from MIDI Sequencer into MuseScore.
If the lyrics are unreadable due to the difference in the character code, change the character code from "UTF-8" to "Shift_JIS".
Now the MuseScore can read Japanese hiragana.
If there are multiple parts, uncheck "Import" of the parts other than the output and delete them with "Apply".
Select File → Export.
Output the file as an "uncompressed MusicXML file". "ONLY"
Try to create a file name by using english without space to preventing error.
Create a singing voice file with NUTRINO
Placement of musicxml file
Move the output MusicXML file to "NUTRINO/score/musicxml".
Adjust NEUTRINO settings
The main body of NEUTRINO (where the user plays with it) is " Run.bat " in the folder . You need to play with this to synthesize the singing voice. I use Notepad++ for edit the data inside Run.bat file
Open Notepad and edit the Run.bat, specify the name of the sequence to be read, change the pitch, gender, etc. Run.bat searches for a sequence based on this information and passes the information to the synthesis engine.
Modify bat file
Open "Run.bat" in the NUTRINO folder with Notepad etc. and rewrite "set BASENAME =" and after to the file name of the MusicXML file (extension is not required)
Start NEUTRINO Synthesize the singing voice
Double-click Run.bat to start it. All you have to do is wait.
It takes some time to synthesize. It seems that it takes about 1 to 4 times the real time on a decent PC. The progress bar isn't displayed, so I don't know how long it will take to finish the composition. Let's do other work and wait.
When Run.bat finishes, there should be audio in the output folder.
When the character "end" is displayed and the prompt returns (when the character can be entered), the process ends.
Rendering is complete when "YourFileName.wav" is generated in the NEUTRINO / output folder.
Confirm that the wav file is output to the NUTRINO/output folder, and you are good to go.
Then load it into your DAW and be impressed!
I am grateful to have come across such wonderful software.
_______________________________________________________________________________________________
How to use the online version
Overview
Google Colaboratory (Colab) is a web service that allows you to run Google's cloud PC on your browser.
Use this service to run NEUTRINO online.
It is characterized by being able to execute machine learning / deep learning programs on a browser and check the results as if writing notes, and is widely used in data analysis sites, research, and education.
Since the operation is completed on the web browser, you don't even need a PC, and it works on smartphones. And at Colab, you can also use the GPU for free. Please take this opportunity to experience high-speed rendering and the latest neural vocoder singing voice synthesis (NSF version).
Things necessary
Import method (picture explaining)
1. Download and unzip NEUTRINO (online version)
2. Access Google Drive
3. Create a "Colab Notebooks" folder in My Drive and copy the previously unzipped NEUTRINO
4. Go to the NEUTRINO folder, right-click and select "Other"-> "Add App"
5. Search for and add "Colaboratory".
6. Double-click or Right-click "NEUTRINO.ipynb" in the NEUTRINO folder → "Open with app" → "Google Colaboratory" to open the notebook.
7. Open the notebook settings by selecting "Runtime"-> "Change Runtime Type" from the menu at the top of the screen. Make sure the GPU is selected as the hardware accelerator.
* Specifications are subject to change without notice due to the cloud environment.
8. Press the [] or play button displayed on the left of the cell in order to execute.
If a user authentication link is displayed, click the link to authenticate. The verification code will be displayed, so copy and paste it.
9. After that, if you execute in order, the audio will be output below NEUTRINO / output.
10. If you want to use your MusicXML, upload it to the score / musicxml folder, change the BASENAME and run No. 4 again.
11. When using the acoustic feature file (f0, mgc, bap) or label file created on the local PC, upload the score folder / output folder as it is, change the BASENAME, and execute NSF No. 5 again.
**Please note that if you execute same file name, it will be overwritten!
Remarks
The processing is slow (0.5 to 1.0 times speed) because the setup is entered at the first execution, but it operates at high speed (about 4 times speed) after the second time.
It may seem complicated, but the initial setup takes about 5 to 10 minutes, and after that it is almost the same as the conventional workflow.
Please experience high-speed rendering and the latest neural vocoder singing voice synthesis (NSF version).
Frequently Asked Questions
-Audio is not output normally or processing stops in the middle
First, check if all the files are synchronized.
If the file is insufficient, the output will not be performed normally.
Empty folders may not be uploaded depending on the browser. It may work if you change the browser.
There is also alternative 3rd party plugins editor that look and use similarly to UTAU and Vocaloid Editor for NEUTRINO as well.
You can find NEUTRINO Editor and how to use it here: NEUTRINO Editor
- This tutorial focus on Windows OS
- This guide cover both Offline version and Online version
About NEUTRINO
NEUTRINO is a singing voice synthesizer that uses a neural network. This software is freeware.
A neural network is used to estimate the timing of vocalization, pitch, voice quality, and faintness of voice from the score. The voice is synthesized by vocoder based on the above estimated parameters.
The name of this software is based on the desire to develop songs and genres that have never been heard before.
What to Prepare
1. Neutrino
2. Voicebanks
3. MuseScore for create the lyrics and MusicXML files
4. Notepad for edit the batch file
5. (Optional) Any MIDI Sequencer that can export to MIDI file, If you are not good at write music score.
Vocaloid Editor, UTAU, or simply any DAW should do it.
6. (Recommended) NEUTRINO voice support tool (Neutrino Editor)
7. (Recommended) Hiragana / Phoneme list
Download
NEUTRINO
VoiceBanks
MuseScore
Notepad++ (you can use standard notepad that come with Operating System)
NEURINO Voice Support Tool
More detail: NEUTRINO -Neural singing synthesizer
Installation
NEUTRINO
Unzip the downloaded file to any folder.
Since there is no installation work and it will be executed when it is unzipped, it will be unzipped to the location where it will actually be used, not a temporary folder.
When uninstalling, delete each folder.
MuseScore
What is MuseScore?
Finale and Sibelius are some of the most famous music notation software, but they are expensive, the free version has limited functionality, and they are not updated and are difficult to use. On the other hand, MuseSocre is open source music notation software.
It is updated frequently and it is characterized by running on multiple platforms such as Windows / Mac / Linux.
It is easy to use with no functional restrictions, MIDI import is possible, and you can try MIDI and sheet music created with other DAWs and GUIs relatively easily.
MIDI import
Please refer to the following for MIDI import.
You can speedily work by reading your existing MIDI or converting a sequence created with other software to MIDI and reading it.
The following settings are recommended.
Channel: 1 only
Quantize: 16th note or 32nd note
Voice part: 1
tuplet: 3, 4, 5, 7, 9
Simplified note: ON
Staccato display: OFF
Dotted note: ON
Display tempo text: ON
NEUTRINO Modules
About each module
musicXML_to_label
Code:
Input : score/musicxml/*.musicxml
Output : label/full/*.lab
: label/mono/*.lab
score creation . You can output a MusicXML format file by selecting [File]-> [Export] and setting the file type to uncompressed MusicXML file. In other software, the output will be "* .xml", so change SUFFIX in Run.bat to xml.
NEUTRINO
Code:
# predict timing
Input : label/full/*.lab
: model/KIRITAN/*.bin
Output : label/timing/*.lab
# predict acoustic feature
Input : label/full/*.lab
: label/timing/*.lab
: model/KIRITAN/*.bin
Output : output/*.f0, *.mgc, *.bap
You can change the sound source with ModelDir.
You can change the number of processors used by NumThreads.
Example
Code:
# 変更前
set ModelDir=KIRITAN
set NumThreads=3
↓
# 変更後
set ModelDir=YOKO
set NumThreads=4
Code:
Input : output/*.f0, *.mgc, *.bap
Output : output/*.wav
You can change the number of processors used by NumThreads.
You can change the pitch with Pitch Shift.
You can change the voice quality with Formant Shift. Raise it to make it childish, and lower it to make it more mature. (Around 0.85-1.15 is recommended.)
Example
Code:
# 変更前
set PitchShift=1.0
set FormantShift=1.0
set NumThreads=3
↓
# 変更後
set PitchShift=0.944
set FormantShift=1.05
set NumThreads=4
Key | -6 | -5 | -4 | -3 | -2 | -1 | ±0 | +1 | +2 | +3 | +4 | +5 | +6 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Pitch shift | 0.707 | 0.749 | 0.794 | 0.841 | 0.891 | 0.944 | 1.000 | 1.059 | 1.122 | 1.189 | 1.260 | 1.335 | 1.414 |
NSF_IO (Windows / online version only)
Code:
Input : label/full/*.lab
label/timing/*.lab
output/*.f0, *.mgc, *.bap
model/KIRITAN/NSF.jsn
Output : output/*.wav
NSF is one of the methods to generate high-quality voice waveforms close to real voice at high speed using neural networks. It is a clear and attacking sound without over-smoothing bass. If it is in the proper range, there is almost no deviation in sound quality and the quality is stable and high.
On the other hand, WORLD is good against processing such as pitch shift and formant shift, and is also good against sounds that are far out of the proper range. Both have advantages and disadvantages.
System Specification Recommended
Action environment | Windows 10 MacOS (Apple M1 compatible) Online Linux (Ubuntu) |
CPU | Intel Core i5 AMD RYZEN 5 Apple M1 |
GPU (optional) | NVIDIA GPU (3GB or more GPU memory recommended) |
memory | 8GB or more |
Free disk space | 10GB or more free space |
Remarks
[NVIDIA GPU (3GB or more GPU memory is recommended)] is required to use some functions of the Windows version (high-speed rendering by GPU, composition by NSF). Please update the NVIDIA driver to the latest version before using.
**Please be assured that you can use it as an ordinary GPU for PC games and graphics without any problems.
You can use all the features of NEUTRINO in the online version. Since the operation is completed on the web browser, it does not require an NVIDIA GPU and can be operated from a smartphone.
How to use (Windows version)
Make a sequence
Create a sequence for loading into NEUTRINO. I use MuseScore
MuseScore is not a singing voice synthesis software, but a pure music notation software. Basically, write notes on the staff. The score, including the lyrics input, will be completed here.
When you're done, export the MusicXML format (* .musicxml). It is this sequence .musicxml that NEUTRINO loads. Rest assured that you can import MIDI.
Creating a score (MusicXML) Create xxx.musicxml with score creation software such as
MuseScore. You can use MIDI Sequencer that can export to MIDI file, If you are not good at write music score, by create the melody and lyrics inside Vocaloid Editor, UTAU, etc and then export midi file or simply use any DAW and import midi to MuseScore and then write the lyrics in MuseScore manually.
Load the file output from MIDI Sequencer into MuseScore.
If the lyrics are unreadable due to the difference in the character code, change the character code from "UTF-8" to "Shift_JIS".
Now the MuseScore can read Japanese hiragana.
If there are multiple parts, uncheck "Import" of the parts other than the output and delete them with "Apply".
Select File → Export.
Output the file as an "uncompressed MusicXML file". "ONLY"
Try to create a file name by using english without space to preventing error.
Create a singing voice file with NUTRINO
Placement of musicxml file
Move the output MusicXML file to "NUTRINO/score/musicxml".
Adjust NEUTRINO settings
The main body of NEUTRINO (where the user plays with it) is " Run.bat " in the folder . You need to play with this to synthesize the singing voice. I use Notepad++ for edit the data inside Run.bat file
Open Notepad and edit the Run.bat, specify the name of the sequence to be read, change the pitch, gender, etc. Run.bat searches for a sequence based on this information and passes the information to the synthesis engine.
Modify bat file
Open "Run.bat" in the NUTRINO folder with Notepad etc. and rewrite "set BASENAME =" and after to the file name of the MusicXML file (extension is not required)
Start NEUTRINO Synthesize the singing voice
Double-click Run.bat to start it. All you have to do is wait.
It takes some time to synthesize. It seems that it takes about 1 to 4 times the real time on a decent PC. The progress bar isn't displayed, so I don't know how long it will take to finish the composition. Let's do other work and wait.
When Run.bat finishes, there should be audio in the output folder.
When the character "end" is displayed and the prompt returns (when the character can be entered), the process ends.
Rendering is complete when "YourFileName.wav" is generated in the NEUTRINO / output folder.
Confirm that the wav file is output to the NUTRINO/output folder, and you are good to go.
Then load it into your DAW and be impressed!
I am grateful to have come across such wonderful software.
_______________________________________________________________________________________________
How to use the online version
Overview
Google Colaboratory (Colab) is a web service that allows you to run Google's cloud PC on your browser.
Use this service to run NEUTRINO online.
It is characterized by being able to execute machine learning / deep learning programs on a browser and check the results as if writing notes, and is widely used in data analysis sites, research, and education.
Since the operation is completed on the web browser, you don't even need a PC, and it works on smartphones. And at Colab, you can also use the GPU for free. Please take this opportunity to experience high-speed rendering and the latest neural vocoder singing voice synthesis (NSF version).
Things necessary
- Google account
if you do not have here please create from. - web browser Works with
major browsers.
Chrome and Firefox are recommended.
Import method (picture explaining)
1. Download and unzip NEUTRINO (online version)
2. Access Google Drive
3. Create a "Colab Notebooks" folder in My Drive and copy the previously unzipped NEUTRINO
4. Go to the NEUTRINO folder, right-click and select "Other"-> "Add App"
5. Search for and add "Colaboratory".
6. Double-click or Right-click "NEUTRINO.ipynb" in the NEUTRINO folder → "Open with app" → "Google Colaboratory" to open the notebook.
7. Open the notebook settings by selecting "Runtime"-> "Change Runtime Type" from the menu at the top of the screen. Make sure the GPU is selected as the hardware accelerator.
* Specifications are subject to change without notice due to the cloud environment.
8. Press the [] or play button displayed on the left of the cell in order to execute.
If a user authentication link is displayed, click the link to authenticate. The verification code will be displayed, so copy and paste it.
9. After that, if you execute in order, the audio will be output below NEUTRINO / output.
10. If you want to use your MusicXML, upload it to the score / musicxml folder, change the BASENAME and run No. 4 again.
11. When using the acoustic feature file (f0, mgc, bap) or label file created on the local PC, upload the score folder / output folder as it is, change the BASENAME, and execute NSF No. 5 again.
**Please note that if you execute same file name, it will be overwritten!
Remarks
The processing is slow (0.5 to 1.0 times speed) because the setup is entered at the first execution, but it operates at high speed (about 4 times speed) after the second time.
It may seem complicated, but the initial setup takes about 5 to 10 minutes, and after that it is almost the same as the conventional workflow.
Please experience high-speed rendering and the latest neural vocoder singing voice synthesis (NSF version).
Frequently Asked Questions
-Audio is not output normally or processing stops in the middle
First, check if all the files are synchronized.
If the file is insufficient, the output will not be performed normally.
Empty folders may not be uploaded depending on the browser. It may work if you change the browser.
There is also alternative 3rd party plugins editor that look and use similarly to UTAU and Vocaloid Editor for NEUTRINO as well.
You can find NEUTRINO Editor and how to use it here: NEUTRINO Editor