I just discovered a cure for SynthV Studio’s harshness. (Or at least a cure for Saki AI, but I suspect that this works for all SynthV Studio voicebanks, including those running in the original SynthV, which also exhibited harshness.)
Normally I use Z-Noise to try and reduce some of SynthV’s harshness, but today I got thinking. Does Z-Noise work better at higher sample rates? So instead of exporting Saki AI at 44.1 kHz, I decided to try 88.2 kHz—except there was no such option. All I could choose from was 44.1, 48, and 96 kHz. So I gave 96 kHz a spin.
But after doing so, and even before I engaged Z-Noise, I noticed how much cleaner Saki AI had become. The harshness had disappeared without the need for extra post-processing!
Owing to the lack of an 88.2 kHz option, I then wondered if SynthV preferred multiples of 48 kHz, so I tried exporting at that sample rate. But when doing so, I noticed that SynthV did not re-render the audio before saving. In fact, once the vocal track had been internally rendered, it could be saved as 44.1, 48, and 96 kHz in rapid succession without any extra CPU grinding. This led me to wonder if SynthV is always running internally at 96 kHz, and if SynthV’s harshness is caused by poor sample-rate conversion.
A quick listen to the 48 kHz export option seemed to confirm this, as it sounded as nearly harsh as the 44.1 kHz export. So I tried down-sampling the 96 kHz export to 44.1 kHz using Audacity and compared it to the SynthV native 44.1 kHz export. The difference was not subtle, with the 44.1 kHz Audacity output sounding essentially identical to the 96 kHz source.
If you are a SynthV user, I recommend exporting at 96 kHz and then down-converting to 44.1 kHz in either Audacity, your DAW, or whatever. Just avoid exporting natively at 44.1 or 48 kHz and you’ll have much less harsh sounding vocals. The difference is very much audible.
And yes, sample-rate conversion (SRC) does vary from software to software, as evidenced by the data presented at this website:
src.infinitewave.ca