Here’s my latest collaboration, and my first to use Saki AI:
As per usual, りくりくり created the (rather lengthy) lyrics. Importantly, they were written first, and I had to add music to what was provided. This proved challenging because there is a near total absence of a regular syllabic meter. Nevertheless, I managed to create something that more-or-less fits the words, and I even added moments of word painting.
The biggest learning experience was discovering a cure for SynthV’s harsh audio output. Ever since I began using the original SynthV, I had always struggled to eliminate its brittleness—often with the help of Z-Noise. Unfortunately, this same harshness can be heard with SynthV Studio, which led me to believe that it was just a quirk of the synthesis engine.
However, through some luck and investigation, I now hypothesize that this brittleness is caused by poor sample-rate conversion. Although I don’t know what goes on inside the SynthV synthesis engine, evidence points to it always running internally at 96 kHz, and when you export to 44.1 or 48 kHz the software just does a simple sample-rate conversion rather than re-synthesize for the target rate. If so, then this conversion process (Decimation?) is flawed.
To fix the problem, I now export to 96 kHz and then down sample to 44.1 kHz using Audacity. The difference is very, very noticeable. And while I’m in Audacity, I also edit out the weird clicks that happen during quiet moments, as well as export to FLAC format. (Why was FLAC support removed for SynthV Studio? It was in the original SynthV.) The result is an audio file that no longer needs Z-Noise, and sounds way better than Z-Noise could ever achieve with a native SynthV 44.1 kHz export.
And to think, I only just discovered this one day before finalizing the above video. Good timing! Although the timing could have been better; I could have discovered this two years ago. (Assuming that this same fix also works for the original SynthV, which I haven’t tested.)
As per usual, りくりくり created the (rather lengthy) lyrics. Importantly, they were written first, and I had to add music to what was provided. This proved challenging because there is a near total absence of a regular syllabic meter. Nevertheless, I managed to create something that more-or-less fits the words, and I even added moments of word painting.
The biggest learning experience was discovering a cure for SynthV’s harsh audio output. Ever since I began using the original SynthV, I had always struggled to eliminate its brittleness—often with the help of Z-Noise. Unfortunately, this same harshness can be heard with SynthV Studio, which led me to believe that it was just a quirk of the synthesis engine.
However, through some luck and investigation, I now hypothesize that this brittleness is caused by poor sample-rate conversion. Although I don’t know what goes on inside the SynthV synthesis engine, evidence points to it always running internally at 96 kHz, and when you export to 44.1 or 48 kHz the software just does a simple sample-rate conversion rather than re-synthesize for the target rate. If so, then this conversion process (Decimation?) is flawed.
To fix the problem, I now export to 96 kHz and then down sample to 44.1 kHz using Audacity. The difference is very, very noticeable. And while I’m in Audacity, I also edit out the weird clicks that happen during quiet moments, as well as export to FLAC format. (Why was FLAC support removed for SynthV Studio? It was in the original SynthV.) The result is an audio file that no longer needs Z-Noise, and sounds way better than Z-Noise could ever achieve with a native SynthV 44.1 kHz export.
And to think, I only just discovered this one day before finalizing the above video. Good timing! Although the timing could have been better; I could have discovered this two years ago. (Assuming that this same fix also works for the original SynthV, which I haven’t tested.)