Ever since Vocaloid5 came out, I've been wondering how they achieved the vocal fry effect. Before the release of V5, I assumed that the only way to achieve a convincing vocal fry was to record actual fry samples from the voice provider and work them into a voicebank through something similar to cross-synthesis. Given this, I believed that if vocal fry was added to vocaloid's repertoire, it would only be able to be used with new banks which had recorded some fry samples.
However, it became clear that my assumptions were totally wrong when V5 announced that the vocal fry feature was backward compatible with old voicebanks. And it's driving me crazy wondering how they did it.
The only decent example of an attempt at vocal fry on a vocaloid before V5 was posted by the producer PSGOZ back in 2015, and even then, it's a little rough around the edges and not as natural sounding as what Vocaloid5 was able to achieve. Additionally, afak, PSGOZ never revealed how he did it, so no clues there either.
TLDR; Vocaloid5's vocal fry is artificially generated. Do any of you have any insights on how exactly this effect might be being accomplished under the hood?
However, it became clear that my assumptions were totally wrong when V5 announced that the vocal fry feature was backward compatible with old voicebanks. And it's driving me crazy wondering how they did it.
The only decent example of an attempt at vocal fry on a vocaloid before V5 was posted by the producer PSGOZ back in 2015, and even then, it's a little rough around the edges and not as natural sounding as what Vocaloid5 was able to achieve. Additionally, afak, PSGOZ never revealed how he did it, so no clues there either.
TLDR; Vocaloid5's vocal fry is artificially generated. Do any of you have any insights on how exactly this effect might be being accomplished under the hood?