I'm seeing a lack of threads talking about AI text-to-speech and voice especially when it has gotten to this point
https://files.catbox.moe/r5xvmrsjyunc.wav
I'm seeing a lack of threads talking about AI text-to-speech and voice especially when it has gotten to this point
https://files.catbox.moe/r5xvmrsjyunc.wav
>https://files.catbox.moe/r5xvmrsjyunc.wav
Sorry here's the actual link
https://litter.catbox.moe/zv5zte.wav
ッチ、エッチのは君だろう馬鹿妹
なぜ妹さんは名無しさんのことを先輩と呼ぶんだろう?
More resources
https://github.com/CjangCjengh/MoeGoe
https://huggingface.co/spaces/skytnt/moe-tts
https://voca.ro/1kHbkFInvzGl
but it's japanese
also the characterai fiasco made me lose my faith on ai waifus
how many times do people have to try to "prevent nsfw" before they realize it means worsening everything?
>but it's japanese
There are English models
>the characterai fiasco made me lose my faith on ai waifus
You can ran this locally with a GUI using https://github.com/CjangCjengh/MoeGoe_GUI and the models can be found at https://github.com/CjangCjengh/TTSModels
thanks, gonna check it out in a bit
You're welcome. I'm going to make an english fork since the GUI is japanese
the model with thousands of voices is very impressive. this mixed with koboldai would be funny
Thanks, made some adjustments to the pretty bad python script and now it works quite well:
https://files.catbox.moe/fipw1y.wav
The English voices and accents were quite bad.
The Japanese ones were good. It work's so well with short phrases, that you could literally use the output in games right now. Anime voice actors shaking as much as the artists drawing the frames.
Impressive stuff.
These are some random Japanese/Chinese individuals that trained these? Is this some new groundbreaking method, or why is all the English voice synthesis models absolute dogshit? Even professional ones.
There are many reasons the english voices are harder to train, including that english is a much more complicated language and that japs love conformity which means every other voice actors does one of 5 archetypes and nothing else. This allows for a very large corpus of very limited phrases (it's also why it's so easy to learn a lot of japanese purely by consuming animu, while you can't do the same for most other languages without also studying things in parallel). This is a perfect setting for deep learning.
Chinese might also work well because while it doesn't have the same conformity as japs, it has so much data it might be able to overcome that issue. That is because while the setting is similar to English, chinese is more regular and has less international presence, thus less overall variation.
Imagine being so wrong in just a single post
Holy shit
I accept your surrender.
The BOT mascot is 小岩井よつば from the manga よつばと!
https://www.BOT.org/flash
Anime Site.
>2005 otakon intro
>delicious cake
>Electiongays come over a decade later
>>go back to /a/ anime gay
Learn your history, chud.
OK election tourist. Whatever helps you sleep at night.
Ok, Fox News tourist.
have a nice day
Why does the literal truth make chuds so mad? Stop trying to skinsuit BOT. It will never happen.
It's people desperate for attention. Just ignore them.
Since 2004 BOT has been in my bookmarks folder named "Anime discussion forums"
go back 2 gaia