AI Voice Synthesis Posted on January 29, 2023 by Anonymous https://vocaroo.com/156LekEhAlrA https://beta.elevenlabs.io/
every garden gnometuber with 20 bucks is using these things so that they don't have to get on a mic now
How long will this thread last?
hopefully not that long
Record your voice and let's find out.
If you can make it sing, you can be Moonman 2.0 Electric Chair Boogaloo
Someone should get this to the vaxx jamz
Imagine having a system where it's synthesized in real-time. This shit could unironically start the next WW3.
>ww3 starts with shitposters making world leaders say they're going to open the borders of israel through the fires of nuclear holocaust
Does anyone have the one where he says "slam dunk moron babies into the trash can"?
Thanks sweet bro
doesn't really sound like him
is it for anybody
The futa one got me good
i am fucking dying
he's gonna play with one of these on the podcasts eventually, guarantee it
>jamie pull up that BOT post with the AI talking in my voice
It's so strange.
It sounds just like him,
But it sounds like he's reading off a piece a paper, or from a "copy".
Did you use prompt audio of him reading an ad?
Because it sounds like he's reading from a book or something.
It was probably trained on voices reading something, which is why it sounds like that
That's what I was thinking.
Fuck cannot unhear now.
aww that's too bad was so close to being perfect
It's courtesy of /v/. They're playing with it like no tommorow.
I think if you turn down the voice stability setting it might sound more natural.
The future is now.
the flawless japanese is funny as fuck
>=[ Chinese cartoons
Blew my sides into orbit.
Holy fucking shit I just spit out my coffee
For me it’s the Mcchicken.
I can imagine him casually shilling Mcdonalds in the middle of his speeches like that
Sounds way too calm.
You should train it on one of his many monologues where he loses his shit.
It seems that the synthesizer doesn't do that. I tried that sergeant from Full Metal Jacket and the result was pretty disappointing
Why did the AI shit all come out like at the same time?
AI images, AI writing and now AI voices all blew up at the same time in the last few months? Like why all at once
Because it was all already developed with military tech,
And the breakdown of narrative control (really just the energy required to maintain it) is more than the energy that would be required for the cabal government to lead us if we had absolutely no chance of knowing what the truth is.
Easier to guide the blind than to distract someone with sight.
Because the algorithms were developed decades ago, and they're being disclosed instead of being kept hidden.
Imagine industry developed the means of production but kept it a secret to all the workers. Shit metaphor but I'm tired and horny.
Transformer network architecture (attention etc) was the big thing that made it way faster. That was discovered a couple years ago and the few people who realized just how much better it was were keeping the models to themselves until pretty recently.
I'm experimenting with a fundamental algorithm change that could improve the memory/time efficiency by 10-100x but I'm keeping it to myself since my employer technically owns all the IP I create (even on my own time) and I want this for myself.
ML is the new tcp/ip
I really think a decent language model that you could cram onto a PC could replace google.
Then the internet becomes only interesting for commerce and messaging, that would change things a lot.
Oh shit I didn't even think of that. These companies are finished. Your own personal Google
Still working within the bounds of ANNs, or are you designing your own AA architecture too?
imo, ANNs are actually pretty bad for anyone but big companies. The networks are basically black-boxes, and extracting "knowledge" from them is usually more complex than just retraining a new net from scratch, which can cost in the millions. Seems like a great way to keep cutting-edge nets proprietary while dripping stuff out to consumers, limiting what they can do while simultaneously disincentivizing people training their own, more capable "AI"s, since the commercial net makes makes the first 90% of development basically useless/unsellable (since the "free" commercial nets already do it, your own net is worthless until it implements a new feature, which won't happen till you're practically finished with development
TL;DR: we need an ML architecture that supports transfer of data without retraining or frankenstein'ing the nets
thisisnotjordanpetersen.com was up like 4 years ago, this tech is a decade old on the academic side
military in glow moron and moron project budge had already developed such tech's years ago.
Now random gay garden gnomes that were good parasites were given the intelectual property.
so leaders can call any new leak a deepfake to protect themselves. also to prevent one or a few parties from destroying everything you just allow everyone to have the tech and let everyone know shit can be faked so the world doesn't believe the wrong thing one morning and cause armageddon.
also shit like this has been out for awhile but was only half disclosed and only one fourth available.
>Why did the AI shit all come out like at the same time?
because it's all based on the same fundamental tech. it's all just neural nets. people have been playing with it for a long time and the more they do the better it gets. practice makes perfect and all that
You're just dumb
AI voices have been a thing for years now
Stable Diffusion's only been a thing since last year
Is there any real reason we shouldn't deepfake a series of Biden videos, with audio,
Where he says he will begin world war three?
Two or three really good ones, just enough to terrify the world, and then profit on the stocks?
He speaks to clearly without any of his stuttering or brain farts where he stops to think and also he’s too monotone without his cringe and slow emphasis on certain words
Need to train the bot on recent speechs then
Needs more "not a joke" and whispering
>Needs more "not a joke" and whispering
you have to admit the whispering is based as fuck.
If someone could figure out how they watermark the audio and remove it or edit it over,
And then capture a real Biden impersonator saying something similar, actually acting it out, and one of those deepfake videos with the facial scan.....
Almost perfect, sir.
Please change "dog pussy" to "dussy" and reupload.
So proud of you, when it's done spam it because I don't want to miss it. Glad you picked my audio too.
That actually creeped me out, during first. 15 seconds.
I feel like that was okay to listen to but I shouldn't have listened to it.
He's WAY too comprehensive and eloquent.. needs more studdering and weird digressions
Add camera clicking sounds in the background at that's it donezo finito
How are you getting these different voices? Pay $20 bux?
Last call at the Bee & Barb
Nazeem is pissed
Do they have any offline versions of these things?
what yah up to mate?
Yeah, do you have a datacenter ready to go with 8000 high-end enterprise GPUs?
I'm sure a local instance can handle a single request with enough training data like the art ML programs
just make him say moron, darkest dumbest etc.
Bros I can’t wait to see some old StarCraft Maps get the old voice actors deep faked into those fan made campaigns
I cant wait for when Ai video is ready, This with these voices will be amazing.
>At Eleven, we believe that we should strive to make the most of new technologies, but not at all cost. As we develop them, we make every effort to implement appropriate safeguards which minimize the risk of harmful abuse. With this in mind, we’re fully committed both to respecting intellectual property rights and to actioning misuse. We will also watermark all audio generated by our model, so that it is instantly identifiable as AI-generated.
Sounds like the sort of thing you can sniff out with a spectrum analyzer.
Maybe, but I doubt you could do anything about it.
With audio, it's very easy to add a frequency of the audio and then compress it in a way that makes it impossible to remove or edit.
You probably know that already, but I guess I'm saying it anyway.
It would be like removing salt from a stew. Or at least it could be, If they are smart about it. Maybe I should keep my mouth shut
If you find it once, you can write a little script to automatically search for it across the frequency range and mute the watermark.
Thanks insider anon.
No, I don't think you could.
That's what I'm saying. Unless they are very stupid about it, and just add a little "blip" or add a sound.
But I doubt it.
What I'm saying is that if I wanted to add a watermark to audio like this to make sure I could always tell that it was generated by my scripts (and prove it's fake) I could do that in a way, through audio editing, that would make it impossible to remove from the audio. That is possible to do and it's not difficult. Compressing two audio sources together results in a final wave form that is impossible to get either of the originals from, there would be no way to ever do it, or to remove traces of one of the files from the other. That's how audio compression works, think of it like a encryption that has no decryption key.
I have noticed checking the biden ones that you can hear applause etc when he is supposed to take a pause or breath. I assume its a ghost of what the bot was trained on. So there would be tell tell signs if you look close enough.
But theoretically we could get around that with carefully chosen training audio.
I was thinking it might be best to actually cut and shape the training prompt from multiple sources, or from multiple pieces of a much longer speech, in order to match the pacing and syntax of the text we want read. Certainly to make the timing, pace and pauses.
But that's not what I imagine their watermark is. I imagine the watermark is something like a reversed version of the audio, with an EQ gate above maybe 15k, then added into a compression somewhere in the audio chain, so it's impossible to remove from the transients or something. Or a combination of that which couldn't be edited by low pass.
I don't know if that what you meant, but there's that.
yes on both points, it would require a better model to make it a better outcome, and you could do a freq watermark, but I assume if you knew what it was it would be an encryption key at that point, and with the limited amount of freq that humans can hear, you could make to only be picked up by an analyzer.
there is no watermark. they just run it through their classifier that tells you the probability the clip was generated by their AI
Why hasn’t anyone trained it on Hitler and made a “secret” never before heard speech where he says garden gnomes made him do the whole thing?
Be the change you want to see in the world. the sky is the limit.
they already did that, but it was in 1945 and they made him say he hated garden gnomes
Make it Tim Pool
ok so https://beta.elevenlabs.io/ this has some voices but how the fuck did you get jordan peterson's voice?
you can upload mp3 samples of someones voice to make a custom voice, it's how i did belathor and nazeem
what happens if you do a coombait tester one based on like some ASMR chick or something does it work?
jesus christ, the AI threads (art and now voice) always being full of beggars (do this do that!) have just confirmed to me how many fucking lazy fucks there are in the world. The type of people who watch people play video games, rather than do it themselves. What the fuck you lazy piece of shit, there's a link in OPs post, go try it yourself.
The duality of 4chan
sorry for askin, i was just curious if it fucks up when it trys to analyse whispers or something, im doing a fuckload of multitasking right now and just browsing 4chan as im doing other shit, i dont want to have to sign up for shit right now or download/upload anything.
Most people are phonegayging these days.
>the AI threads (art and now voice) always being full of beggars (do this do that!) have just confirmed to me how many fucking lazy fucks there are in the world. The type of people who watch people play video games, rather than do it themselves. What the fuck you lazy piece of shit
It's okay anon
It's always been this way
When I was a photographer people wanted shit, couldn't describe it and if they could might take thousands of dollars of equipment on site, or maybe a dozen hours in ps/or if it was realistic at all
and they wanted it for as close to free as possible
It's always been this way, dumb lazy people who want want want
You underestimate the amount of phoneshitters that make up the internet today.
a little bit more specifics about the sample i gotta upload? like does it have to have all the vowels or how long should it be?
I improved it a little, but the chud gets scuffed
Oh man, seems like i could absolutely use this tech for AI generating vocal samples of shit that doesnt actually exist.
Anybody aware of any alex jones ones? i want to hear how well it does his ranting.
how do you add in popular voices?
hi-tech, low-life is now.
its fucking terrifying tbh, people are going to deny reality, because virtually they can be anything they want, fatal escapism
>people are going to deny reality
The cream will rise to the top.
You are cream right anon?
I bet your tasty cream... You better be
oooooooo randy savage
All im thinking about now is realtime gpt mixed with real time vocal synthesis crossed with realtime stable diffusion fed into a vr headset using upscaling technology, could literally transport you into a reality of your own construction, doesnt even seem that far off either, 20 years maybe max?
waifus are going to be real i guess, well close to real, synthetic waifus.
ONLY GOOD BUG IS A DEAD BUG
Can you please get peterson to say the..pasta
>Has Anyone Really Been Far Even as Decided to Use Even Go Want to do Look More Like?
which one? I'm running this locally so generating an entire paragraph will take a while, a sentence takes about 5-10 minutes
yeah figured it was kind of involved, no worries, im going to have a mess with this shit when i can, im just kind of browsing here as im doing other important shit that periodically needs focus, then i get 5 minutes downtime, cant really do anything too involved right now.
kek, sounds like a normal peterson speech
Already chewed through my monthly limit.
Holy shiittttttt that is goooodddd
Sounds like Biden 16 years ago. Wont fool anyone because it's too coherent.
Some kino dumbledore stuff.
I recommend the third one. That's the best I've done.
Share mp3's and settings
Kek. Someone should send this to the Kremlin.
Really needs a "Ha!" at the end
my sides! best thread up
my favorite so far
Best in thread
That's it. Nothing can top this, ever.
amazing, so funny
I'M FUCKING DYING
Holy fucking nailed it
I cant fucking breathe
You did it, haha! xD
My fucking sides
reposting this from /v/eddit
moments before January 6, 2021
address to twitter, scrubbed immediately after posting
Imagine the truther community hears this and nobody tells them it's AI
Dan Carlin on Foot Fetishism
guys it won't be long until you have something like chat gpt, mixed with this voice synthesis, interfaced with an operating system like windows.
seems like Her, the film, is just 10 years away, even if it's not perfect and makes mistakes.
yeah that was my initial thought, when all this shit mixed with stable diff becomes real time its going to be nuts
they've been using it for 10+ years
you have probably watched a CGI person on TV before
2, it will be made by Microsoft and it will be obsolete the day it launches
Hey friendly reminder
Adobe demos shit like this ages ago. Called Photoshop for voice. Never let the public at it:
The inolication should be clear
You guys have been hearing ai audio for a long time
2016 really makes ya think
yeah, the weapons finally made the black market.
do a follow up
>i have autohrized strikes on tel aviv
You know why this one's so good?
Bush actually spoke in this exact monotonous voice every single speech
HOLY FKKKK im done. you win.
Its like being 14 all over again.
how are you guys cloning the voices?
I started doing this a few days ago. I'm baking Molynneux, Tim Pool, Steven Fry and JP. I can share the voice data, but who should I do next?
I would be interested in seeing how it renders out the extremes, like whispers or shouting etc, or voices with a lot of sssss type vowels etc. It seems pretty convincing but i think like photoshop and the ai image generation you can kind of spot it if you know what to look for, its bloody convincing though.
Seems like the better the quality the input, the better the output.
That can be done, but it needs to be done on a different AI system that gives you cadence controls
I played with it a bit and it can be done with the current AI aswell.
You can just put in different samples of the same person shouting for 1 voice.
Then different samples of the same person talking normally for another voice.
You can also use brackets and periods to change pacing. That worked with Dan Carlin for me
Does the ML take in punction as a prompt like ? or !
thats interesting information, will check it out myself when i have time to spend a few hours involved messing with it, actually kind of interested in it from a music creation standpoint, might be cool to generate samples of spoken word that are basically not real samples, so unsure how the copyright would work.
Jeremy Clarkson doing Hitler speeches
Jeremy Clarkson would be memes as suggested by
Howard Stern though, that would seriously fuck with boomers.
So would Vince McMahon, Steve Austin would be fucking based, Martin Luther King, Malcolm X, Bill Gates, any celebrity would work.
This is it, gentlemen.
The only thing separating us from fully Turing-test passing synthetics now is the relatively janky state of robotics, but as soon as they figure life-like movement the sex robot market will fucking EXPLODE
the original sounds completely different:
ok I got my speech , how do I put it in vocaroo?
I absolutely and 100% fear for the future of evidence-based justice, and I think it's clear that big tech doesn't even care.
Everything made with AI should have some sort of detectable digital watermark.
you fucking retard
why do you think tptb have been pushing "muh feels" and "muh lived experience" and "muh bodies"?
Go ahead and spell it out for me.
you will know nothing, less than nothing, and you will be happy
if you knew the first thing about the justice system you wouldn't believe in it to begin with. look up case prosecution and closure rates, not to mention family courts. in the justice system, actual justice was always just an accidental byproduct.
Even if big tech companies all agreed to add a "digital watermark" the methods and datsets used to train these advanced models are already well known. It's become approachable enough that even a (knowledgeable) individual can make a model on their own, let alone a group with any sort of funding. The cat's out of the bag and there will be more and more models popping up now. It's already at a point where training a model is a weekend project for a software developer.
penalty of death for this shit and we'll see how easy will it be for you
dont worry, we will collapse before it becomes an issue.
Then you just accept everything without a watermark as fact? This is a tool like any other, if you refuse to use it then you'll just be at the mercy of people who do.
No, you're gonna have to learn that recordings and pictures don't constitute reality and the only thing that has value is what you see with your own eyes, like the good old days
rather hard for the moroncattle, but fuck them
technology is human extinction.
why doesnt it sound like JBP despite feeding it two mp3 files 5 minutes long with only him talking as training source?
Hahahaha, the ending was the best, the way it went all sassy, valley girl.
Actually that was quite full and rich of words, just like the werewolves would tear apart the young ripe breasts from a female body not caring if she was a virgin or if she was still breathing when the breasts came back to the rightful owner of slayn indecent but yet docile cattle. Once the wolf has captured the breast indifferently of the dignity of the woman, yet their bountiful flesh that is stuck on their chest like a sore thumb to their purity yet signals and dignifies its purity. The woman will become pure flesh that has to be taken like a hunter sees a gatherer picking barries from a tree.
And said tree has no stem but a hole, above the heart, right next to another hole. And i will stick my magic stick into one of said holes and her heart will be broken.
cool thread plz make another one after limit
I tried making a Metal Gear raiden voice of Flynn but it doesn't wanna cooperate, wat I do wrong 🙁
Good luck on your law suits
seems like the cats out of the bag on this kind of tech like deepfakes etc, its only going to improve
Black Wizard of putrid folly.
Pusillanimous pip-squeak of perfidy and prejudice.
Idiot ideolog of Dunning-Kruger deranged moral superiority.
I straighten my back and take my meds.
looks like a dash acts like a stop, almost perfect
Thats insane how good that actually is, some of the rogan ones are incredible too.
tried to improve OP
I haven't laughed this much in years, literally some of the funniest shit I've ever listened to
Wasn't creative enough to come up with better text. Suggestions welcome
Anyone have the "be thankful for the jab copypasta?
>The real war hasn't even begun
Someone do an Elon Musk one
Laughing until I couldnt take it anymore. Someone please send this to him.
tried with luther pierce but it just doesnt sound right
Im sure there is a lot more biden models to work on for the cadence for each one. Thats why his sounds more human and not read than the other ones.
The dedicated models for JP and AJ are going to be amazing.
this would be best in the voice of Rorschach from that movie
sorry forgot to say that the first one is with biden and the second wlp
Do one of Scott Adams
Previous was funnier, OP.
Here's a request. Do a Biden voice:
My fellow Americans, dignitaries and foreign allies, we are at a never before seen crossroad in history. Unfortunately, gnomish aggression headed by the Military Industrial Complex and Private Central Bank regime is set on undermining our most principled core Western values and continues to disregard international law and recognized sovereign borders. We can not allow lawlessness and aggression to lead the world into the twenty first century. We must act, and we will do so swiftly and with strength. The United States was the Republic and had freedom a long time ago, along with our allies, we will not falter in our beliefs. This is why, as your elected president, I am calling on all branches of our armed forces to directly counter this illegal provocation by unelected officials in Washington DC. I understand that the stakes have never been higher but we have no choice, we must act, and we must do so now. I have called on our nuclear arsenal to remain at the highest readiness. Our strength is unshakable and we will demonstrate this with our full might to defend our values, our beliefs, and our people.
That is why I also resign from the post of the President of the United States.
God bless America and may God lead us through.
Someone do G-man from HL.
Will this tech kill voice actors, i mean fuck, with AI Generated art and ai generated voice acting, <6 man teams could make pretty large in scope videogames again.
You sure do like vocaroos and morons.
Mike tyson is a goat.
>the breaths and meter
Fuck bros I think we're in trouble
> they neuter LE BAD WORDS in 3.. 2...1...
Local users are based
sad thing there's no local audio-ai, for now.
also it's prb the same clusterfuck shit as LLM's requiring 8 nvidia A100's just to run it.
I literally just made it say "I hate morons".
how the hell are you guys doing these real people voices? does the paid version have a bunch of celebrities already or are you making them yourself with the settings
i think you feed it text, you feed it a vocal sample and it generates the text to speech in the style of the vocal sample
It lets you upload custom voices you dumb sirs
thank you sir
HAHAHHA HOLY SHIT
really doesn't do any justice to the original
I agree, fuck Belgium
wow this site is pretty good (i don't know if you ever heard of taylor marshall)
Oh that's funny.
Alright, I've got a basic Tim Pool, bit I used his crigler response video so its sombre energy
Tirn up Tupacs new jam.
Riding Hard 4 My Homiez
Damn, thats pretty good.
> American culture is centered around ni-
Joshua Graham speak out about the decline of r/femboys
>adding the new vegas ambience
You fucker my sides weren't ready
fuck me, the potential to make add ons to old games with full voice acting is real. I need to hear max payne doing an ai generated monologue.
Going to need Chris Chan..
I want this copy pasta with Morgan freemans voice really bad
that is a lot of references deep.
what a rap battle
so now that we know for a fact voice can be easily recreated with AI, do we have even a single shred of proof that this was actually kanye?
Not proof, but that definitely wasn't the real Kanye.
i went back and listened to lift yourself
the song from kanye that's about literal shit
i'm not a huge fan or anything, but i know kanye's voice and now in hindsight i believe that that song wasn't made by kanye
just listen to it
Here's the real Kanye
Sounds nothing like the poop-di-di-scoop guy, or the guy who sat at Alex Jones' desk
yup sounds completely different
plus, its just shady to wear a mask to hide your face, a big jacket to hide your body, and gloves to hide your hands
one reason i can think of why kanye would do that, is that he himself can later deny that that was actually him, which would be a genius move
The disguise also helps keep people from switching off.
It's not so much about making sure the garden gnomes can't abuse your image, it's about making sure people can't decide if it's really you who's saying it, thus they keep listening until they feel the answer.
Without the disguise, people would've turned it off after 5 seconds because they got verification.
With the disguise, it might take an hour or two.
I still think it wasn't him though.
I want to believe
He's a Mossad cyborg. So is Nick Fuentes, it's why his emotions seem so fake.
alex jones literally named the actor playing kanye, supposedly as a joke
i actually forgot who it was, but it was a black actor from i think walking dead
Can you make Alex Jones admit to being gay?
Historic thread. Incredible to witness
Its only just begun
Can anyone make Jared Taylor and/or Patrick Bateman do the Sneed?
Anyone know a fix for the file extension? I'm using a random youtube video ripper to convert a video to MP3, but the dragon and drop for the voice AI keeps saying I have to change the file extension. Even though it is already MP3.
Has anyone made a model based on some popular voice actor for audio books? I could use that lol.
voice actors on suicide watch