TTS AI trained only 30 seconds of clip

Posted on January 5, 2023 by Anonymous

https://twitter.com/justlv/status/1610343308831920128

>TorToiSe (TTS) trained on only 30s of audio of Sam Harris

>https://github.com/neonbjb/tortoise-tts

Tip Your Landlord Shirt $21.68

Black Rifle Cuck Company, Conservative Humor Shirt $21.68

Tip Your Landlord Shirt $21.68

1 year ago

Reply

Anonymous

okay
1 year ago

Reply

Anonymous

It's over, biobros.
1 year ago

Reply

Anonymous

Can I sound like a cute girl yet or not
- 1 year ago
  
  Reply
  
  Anonymous
  
  Have 3-5 different 10 seconds clips of a cute girl in wave file, then train the system.
- 1 year ago
  
  Reply
  
  Anonymous
  
  You can try the colab notebook.
  Just upload your own samples.
  https://github.com/neonbjb/tortoise-tts
1 year ago

Reply

Anonymous

How is that 15.ai guy coping?
1 year ago

Reply

Anonymous

>AI trained only 30 seconds of clip!!!!
>Used ChatGPT for an initial draft,
- 1 year ago
  
  Reply
  
  Anonymous
  
  chatGPT is for the text input of a generic sam harris talk, you can write in whatever you want as text input. Thats the whole point of TTS. Text To Speech.
  - 1 year ago
    
    Reply
    
    Anonymous
    
    Man you're a dumb Black person
    
    >Bro pretrained models are totally different!!!
    
    >These models were trained on a small cluster of 8 NVIDIA RTX-3090s over the period of ~ 1 year.
    >I started with the LibriTTS and HiFiTTS datasets, which combined contain ~896 hours of transcribed speech. I built an additional, “extended” dataset of 49,000 hours of speech audio from audiobooks and podcasts scraped from the internet.
    
    Cope further morons, muh 30 seconds is a meme. How are we this far in and brainlets still don't know the difference between training and function result
    - 1 year ago
      
      Reply
      
      Anonymous
      
      Redditspacer + dumb Black person lol
      - 1 year ago
        
        Reply
        
        Anonymous
        
        >Resorts to /misc/tier insults when no arguments left
        I accept your concession, brainlet.
        
        Nope, you just put in your own custom voice folder with few of your few second audio clips and you can get your result
        
        >Nope, you just put in your own custom voice folder with few of your few second audio clips and you can get your result
        Not "Nope", input is not the same thing as training data. OPs initial statement of "muh trained only 30 seconds!!" is inherently wrong, jesus christ why is this board so moronic. This is like me claiming some autoencoder or dall-e trained on just one single image to produce le epic AI art XYZ.
        
        1 year ago
        
        Reply
        
        Anonymous
        
        Its not wrong. The guy took 30 seconds of a random Sam Harris audio and then generated the full breathe of speech.
        
        The pretrained model + new custom audio of your choice = your audio result.
        
        1 year ago
        
        Anonymous
        
        The anon you’re responding is right, though. It was not ”trained on only 30s of audio”, it was fine tuned or one shot learned or whatever that system does with the given data. It’s cool that you can do it, but you just can’t use standard terminology in a non-standard way.
        
        1 year ago
        
        Anonymous
        
        >Its not wrong. The guy took 30 seconds of a random Sam Harris audio and then generated the full breathe of speech.
        This isn't "trained in only 30 seconds" though you morons. Generation != training. Is chatGPT training for 5 seconds when it gives you a response based on your input? NO
        
        1 year ago
        
        Anonymous
        
        If you have chatGPT speak like Elon Musk with just few sentences of input. Then when you ask any question and all the answer is given as if Elon Musk is writing, then sure it is
        
        1 year ago
        
        Anonymous
        
        >Then when you ask any question and all the answer is given as if Elon Musk is writing, then sure it is
        If you're too moronic to understand the terminology then I don't have the time to explain it to you. Anyone above 80iq should understand the difference. It's like saying an Olympic swimmer only trained for 2 minutes because they performed the task in 2 minutes! wow! only 2 minutes!
        
        1 year ago
        
        Anonymous
        
        No.
        
        Its like taking a random person, then you show 30 seconds clip of judo techniques and then they become a judoka after that.
        
        Before you show them video of judoka, they're just a random person. So 30 seconds gives them enough training to make a rando into a master
        
        1 year ago
        
        Anonymous
        
        >Its like taking a random person, then you show 30 seconds clip of judo techniques and then they become a judoka after that.
        No you FRICKIGN moron IT'S A LANGUAGE MODEL TRAINED TO SPEAK LANGUAGES
        
        1 year ago
        
        Anonymous
        
        Its a generalized voice modulation. With specified custom voice module that you yourself can train on and output in the fashion of your desire.
        
        1 year ago
        
        Reply
        
        Anonymous
        
        you're right. too bad you're a Black person loving rëddit homosexual. go back.
        
        1 year ago
        
        Reply
        
        Anonymous
        
        To add onto that, we don't even know if Sam Harris podcasts were included in the training data. Because if they were, this makes the whole ebin achievement even less impressive. This is why you don't trust random twitteratis to do actual research, they use the code bootcamp approach to advanced topics and feel smart if they write 3 pages of a "paper" with shitty graphs
        
        1 year ago
        
        Reply
        
        Anonymous
        
        You're correct, but frogposting over
        >Used ChatGPT for an initial draft,
        was equally as moronic as the people who responded. By the way, I'm trans
    - 1 year ago
      
      Reply
      
      Anonymous
      
      Nope, you just put in your own custom voice folder with few of your few second audio clips and you can get your result
  - 1 year ago
    
    Reply
    
    Anonymous
    
    https://i.imgur.com/8abJTSv.jpg
    
    https://twitter.com/justlv/status/1610343308831920128
    
    >TorToiSe (TTS) trained on only 30s of audio of Sam Harris
    
    >https://github.com/neonbjb/tortoise-tts
    
    This is not training, this is finetuning you absolute fricking moron. Your model already contains pretty much everything it needs to produce the output, the small snippet is basically used to specify what part of the possible solution space you wanna get.
- 1 year ago
  
  Reply
  
  Anonymous
  
  Man you're a dumb Black person
1 year ago

Reply

Anonymous

How can non chuds train their own AIs all the time? Chudbros are being left in the dust
1 year ago

Reply

Anonymous

The cool thing about it is you can have your own voice read books in a pretty decent way.
1 year ago

Reply

Anonymous

Why are people who are into AI unable to understand basic technology?
Do they think all technology is just magic and beyond human comprehension? is that why they worship AI?
- 1 year ago
  
  Reply
  
  Anonymous
  
  >Do they think all technology is just magic and beyond human comprehension? is that why they worship AI?
  yes. these low iq morons lost big time on the cryptocoin game, so now they're all gathering around machine learning websites hoping to find the next cash cow. they think they're so intelligent because they can copy/paste text into a window. since these people have the english skills of 4yos, chatgpt looks like a sentient being. the problem that these dangerously low iq monkeys face is that the monkeys don't own the datasets, nor do they have the computing power or bandwidth to create their own.
- 1 year ago
  
  Reply
  
  Anonymous
  
  >understand basic technology
  whatever the frick does that mean? OP is a troglodyte but so are you.
1 year ago

Reply

Anonymous

eh the jordan peterson ai was funnier
- 1 year ago
  
  Reply
  
  Anonymous
  
  >jordan peterson ai
  
  Which one?
  - 1 year ago
    
    Reply
    
    Anonymous
    
    The funny one ofcourse
- 1 year ago
  
  Reply
  
  Anonymous
  
  If you can get some clean jordan peterson audio clips, you can make your own jordan peterson audio with this TTS AI too.
1 year ago

Reply

Anonymous

still a meme that will never replace voice actors
- 1 year ago
  
  Reply
  
  Anonymous
  
  Indie game devs should be using it for their own needs.
1 year ago

Reply

Anonymous

Okay, but can I use it to make ASMR?
1 year ago

Reply

Anonymous

Someone please help me understand this repo.
Which files are the models?
The files in the model folders are .wav so they are either generated samples or training samples or w/e. They can't be the actual model can they?
It says in the readme that you can run it locally so they should be there right ?
Or does it just do web queries?
- 1 year ago
  
  Reply
  
  Anonymous
  
  nvm, it's the cvlp2.pth, and a few others that get downloaded during install
1 year ago

Reply

Anonymous

cant wait to retire off of my entirely AI generated podcast that normies will eat up

Cancel reply