OpenAI's Whisper

Posted on September 28, 2022 by Anonymous

tested it on the final battle scene from 8 Mile
is it good or shit?

POSIWID: The Purpose Of A System Is What It Does Shirt $21.68

UFOs Are A Psyop Shirt $21.68

POSIWID: The Purpose Of A System Is What It Does Shirt $21.68

2 years ago

Reply

Anonymous

Pretty cool.
2 years ago

Reply

Anonymous

Its understanding of grammar is impressive, I haven’t seen any other transcriber on that level.
Funny how its most noticeable mistake is a pretty easy one (ain’t no such thing as halfway crooks), but it nails the last part where Eminem is speaking quickly without music. Probably because of the crowd’s noise.
2 years ago

Reply

Anonymous

where does it store the downloaded models? Linux
- 2 years ago
  
  Reply
  
  Anonymous
  
  oh it's ~/.cache/whisper
2 years ago

Reply

Anonymous

Is there some place I can download the models and put on a flash drive? My internet is not good at home.
- 2 years ago
  
  Reply
  
  Anonymous
  
  You can create a docker image, save it (docker save), and load it (docker load) whenever you want.
2 years ago

Reply

Anonymous

it's not "pops" though. it's pac.
2 years ago

Reply

Anonymous

>pops
pac.
>but all six of you jumped
by all six of you chumps
>we did frick my girl
wink did frick my girl
>this hot plate groups
as halfway crooks.
>frick a pop, a doc
frick papa doc
2 years ago

Reply

Anonymous

Model where
2 years ago

Reply

Anonymous

parents doesn't rhyme with marriage
2 years ago

Reply

Anonymous

Is this another of these shits I can't run if I don't have dedicated gaymer GPU?
- 2 years ago
  
  Reply
  
  Anonymous
  
  You can run with CPU but it's slower
  - 2 years ago
    
    Reply
    
    Anonymous
    
    THANK GOD, god bless you anon
2 years ago

Reply

Anonymous

How difficult would it be to write a program to use this for speech to text? I'm a victim of US schools and never learned how to spell. To this day I sometimes need to use Google voice typing for some words and have been looking a first programming project.
- 2 years ago
  
  Reply
  
  Anonymous
  
  >and have been looking a first programming project
  >looking a first
  Less effort to learn to spell (and some grammar apparently).
  Erm, can I translate, let me try:
  You finna speak first fool, then spell `my homie`.
  - 2 years ago
    
    Reply
    
    Anonymous
    
    you're a loser for having the time to respond like this
  - 2 years ago
    
    Reply
    
    Anonymous
    
    >Less effort to learn to spell
    I have been told I my entire life I can't. I have been looking for a way to improve my spelling for a long time and the best I can do is fly-spell in Emacs. I will still misspell some words so badly I end up using my (non-free) phone to spell it for me.
    In the past, I was dependent on Dragon Naturally Specking but now I only need my phone or search engine about every other sentence. Most normal gays think this is an acceptable way to live.
2 years ago

Reply

Anonymous

Very excited for this. I have some archived videos and always forget where I heard something from. With this can probably create a transcipt for each video next to it, and then index over all of them to create something searchable.
2 years ago

Reply

Anonymous

It's pretty good from what I've tested
2 years ago

Reply

Anonymous

bretty neat, could be really cool to have autogenerated subtitles on every video/lyrics on all music/etc. because of a simple addition/plugin to various websites and programs.
2 years ago

Reply

Anonymous

there is gpu cuda with is way faster. Too bad it can only handle up to medium on my 3070
Does Japanese ok too
2 years ago

Reply

Anonymous

OP should test Whisper on those fast speaking commercials you see on TV or hear on radio. What about livestock auctioneers?
- 2 years ago
  
  Reply
  
  Anonymous
  
  I tried it with this:
2 years ago

Reply

Anonymous

.. does it work on porn?
- 2 years ago
  
  Reply
  
  Anonymous
  
  anything that has people speaking, yeah
  - 2 years ago
    
    Reply
    
    Anonymous
    
    i have several terabytes of porn to run this on now
    
    4090 it is i think
2 years ago

Reply

Anonymous

Don't work on riscv
2 years ago

Reply

Anonymous

how can we know if it wasn't trained on that specific song
2 years ago

Reply

Anonymous

Could it be used to autogenerate literal subtitles for anime?
>generate japanese subtitles
>substitute each japanese word with the closest english equivalent or an AI generated glyph
- 2 years ago
  
  Reply
  
  Anonymous
  
  yes, "--task translate" gives you a subtitles file
- 2 years ago
  
  Reply
  
  Anonymous
  
  Yes it already can do that. It doesn't work like that though.

Cancel reply