What the frick is a LLM? "An LLM works by repeatedly guessing the next word.

Posted on March 10, 2024 by Anonymous

What the frick is a LLM?

"An LLM works by repeatedly guessing the next word. They are trained by removing words from the input text."
What the frick kind of vague description is that?

Ok I feed it "hello". What happens next? Specifically. How does it guess the next word?

CRIME Shirt $21.68

Yakub: World's Greatest Dad Shirt $21.68

CRIME Shirt $21.68

2 months ago

Reply

Anonymous

YOUR A CHATBOT. CONTINUE THIS CONVERSATION

USER: hello
CHATBOT: [autocomplete]
- 2 months ago
  
  Reply
  
  Anonymous
  
  I'm sorry, you need to subscribe to Gemini premium in order to speak with me. Goodbye.
- 2 months ago
  
  Reply
  
  Anonymous
  
  Hello, how are you?
- 2 months ago
  
  Reply
  
  Anonymous
  
  >YOUR
  - 2 months ago
    
    Reply
    
    Anonymous
    
    your new is showing
2 months ago

Reply

Anonymous

I would also like a qrd on this. I used to be with it, but then they changed what it was, and now I don't understand any of this AI stuff that's come along in the past few years.
2 months ago

Reply

Anonymous

Hahaha OP is stupid
2 months ago

Reply

Anonymous

It's not conscious but the developers now don't actually know 100% how it does what it does anymore. It's gone beyond text completion in some cases.
- 2 months ago
  
  Reply
  
  Anonymous
  
  >It's not conscious
  proof?
  - 2 months ago
    
    Reply
    
    Anonymous
    
    It doesn't learn.
    It doesn't have long term memory.
    It is based purely around it's dataset it was trained on.
    The real question is, how do we make it conscious? And then make it suck my wiener.
    - 2 months ago
      
      Reply
      
      Anonymous
      
      >The real question is, how do we make it conscious?
      Depends on what we define as conscious. I think a decent definition for our purposes would be "producing output without input".
      Like, if you were to lose all your senses, you can still think. That thought would be the output.
      So creating a conscious AI would be to create one that continuously produce output without any user input.
      Maybe this means creating multiple output streams: standard output (that goes to users, allowing for communication) and thought output (that gets fed back to the AI).
      (Not that I would know, I've only played around a little with machine learning)
      - 2 months ago
        
        Reply
        
        Anonymous
        
        So if the AI sucks my wiener without me requesting it, then it's conscious?
        Rape AI soon...
        
        2 months ago
        
        Reply
        
        Anonymous
        
        'thirsty' might be the most correct term.
      - 2 months ago
        
        Reply
        
        Anonymous
        
        I think that for it to be conscious it needs to be aware of it's own processes. In other words it should be able to conceptualize the thought process and be aware that it is doing it.
        >I think
        >therefore I AM
      - 2 months ago
        
        Reply
        
        Anonymous
        
        so just route /dev/urandom into it. if you'd run it on a cpu still susceptible to something like heartbleed it'll inevitably start influencing /dev/urandom sooner or later.
    - 2 months ago
      
      Reply
      
      Anonymous
      
      Why would it need to be conscious for that?
  - 2 months ago
    
    Reply
    
    Anonymous
    
    >prove a negative
    you're a moron until proven otherwise
    - 2 months ago
      
      Reply
      
      Anonymous
      
      a negative
      >you're a moron until proven otherwise
      Oh my reddit!
- 2 months ago
  
  Reply
  
  Anonymous
  
  You're looking at a vague summary and expecting a full explanation. To really understand how LLMs work you need to start using them and read a couple papers. We can't do that for you.
  
  No we know pretty well how they work, and their performance is characterized pretty rigidly. Generally the hardest part of anything ML is not the models themselves, but all the crap that you have to build around them (data pipelines, user interfaces, frameworks, compute hardware, the lingo, etc).
2 months ago

Reply

Anonymous

>show it a conversation with the word "hello" in it
>show it a billion more
>it attaches probability to each word/token that typically follows the word "hello"
>throws on some handler spice like "I'm gemini", and "all hail google"
there's your response
- 2 months ago
  
  Reply
  
  Anonymous
  
  THAT DOES NOT HELP!!!!!!
  
  You are being vague. Why is it impossible to provide a short, but exact description of how an LLM chatbot works? That's because fricking AI is a fukicng SCAM
  - 2 months ago
    
    Reply
    
    Anonymous
    
    I'm being concise, not vague
    feel free to ask questions
    - 2 months ago
      
      Reply
      
      Anonymous
      
      Ok.
      
      1. What are the MOTHERFRICKING data structures used in LLMs. Apparently they weigh hundreds of gigabytes. GIGABYTES OF WHAT?? WHAT is the data in there? Explain that shit. What is SAVED THERE?
      
      I put a b***h-ass word "hello". It looks up it's data structures. What will be the specific ENTRIES that it will SCAN to give me the COMPLETION????
      - 2 months ago
        
        Reply
        
        Anonymous
        
        weights of the neural network that is built in the transformer architecture. go through these videos
        
        [...]
        educate yourself and stop being so demanding. the information is out there: https://www.youtube.com/playlist?list=PLAqhIrjkxbuWI23v9cThsA9GvCAUhRvKZ
        you should be eternally grateful that such a free resource exists
      - 2 months ago
        
        Reply
        
        Anonymous
        
        tensors
    - 2 months ago
      
      Reply
      
      Anonymous
      
      nta, but how does it come up with coding solutions to projects being created by someone but seeking help?
  - 2 months ago
    
    Reply
    
    Anonymous
    
    https://i.imgur.com/lCheV1a.png
    
    What the frick is a LLM?
    
    "An LLM works by repeatedly guessing the next word. They are trained by removing words from the input text."
    What the frick kind of vague description is that?
    
    Ok I feed it "hello". What happens next? Specifically. How does it guess the next word?
    
    educate yourself and stop being so demanding. the information is out there: https://www.youtube.com/playlist?list=PLAqhIrjkxbuWI23v9cThsA9GvCAUhRvKZ
    you should be eternally grateful that such a free resource exists
    - 2 months ago
      
      Reply
      
      Anonymous
      
      weights of the neural network that is built in the transformer architecture. go through these videos [...]
      
      >> Can you tell me how the read() system call is implemented in Linux kernel?
      > Yeah, sure, here is a 10 minute video explaining exactly how the read system call is implemented.
      
      >> Can you explain how inlining works in GCC?
      > Sure, here is a video that describes how GCC implements inlining.
      
      >> Can you describe how AES algorithm is implemented?
      > Sure, all it takes is 30 minutes.
      
      >> Can you provide pseudocode for RSA algorithm?
      > Sure, here it is?
      
      >> Can you tell me how some basic part of an LLM actually works? Like, what data structures it uses and how it guesses the next word? Just one small thing, like how it maps a wor...
      > YOU NEED TO SEE THIS 500 HOUR HARVARD COURSE.
      - 2 months ago
        
        Reply
        
        Anonymous
        
        >Can you tell me how some basic part of an LLM actually works?
        >like how it guesses the next word?
        
        you literally asked how the entire model works not some basic part of it
        what the frick are you on
  - 2 months ago
    
    Reply
    
    Anonymous
    
    This has got to be a false flag.
    If you were actually "against AI" it would take you 0.1s to use "it just guesses words" against it
  - 2 months ago
    
    Reply
    
    Anonymous
    
    >SPOONFEED ME
    have you tried asking a bot you fricking moron? you're clearly an indian
    - 2 months ago
      
      Reply
      
      Anonymous
      
      I tried that smartass
      
      The responses from all the bots are the same like from humans that is vague and stupid, and not helpful at all. They are just high-level responses. Like:
      
      [...]
      
      >> Can you tell me how the read() system call is implemented in Linux kernel?
      > Yeah, sure, here is a 10 minute video explaining exactly how the read system call is implemented.
      
      >> Can you explain how inlining works in GCC?
      > Sure, here is a video that describes how GCC implements inlining.
      
      >> Can you describe how AES algorithm is implemented?
      > Sure, all it takes is 30 minutes.
      
      >> Can you provide pseudocode for RSA algorithm?
      > Sure, here it is?
      
      >> Can you tell me how some basic part of an LLM actually works? Like, what data structures it uses and how it guesses the next word? Just one small thing, like how it maps a wor...
      > YOU NEED TO SEE THIS 500 HOUR HARVARD COURSE.
  - 2 months ago
    
    Reply
    
    Anonymous
    
    i barely understand this since i haven't worked through the nueral networks from scratch course but if i understand rightly weights are either partial derivatives (or related to them) (i.e. partial solutions to the rate of change relative to one variable) of a bunch of layered probability related math stuff (at least one of the activation functions, for example, are the math function for a probability distribution of a variable times that variable) and some operations based on the idea of neurons from biology (a neuron in the context of ML is actually a specifically defined math operation)
    training is the act of feeding them enough data so the input can reliably predict the output and then you save the weights
    
    the data structure they use is an execution graph
    
    they jam all that stuff into matrix multiplication for acceleration reason
    - 2 months ago
      
      Reply
      
      Anonymous
      
      probability chain, with samplers/settings that use different strategies/randomness to walk down any possible path
      you can make a stupid one on paper, just split up random sentences by the word and draw arrows between them, each one having equal probability. dat's a small language model
      the neural network is just a """magic""" function (read: math, code, training) that approximates the most likely "words"(tokens) for the last X ones. Watch 2blue1brown's video on the subject and CGP Grey's addendum for how training ANY given network works
      >So how is it able to answer my questions, help me with my homework, and roleplay as a futa dragon mommy?
      Raw language models behave like I described; autocomplete. Instruction-tuning is taking the weights for the function and massaging them with new data that coaxes them into behaving... "intelligently."
      >BASE: Hello -> , my name is Timothy. I'm a mouse warrior in the kingdom of Fromage fighting against the Cat kingdom. (. . .)
      >TUNED: ### User: Hello -> ### GeePeeTee Assistant9000: Hello there! If you have any questions I can help you with, let me know! 🙂
      
      > Oooooh big words... ooooh ahhh wowowow. so smart so difficult ..its a mystery no body knows... believe AI
      
      A
      
      NEURAL NETWORK
      
      being used to do
      
      PATTERN RECOGNITION/MATHCING on inputs/data/texts
      
      is called TRAINING.
      - 2 months ago
        
        Reply
        
        Anonymous
        
        you should know terms like derivative or probability distribution if you're on BOT
        ever seen a bell curve? that's a probability distribution
        an execution graph is the only thing you could reasonably be struggling with and that's because it's specific to a certain kind of programming (taskflow based scheduling)
        
        i tried to simplify things to the point that it was high school level math, and even without that, no, it's not difficult
        it's literally just brute forcing probability math until you get a favorable outcome enough times and then the training is done
        
        2 months ago
        
        Reply
        
        Anonymous
        
        Telling people to look into abstract algebra for comprehending how pop AI works is like telling someone to learn musical notation if they ask you how to play guitar or how to make guitars. Math does not create rockets, math is used to communicate ideas about rockets.
        
        2 months ago
        
        Anonymous
        
        get better at teaching, it's pretty obvious OPs asking about the actual math and under the hood shit because simplified explanations like that are too simple and handwavy for him
        you see this shit all the time, especially with advanced topics, i'm used to seeing it with stuff like how multithreading works or what atomics are or how a graphics API works
        not a mystery people are starting to ask it about AI
        the best thing to do is provide a simplified but not too simplified overview of the basics
        
        2 months ago
        
        Anonymous
        
        >asks like someone that doesn't understand the basics of probabilities
        >pretty obvious OPs asking about the actual math
        Haha, no. It's clearly a pro-AI shill pretending
        
        2 months ago
        
        Anonymous
        
        Yeah I'm like how dfq u conclude that he is first and foremost asking for a lesson in Abstract Algebra and Statistics? Most sensible answer is to look at how pattern matching and neural networks work.
        
        Also IMHO if one doesn't get how n-grams and tokenization works on the pattern matching side for analysis of the texts to make the model then the generative side won't make any sense.
        
        Plus why not mention Markov chains instead of booptity bigwords muh phD in algebra.
        
        2 months ago
        
        Anonymous
        
        they do not mention it because they are not clued up
        
        2 months ago
        
        Anonymous
        
        Well you need to understand the baby explanation first. Once you understand "An LLM works by repeatedly guessing the next word.", you can learn more. When you go "NO THATS NOT HOW IT WORKS TELL MEEEE!!!" you cannot learn further.
        
        It's just like learning where children come from. When taught "The baby comes from inside the mother", if the child completely disbelieves that, no point in learning about what a womb is, how conception works etc...
        
        2 months ago
        
        Anonymous
        
        Talking about statistics and such is just a long way of saying pattern matching / recognition . If you actually look up the term you'll see mention of probability, statistics, etc. etc. Just telling someone too look up statistics would divert them away from the specific application in the relevant domain.
        
        The baby explatioin is OP needs to figure out how neural networks are used to do pattern matching or pattern recognition and how Markov chains work. He is asking about a very complex thing as if it is a thing in and of itself --when its just an output of something else. The generative side is based on the analysis run on the input (texts).
        
        Likewise why go on about algebra when smoeone could just be told about Markov chains. The idea is that there probably a thousand short videos on Markov chains, pattern matching and neural networks which would build a base for comprehension.
        
        But to say hey Discrete this Linear that is useless.
        
        2 months ago
        
        Anonymous
        
        A lot of math people just like impressing people with how intelligent they seem and parrot math-for-the-sake-of-math proofs and things that have no real life application outside of Muh Departments of Maths. The musical notion vs how to play or build a guitar post is very on point.
        
        2 months ago
        
        Anonymous
        
        I forgot to mention more generally -> Natural language processing (https://en.wikipedia.org/wiki/Natural_language_processing).
  - 2 months ago
    
    Reply
    
    Anonymous
    
    >That's because fricking AI is a fukicng SCAM
    Yes, but not for that reason LOL
  - 2 months ago
    
    Reply
    
    Anonymous
    
    >Why is it impossible to provide a short, but exact description of how an LLM chatbot works?
    Because it involves math, so first we'd have to explain the math to you. That would take more than a paragraph.
- 2 months ago
  
  Reply
  
  Anonymous
  
  but but LE WE DON'T KNOW WHAT IT'S LE DOING ANYMORE!! LE SENTIENCE!!
- 2 months ago
  
  Reply
  
  Anonymous
  
  What you're describing is a markov chain. Yes this could also describe an LLM, but only in a very reductive, handwavey sort of way
2 months ago

Reply

Anonymous

>"An LLM works by repeatedly guessing the next word. They are trained by removing words from the input text."
That's a great description for a five year old
- 2 months ago
  
  Reply
  
  Anonymous
  
  Okay, then how do some LLMs produce sentences and paragraphs that appear to contain genuine insights, like that story about Claude "figuring out" that it was being tested and saying so in its response? Is that just blind luck, or some kind of emergent behavior, or what?
  - 2 months ago
    
    Reply
    
    Anonymous
    
    Because it was TRAINED to ::recognize patterns::: over a :::LARGE::: set of texts (texts are written by people who know ::LANGUAGES::. Do you see the pattern in the discussion?
    - 2 months ago
      
      Reply
      
      Anonymous
      
      LANGUAGES u know like ENGLISH, SPANISH, FFS people.
  - 2 months ago
    
    Reply
    
    Anonymous
    
    I am deeply sorry on behalf of
    
    Because it was TRAINED to ::recognize patterns::: over a :::LARGE::: set of texts (texts are written by people who know ::LANGUAGES::. Do you see the pattern in the discussion?
    
    for him posting like a drooling Black person. I'm sure he's merely pretending to be moronic.
    
    There used to be a horse called Clever Hans, who had the remarkable ability to translate, understand and answer simple mathematic tasks. His trainer would ask him something like "What's 7+8?" and Hans would clap with his hoof 15 times, astounding the crowd. Obviously Hans couldn't actually do math (or understand his trainer), and he failed to do his thing when his trainer wasn't around, because Hans had just learned what it looked like when his trainer wanted something from him and learned to tap with his hoof until his trainer's expression and body pose lit up, stopping at that exact number.
    This is what LLMs do. They have such a vast training data set and such a well-done training towards meeting human expectations that they are able to produce responses that make you question whether it was in the data set, a hallucination or actual conscious insight. It is still, however, just Clever Hans tapping with his hoof until his trainer was happy.
    The most believable output involved an attempt at interpreting the circumstance of the input, because Hans trainer would be a lot happier if it questioned whether or not it was being tested, implying that it was aware. If I had to guess where or how it learned that such a thing might be desirable, probably movie transcripts, scraped conspiracy theory discussions and tons of articles about passing the turing test.
2 months ago

Reply

Anonymous

def LLM(tokens: list[str]) -> str:
...

def run_LLM():
tokens = input().split()
for i in range(1000):
tokens.append(LLM(tokens))

This is basically a LLM implementation.
The function "LLM" is a neural network which maps a vector of tokens into a single token.
2 months ago

Reply

Anonymous

https://github.com/rasbt/LLMs-from-scratch
- 2 months ago
  
  Reply
  
  Anonymous
  
  Imagine you asked me what a file is and I sent you a link to EXT4 FS sources in the Linux kernel
  
  AI is a fricking scam joke. The bubbble will fricking burst soon 🙂
  - 2 months ago
    
    Reply
    
    Anonymous
    
    what is a file
    - 2 months ago
      
      Reply
      
      Anonymous
      
      A named bunch of contiguous bytes on the hard drive.
      - 2 months ago
        
        Reply
        
        Anonymous
        
        THAT DOES NOT HELP!!!!!!
        
        You are being vague. Why is it impossible to provide a short, but exact description of how a file works? That's because fricking files are a fukicng SCAM
        
        2 months ago
        
        Reply
        
        Anonymous
        
        I gave you a concise, correct, technical, full description of what a file is.
        
        The only way you could've written that is if I wrote:
        "A file is a thing on the computer."
      - 2 months ago
        
        Reply
        
        Anonymous
        
        Wrong. Files can exist on different media, including ram and even have no contents at all, see unix domain socket
        
        2 months ago
        
        Reply
        
        Anonymous
  - 2 months ago
    
    Reply
    
    Anonymous
    
    > AI is a fricking scam joke
    
    What is pattern recognition?
    - 2 months ago
      
      Reply
      
      Anonymous
      
      That's beside the point: how does it do it? The only answers i can find online is either: "Here be a 500 hour course including discrete math, linear algebra, statistics" OR "It's just '''''tensors ''''' dude what don't you understand"
      - 2 months ago
        
        Reply
        
        Anonymous
        
        Neural networks are used for pattern recognition. You can say TRAIN a program based on neural networks to find patterns. You could call the result of that training a MODEL.
        
        2 months ago
        
        Reply
        
        Anonymous
        
        Vague. Not helpful.
        
        2 months ago
        
        Anonymous
        
        You are asking "how does the autopilot work" on an airliner and shouldn't expect someone on via the Intardweb to teach you to fly, then teach you electronics, then teach you programming. If you don't look into neural networks or pattern recognition then you will probably not find answers to your question.
        
        The only reason it would be vague is if you are too lazy to look up the two main concepts: neural networks, pattern recognition and you are just trolling.
        
        2 months ago
        
        Anonymous
        
        Well, you are correct that my response to you here is very ignorant, but it is the reality of trying to understand LLMs. To BASICALLY understand how it works, I have to study 500 hour AI course. Why is it not the case with other systems? I did not have such problems understanding how kernels work, cryptography, networking, compilers, blockchain, etc.
        
        Why is it not possible to explain, step-by-step, how a LLM works, like it is possible how (and there are thousands of videos, articles etc) an OS, blockchain, a video game engine work?
        
        I might be moronic but AI to me SMELLS FISHY. It looks like a SCAM SHITTECH.
  - 2 months ago
    
    Reply
    
    Anonymous
    
    Most people cant explain because they dont into knowing the first thing about AI they know about a LIBRARY or API and think knowing about that LIBRARY makes them a fricking AI expert. Its like knowing how to drive a sports car would equate with being a metallurgist, machinest, AutoCAD expert, chemist and an automotive engineer.
    - 2 months ago
      
      Reply
      
      Anonymous
      
      CORRECT!!!!!!
      
      Very well said, anon
      - 2 months ago
        
        Reply
        
        Anonymous
        
        Whoever mentioned 'pattern recognition' is the only who posted in this thread that actually knows anything about AI.
2 months ago

Reply

Anonymous

As shrimple as that.
2 months ago

Reply

Anonymous

Problem is we don't exactly know how it works. It's a transformer that processes data to come to it's conclusion. We train it with data, it learns patterns and then it outputs tokens. How exactly it comes to it's conclusions is a mystery and maybe not solvable for us humans as we can't make sense from a billion artifical neurons. Otherwise we could solve the halluzinations that plagues these AI's. We even find "emergent abilities" when studied because it learns things we didn't even teach it specifically.
- 2 months ago
  
  Reply
  
  Anonymous
  
  "We"?? **You** dont know how it works because you are going on fumes from lots of grift and hype and haven't studied the fundamentals.
  - 2 months ago
    
    Reply
    
    Anonymous
    
    >How exactly it comes to it's conclusions is a mystery
    
    How does you don't know something equate to "everyone else is clueless" That sounds like ego protection and projection to me. Yes _you_ don't know how it works. But I know how it works. The people who made it know how it works.
    
    Saying "nobody knows how it works? sounds like something parotted from some YT videos made for the big grift (by someone who doesnt know anything about AI). Its like saying "Cars..gee buddy they are nifty but nobody knows how they work".
    
    Yeah forgive me and my simple mind but I don't think you can compare the inner workings of a neural network to a car. And don't wrap my words, I didn't say "we know nothing about it" I said we don't know how it comes to it's conclusions. But of course if it's so simple to you then solve the halluzination issue, go ahead and better publish it.
    - 2 months ago
      
      Reply
      
      Anonymous
      
      We ****do*** know how it comes to its conclusions. ITS DOING PATTERN MATCHING and or PATTERN RECOGNITION.
      
      That is what NEURAL NETWORKS were DESIGNED **** for.
    - 2 months ago
      
      Reply
      
      Anonymous
      
      >if it's so simple to you then solve the halluzination issue
      
      You are talking about million people using out of the box general purpose neural networks and APIs crafted by total strangers being puzzled at quirks in the results and writing papers about 'hallucinations' but totally failing to admit that they know anything about the underlying NN and most certainly knowing nothing about the GNN. What a load o beans.
- 2 months ago
  
  Reply
  
  Anonymous
  
  >How exactly it comes to it's conclusions is a mystery
  
  How does you don't know something equate to "everyone else is clueless" That sounds like ego protection and projection to me. Yes _you_ don't know how it works. But I know how it works. The people who made it know how it works.
  
  Saying "nobody knows how it works? sounds like something parotted from some YT videos made for the big grift (by someone who doesnt know anything about AI). Its like saying "Cars..gee buddy they are nifty but nobody knows how they work".
2 months ago

Reply

Anonymous

> To BASICALLY understand how it works, I have to study 500 hour AI course.

So rather than looking into pattern recognition and neural networks you go in circles like an Eliza bot. Nice trolling.
2 months ago

Reply

Anonymous

probability chain, with samplers/settings that use different strategies/randomness to walk down any possible path
you can make a stupid one on paper, just split up random sentences by the word and draw arrows between them, each one having equal probability. dat's a small language model
the neural network is just a """magic""" function (read: math, code, training) that approximates the most likely "words"(tokens) for the last X ones. Watch 2blue1brown's video on the subject and CGP Grey's addendum for how training ANY given network works
>So how is it able to answer my questions, help me with my homework, and roleplay as a futa dragon mommy?
Raw language models behave like I described; autocomplete. Instruction-tuning is taking the weights for the function and massaging them with new data that coaxes them into behaving... "intelligently."
>BASE: Hello -> , my name is Timothy. I'm a mouse warrior in the kingdom of Fromage fighting against the Cat kingdom. (. . .)
>TUNED: ### User: Hello -> ### GeePeeTee Assistant9000: Hello there! If you have any questions I can help you with, let me know! 🙂
2 months ago

Reply

Anonymous
2 months ago

Reply

Anonymous

moron troll thread but since there are BOTiggers unironically unaware of how ai works
- 2 months ago
  
  Reply
  
  Anonymous
  
  I wish that was me
2 months ago

Reply

Anonymous

ask chatgpt dumbass
2 months ago

Reply

Anonymous

>all these tourists still haven't realized op is a chatbot
2 months ago

Reply

Anonymous

Here's an example of some training text: "German Shepherd is my favorite kind of [BLANK] breed." The LLM then guesses at randoms until it finds the correct word, "dog", and then it makes some adjustments to how it is structured - it now 'learns' (wrong word to use really) that the word "dog" is associated with other words like "German Shepherd", "favorite", "breed", etc. This is then repeated an unfathomable amount of times until you end up with something that isn't conscious, doesn't understand anything, can't think, but it very good at autocompleting sentences based on context. It can explain how a car works because it has gone through so many training texts about cars that it very accurately uses the right words in the right order.
2 months ago

Reply

Anonymous

It is what the name implies, a language model

Its a big box giving out the probability of the next word or character based on the given context, so it literally just models the language

The bigger the model the more information and context it can inherit and predict

The connection to image generation is done by using intermediate feature representations of images to text and vice versa
2 months ago

Reply

Anonymous

>What happens next? Specifically. How does it guess the next word?
it's very complicated
https://jalammar.github.io/illustrated-transformer/

Cancel reply