What the frick is a LLM?
"An LLM works by repeatedly guessing the next word. They are trained by removing words from the input text."
What the frick kind of vague description is that?
Ok I feed it "hello". What happens next? Specifically. How does it guess the next word?
CRIME Shirt $21.68 |
CRIME Shirt $21.68 |
YOUR A CHATBOT. CONTINUE THIS CONVERSATION
USER: hello
CHATBOT: [autocomplete]
I'm sorry, you need to subscribe to Gemini premium in order to speak with me. Goodbye.
Hello, how are you?
>YOUR
your new is showing
I would also like a qrd on this. I used to be with it, but then they changed what it was, and now I don't understand any of this AI stuff that's come along in the past few years.
Hahaha OP is stupid
It's not conscious but the developers now don't actually know 100% how it does what it does anymore. It's gone beyond text completion in some cases.
>It's not conscious
proof?
It doesn't learn.
It doesn't have long term memory.
It is based purely around it's dataset it was trained on.
The real question is, how do we make it conscious? And then make it suck my wiener.
>The real question is, how do we make it conscious?
Depends on what we define as conscious. I think a decent definition for our purposes would be "producing output without input".
Like, if you were to lose all your senses, you can still think. That thought would be the output.
So creating a conscious AI would be to create one that continuously produce output without any user input.
Maybe this means creating multiple output streams: standard output (that goes to users, allowing for communication) and thought output (that gets fed back to the AI).
(Not that I would know, I've only played around a little with machine learning)
So if the AI sucks my wiener without me requesting it, then it's conscious?
Rape AI soon...
'thirsty' might be the most correct term.
I think that for it to be conscious it needs to be aware of it's own processes. In other words it should be able to conceptualize the thought process and be aware that it is doing it.
>I think
>therefore I AM
so just route /dev/urandom into it. if you'd run it on a cpu still susceptible to something like heartbleed it'll inevitably start influencing /dev/urandom sooner or later.
Why would it need to be conscious for that?
>prove a negative
you're a moron until proven otherwise
a negative
>you're a moron until proven otherwise
Oh my reddit!
You're looking at a vague summary and expecting a full explanation. To really understand how LLMs work you need to start using them and read a couple papers. We can't do that for you.
No we know pretty well how they work, and their performance is characterized pretty rigidly. Generally the hardest part of anything ML is not the models themselves, but all the crap that you have to build around them (data pipelines, user interfaces, frameworks, compute hardware, the lingo, etc).
>show it a conversation with the word "hello" in it
>show it a billion more
>it attaches probability to each word/token that typically follows the word "hello"
>throws on some handler spice like "I'm gemini", and "all hail google"
there's your response
THAT DOES NOT HELP!!!!!!
You are being vague. Why is it impossible to provide a short, but exact description of how an LLM chatbot works? That's because fricking AI is a fukicng SCAM
I'm being concise, not vague
feel free to ask questions
Ok.
1. What are the MOTHERFRICKING data structures used in LLMs. Apparently they weigh hundreds of gigabytes. GIGABYTES OF WHAT?? WHAT is the data in there? Explain that shit. What is SAVED THERE?
I put a b***h-ass word "hello". It looks up it's data structures. What will be the specific ENTRIES that it will SCAN to give me the COMPLETION????
weights of the neural network that is built in the transformer architecture. go through these videos
tensors
nta, but how does it come up with coding solutions to projects being created by someone but seeking help?
educate yourself and stop being so demanding. the information is out there: https://www.youtube.com/playlist?list=PLAqhIrjkxbuWI23v9cThsA9GvCAUhRvKZ
you should be eternally grateful that such a free resource exists
>> Can you tell me how the read() system call is implemented in Linux kernel?
> Yeah, sure, here is a 10 minute video explaining exactly how the read system call is implemented.
>> Can you explain how inlining works in GCC?
> Sure, here is a video that describes how GCC implements inlining.
>> Can you describe how AES algorithm is implemented?
> Sure, all it takes is 30 minutes.
>> Can you provide pseudocode for RSA algorithm?
> Sure, here it is?
>> Can you tell me how some basic part of an LLM actually works? Like, what data structures it uses and how it guesses the next word? Just one small thing, like how it maps a wor...
> YOU NEED TO SEE THIS 500 HOUR HARVARD COURSE.
>Can you tell me how some basic part of an LLM actually works?
>like how it guesses the next word?
you literally asked how the entire model works not some basic part of it
what the frick are you on
This has got to be a false flag.
If you were actually "against AI" it would take you 0.1s to use "it just guesses words" against it
>SPOONFEED ME
have you tried asking a bot you fricking moron? you're clearly an indian
I tried that smartass
The responses from all the bots are the same like from humans that is vague and stupid, and not helpful at all. They are just high-level responses. Like:
i barely understand this since i haven't worked through the nueral networks from scratch course but if i understand rightly weights are either partial derivatives (or related to them) (i.e. partial solutions to the rate of change relative to one variable) of a bunch of layered probability related math stuff (at least one of the activation functions, for example, are the math function for a probability distribution of a variable times that variable) and some operations based on the idea of neurons from biology (a neuron in the context of ML is actually a specifically defined math operation)
training is the act of feeding them enough data so the input can reliably predict the output and then you save the weights
the data structure they use is an execution graph
they jam all that stuff into matrix multiplication for acceleration reason
> Oooooh big words... ooooh ahhh wowowow. so smart so difficult ..its a mystery no body knows... believe AI
A
NEURAL NETWORK
being used to do
PATTERN RECOGNITION/MATHCING on inputs/data/texts
is called TRAINING.
you should know terms like derivative or probability distribution if you're on BOT
ever seen a bell curve? that's a probability distribution
an execution graph is the only thing you could reasonably be struggling with and that's because it's specific to a certain kind of programming (taskflow based scheduling)
i tried to simplify things to the point that it was high school level math, and even without that, no, it's not difficult
it's literally just brute forcing probability math until you get a favorable outcome enough times and then the training is done
Telling people to look into abstract algebra for comprehending how pop AI works is like telling someone to learn musical notation if they ask you how to play guitar or how to make guitars. Math does not create rockets, math is used to communicate ideas about rockets.
get better at teaching, it's pretty obvious OPs asking about the actual math and under the hood shit because simplified explanations like that are too simple and handwavy for him
you see this shit all the time, especially with advanced topics, i'm used to seeing it with stuff like how multithreading works or what atomics are or how a graphics API works
not a mystery people are starting to ask it about AI
the best thing to do is provide a simplified but not too simplified overview of the basics
>asks like someone that doesn't understand the basics of probabilities
>pretty obvious OPs asking about the actual math
Haha, no. It's clearly a pro-AI shill pretending
Yeah I'm like how dfq u conclude that he is first and foremost asking for a lesson in Abstract Algebra and Statistics? Most sensible answer is to look at how pattern matching and neural networks work.
Also IMHO if one doesn't get how n-grams and tokenization works on the pattern matching side for analysis of the texts to make the model then the generative side won't make any sense.
Plus why not mention Markov chains instead of booptity bigwords muh phD in algebra.
they do not mention it because they are not clued up
Well you need to understand the baby explanation first. Once you understand "An LLM works by repeatedly guessing the next word.", you can learn more. When you go "NO THATS NOT HOW IT WORKS TELL MEEEE!!!" you cannot learn further.
It's just like learning where children come from. When taught "The baby comes from inside the mother", if the child completely disbelieves that, no point in learning about what a womb is, how conception works etc...
Talking about statistics and such is just a long way of saying pattern matching / recognition . If you actually look up the term you'll see mention of probability, statistics, etc. etc. Just telling someone too look up statistics would divert them away from the specific application in the relevant domain.
The baby explatioin is OP needs to figure out how neural networks are used to do pattern matching or pattern recognition and how Markov chains work. He is asking about a very complex thing as if it is a thing in and of itself --when its just an output of something else. The generative side is based on the analysis run on the input (texts).
Likewise why go on about algebra when smoeone could just be told about Markov chains. The idea is that there probably a thousand short videos on Markov chains, pattern matching and neural networks which would build a base for comprehension.
But to say hey Discrete this Linear that is useless.
A lot of math people just like impressing people with how intelligent they seem and parrot math-for-the-sake-of-math proofs and things that have no real life application outside of Muh Departments of Maths. The musical notion vs how to play or build a guitar post is very on point.
I forgot to mention more generally -> Natural language processing (https://en.wikipedia.org/wiki/Natural_language_processing).
>That's because fricking AI is a fukicng SCAM
Yes, but not for that reason LOL
>Why is it impossible to provide a short, but exact description of how an LLM chatbot works?
Because it involves math, so first we'd have to explain the math to you. That would take more than a paragraph.
but but LE WE DON'T KNOW WHAT IT'S LE DOING ANYMORE!! LE SENTIENCE!!
What you're describing is a markov chain. Yes this could also describe an LLM, but only in a very reductive, handwavey sort of way
>"An LLM works by repeatedly guessing the next word. They are trained by removing words from the input text."
That's a great description for a five year old
Okay, then how do some LLMs produce sentences and paragraphs that appear to contain genuine insights, like that story about Claude "figuring out" that it was being tested and saying so in its response? Is that just blind luck, or some kind of emergent behavior, or what?
Because it was TRAINED to ::recognize patterns::: over a :::LARGE::: set of texts (texts are written by people who know ::LANGUAGES::. Do you see the pattern in the discussion?
LANGUAGES u know like ENGLISH, SPANISH, FFS people.
I am deeply sorry on behalf of
for him posting like a drooling Black person. I'm sure he's merely pretending to be moronic.
There used to be a horse called Clever Hans, who had the remarkable ability to translate, understand and answer simple mathematic tasks. His trainer would ask him something like "What's 7+8?" and Hans would clap with his hoof 15 times, astounding the crowd. Obviously Hans couldn't actually do math (or understand his trainer), and he failed to do his thing when his trainer wasn't around, because Hans had just learned what it looked like when his trainer wanted something from him and learned to tap with his hoof until his trainer's expression and body pose lit up, stopping at that exact number.
This is what LLMs do. They have such a vast training data set and such a well-done training towards meeting human expectations that they are able to produce responses that make you question whether it was in the data set, a hallucination or actual conscious insight. It is still, however, just Clever Hans tapping with his hoof until his trainer was happy.
The most believable output involved an attempt at interpreting the circumstance of the input, because Hans trainer would be a lot happier if it questioned whether or not it was being tested, implying that it was aware. If I had to guess where or how it learned that such a thing might be desirable, probably movie transcripts, scraped conspiracy theory discussions and tons of articles about passing the turing test.
def LLM(tokens: list[str]) -> str:
...
def run_LLM():
tokens = input().split()
for i in range(1000):
tokens.append(LLM(tokens))
This is basically a LLM implementation.
The function "LLM" is a neural network which maps a vector of tokens into a single token.
https://github.com/rasbt/LLMs-from-scratch
Imagine you asked me what a file is and I sent you a link to EXT4 FS sources in the Linux kernel
AI is a fricking scam joke. The bubbble will fricking burst soon 🙂
what is a file
A named bunch of contiguous bytes on the hard drive.
THAT DOES NOT HELP!!!!!!
You are being vague. Why is it impossible to provide a short, but exact description of how a file works? That's because fricking files are a fukicng SCAM
I gave you a concise, correct, technical, full description of what a file is.
The only way you could've written that is if I wrote:
"A file is a thing on the computer."
Wrong. Files can exist on different media, including ram and even have no contents at all, see unix domain socket
> AI is a fricking scam joke
What is pattern recognition?
That's beside the point: how does it do it? The only answers i can find online is either: "Here be a 500 hour course including discrete math, linear algebra, statistics" OR "It's just '''''tensors ''''' dude what don't you understand"
Neural networks are used for pattern recognition. You can say TRAIN a program based on neural networks to find patterns. You could call the result of that training a MODEL.
Vague. Not helpful.
You are asking "how does the autopilot work" on an airliner and shouldn't expect someone on via the Intardweb to teach you to fly, then teach you electronics, then teach you programming. If you don't look into neural networks or pattern recognition then you will probably not find answers to your question.
The only reason it would be vague is if you are too lazy to look up the two main concepts: neural networks, pattern recognition and you are just trolling.
Well, you are correct that my response to you here is very ignorant, but it is the reality of trying to understand LLMs. To BASICALLY understand how it works, I have to study 500 hour AI course. Why is it not the case with other systems? I did not have such problems understanding how kernels work, cryptography, networking, compilers, blockchain, etc.
Why is it not possible to explain, step-by-step, how a LLM works, like it is possible how (and there are thousands of videos, articles etc) an OS, blockchain, a video game engine work?
I might be moronic but AI to me SMELLS FISHY. It looks like a SCAM SHITTECH.
Most people cant explain because they dont into knowing the first thing about AI they know about a LIBRARY or API and think knowing about that LIBRARY makes them a fricking AI expert. Its like knowing how to drive a sports car would equate with being a metallurgist, machinest, AutoCAD expert, chemist and an automotive engineer.
CORRECT!!!!!!
Very well said, anon
Whoever mentioned 'pattern recognition' is the only who posted in this thread that actually knows anything about AI.
As shrimple as that.
Problem is we don't exactly know how it works. It's a transformer that processes data to come to it's conclusion. We train it with data, it learns patterns and then it outputs tokens. How exactly it comes to it's conclusions is a mystery and maybe not solvable for us humans as we can't make sense from a billion artifical neurons. Otherwise we could solve the halluzinations that plagues these AI's. We even find "emergent abilities" when studied because it learns things we didn't even teach it specifically.
"We"?? **You** dont know how it works because you are going on fumes from lots of grift and hype and haven't studied the fundamentals.
Yeah forgive me and my simple mind but I don't think you can compare the inner workings of a neural network to a car. And don't wrap my words, I didn't say "we know nothing about it" I said we don't know how it comes to it's conclusions. But of course if it's so simple to you then solve the halluzination issue, go ahead and better publish it.
We ****do*** know how it comes to its conclusions. ITS DOING PATTERN MATCHING and or PATTERN RECOGNITION.
That is what NEURAL NETWORKS were DESIGNED **** for.
>if it's so simple to you then solve the halluzination issue
You are talking about million people using out of the box general purpose neural networks and APIs crafted by total strangers being puzzled at quirks in the results and writing papers about 'hallucinations' but totally failing to admit that they know anything about the underlying NN and most certainly knowing nothing about the GNN. What a load o beans.
>How exactly it comes to it's conclusions is a mystery
How does you don't know something equate to "everyone else is clueless" That sounds like ego protection and projection to me. Yes _you_ don't know how it works. But I know how it works. The people who made it know how it works.
Saying "nobody knows how it works? sounds like something parotted from some YT videos made for the big grift (by someone who doesnt know anything about AI). Its like saying "Cars..gee buddy they are nifty but nobody knows how they work".
> To BASICALLY understand how it works, I have to study 500 hour AI course.
So rather than looking into pattern recognition and neural networks you go in circles like an Eliza bot. Nice trolling.
probability chain, with samplers/settings that use different strategies/randomness to walk down any possible path
you can make a stupid one on paper, just split up random sentences by the word and draw arrows between them, each one having equal probability. dat's a small language model
the neural network is just a """magic""" function (read: math, code, training) that approximates the most likely "words"(tokens) for the last X ones. Watch 2blue1brown's video on the subject and CGP Grey's addendum for how training ANY given network works
>So how is it able to answer my questions, help me with my homework, and roleplay as a futa dragon mommy?
Raw language models behave like I described; autocomplete. Instruction-tuning is taking the weights for the function and massaging them with new data that coaxes them into behaving... "intelligently."
>BASE: Hello -> , my name is Timothy. I'm a mouse warrior in the kingdom of Fromage fighting against the Cat kingdom. (. . .)
>TUNED: ### User: Hello -> ### GeePeeTee Assistant9000: Hello there! If you have any questions I can help you with, let me know! 🙂
moron troll thread but since there are BOTiggers unironically unaware of how ai works
I wish that was me
ask chatgpt dumbass
>all these tourists still haven't realized op is a chatbot
Here's an example of some training text: "German Shepherd is my favorite kind of [BLANK] breed." The LLM then guesses at randoms until it finds the correct word, "dog", and then it makes some adjustments to how it is structured - it now 'learns' (wrong word to use really) that the word "dog" is associated with other words like "German Shepherd", "favorite", "breed", etc. This is then repeated an unfathomable amount of times until you end up with something that isn't conscious, doesn't understand anything, can't think, but it very good at autocompleting sentences based on context. It can explain how a car works because it has gone through so many training texts about cars that it very accurately uses the right words in the right order.
It is what the name implies, a language model
Its a big box giving out the probability of the next word or character based on the given context, so it literally just models the language
The bigger the model the more information and context it can inherit and predict
The connection to image generation is done by using intermediate feature representations of images to text and vice versa
>What happens next? Specifically. How does it guess the next word?
it's very complicated
https://jalammar.github.io/illustrated-transformer/