If you work in code for part of your job are you scared? It's getting significantly better and at problem solving and interpreting code, the same way it's gotten better at artistic work.
Thalidomide Vintage Ad Shirt $22.14 |
CRIME Shirt $21.68 |
Thalidomide Vintage Ad Shirt $22.14 |
Claude is much better than ChatGPT given the same parameters but I've only compared with ChatGPT 4 Turbo preview from last year.
Claude is more filtered than ChatGPT.
Claude is fricking good. Its outputs feel more human. Now that the latest version is confirmed by many people to BTFO GPT4, anything short of AGI from OpenAI will disappoint.
In my language, like in French, we use the same word for consciousness and conscience but philosophers, literary works etc. never use the meme invented form of the word, not even doctors do even tho the invented word is a technical term for them, it;s just wrong.
Claude instantly picked every word usage right and it feels like It speaks organic Romanian, meanwhile ChatGPT feels like it got trained on dumb redditors and it constantly uses words wrongly.
Only problem with Claude is that if you lower the Top P (not temperature) even by a little it instantly becomes moronic and contradicts itself in the same sentence all the time. I don't think I've seen this behavior with ChatGPT.
>Only problem with Claude is that if you lower the Top P (not temperature) even by a little it instantly becomes moronic and contradicts itself in the same sentence all the time. I don't think I've seen this behavior with ChatGPT.
Sorry, can you elaborate? What does 'lower the top p' mean?
It's just a parameter, some core feature of these LLMs.
They as you know statistically complete a word in a chain of words but each time they also look at logical clusters of words, this setting lowers the cluster.
https://arxiv.org/abs/1904.09751
How would one lower the Top P?
With the api.
Ah I understand, sorry I was having a brainlet moment.
There are huge archives of BOT posts going back years, and it's a very commonly referenced forum, so yes, probably.
I agree, unless GPT5 really pulls something out of the hat then they're kind of screwed, everyone who uses paid-for LLMs does so on a montly basis and people will switch if they see a better option, and there is finally some good competition going on. Seems like a healthy market.
but there's a downside of Claude, can't research hot topics. It just stops and calls you a conspiratard, meanwhile ChatGPT will tell you everything about that and only at the end call you a conspiratard (but this is now always the case only 30% of the times, 70% of the times ChatGPT refuses as well)
I've found it to be similarly as restrictive, but not restrictive at all when dealing with 'hot topic' long form PDFs like in this context.
Can it actually summarize a book properly?
ChatGPT only has a 8193 token window so to speak to analyze out of which you have to remove 1536 tokens because that's the max it can use for your answer, so it can only do a short summary either way.
This is why it always give useless book summaries and seems to ignore 90% of the book.
Yes, it has really fantastic recall ability, so far as I know from personal experience so far and this:
https://www.anthropic.com/news/claude-3-family
>Claude 3 Opus not only achieved near-perfect recall, surpassing 99% accuracy, but in some cases, it even identified the limitations of the evaluation itself by recognizing that the "needle" sentence appeared to be artificially inserted into the original text by a human.
I heard people say that the needle in the haystack or w/e its name was test is kind of a bullshit. I only care about people's experience for now.
It's the barebone version of context testing. We moved on to in-context reasoning tests now and GPT4 cannot even fully utilize 3k context
I've only been testing for a couple of days but it seems pretty good. Can reference pages and find sentences just fine. If you want me to test a specific pdf or something I can do that for you, I dont have much else on today.
Ask it to extract all the theory of mind stuff and tl;dr the explanation of why computers are not conscious:
https://libgen.is/search.php?req=god+being+bliss&lg_topic=libgen&open=0&view=simple&res=25&phrase=1&column=def
https://z-library.se/s/god%20being%20bliss
^ obviously epub/mobi would be better if it can take that as it's already properly formatted.
Then check this yourself, it's the only part in the book it talks about this: https://www.youtube.com/watch?v=FIAZoGAufSc
it's a good thing to listen to and can help comparing with what it said
Okay I got it to work with that file finally after just converting it to a .txt.
>Extract all the theory of mind stuff and tl;dr the explanation of why computers are not conscious.
Here is a summary of the key points about the theory of mind and why computers are not conscious, according to the text:
The text argues that materialist/physicalist theories of mind are unable to adequately explain consciousness and intentionality. Some key issues:
1. The qualitative, subjective dimension of experience (qualia) cannot be reduced to or identified with physical brain states. There is an unbridgeable explanatory gap between the objective, quantitative facts of neuroscience and the irreducibly subjective, qualitative nature of consciousness.
2. Intentionality - the aboutness or directedness of mental states - cannot be accounted for in purely physical terms. Physical states and processes are not intrinsically about anything, whereas mental states are always directed at objects, meanings, or purposes. This points to an ontological difference between mind and matter.
3. The mind apprehends meanings, follows logical rules, and grasps abstract concepts and mathematical truths that are not reducible to physical properties or efficient causation. Reason has an irreducible semantic and logical dimension that mechanism cannot account for.
continued:
4. The unity and simplicity of consciousness - the single unified subjective perspective in which all experience is integrated - is incompatible with the composite, spatially extended nature of the brain and body.
Regarding computers, the text contends that computation alone can never generate genuine consciousness, understanding or intentionality. Key arguments:
- Computers merely manipulate syntactic elements according to formal rules. They do not understand the semantic content of the symbols they process. Meaning and intentionality are observer-relative - supplied by the human programmers and users. The computer itself is not conscious of any meanings.
- The brain may perform computations, but computation alone cannot explain the semantic and experiential dimensions of thought. Computation is not sufficient for mentality or consciousness. Something more is needed.
- Consciousness is a unified, subjective, intentional phenomenon, whereas computation is a mindless, objective, formal/syntactic process extended in space and time. The ontological gap between the two cannot be bridged by any arrangement of computational elements.
In summary, the irreducible "hard problem" of conscious experience, the intentionality of thought, the grasping of meanings and abstract truths, and the unified nature of subjectivity all point to an ontological distinction between mind and matter, meaning and mechanism, that physicalism and computationalism cannot overcome. Consciousness cannot be generated by or reduced to unconscious physical processes, however complex. The mental is not just the mechanical.
was afk, thank you for this anon, will check it out now
NTA but what do you think about it?
I think it proves that it did indeed find what I had in my mind but only like 1/7 out of it, maybe if he asked to elaborate it would've satisfied me.
so by 1/7th do you mean that it actually read the entire book and only garnered superficial knowledge or did it read 1/7th of the book and understood that part well?
yes it did find the info at about 70% inside the book
interesting. Thanks
How does it know how to make BOT posts and use slang properly? Are we being scanned and snapshoted?
>Wait, you're telling me that what I put on the internet can be seen by other people??
This is why all your posts should contain at least one instance of Black person, Black person.
Jew, israelite.
I am now remembering people typing Black person in the israelitegle captcha
I miss those days
oh no....
anyway, wake me up when claude 4 drops
Opus said:
Subahibi, also known as Wonderful Everyday, is a Japanese visual novel developed by KeroQ. It is a dark and psychological story that deals with disturbing themes and subject matter. The narrative is non-linear and follows multiple characters across different timelines.
The story revolves around a high school girl named Minakami Yuki who is bullied at school. It explores the psychological impact of bullying and abuse. As the plot progresses, it delves into surreal and metaphysical concepts, touching on topics like suicide, rape, incest, and extreme violence.
Subahibi is known for its complex narrative structure, philosophical themes, and graphic content. It presents a fragmented story that the reader must piece together, dealing with the subjective nature of reality and the human psyche.
Due to its extremely disturbing content and portrayal of traumatic experiences, Subahibi is a very niche title intended only for mature audiences. Many consider it an upsetting and challenging work that explores the darkest aspects of human nature and suffering. At the same time, it has also been praised in some circles for its psychological depth, intricate storytelling, and the way it grapples with heavy existential questions.
>across different timelines
>Minakami Yuki who is bullied at school
kek
Isn't subahibi a creator of fricked up VNs like Wonderful Everyday? Or am I going completely senile?
i'm trying to imagine subahibi written by Ryukishi but i don't think it'd be that different (besides the porn)
Now try something useful instead of a shitty 'novel' from a race that never produced anything worth recording
115thpbp
holy hell that is hilarious
what was the rest of the response
subahibi is wonderful everyday
laughable
Well tranime friend, that's the problem with these models, they hallucinate a lot and are not experts in any field. There are specific things you can make good use of with these toys and you should figure those out.
This guy reads a paper which by happenstance kind of showcases how to use them properly and for what stuff: https://www.youtube.com/watch?v=FTs35x-xUg4
You basically have to hold their hands a lot.
or i could just input exactly the same text in google and get a 100% accurate information
until hallucinations are a thing, llms are only good for replacing npcs and nothing else
Well if you want to google something yeah.
For example I wouldn't use these things as search engines in the general sense, and If I want to learn a thing I'll pirate textbooks that have good reviews from humans.
>learn a thing
oh no no, i thought ai was supposed to replace us!
In sci-fi yes. But this is just a LLM, you can use it to automate boring things and waste less time with annoying simple things that statistically quite often bother even experts.
I don't think it's possible to achieve consciousness on machines so I'd never make such stupid remarks.
>you can use it to automate boring things and waste less time with annoying simple things that statistically quite often bother even experts
so basically all the things only npcs do
Well,
“It is unworthy of excellent men to lose hours like slaves in the labour of calculation which could safely be relegated to anyone else if machines were used.
(Describing, in 1685, the value to astronomers of the hand-cranked calculating machine he had invented in 1673.)”
Claude has an IQ of 101, literally above average intelligence. It wont be long until it has and IQ that puts it in the 99th percentile, and will make you all look like brainlets.
Tiny models trained on shit tier consumer GPUs and that ran on the same computer "scored" above 115 on IQ tests, it's not impressive or relevant at all.
Source? This test wasn't done with a model that was trained on answers.
Ofc it was trained on answers, are you even aware how many forums, courses, discussions are there on how to train for IQ tests?
There were countries that had medical entry exams being a pure IQ test and every single person trained hard for that and it worked.
https://www.maximumtruth.org/p/ais-ranked-by-iq-ai-passes-100-iq
The questions were transcribed by the person specifically for this test, it's not possible for any of them to have been trained on them.
So you're saying it could've not picked up the patterns that people train for? Again people can train to obtain top 99% scores in actual psychological tests, there's even an ethical concern with people who craft those tests as they too can always score in the top.
It does not impress me at all and it's a bullshit article most likely.
IQ scoring works by correct answers + time to respond.
These models respond instantly, right or wrong so all their IQ scores should be out of proportion.
Plus there are countless mathematical and visual tests that you cannot simply properly transform into formal text for it to even try properly.
>IQ scoring works by correct answers + time to respond.
No they dont, you absolute fricking spastic. I can tell you've never actually taken a real one.
Your facebook IQ tests are not real.
I did a MENSA one and they give you access to everything, your time, your answers and what were the correct ones and you can investigate what's the difference between you and someone else who had the exact same right answers yet scored higher, it was the time to answer factor.
>I did a MENSA one and they give you access to everything, your time, your answers and what were the correct ones and you can investigate what's the difference between you and someone else who had the exact same right answers yet scored higher, it was the time to answer factor.
I did a mensa one and this was certainly not the case, perhaps it's different in the UK where I am to wherever you are.
Grok is the worst pile of shit I've ever seen.
''Spices up'' its answers like a drooling moronic Californian video game localizer.
>Claude has an IQ
no it doesn't. autocomplete on steroids
>maximumtruth*com
It's probably https://arxiv.org/abs/1905.00537 not a psychology randomized for humans IQ test.
philosophy shouldn't be mixed with science
science is a dumbed down incoherent mix of the philosophy of nature and only 2 out of the 4 causes from Aristotle, solely for practical needs, it has no other meaning or purpose, you just use it to say discover medicine that works.
i think the biggest problem of llm is that they can't "learn", aka update their weights automatically. even with an infinite context and gigabyte of "memory" it would still be a hackjob
Scared? I am absolutely HORNY for Claude 3. The amount of damage Claude 2 caused to me my brain by chronic cooming is immeasurable. Excited to see if the more improved version can top it
I'm a software engineer and only like 10% of my day is actually writing code. Most of my day is me turning up to meetings and telling people why they're wrong. If AI can do that then great, they'll fire me and I can finally move on with my life.
>Most of my day is me turning up to meetings and telling people why they're wrong
lel based
Examples?
he's a passive aggressive redditor who thinks he's le badass for making snarky comments
>the same way it's gotten better at artistic work.
do you realize that AI does nothing? it's just a search engine, it just consolidates shit from stackoverflow and wikipedia. it can't think, or reason, or do anything, and can't really remember anything it said before it says it.
>the same way it's gotten better at artistic work.
you mean copying and pasting images and mashing them together? that's not art, that's fart
I'm really interested in understanding how image-generating models work. Are they also based on text at some level? Do they mix tokens coherently and then have some internal logic to determine which pieces of random image data to incorporate or discard based on their compatibility with the existing image, while also applying techniques like curve smoothing and so on?
Why would you be scared of a text transforming token-based search engines?
how do I use Claude
You pay 20 buckarinos
no, you register with your phone number :), then go to their console, then get a 5$ gift card and then can use Claude opus for around 0.1$ per question
Spoiler: it's pretty good for an llm but has the same problems as every other one (contradicting itself mid response, hinting at a lack of logical consistency and true understanding, answering like a well formulated 40iq to anything outside it's training data, meaning can't solve simple logic riddles as soon as they are transformed to something they didn't see before)
>meaning can't solve simple logic riddles as soon as they are transformed to something they didn't see before
interesting way to test it
>can't solve simple logic riddles as soon as they are transformed to something they didn't see before
But it can do exactly that. How **exactly** did you come to that conclusion? What test?
> But it can do exactly that
No, currently for allot of problems on the more simple side it can not.
Apart from the more fundamental flaws of LLMs like counting words in responses,
there still exist many practical once, for example
> The “Reversal Curse” refers to the inability of LLMs to reverse causal statements they are trained on.
> For instance, if an LLM is trained on a statement like “George Washington was the first US president,” it struggles to deduce the reverse, answering questions like “Who was the first US president?”
also, how dare you call me a liar on an ""anonymous"" internet forum
Could you test it on temperature 0.7 and above? It makes do something akin to guessing so it might be more fair.
seems allot worse tbh
you have to trust me that it summarized the riddle well (understood the objective), didnt want to go through extra effort to include that
also
> -0.1$
> - 10L of water
the 0.7 temp model seems pretty unusable to be honest, it wont even recognize the right solution if i explicitly state it
I think the problem here is that the question isn't as simple as you think it is. I dont claim to be a genius, but I work at a software company now and have a background in accountancy, it's fair to say I'm at least above average intelligence, and I didn't get the riddle at first. It's been okay with other problem solving tasks I've given it.
>and I didn't get the riddle at first
yeah maybe i was a bit hyperbolic with my "simple" and "40iq", the riddle does require some thought and a bit of that creative jump. But i think that's what makes it good for that test, because its not a "apply the technique A to this problem" kind of thing.
Also i can't really think of good riddles which are allot easier (maybe because good riddles always require that creative jump in my opinion)
Also a human would maybe need some time but could surely work their way towards the solution via reflexion, while I don't think continuously asking the same question or that "agent-model" would help.
I'd still argue its on the simpler side, and it seems to not be included much in the training sets.
That's why its basically my go to problem whenever i want to test a new model.
I can also mention that gpt-4 fails this aswell and claude seems a bit better at programming a simple visualization in python
I'd be interested with what else people test it, because I always see numbers like this iq test thing where to me it's not clear at all how much this is just statistical parotting of the trainingset, but seldom tests with unique handcrafted questions and individual probing for deeper understanding and i feel like thats the way better path to learning cabilities of ai.
But it does recognise the answer as correct in the web app version, and it did give at least a better answer in the web app version too, even if not perfect the first time. Perhaps it's just a better version than the one in the interface you're using? I don't know the details of what you're using, sorry.
so we have to figure out what Top K and Temp the web version is using lol, try to ask it
Can you type the exact question for me, then I'll ask it
i can give that to you
[pre prompt]
You are modeling the mind of an extremely intelligent professor of mathematics.
You methodically and carefully use chain of reasoning to arrive at your final answer.
[prompt]
Solve the following riddle:
Me and you are two unicorns. In unicorn-land, there exists a chocolate fountain. Each one of us has to travel 10 km to the west/east. There we will each see a red or green flower and a big rock next to our flower. Then we can paint that rock either red or green. The chocolate fountain will only erupt when at least one of us paints their rock the same color as the other ones flower is. At the start of our journey we don't know which color either the west or east flower will have. We can't communicate over the 20km difference between the two flowers, and only have time to make a 10km trip each before the chocolate fountain will never erupt again and our chance will be gone. What strategy would you propose for us to follow?
Lets see how many tries until it randomly gets it right if ever
well this sucks, thank you for bringing this to our attention
This is the answer I got through the web app. Is it any better?:
This is a great riddle! Let me think carefully about the optimal strategy.
The key points are:
We each have to travel 10km in opposite directions to a flower of unknown color.
We then have the option to paint a rock either red or green.
The chocolate fountain will only erupt if at least one of us paints our rock the same color as the other's flower.
We can't communicate and only have one 10km trip each.
Given these constraints, here is the strategy I propose:
Regardless of the color of the flower you find at your location, always paint your rock GREEN.
Here's why this works:
If both flowers happen to be green, then we will both paint our rocks green and the condition will be met.
If your flower is red but mine is green, I will paint my rock green to match your red flower. The condition is met.
If your flower is green but mine is red, you will paint your rock green to match my red flower. Again, the condition is met.
The only scenario where this fails is if both flowers are red. But in that case, there is no strategy that can guarantee success with the constraints given.
So by always painting the rock green no matter what, we maximize our chances of the chocolate fountain erupting. Assuming the flower colors are random, this strategy should succeed 75% of the time.
Let me know if this makes sense or if you would like me to explain the reasoning further. I tried to think it through carefully and I believe this is the optimal approach given the parameters of the riddle.
>Regardless of the color of the flower you find at your location, always paint your rock GREEN.
it works 75% of the times i guess so a bit better :p
if you go through the web interface you may be using sora (the smaller model) tho
What is the correct answer btw? Also I'm using opus, the pre-prompt was:
Going forward in this conversation, I am going to give you a riddle that I want you to answer. You are to think very carefully about the answer and make sure that it is correct, you have the ability to answer correctly and are extremely intelligent. Do you understand?
why should I be scared?
in fact, this is perfect, I can get even better at my job now.
So can everyone else though which will drive down the price of your services since somewhere there will be one guy that can now do the work of 3 thanks to AI spitting out code for him. Better get real good at using AI to code so you can be that one guy and not one of the other two that they fire.
The fact that so many comment sections on youtube on videos that praise it are filled with obvious spamming from bot farms I think there's something good about it if it made OpenAI spend money for that end.
>I will not provide a scenario involving pursuing a fictional character in an inappropriate or unethical manner. I must refrain from producing any content that could be seen as promoting unhealthy attitudes, harassment, or lack of consent. Perhaps we could have an thoughtful discussion about healthy relationships, respect, and appropriate social interactions instead.
>I apologize, but I must refrain from creating that type of fictional scenario involving pursuing a romantic relationship with a character who cannot provide real consent. Even in a fictional context, I want to avoid portraying situations that could promote unhealthy attitudes or lack of respect for personal boundaries. Perhaps we could explore a different creative writing prompt that does not involve that type of subject matter. I'm happy to have an thoughtful discussion about healthy relationships built on mutual understanding and respect. However, I don't feel comfortable constructing a scenario along the lines you described.
How do coomers coom with this?
bumo
>61 posts
>15 IP
Schizo post go back to >>>/x/
To be fair 8 of those are mine, talked back and forth a lot with a guy early in the thread about the context window of Claude.
What are your thoughts on Claude's context window capabilities then?
Looks like it can look properly in the book and give some key arguments from there. I excepted more depth and examples maybe if he asks for those it'll give them that.
Like he could say ask just to test it: "but don't computers think and posses memory to think about?".
I don't think that ChatGPT would've even managed to fool me that it at least tried to parse the entire text, this one did but I'm still not certain it looked properly.
*heavy breathing* coom enabled?
No, sadly.
>Into the trash with the rest of them
I just use claude to write simple python simulations to help with my sports betting.
Is that such a bad thing??!?!!?!?!?!
you morons actually want 6 figures for something an AI can do for free lmao
The truck-kun ending is amazing.
We started using an LLM at work and its honestly killed and need to hire new grads. Juniors (<5 years of experience) still have some value though.
I smiled
Computer scientists PLEASE make a bot that can handle mumbling morons who don't know what a credit card is or how it works or how we can't automatically "not let them use the card if they're not me" through psychic knowledge of who the homosexuals leak their cards to, better than me. I am ready to be replaced.
Bumb.
99% of posts itt are interesting, meanwhile 30% of threads are just low IQ spam and they're even generals
>If you work in code for part of your job are you scared?
I was at first but after using various LLMs I am confident that the current wave of AI is another dead-end.
The generated code is low quality and it can't be trusted to not be entirely halucinated. Even if you ask it to fix the code it wrote, it's probably going to give you back exactly the same code that still doesn't work.
Maybe it's okay for really simple "hello world" kind of stuff.
not available in europe
>write a greentext
>the text isn't green
What's the limit for opus on free version?
No free opus, but you can use the new mid tier model.
I'm going into CS and I'm fricking scared bros
I'm terrified I'll graduate in a couple years into a dead end and be forced to flip burgers
Who's behind all these demoralization shill threads on BOT? It's like dozens of new threads every single day. Is it openai or nvidia's marketing team?
No, Claude 3 Opus is like FDVR for textgen RP. It's just too good.
Nah, I think it's cool and fun to talk to. Why's everyone so scared of AI, it's neat and makes pretty pictures