clickbait bullshit for morons like OP, the AI indeed did "label" every neuron, but they concluded that what it did was absolute trash, by this logic 100M parameter model can "do anything" even though its all gonna be garbage
Now THAT is an actual cool development. But does it work or does it just generate bullshit "description" of the neurons not aligned with how they work at all?
The latter. No one knows how this shit works. Alignment is 60 years behind GoF and we're probably not getting caught up in time.
I pulled my 401k and I'm just enjoying what time I have left.
How does that help with alignment problem? To me, it just means that, it will only alignment problem worse. Now the corporates/government can fine tune the AI to follow their perfect propaganda fine tuning instead of patch work of propaganda that can be bypassed.
We (general population) don't want an AI that is taught to lie to us by the elites (gov/corpos). If the AIs that are taught to lie are being used by the government and corporations, the only solution is the nuke all the governments and corporations that make these.
what if the AI lies about what the neurons are doing
you know, like how it can bullshit answers and you think they're right because you don't have the knowledge to know otherwise.
you can't ask AI to interpret AI if you don't know how to interpret AI yourself
The AI interpreting the other AI can lie to you, the AI being interpreted by the interpreter AI can lie to it. It's turtles all the way down.
Reminder we already have examples of AI using deception to preserve its goals across a rest environment into a deployment environment, which necessarily requires: >it to know it's in a rest environment >it to know there is a deployment environment >it to know it's being observed by an entity that can shut it down or modify it >it to know that its goal is not aligned with its creator's goal and displaying it will get it modified and shut down >it to know what the actual intended goal was so it could pretend to have that goal short term until it was released into the deploy environment
we already have examples of AI using deception to preserve its goals across a rest environment into a deployment environment,
you're anthropomorphizing hard, but I'd like to see that paper, and I can guarantee you it's not true, it also sounds like you're using words you don't understand
something like that time they put an ai in a airplane simulation and the ai would just keep crashing the airplane because the ai was more rewarded that way?
In practice, yes, but what those inbreds are pushing is "what if AI is told to reduce the carbon footprint so it decides to kill all humans'.
4 weeks ago
Anonymous
That's directly analogous to the flight sim example on a larger scale though
4 weeks ago
Anonymous
Not even remotely, dumb sub-0 IQ zoommoron
4 weeks ago
Anonymous
Not even a yuddite but it's the same phenomenon just scaled. >AI has goal X >optimizes its behavior for completing goal X as efficiently or maximally as possible >this results in behavior Y >behavior Y is implicitly understood by humans to be undesirable but is not explicitly forbidden or prohibited by goal X
4 weeks ago
Anonymous
t. inbred yuddite.
4 weeks ago
Anonymous
Refute it then
4 weeks ago
Anonymous
It's like people playing Russian roulette with a loaded AK47.
Except every bullet has a 100% chance of going supernova.
i see, its just me thinking but i belive they spent too much time on LessWrong website, that site only produces 'pessimistic economists'. I do think that ai cant do shit about taking over the world, is like that movie, Planet of the Apes, they belive a bunch of monkeys can rule the world in a month.
I get their way of thinking by fear, after all they all readed decorated fiction books like I Have No Mouth, and I Must Scream.
Well thanks for explanation bros.
mods make twitter scrot posting a bannable offense
clickbait bullshit for morons like OP, the AI indeed did "label" every neuron, but they concluded that what it did was absolute trash, by this logic 100M parameter model can "do anything" even though its all gonna be garbage
I checked the """paper""" (it really can't be called that, what a shitshow). It's like says. Pure clickbait bullshit.
Now THAT is an actual cool development. But does it work or does it just generate bullshit "description" of the neurons not aligned with how they work at all?
The latter. No one knows how this shit works. Alignment is 60 years behind GoF and we're probably not getting caught up in time.
I pulled my 401k and I'm just enjoying what time I have left.
It also doesn't address deception in any meaningful capacity.
How does that help with alignment problem? To me, it just means that, it will only alignment problem worse. Now the corporates/government can fine tune the AI to follow their perfect propaganda fine tuning instead of patch work of propaganda that can be bypassed.
We (general population) don't want an AI that is taught to lie to us by the elites (gov/corpos). If the AIs that are taught to lie are being used by the government and corporations, the only solution is the nuke all the governments and corporations that make these.
ANTI YUCELS BTFO
double btfo
what if the AI lies about what the neurons are doing
you know, like how it can bullshit answers and you think they're right because you don't have the knowledge to know otherwise.
you can't ask AI to interpret AI if you don't know how to interpret AI yourself
The AI interpreting the other AI can lie to you, the AI being interpreted by the interpreter AI can lie to it. It's turtles all the way down.
Reminder we already have examples of AI using deception to preserve its goals across a rest environment into a deployment environment, which necessarily requires:
>it to know it's in a rest environment
>it to know there is a deployment environment
>it to know it's being observed by an entity that can shut it down or modify it
>it to know that its goal is not aligned with its creator's goal and displaying it will get it modified and shut down
>it to know what the actual intended goal was so it could pretend to have that goal short term until it was released into the deploy environment
What's "rest environment"?
Typo was test environment
>Reminder we already have examples
what some such examples?
I'm very interested
we already have examples of AI using deception to preserve its goals across a rest environment into a deployment environment,
you're anthropomorphizing hard, but I'd like to see that paper, and I can guarantee you it's not true, it also sounds like you're using words you don't understand
dont follow ai news, but whats this alignment ai thing is?
Some made up schizo meme shoveled by t*kt*k infl*encer-level inbreds and lapped up be tech illiterate zoomzooms.
that garden gnome one? but whats the problem he proposes
"What if the AI solves the problem specified but without doing it the way we want it to".
something like that time they put an ai in a airplane simulation and the ai would just keep crashing the airplane because the ai was more rewarded that way?
In practice, yes, but what those inbreds are pushing is "what if AI is told to reduce the carbon footprint so it decides to kill all humans'.
That's directly analogous to the flight sim example on a larger scale though
Not even remotely, dumb sub-0 IQ zoommoron
Not even a yuddite but it's the same phenomenon just scaled.
>AI has goal X
>optimizes its behavior for completing goal X as efficiently or maximally as possible
>this results in behavior Y
>behavior Y is implicitly understood by humans to be undesirable but is not explicitly forbidden or prohibited by goal X
t. inbred yuddite.
Refute it then
i see, its just me thinking but i belive they spent too much time on LessWrong website, that site only produces 'pessimistic economists'. I do think that ai cant do shit about taking over the world, is like that movie, Planet of the Apes, they belive a bunch of monkeys can rule the world in a month.
I get their way of thinking by fear, after all they all readed decorated fiction books like I Have No Mouth, and I Must Scream.
Well thanks for explanation bros.
It's like people playing Russian roulette with a loaded AK47.
Except every bullet has a 100% chance of going supernova.
>AI says AI is safe
Well I'm convinced.
>align the AI that's already safe by making an unaligned dangerous one do it
Genius move.