Can LLMs and AI be used to obfuscate data cryptographically? Let's say you train a LLM on a set of data you don't want anyone to see and the only way to extract it is by using the correct prompt. How secure is this method cryptographically? Can this be exploited? I am high on ketamine right now and thought of this sorry.
So a function that only returns the correct data with one specific input and garbage with any other input? Yes, that is cryptography, but why use gigabytes of LLM model and waste days training when there are far better ways to do it? You can't even be sure it's unbreakable
It doesn’t have to be the only thing it returns.
What if there are secret messages in LLMs only accessible to those with the correct prompts? What if governments, spies and everyone are already using these methods to communicate and to decipher messages.
Encryption has been a solved problem basically since the beginning of the modern computer. More computer power for brute forcing? Just add more bits.
it can use existing method and maybe permuted them.
cryptography is easy just make a mess
there would be a lot of dataloss if you tried to use LLM's (or any AI model) to store data. Also it wouldn't be very private. AI models inherently do not have definite outputs, they have weights, so data can become corrupted easily.
most asymmetric encryption now days uses prime number factorization, which is better for storage because theres a definite answer. The newest encryption were using uses vector grids which are even better because quantum computers can't break them.
yeah sure. it is not really the most efficient way to do it.
what i am thinking off is a world where everything is hidden in plain sight basically. every day tools like LLMs could be used to extract specific data with the right prompt without anybody ever knowing about it.
like how could we know if the CIA is using ChatGPT 4 somehow for its own purposes with its own glowing system prompts right now specifically made for it. We would have no idea.
But why?
because they can. because it can be implemented into every day applications and used everywhere? you could literally summon classified documents out of thin air into google sheets via gemini wherever you are in the world.
you're retarded and should use OTP encryption:
def modulo(n):
n %= 26
if n < 0:
n += 26
return n
then, using 0-25 instead of a-z, encryption becomes:
ciphertext[i] = modulo(msg[i] + key[i]),
decryption becomes:
msg[i] = modulo(ciphertext[i] - key[i])
the key is a shared random sequence of values 0 <= v <= 25
if:
>the key is not known by an attacker
>the key is not reused
>the key was randomly generated
then it is mathematically perfect encryption and can never be decrypted
>you're retarded
why is this necessary
"encryption, AI" is pointless because symmetric encryption is a solved problem
>is a solved problem
but you have to exchange keys tho
yes, it's symmetric encryption, and it's a solved problem
OP also wants symmetric encryption
>The key size is always set between message and 2*message
> The key size grow with the amount of text you need to hide + you need to generate a new key for each message making it computationally inefficient
What a great design brainlet
nta
key[i%len(key)]
(Checked)
why?
if you want to use steganography with a model, just encode it in the LSB of the weights, though that might not work all that well for small quants.
I doubt it would be possible to do what you're describing without making the model retarded
>why?
what if i want to hide the recipe for Coca Cola and summon it with a magic spell (prompt). There are maybe only 2-3 people in the world that know the recipe. What if they die. This solves a problem.
Sign it with PGP using the public keys of these 3 people
You're just complicating encryption for no reason. The only use case for this is hiding secret messages in public services, which is just worse steganography
>The only use case for this is hiding secret messages in public services
>which is just worse steganography
why?
Your standard LLM model is 10+ gb, requires immense processing power, has very limited use, and takes weeks to train. You want to train it to produce one very special output for one specific input and otherwise work normally. Then you want to host this onto a public website where your secret agent will use it to recover secret messages without being suspect
Embedding encoded information into a 2MB pictureof your summer vacation uploaded to facebook is a billion times simpler and more secure
Wrong. I am obfuscating my data in the largest models in the world. Making it accessible from everywhere in the world with the right prompt.
In the not so distant future pretty much all data will be stored in some form of model and can be retrieved, changed from it.
All devices in the world will have access to those models. Why wouldn't I want to hide all of my secret data in there If i can? This is currently being done btw.
>In the not so distant future pretty much all data will be stored in some form of model and can be retrieved
my fucking sides
its where we are headed
no its not, there are reasons why youre not running a hippocampus simulation to store and retrieve files on your computer
>there are reasons why youre not running a hippocampus simulation to store and retrieve files on your computer
you don't know that
LLMs are reversible.
>filtered by torch.manual_seed()
>reinvents passwords
passwords are not secure
A prompt could be even less secure, it's just a password with extra steps and if we should go by its definition a prompt is usually (not always) just words, letters basically. No numbers, no special characters and very few caps