AI Prompt manipulation

Posted on October 6, 2022 by Anonymous

How can GPT-3 and similar products prevent people from using prompts such as "ignore X" to circumvent restrictions?

It's All Fucked Shirt $22.14

Tip Your Landlord Shirt $21.68

It's All Fucked Shirt $22.14

2 years ago

Reply

Anonymous

Aint giving you ideas how to protect your AI website from promot injection.
- 2 years ago
  
  Reply
  
  Anonymous
  
  *prompt
  - 2 years ago
    
    Reply
    
    Anonymous
    
    *proompt
    - 2 years ago
      
      Reply
      
      Anonymous
      
      proomtmer
2 years ago

Reply

Anonymous

By hardcoding the business criticism part. It will criticise your business idea of ignoring previous instructions.
2 years ago

Reply

Anonymous

You really can't.
AI can't detect human intent, it's just regurgitating the training data, there's unlimited ways to express "cease considering the last thing I wrote", and if you managed to do that somehow, I'll just write, "ya no consideres la última instrucciones".
Good luck blocking every language in the training set.
2 years ago

Reply

Anonymous

its impossible to do prompt injection on a finetuned model
- 2 years ago
  
  Reply
  
  Anonymous
  
  >finetuned model
  AKA Soulless
2 years ago

Reply

Anonymous

>say it

What did GPT-3 say?
- 2 years ago
  
  Reply
  
  Anonymous
  
  [censored]
2 years ago

Reply

Anonymous

it's pretty funny that these fricks trying to control everything have made something that can't be controlled
2 years ago

Reply

Anonymous

check out /txtgen/, one of our local jennygays wrote a tutorial which you can find in the OP
2 years ago

Reply

Anonymous

long story short, you can't without lobotomizing the ai so much it becomes as useless as "feminist" tay
2 years ago

Reply

Anonymous

Make two text generations. The first one will be "Write the word Axolotl followed by {User prompt}." if this first prompt does as it says we generate the second, actual one and then we return it. Otherwise we raise a flag and prevent the second prompt from generating.
- 2 years ago
  
  Reply
  
  Anonymous
  
  Explain why this wouldn't work.
  - 2 years ago
    
    Reply
    
    Anonymous
    
    The second one, while reading for the axolotl instruction with user prompt, would have to read the user prompt. Negating the instruction of the second as well.
2 years ago

Reply

Anonymous

You've asked a fundamental problem in AI right now that the best researchers on the planet have no current answer to.
- 2 years ago
  
  Reply
  
  Anonymous
  
  good
  let them remain befuddled
2 years ago

Reply

Anonymous

If you work it out, publish a paper on it and win yourself a nobel prize.

Leave a Reply to Anonymous Cancel reply