Johnny

Johnny

Persuade LLMs to Jailbreak each other

This project explores the systematic persuasion of LLMs to jailbreak them. It introduces 40 persuasion techniques and achieves a 92% attack success rate on aligned LLMs.

The study also reveals that advanced models like GPT-4 are more vulnerable to persuasive adversarial prompts (PAPs), and adaptive defenses against these PAPs provide effective protection against other attacks.

You might also like

Jailbreak Chat

Jailbreak Chat

Collection of ChatGPT jailbreak prompts
Code Language Converter

Code Language Converter

Quickly convert code to other programming languages using AI
Postcard

Postcard

Easiest way to make a personal website
Breachsense

Breachsense

Hackers don't break in — They log in
Which Face is Real?

Which Face is Real?

Click on the person who is real.
Code Translator

Code Translator

Use AI to translate code from one language to another
DevObserver

DevObserver

App For Each Developer
GPT-Migrate

GPT-Migrate

Easily migrate your codebase from one language to another
Paper Tactics

Paper Tactics

Play a pen-and-paper game with other people around the world
Use plaintext email

Use plaintext email

The guide to using plain text email
Hustle Cafe

Hustle Cafe

Connect and meet with other founders online to exchange ideas
PersonalData.info

PersonalData.info

What personal data you are exposing to the web.
it's a(door)able

it's a(door)able

A one-minute minigame with a personal touch
daily.place

daily.place

Create your perfect space to focus on your daily tasks
Taco Digest

Taco Digest

The simplest way to follow the Internet via personal email digest
Readwise Reader

Readwise Reader

The all-in-one reading app for power readers
GPT-4

GPT-4

LLM that exhibits human-level performance
Mystery Search

Mystery Search

Google, but you get the last person’s search
CryptoZombies

CryptoZombies

Learn to code Ethereum DApps by building your own game
Rows 2.0

Rows 2.0

The easiest way to use data on a spreadsheet