Using AI to Automatically Jailbreak GPT-4 and Other LLMs in Under a Minute
The automated Tree of Attacks with pruning (TAP) method can jailbreak advanced language models like GPT-4 and Llama-2 in minutes, so they make harmful content.
The automated Tree of Attacks with pruning (TAP) method can jailbreak advanced language models like GPT-4 and Llama-2 in minutes, so they make harmful content.