Zhang, collaborators win distinguished paper award

an illustratio of ones and zeros shaped like a human in a jail cell — Generative AI tools, including large language models such as ChatGPT, have security measures to prevent the creation of harmful content, but even novice users can use jailbreak prompts to escape these guard rails. (Image: iStock)

Ning Zhang, an associate professor of computer science and engineering in the McKelvey School of Engineering at Washington University in St. Louis, and Zhiyuan Yu, a doctoral student in Zhang’s lab, recently won a distinguished paper award from USENIX, a leader in computing systems research. Their paper, “Don’t Listen to Me: Understanding and Exploring Jailbreak Prompts of Large Language Models,” examines jailbreak prompts as one of the most effective methods to circumvent security restrictions on generative artificial intelligence (AI) tools. They presented the work at the USENIX Security 2024 conference.

Recent advancements in generative AI have enabled ubiquitous access to large language models, opening countless avenues for potential misuse of this powerful technology and, in turn, prompting defensive measures from service providers. Users who want to get around these security restrictions turn to jailbreak prompts that can bypass boundaries programmed into the AI. Jailbreak prompts then allow nefarious users to elicit harmful content that would otherwise be prohibited.

In their award-winning work, Zhang and his team aim to gain a better understanding of the threat landscape of jailbreak prompts.

Read more on the McKelvey Engineering website.