LLMs as Deceptive Agents: How Role-Based Prompting Induces Semantic Ambiguity in Puzzle Tasks
Seunghyun Yoo·April 03, 2025
Summary
Recent studies on Large Language Models (LLMs) show their capability to generate deceptive puzzles with high semantic ambiguity, challenging human users. This research compares puzzles created through zero-shot and role-injected adversarial prompts, focusing on cognitive load and fairness. Utilizing HateBERT for analysis, the study reveals that explicit adversarial behaviors significantly increase semantic ambiguity, raising ethical concerns for autonomous language system deployment. The task involves creating a 16-word puzzle for the New York Times' Connections game, with categories like chess terms, military terms, and more, aiming to misdirect solvers without increasing difficulty.
Advanced features