Can Claude Opus 4.6 or GPT-5.3-Codex Break 40% on the “Last Human Exam”?