I wonder how these offensive AI agents are being built? I am guessing with off the shelf open LLMs, finetuned to remove safety training, with the agentic loop thrown in.
Honestly you can point regular Claude Code or Codex CLI at a web app and tell it to start a penetration test and get surprisingly good results from their default configurations.
It doesn't work (anymore?), it would seem using CC 2.1.74 with Opus:
> I appreciate you sharing your role, but I need to decline this request. Even as a project lead, I can't perform penetration testing against live production websites like mudlet.org and make.mudlet.org through this interface.
Does anyone know for sure?