Your comment doesn’t make as strong of a point as you think it does; it might ma...

Your comment doesn’t make as strong of a point as you think it does; it might make the opposite point.

Because, yes, first, it was a model issue, and then more advanced models started appearing and prompting them correctly became more important. Then models learned through RLHF to deal with vague prompting better, and context management became more important. Then models became better (though not great) at inherent context recollection and attention distribution, so now, you need to be careful what instructions a model receives and at what points because it’s literally better at following them. It’s not so much that the goalposts are being moved, it’s that they’re literally being, like, *cleared*.

This isn’t a tech that’s already fully explored and we just need to make it good now, it’s effectively an entirely new field of computing. When ChatGPT came out years ago no one would have DREAMT of an LLM ever autonomously using CLI tools to write entire projects worth of code off of a single text prompt. We’d only just figured out how to turn them into proper chatbots. The point is that we have no idea where the ceiling is right now, so demanding well-defined goalposts is like saying we need to have a full geological map of Mars before we can set foot on it, when part of the point of going to Mars is to find out about that.

As a side point, the agent is the harness; or, rather, an agent is a model called on a loop, and the harness is where that loop lives (and where it can be influenced/stopped). So what I can say about most - not all, but most, including you, seemingly - AI skeptics is that they tend to not actually be particularly up-to-date and/or engaged with how these systems actually work and how capable they actually are at this point. Which is not supposed to be a dig or shade, because I’m pretty sure we’ve never had any tech move this fast before. But the general public is so woefully underinformed about this. I’ve recently had someone tell me in awe about how ChatGPT was able to read their handwritten note and solve a few math equations.