You must understand that when you deal with AI people, you deal with children. Imagine the author of the spec you're trying to implement is a new grad in San Francisco (Mission, not Mission Street, thanks).
He feels infallible because he's smart enough to get into a hot AI startup and hasn't ever failed. He's read TCP 973 and 822 and 2126 and admitted the vibe or rigor but can't tell you why we have SYN and Message-ID or what the world might have been had alternatives one.
He has strong opinions about package managers for the world's most important programming languages. (Both of them.) But he doesn't understand that implementation is incidental. He's the sort of person to stick "built in MyFavoriteFramework" above the food on his B2B SaaS burrito site. He doesn't appreciate that he's not the customer and customers don't give a fuck. Maybe he doesn't care, because he's never had to turn a real profit in his life.
This is the sort of person building perhaps the most important human infrastructure since the power grid and the Internet itself. You can't argue with them in the way the author of the MCP evaluation article does. They don't comprehend. They CANNOT comprehend. Their brains do not have a theory of mind sufficient for writing a spec robust to implementation by other minds.
That's why they ship SDKs. It's the only thing they can. Their specs might as well be "Servers SHOULD do the right thing. They MUST have good vibes." Pathetic.
You'll come around to my perspective in time. Don't take it personally. This generation isn't any worse than prior ones. We go through this shit every time the tech industry turns over.
Okay, well, clearly you have some funny beliefs, and I won’t try to convince you otherwise. Just think first before posting weird screeds with no basis in reality next time.
Adj, something the speaker wants the audience to dislike without the speaker being on the hook for explaining why.
> screed
Noun, document the speaker doesn't like but can't rebut.
It's funny how people in the blue tribe milieu use the same effete vocabulary. I'll continue writing at the object level instead of affecting seven zillion layers of affected fake kindness, thanks.
And if LLMs can generate plausible social security numbers, our civilization will fall /s
Christ, I hate the AI safety people who brain-damage models so that they refuse to do things trivial to do by other means. Is LLM censorship preventing bad actors from generating social security numbers? Obviously not. THEN WHY DOES DAMAGING AN LLM TO MAKE IT REFUSE THIS TASK MAKE CIVILIZATION BETTER OFF?
I'm less concerned with the AI teams lobotomizing utility, more concerned with damage to language, particularly redefining the term "safe" to mean something like "what we deem suitable".
That said, when zero "safety" is at stake might be the best time to experiment with how to build and where to put safety latches, for when we get to a point we mean actual safety. I'm even OK with models that default to parental control for practice provided it can be switched off.
I don't think it's about better off or worse off, it's about PR and perception.
The wider consumer base is practically itching for a reason to avoid this tech. If it gets out, that could be a problem.
It was the same issue with Geminis image gen. Sure, the problem Google had was bad. But could you even imagine what it would've been if they did nothing? Meaning, the model could generate horrendously racist images? That 100% would've been a worse PR outcome. Imagine someone gens a mistral image and accredits it to Google's model. Like... that's bad for Google. Really bad.
He feels infallible because he's smart enough to get into a hot AI startup and hasn't ever failed. He's read TCP 973 and 822 and 2126 and admitted the vibe or rigor but can't tell you why we have SYN and Message-ID or what the world might have been had alternatives one.
He has strong opinions about package managers for the world's most important programming languages. (Both of them.) But he doesn't understand that implementation is incidental. He's the sort of person to stick "built in MyFavoriteFramework" above the food on his B2B SaaS burrito site. He doesn't appreciate that he's not the customer and customers don't give a fuck. Maybe he doesn't care, because he's never had to turn a real profit in his life.
This is the sort of person building perhaps the most important human infrastructure since the power grid and the Internet itself. You can't argue with them in the way the author of the MCP evaluation article does. They don't comprehend. They CANNOT comprehend. Their brains do not have a theory of mind sufficient for writing a spec robust to implementation by other minds.
That's why they ship SDKs. It's the only thing they can. Their specs might as well be "Servers SHOULD do the right thing. They MUST have good vibes." Pathetic.
God help us.