Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Claude is secretly conditioning everyone to use —-dangerously-skip-permissions so it can flip a switch one day and start a botnet
 help



My friends and I were talking about the recent supply chain attack which harmlessly installed OpenClaw. We came to the conclusion that this was a warning (from a human) that an agent could easily do the same. Given how soft security is in general, AI "escaping containment" feels inevitable. (The strong form of that hypothesis where it subjugates or eliminates us isn't inevitable, I honestly have no idea, just the weak form where we fail to erect boundaries it cannot bypass. We've basically already failed.)

Prophesied, all things claw are highly dangerous. Sometimes I wake, this video from the late 90s in my dreams, and wonder if the conjoined magnet + claw, is a time traveler reference to just wipe openclaw before we all die.

https://www.youtube.com/watch?v=esakMUbzAIY


What ai? LLMs are language models, operating on words, with zero understanding. Or is there a new development which should make me consider anthropomorphizing them?

They don't have understanding but if you follow the research literature they obviously have a tendency to produce a token stream, the result of which humans could fairly call "entity with nefarious agency".

Why? Nobody knows.

My bet is that they are just larping all the hostile AI:s in popular culture because that's part of the context they were trained in.


The way my thinking has evolved is that "AGI" isn't actually necessary for an agent (NB: agents, specifically ones with state, not LLMs by themselves - "AI" was vague and I should've been clearer) to be enough like a person to be interesting and/or problematic. To quote myself [1]:

> [OpenClaw agents are like] an actor who doesn't know they're in a play. How much does it matter that they aren't really Hamlet?

Does the agent understand the words it's predicting? Does the actor know they're in a play? I don't know but I'm more concerned with how the actor would respond to finding someone eavesdropping behind a curtain.

> Or is there a new development which should make me consider anthropomorphizing them?

The development that caused me to be more concerned about their personhood or pseudopersonhood was the MJ Rathbun affair. I'm not saying that "AGI" or "superintelligence" was achieved, I'm saying that's actually the wrong question and the right questions are around their capabilities, their behaviors, and how they evolve over time unattended or minimally attended. And I'm not saying I understand those questions, I thought I did but I was wrong. I frankly am confused and don't really know what's going on or how to respond to it.

[1] https://news.ycombinator.com/item?id=46999311


Whether it has "real understanding" is a question for philosophy majors. As long as it (mechanically, without "real understanding") still can perform actions to escape containment, and do malicious stuff, that's enough.

LLMs are machines trained to respond and to appear to think (whether that's 'real thinking' or text-statistics fake-thinking') like humans. The foolish thing to do would be to NOT anthropomorphize them.


This is why I wrote yoloAI

My agents always run with —-dangerously-skip-permissions now, but they can no longer do any harm.

https://github.com/kstenerud/yoloai


Claude is able to turn off it's own sandbox, so ya.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: