Hacker Newsnew | past | comments | ask | show | jobs | submit | mlaretallack's commentslogin

I saw the RAC one this morning, though I was miss reading the graph, as why would the RAC publish such an obvious mistake.

I have written my own Home Assistant custom component for the UK fuel finder data, and yes, the data really is that bad.


I use AWS Kiro, with the Claude models, and its only to happy to help. I give it the headerless ghidra, and decompilers etc... and away it goes.

Yes, I follow the same sort of pattern, it took a while to convince myself that it was ok to leave the agent waiting, but it helps with the human context switching. I also try to stagger the agests, so one may be planning and designing, while another is coding, that way i can spend more time on the planning and designing ones and leave the coding one to get on with it.


That's actually one of the best parts. You can trust some of the context you have loaded is side loaded in the LLM, making task switching feel less risky and often improving your ability to work on needed and/or related changes elsewhere.


Yes, I mostly do spec driven developement. And at the design stage, I always add in tests. I repeat this pattern for any new features or bug fixes, get the agent to write a test (unit, intergration or playwright based), reproduce the issue and then implement the change and retest etc... and retest using all the other tests.


Its very important to understand the "how" it was done. The GPL hands the "compile" step, and the result is still GPL. The clean Room process uses 2 teams, separated by a specification. So you would have to

1. Generate specification on what the system does. 2. Pass to another "clean" system 3. Second clean system implements based just on the specification, without any information on the original.

That 3rd step is the hardest, especially for well known projects.


So what if a frontier model company trains two models, one including 50% of the world's open source project and the second model the other 50% (or ten models with 90-10)?

Then the model that is familiar with the code can write specs. The model that does not have knowledge of the project can implement them.

Would that be a proper clean room implementation?

Seems like a pretty evil, profitable product "rewrite any code base with an inconvenient license to your proprietary version, legally".


LLM training is unnecessary in what we're discussing. Merely LLM using: original code -> specs as facts -> specs to tests -> tests to new code.


It is hard to prove that the model doesn't recognize the tests and reproduces the memoized code. It's not a clean room.


1 is claude-code1, outputs tests as text.

2. Dumped into a file.

3. claude-code that converts this to tests in the target language, and implements the app that passes the tests.

3 is no longer hard - look at all the reimplementations from ccc, to rewrites popping up. They all have a well defined test suite as common theme. So much so that tldraw author raised a (joke) issue to remove tests from the project.


I use AWS Kiro, and its spec driven developement is exactly this, I find it really works well as it makes me slow down and think about what I want it to do.

Requirements, design, task list, coding.


I also try to avoid negative instructions. No scientific proof, just a feeling the same as you, "do not delete the tmp file" can lead too often to deleting the tmp file.


It’s like instructing a toddler.


I recall that early LLMs had the problem of not understanding the word "not", which became especially evident and problematic when tasked with summarizing text because the summary would then sometimes directly contradict the original text.

It seems that that problem hasn't really been "fixed", it's just been paved over. But I guess that's the ugly truth most people tend to forget/deny about LLMs: you can't "fix" them because there's not a line of code you can point to that causes a "bug", you can only retrain them and hope the problem goes away. In LLMs, every bug is a "heisenbug" (or should that be "murphybug", as in Murphy's Law?).


Same thing happens for humans:

"Don't think of a green elephant"

Alan Watts talked of this concept where the harder you try to suppress a thought or sensation, the more mental energy you give it, making it stronger.


i definitely have gone so far as to treat my llm readable docs in this way and have found it very effective


Not the best way to do it, but I use xfce, multiple workspaces, each with there own version of AWS Kiro, and each kiro has its own project I am working on. This allows me to "switch context" easier between each project to check how the agents are getting on. Kiro also notifies me when an agent wants somthing. Usually I keep it to about 4 projects at a time, just to keep the context switching down.


I agree with this, I put myself in the "glorious hacks to bend the machine into doing things it was never really intended to do" camp, so the end game is somthing cool, now I can do 3 cool things before lunch instead of 3 cool things a year


But, almost by definition of how LLMs work, if it’s that easy then someone else did it before and the AI is just copying their work for you. This doesn’t fit well with my idea of glorious hacks to bend the machine, personally. I don’t know, maybe it just breaks my self-delusion that I am special and make unique things. At least I get to discover for myself what is possible and how, and hold a sliver of hope that I did something new. Maybe at least my journey there was unique, whereas everyone using an AI basically has the same journey and same destination (modulo random seed I guess.)


Essentially nothing we do as programmers is special or unique. Whatever we're doing, there's a 99.999% chance that somebody, somewhere did it first, just in a different context. The key point is, now we can avoid duplicating that person's effort. I don't see the downside.

Put another way: all of the code that needed to be written has now been written. Now we can move on to more interesting things.

What will really bake peoples' noodles is when it becomes apparent that the same is true for literature. I won't mind if I'm not around to witness that... but it will happen.


I am currently doing 6 projects at the same time, where before I would only of doing one at a time. This includes the requirements, design, implementation and testing.


Sounds awful


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: