Project: Hail Mary, a fantasy world where geopolitics are trivially simple and every state in the world collectively agrees how great it would be to cede power and work together. (And therefore enable a genuinely fun and amazing science story which was the actual focus of the book to begin with, 10/10).
Arguing with an LLM is silly because you’re dealing with two adversarial effects at once:
- As the context window grows the LLM will become less intelligent [1]
- Once your conversation takes a bad turn, you have effectively “poisoned” the context window, and are asking an algorithm to predict the likely continuation of text that is itself incorrect [2]. (It emulating the “belligerent side of OSS maintenance” is probably quite true!)
If you detect or suspect misunderstanding from an LLM, it is almost always best to remove the inaccuracies and try again. (You could, for example, ask your question again in a new chat, but include your terminal output + clarifications to get ahead of the misunderstanding, similar to how you might ask a fresh Stack Overflow question).
(It’s also a lot less fun to argue with an LLM, because there’s no audience like there is in the comments section with which to validate your rhetorical superiority!)
I knew roughly the right path, and wanted guidance on that (cli guidance specifically). It was refusing to do so saying it wouldn’t work! It did work…:
Text is an LLMs input and output, but, under the hood, the transformer network is capable of far more than mere re-assembly and remix of text. Transformers can approximate turing completeness as their size scales, and they can encode entire algorithms in their weights. Therefore, I'd argue they can do far more than reassemble and remix. These aren't just Markov models anymore.
(I'd also argue that "understanding" and "functional brain" are unfalsifiable comparisons. What exactly distinguishes a functional brain from a turing machine? Chess once required a functional brain to play, but has now been surpassed by computation. Saying "jobs that require a human brain" is tautological without any further distinction).
Of course, LLMs are definitely missing plenty of brain skills like working in continuous time, with persistent state, with agency, in physical space, etc. But to say that an LLM "never will" is either semantic, (you might call it something other than an LLM when next generation capabilities are integrated), tautological (once it can do a human job, it's no longer a job that requires a human), or anthropocentric hubris.
That said, who knows what the time scale looks like for realizing such improvements – (decades, centuries, millennia).
Perhaps consider that I still think coding by prompting is just another layer of abstraction on top of coding.
I'm my mind, writing the prompt that generates the code is somewhat analogous to writing the code that generates the assembly. (Albeit, more stochastically, the way psychology research might be analogous to biochemistry research).
Different experts are still required at different layers of abstraction, though. I don't find it depressing when people show preference for working at different levels of complexity / tooling, nor excitement about the emergence of new tools that can enable your creativity to build, automate, and research. I think scorn in any direction is vapid.
One important reason people like to write code is that it has well-defined semantics, allowing to reason about it and predict its outcome with high precision. Likewise for changes that one makes to code. LLM prompting is the diametrical opposite of that.
It completely depends on the way you prompt the model. Nothing prevents you from telling it exactly what you want, to the level of specifying the files and lines to focus on. In my experience anything other than that is a recepy for failure in sufficiently complex projects.
Several comments can be made here: (1) You only control what the LMM generates to the extent that you specify precisely what it should generate. You cannot reasons about what it will generate for what you don't specify. (2) Even for what you specify precisely, you don't actually have full control, because the LLM is not reliable in a way you can reason about. (3) The more you (have to) specify precisely what it should generate, the less benefit using the LLM has. After all, regular coding is just specifying everything precisely.
The upshot is, you have to review everything the LLM generates, because you can't predict the qualities or failures of its output. (You cannot reason in advance about what qualities and failures it definitely will or will not exhibit.) This is different from, say, using a compiler, whose output you generally don't have to review, and whose input-to-output relation you can reason about with precision.
Note: I'm not saying that using an LLM for coding is not workable. I'm saying that it lacks what people generally like about regular coding, namely the ability to reason with absolute precision about the relation between the input and the behavior of the output.
>> One important reason people like to write code is that it has well-defined semantics, allowing to reason about it and predict its outcome with high precision. Likewise for changes that one makes to code. LLM prompting is the diametrical opposite of that.
> You’re still allowed to reason about the generated output. If it’s not what you want you can even reject it and write it yourself!
You missed the key point. You can't predict and LLM's "outcome with high precision."
Looking at the output and evaluating it after the fact (like you describe) is an entirely different thing.
For many things you can though. If I ask an LLM to create an alert in terraform that triggers when 10% of requests fail over a 5 minute period and sends an email to some address, with the html on the email looking a certain way, it will do exactly the same as if I looked at the documentation, and figured out all of the fields 1 by 1. It’s just how it works when there’s one obvious way to do things. I know software devs love to romanticize about our jobs but I don’t know a single dev who writes 90% meaningful code. There’s always boilerplate. There’s always fussing with syntax you’re not quite familiar with. And I’m happy to have an AI do it
I don’t think I am. To me, it doesn’t have to be precise. The code is precise and I am precise. If it gets me what I want most of the time, I’m ok with having to catch it.
Technologies: My current favorite stack for building products is with TypeScript, React, (or React Native), PostgreSQL, Vite, Linux (Arch, Debian, or Ubuntu), Bash, and Nginx
Hello! I'm a fullstack app developer. I can build a product or own features from start to finish. I would love to join your team or help out, whether as a contractor or full time hire!
In general I usually toss “follow KISS principles” into nearly every plan prompt. If you tell it that, it helps cut out a lot of the noise and only focuses on the core functionality
CSS or Tailwind has always been a tough one for me. I have banks of flashcards to help me remember stuff, (align-items, justify-content, grid-template-columns, etc.). Even with all that effort and many projects of practice, though, I've never had things click.
LLM assisted programming, however? – instant flow state. Instead of thinking in code I can think in product, and I can go straight from a pencil sketch to describing it as a set of constraints, and then say, "make sure it's ARIA compliant and responsive", and 95% of the work is done.
I feel similarly about configuration heavy files like Nginx or something. I really don't care to spend my time reading documentation, I'd rather copy paste the entire docs into the context window and then describe what I want in English.
Also good for SQL. And library code for a one off tool or API. And Bash scripting.
From my experience, (and to borrow terminology from a HN thread not long ago), I've found that once a chat goes bad, your context is "poisoned"; It's auto completing from previous text that is nonsense, so, further text generation from there exist in the world of nonexistent nonsense as well. It's much better to edit your message and try again.
I also think that language matters - An Emacs function is much more esoteric than say, JavaScript, Python, or Java. If I ever find myself looking for help with something that's not in the standard library, I like provide extra context, such as examples from the documentation.
I think the magic is often in how you prompt. The best written art I've gotten from Claude was after a very extended dialogue; eventually, in the same context window, I prompted for an essay, framed by a short poem. The results were, to me, beautiful, and extraordinarily relevant in a cathartic way. Elements of my own personal "meaning and substance" ended up getting synthesized into what I would certainly consider poetry.
If you ask an LLM to "Write me a poem", expect the equivalent of what others are calling analogous to hotel art: generic and inoffensive. However, if you inject your personalized soul and suffering into the context window, there's no reason not to expect the transformation of that soul into indistinguishably human-like prose.
I won't share the full content, because it was personalized to me and my moment.
I am quite curious though, how art without an author will grow into society.