We've already talked; but let's squash what appears to be a feeling of a lack of answers for you.
Truth is, this has been a learning process for us all with these tools, but it needs to be understood -- especially going in -- that these models excel at translation tasks and constrained problem spaces but can struggle with generating cohesive, large-scale code without specific hand-holding.
This is generally what I do:
1. Start with the "Whole" Picture: Models often work best when they know the final goal and the prompter has worked backwards from them. Think ontologically: define the problem as if you’re describing it to a junior dev colleague who only understands outcomes, not methods. Instead of just prompting with specs, explain the end-state you want (even simple features like error handling or specific libraries). If you have ANY achieved method or inclusion for what the end-state should include, you write it out clearly.
2. Break Down the Process: Models handle complexity better if it's broken down into micro-tasks. Instead of expecting it to design an entire feature, ask for components step-by-step, integrating each output with the rest manually. There is a very decent chance that you have to do this across multiple new chats, after 3-5 iterations in, the AI will most likely crash and burn. At that point, you open a new chat, paste in the whole working codebase, and picked up from where you left off in the last chat. You have to do this A LOT.
3. Iterative Refinement: When the model generates code, go over it closely. Check for errors, then use targeted prompts to fix specific issues rather than requesting whole rewrites. Point out exact issues and ask for specific fixes; this prevents the model from “looping” through similar incorrect solutions.
Some Hacks I Use As Well:
1. Contextual Repetition: Reinforce key components (e.g., function structure, file organization) to avoid losing them in longer prompts.
2. Use “As if” Phrasing: Prompt the model to act “as if” it’s coding for a hypothetical person (e.g., a junior dev). It’s surprisingly effective at generating more thoughtful code with this type of frame.
3. Ask for Questions: Have the model ask you clarifying questions if it’s “unsure.” This can uncover key details you may not have thought to include.
4. Remind It What It Is Doing: Sounds counter-productive, but almost all of my code chats end with a description of what exactly I expect from the AI, iterated over the various stunts and "shortcuts" that it has taken over the years I've used it. I generally say "Write the code in full with no omissions or code comment-blocks or 'GO-HERE' substitutions" (this is directly because AI has generally pulled "/rest of code goes here/ on me several times), "write the code in multiple answers if you must, pausing at the generic character limit and resuming when I say 'continue' in the next message" (because I've had "errors" from code generation in the past because the chat reply processing time had timed out).
It's a labor of love and it's things you learn over time, and it won't happen if you don't put the work in.
-------------------------
I wrote all of this haphazardly in a Google Doc. GPT-4 organized it for me cleanly.
Beware of trying to get the LLM to output exactly the code you want. You get points for checking code in git and sending PRs, not tokens the LLM outputs. If it's being stupid and going in circles,
or you know from experience that the particular LLM used will (they vary greatly in quality), you can just copy the code out (if you're not using some sort of AI IDE), fix it, then paste that in and/or commit it.
Some may ask, if you have to do that, then why use an LLM in the first place. It's good at taking small/medium conceptual tasks and breaking them down, and it's also a faster typer than me. Even though I have to polish its output, I find it easier to get things done because I can focus more on the higher level (customer) issues while the LLM gets started with lower level details on implementing/fixing things.
Thank you! This information is the kind of information for which I’ve been searching.
That said, l feel like there’s a mutual-exclusivity problem between ‘Start with the "Whole" Picture’ and ‘Break Down the Process’.
For example, how does this from your first suggestion:
> explain the end-state you want (even simple features like error handling or specific libraries). If you have ANY achieved method or inclusion for what the end-state should include, you write it out clearly.
not contradict this from your second suggestion:
> Instead of expecting it to design an entire feature, ask for components step-by-step
Additionally, you said:
> There is a very decent chance that you have to do this across multiple new chats, after 3-5 iterations in, the AI will most likely crash and burn. At that point, you open a new chat, paste in the whole working codebase, and picked up from where you left off in the last chat. You have to do this A LOT.
But IME, by the time the model chokes on one chat, the codebase is already large enough that pasting the whole thing into another chat typically results in my hitting context-window limits. Perhaps, in the kinds of projects I typically work, a good RAG tool would offer better results?
To be clear, right now I’m only discussing my difficulties with the chatbots offered by the model providers—which, for me, is mostly Claude but also a bit of ChatGPT; my experience with Copilot is outdated so it probably deserves another look, and I’ve not yet tried some of the third-party, code-centric apps like aider or cursor that have previously been suggested, though I will soon.
As for your recommended hacks, these look to be helpful; thank you! The only part I find odd is your inclusion of “Write the code in full with no omissions or code comment-blocks or 'GO-HERE' substitutions”; I myself feel like I get far better results when I ask the model to 1) write full code for the methods that are likely to be the kinds of generic CS logic that a junior would know, 2) write stubs for the business logic, then 3) implementing the more complex business logic myself manually. IOW—and IME—they’re really good at writing boilerplate and generating or reasoning about junior-level CS logic. That’s indeed helpful to me, but it’s a far cry from the kinds of “ChatGPT can write entire apps with minimal effort” hype I keep seeing, and it’s only marginally better, IME at least, than what I’ve been able to do with the inline-completion and automatic boilerplate features that have been included in the IDEs I’ve used for over a decade.
> It's a labor of love and it's things you learn over time, and it won't happen if you don't put the work in.
Indeed. I do love playing with this stuff and learning more. Thank you again for sharing your knowledge!
> I wrote all of this haphazardly in a Google Doc. GPT-4 organized it for me cleanly.
I am regularly impressed at how well these models behave when asked to summarize a document or even when asked to expand a set of my notes into something more coherent; it’s truly remarkable!
Truth is, this has been a learning process for us all with these tools, but it needs to be understood -- especially going in -- that these models excel at translation tasks and constrained problem spaces but can struggle with generating cohesive, large-scale code without specific hand-holding.
This is generally what I do:
1. Start with the "Whole" Picture: Models often work best when they know the final goal and the prompter has worked backwards from them. Think ontologically: define the problem as if you’re describing it to a junior dev colleague who only understands outcomes, not methods. Instead of just prompting with specs, explain the end-state you want (even simple features like error handling or specific libraries). If you have ANY achieved method or inclusion for what the end-state should include, you write it out clearly.
2. Break Down the Process: Models handle complexity better if it's broken down into micro-tasks. Instead of expecting it to design an entire feature, ask for components step-by-step, integrating each output with the rest manually. There is a very decent chance that you have to do this across multiple new chats, after 3-5 iterations in, the AI will most likely crash and burn. At that point, you open a new chat, paste in the whole working codebase, and picked up from where you left off in the last chat. You have to do this A LOT.
3. Iterative Refinement: When the model generates code, go over it closely. Check for errors, then use targeted prompts to fix specific issues rather than requesting whole rewrites. Point out exact issues and ask for specific fixes; this prevents the model from “looping” through similar incorrect solutions.
Some Hacks I Use As Well:
1. Contextual Repetition: Reinforce key components (e.g., function structure, file organization) to avoid losing them in longer prompts.
2. Use “As if” Phrasing: Prompt the model to act “as if” it’s coding for a hypothetical person (e.g., a junior dev). It’s surprisingly effective at generating more thoughtful code with this type of frame.
3. Ask for Questions: Have the model ask you clarifying questions if it’s “unsure.” This can uncover key details you may not have thought to include.
4. Remind It What It Is Doing: Sounds counter-productive, but almost all of my code chats end with a description of what exactly I expect from the AI, iterated over the various stunts and "shortcuts" that it has taken over the years I've used it. I generally say "Write the code in full with no omissions or code comment-blocks or 'GO-HERE' substitutions" (this is directly because AI has generally pulled "/rest of code goes here/ on me several times), "write the code in multiple answers if you must, pausing at the generic character limit and resuming when I say 'continue' in the next message" (because I've had "errors" from code generation in the past because the chat reply processing time had timed out).
It's a labor of love and it's things you learn over time, and it won't happen if you don't put the work in.
-------------------------
I wrote all of this haphazardly in a Google Doc. GPT-4 organized it for me cleanly.