The combination of Flutter + Claude Code makes cross-platform app development really, really fast. I've been impressed with how well Clause handles prompts like, "This list should expand on the web, but not on iOS." I then ask it (Claude) to run both a web instance and an iOS simulator instance. Can usability test in-tandem.
I recently (as in, last night) added WebSockets to my backend, push notifications to my frontend iOS, and notification banner to the webapp. It all kinda just works. Biggest issues have been version-matching across with Django/Gunicorn/Amazon Linux images.
You don't, and as long as you're comfortable with that you keep prompting to dig yourself out of holes.
The problem is unless your ready to waste hours prompting to get something exactly how you want it, instead of spending those few minutes doing it yourself, you start to get complacent for whatever the LLM generated for you.
IMO it feels like being a geriatric handicap, there's literally nothing you can do because of the hundreds or thousands of lines of code that's been generated already, you run into the sunk cost fallacy really fast. No matter what people say about building "hundreds of versions" you're spending time doing so much shit either prompting or spec writing that it might not feel worth getting things exactly right in case it makes you start all over again.
It's literally not as if with the LLM things are literally instantaneous, it takes upwards or 20-30 minutes to "Ralph" through all of your requirements and build.
If you start some of it yourself first and you have an idea about where things are supposed to go it really helps you in your thinking process too, just letting it vibe fully in an empty directory leads to eventual sadness.
This. I find bringing in the LLM when there is a good structure already in place is better. I also use it sparingly, asking it for very specific things. Write me tests for this, or create me a function that does this or that. Review this, extend that etc.
They are pretty good at "scaffold this for me" and you adapt as a second step.
That is one of the three uses I give them.
The other two are: infra scripting, which tends to be isolated: "generate a python script to deploy blabla with oarameters for blabla...". That saves time.
The third use is exploring alternative solutions, high level, not with code generation, to stimulate my thinking faster and explore solutions. A "better" and more reasoned search engine. But sometimes it also outputs incorrect information, so careful there and verify. But at least it is successful at the "drop me ideas".
For big systems, generating a lot of code that I have no idea of what I end up with, that when I get bugs is going to be more difficult to modify, understand and track (Idk even the code, bc it outputs too much of it!).
Or for designing a system from zero, code-wise is not good enough INHO.
oh, a fourth thing it does well is code review, that one yes. As long as you are the expert and can quickly discard bs feedback there is always something valuable.
And maybe finding bugs in a piece of code.
Definitely, for designing from scratch it is not reliable.
Yes, I agree on all points. Also, I keep finding new use cases all the time. So, going all poetic; part of me laments the death of my craft, and the other rejoices at the superpowers of what rises from the ashes...
This reminds me of how some famous artists would paint via their studios wherein assistants put most of the pant on the canvas, under the direction / modeled off an example, and with the signature / embellishments of the named artist.
LLMs would not be popular if "spending those few minutes doing it yourself" part was true. In actuality it can be hours, days, or weeks depending on the feature and your pickiness. Everyone acts as if they are the greatest developer and that these tools are subpar, the truth is that most developers are just average, just like most drivers are average but think of themselves as above average. All of the sudden everyone that was piecing together code off of stackoverflow with no idea how to build the damn thing is actually a someone who can understand large code bases and find bugs and write flawless code? Give me a break.
To the degree that those same people are now writing 10-100x more code...that is scary, but the doom and gloom is pretty tiring.
The SO copy-pasting is actually quite accurate. The same folks are now just blindly generating code. That's why most software in the world is shit, and will continue to be in the future. There might just be more of it.
There will most definitely be much more of it, maybe machines are doing this on purpose to increase dependency on them haha. Ultimately, wagging a finger at someone will have no outcome, allowing someone to make real mistakes while vibe coding will be a much better learning experience. Someone that drops a prod database using Claude will have a very lasting memory of that(not saying that should be the goal, critical thinking obviously matters A LOT). Cars didn't used to have seatbelts, a lot of people died, then they got seatbelts and now the world is a better place.
The problem is maintenance and understanding code in the presence of bugs.
It looks very productive at first sight but when you start to find problems it is going to be a lot of fun on a production system.
Because basically you cannot study all the output that the LLM throws line by line if what you want is speed.
Which leaves reliability compromised.
Also, sometimes LLMs throw a lot of extra and unncessary code making things more barroquw than if you had sat down and thought a bit about the problem a bit.
Yes, you can deliver faster code with LLMs, maybe. But it is going to be good enough for maintenance and bug fixing?
I never said anything against using LLMs. You're projecting.
Any engineer worth their weight will always try to avoid adding code. Any amount of code you add to a system, whether is written by you or a all knowing AI is a liability. If you spent a majority of your work day writing code it's understandable to want to rely heavily on LLMs.
Where I'd like for people to draw a line on is not knowing at all what the X thousand lines of code are doing.
In my career, I have never been in a situation where my problems could be a solved by piecing together code from SO. When I say "spend those few minutes doing it yourself" I am specifically talking about UI, but it does apply to other situations too.
For instance, if you had to change your UI layout to something specific. You could try to collect screenshots and articulate what you need to see changed. If you weren't clear enough that cycle of prompting with the AI would waste your time, you could've just made the change yourself.
There are many instances where the latter option is going to be faster and more accurate. This would only be possible if you had some idea of your code base.
When you've let an agent take full control of your codebase you will have to sink time into understanding it. Since clearly everyone is too busy for that you get stuck in a loop, the only way to make those "last 10%" changes is *only* via the agent.
I didn't say anything about your beliefs in AI, my statement was general. You're projecting.
It is still possible to write code with AI AND educate yourself on what the codebase architecture is. Even better, you can educate yourself on good software engineering and architecture and build that into making better specs. You can understand what the code is doing by having good tests, observability, and actually seeing it work. But if you're after peeping what every character is doing, I am not going to stop you!
LLMs are not reliable to fix bugs and when they do they often introduce new ones in my experience.
In fact, humans do targetted bug fixing reasonably well and knowing they did not change the structure of other code better than LLMs currently do in my experience.
I do not find them reliable enough TBH to leave such delicate tasks in a production system in their hands.
Yeah… I wonder how you write complex software without something that looks like a spec, other than slowly. It seems like the prep work is unavoidable, and this contrarian opinion you are offering is just that.
Writing the spec is becoming the default for pet projects. Which would be a good thing if the spec wasn't also partly written by an LLM.
You can already see people running into these issues, they have a spec in mind. They work on the spec with and LLM, the spec has stuff added to it that wasn't what they were expecting.
And again, I am not against LLMs but I can be against how they're being used. You write some stuff down, maybe have the LLM scaffold some skeleton for you. You could even discuss with the LLM what classes should be named what should they do etc. just be in the process so the entire code base isn't 200% foreign to you by the time it's done.
Also I am no one's mother, everyone has freewill they can do whatever they'd like. If you don't think you have better things to do than to produce 3-5 pieces of abandonware software every weekend then good for you.
Same as any other software team? You keep an eye on all PRs, dive deep on areas you know to be sensitive, and in general mostly trust till there's a bug or it's proven itself to need more thorough review.
I've only ever joined teams with large, old codebases where most of the code was written by people who haven't been at the company in years, and my coworkers commit giant changes that would take me awhile to understand so genAI feels pretty standard to me.
I love using AI and find it greatly increases my productivity, but the dirty little secret is that you have to actually read what it writes. Both because it often makes mistakes both large and small that need to be corrected (or things that even if not outright wrong, do not match the style/architecture of the project), and because you have to be able to understand it for future maintenance. One other thing I've noticed through the years is that a surprising number of developers are "write only". Reading someone else's code and working out what it's doing and why is its own skillset. I am definitely concerned that the conflux of these two things is going to create a junk code mountain in the very near future. Humans willing to untangle it might find themselves in high demand.
And, as well as noticing actual semantic issues, it's worth noting where they've mixed up abstractions or just allowed a file to grow to an unsustainable size and needs refactoring. You can ask the AI agent to do the refactoring, with some guidance (e.g. split up this file into three files named x, y, z; put this sort of thing in x, ...). This helps you as a human to understand their changes, and also helps the AI. It also makes you feel in control of the overall code design, even though you're no longer writing all the details.
They'll often need a little final tuning afterwards (either by hand or ask the AI again) e.g. move this flag from x to y. As is often the case, it's just like you have an enthusiastic and very fast but quite junior dev working for you.
I loved learning Computer Engineering in college because it de-mystified the black box that was the PC I used growing up. I learned how it worked holistically, from physics to logic gates to processing units to kernels/operating systems to networking/applications.
It's sad to think we may be going backwards and introducing more black boxes, our own apps.
I personally don't "hate" LLMs but I see the pattern of their usage as slightly alarming; but at the same time I see the appeal of it.
Offloading your thinking, typing all the garbled thoughts in your head with respect to a problem in a prompt and getting a coherent, tailored solution in almost an instant. A superpowered crutch that helps you coast through tiring work.
That crutch soon transforms into dependence and before you know it you start saying things like "Once you vibe code, you don't look at the code".
I think a lot of people, regardless of whether they vibe code or not are going to be replaced by a cheaper sollution. A lot of software that would've required programmers before can now be created by tech savy employees in their respective fields. Sure it'll suck, but it's not like that matters for a lot of software. Software Engineering and Computer Science aren't going away, but I suspect a lot of programming is.
I've been around for a while. The closest we ever got was probably RPA. This time it's different. In my organisation we have non-programmers writing software that brings them business value on quite a large scale. Right now it's mainly through the chat framework we provide them so that they aren't just spamming data into chatGPT or similar. A couple of them figured out how to work the API and set up their own agents though.
Most of it is rather terrible, but a lot of the times it really doesn't matter. At least most of it scales better than Excel, and for the most part they can debug/fix their issues with more prompts. The stuff that turns out to matter eventually makes it to my team, and then it usually gets rewritten from scratch.
I think you underestimate how easy it is to get something to work well enough with AI.
I assume he’s mostly joking but… how often do you look at the assembly of your code?
To the AI optimist, the idea of reading code line by line will see as antiquated as perusing CPU registers line by line. Something do when needed, but typically can just trust your tooling to do the right thing.
I wouldn’t say I am in that camp, but that’s one thought on the matter. That natural language becomes “the code” and the actual code becomes “machine language”.
And you could say that the difference is that high-level languages are deterministically transformed down, but in practice the compiler is so complex you'd have no idea what it's doing and most people don't look at the machine code anyway. You may as well take a look at the LLM's prompt and make an assumption of the high-level code that it spits out.
Honestly I'm not so strongly opiniated now as I was a few weeks ago. I'm in a huge questioning phase about my work/craft/hobby.
I've worked places where junior made bad code that was accepted because the QA tests were ok.
I even had a situation in production where we had memory leaks because nobody tried to use it for more than 20 minutes when we knew that the app is used 24/7.
We aim for 99% quality when no-one wants it.
No-one wants to pay for it.
Github is down to one 9 and I haven't heard them losing many clients, people just cope.
We've reached a level where we have so much ram that we find garbage collection and immutability normal, even desired.
We are wasting bandwidth by using json instead of binary because it's easier to read when have to debug, because it's easier to debug while running than to think before coding.
I built a system that can hold 40,000 concurrent users with hardly 2 GiB of RAM, a bit of bandwidth (with a 300 Mbps connection it works great) using Capnproto and that I am scaling horizontally.
A server with 6 cores can hold at least 3 of these services. Now think of customers. How much you are going to save in operations? Loooooots! For backend efficiency and quality is still a critical metric, especially when operations are so cheap (tens of dollars per month).
The trick is to separate your codebase into "code I care about that I give the AI a fixed API and rarely let the AI touch" and "throwaway code I don't give one iota of damn about and I let the AI regenerate--sometimes completely from scratch".
For me, GUI and Web code falls into "throwaway". I'm trying to do something else and the GUI code development is mostly in my way. GUI (especially phone) and Web programming knowledge has a half-life measured in months and, since I don't track them, my knowledge is always out-of-date. Any GUI framework is likely to have a paroxysm and completely rewrite itself in between points when I look at it, and an LLM will almost always beat me at that conversion. Generating my GUI by creating an English description and letting an AI convert that to "GUI Flavour of the Day(tm)" is my least friction path.
This should be completely unsurprising to everybody. GUI programming is such a pain in the ass that we have collectively adopted things like Electron and TUIs. The fact that most programmers hate GUI programming and will embrace anything to avoid that unwelcome task is pretty obvious application of AI.
Man, people say this kind of thing, and I go… really? …because I use Claude code, and an iOS MCP server (1) and hot damn I would not describe the experience as “just works”.
What MCP and model are you using to automate the testing on your device and do automated QA with to, eg. verify your native device notifications are working?
My experience is that Claude is great at writing code, but really terrible at verifying that it works.
What are you using? Not prompts; like, servers or tools or whatever, since obviously Claude doesn’t support this at all out of the box.
Claude set up my whole backend on AWS. That includes a load balancer, web server, email server, three application servers, and a bastion server to connect to their VPN.
It configured everything by writing an AWS Terraform file. Stored all secrets in AWS as well.
Everything I do is on the command line with Claude running in Visual Studio Code. I have a lot of MacOS X / Ubuntu Linux command line experience. Watching Claude work is like watching myself working. It blew my mind the first time it connected through the bastion to individual AWS instances to run scripts and check their logs.
So yeah, the same Claude Code instance that configured the backend is running inside a terminal in VS Code where I’m developing the frontend. Backend is Django/Python. Frontend is Flutter/Dart. Claude set up the WebSocket in Django/Gunicorn and the WebSocket in Flutter.
It also walked me through the generation of keys to configure push notifications on iOS. You have to know something about public/private key security, but that amounts to just generating the files in the right formats (PEM vs P12).
I'd like to know too! I feel like many people are playing a whole different AI game than me, and I don't think I've written a single line of code since December (team experiment to optimize the vibe coding process)
I recently (as in, last night) added WebSockets to my backend, push notifications to my frontend iOS, and notification banner to the webapp. It all kinda just works. Biggest issues have been version-matching across with Django/Gunicorn/Amazon Linux images.