The More You Push the AI to Build an App, the More Confused It Gets

I wrote the other day about my first attempt to write a Flutter app for mobile using the AI to help.

Now, let's note that I asked it to show me how do build it step-by-step instead of building it itself. But from one point forward it's the same thing, because after it builds it, you would likely have requests for updates or things that don't quite work out the way you wanted, so you'd still go through a series of feedback and rebuilds until you get something you like, if you are lucky.

Today I got back to this little pet project and see if I could move it forward from where we left off on Sunday.

I didn't close the tab with the conversation with ChatGPT, and it was still there... When I restarted the conversation, the LLM picked it up well from where we left off.

It has this good layout at every stage of the development, describing where we are, the steps that need to be done and what should work if all steps are completed. Here's an example of the summary it makes at the end of each stage:

But where ChatGPT messes up is on details. I've noticed that on Sunday as well as today.

For example, in a block of code it was recommending me to add for one of the steps, it mixed the logic of operations, adding a line inside a block instead of outside it. When I pointed out the error, it recognized and fixed it (but said "you have an error in your code" or something like that, not that it provided the wrong code snippet; so, they definitely learned well from us how not to assume blame and pass it to others 😄)

I also observed it tends to mix up the name of variables from one step to a future one, so it lacks the consistency to not lose the logical thread of the context (particularly on the variable names, I noticed) from one prompt to another.

It's true, I did stress test it. I mean, it was trying to follow a plan, a structured way to present the steps necessary to build the app. And I kept interrupting... Either asking it to fix bugs, changing priorities between what it proposed and what I wanted to build next, or reminding the AI we don't have a feature it talks about yet (that's another thing - at some point it started to talk about a feature we haven't built yet as if it was already built).

The deeper we go into this (and this is a simple app), the less likely it is for the AI to be useful, in my opinion, on the development of the app. It can help a coder speed up his work through the IDE or for research, maybe can even write pieces of software, but at least on a step-by-step basis, the LLM seems to have issues staying focused on the details, even if it nails the general-context quite well.

One thing we have to consider is that the LLM doesn't actually see the full code, while I can. Maybe it could be a good refresher for the AI if I would provide the full context of the code from time to time. Otherwise, it could be a monumental job for the AI to keep parsing the conversation and see what changes we made to the code, to put them together logically, and remember them correctly.