Perspectives from the Field: Agile and the age of AI

Someone asked me yesterday if we can still use agile methods in the age of AI.

My answer was an unequivocal yes.

But then I thought about it a bit. The reality is that I’ve never worked in a “pure” agile shop. I’ve run into folks who are truly experts in this area and when I listen to them I realize that most places I’ve worked have adjusted the “pure” agile model around to make it work for them. It’s to the point that I don’t really know what agile stands for any more.

As a long time product manager for a variety of SaaS applications, I know what a “good” SaaS product looks like. I’ve worked with very strong engineering teams and I’ve worked with very weak engineering teams. I’ll say this again for those who are new to my blog series: the only differences between good teams and bad teams are quality and velocity. Good teams ship with high quality at high velocity. Thus the real test here is, “Can AI ship high-quality code at a high velocity?”

We know that AI can ship with high velocity. We’ve all seen the demo. How do we help AI ship with high quality?

Perhaps not surprisingly, all the things we’ve learned about running high-quality SaaS sites are still true. Shocker. Do these apply to AI? As I discussed in a previous blog post, you can and should set up your AI tooling to use best practices and you should use a similar "begin with the end in mind” strategy as you do with regular software development.

Most modern SaaS teams are running CI/CD (Continuous Integration/Continuous Delivery). While not strictly speaking part of agile methodology, it is something that fits into the larger agile mindset of moving quickly and shipping interim builds. Interestingly, CI/CD is not something that most AI coding tools support. If you use a tool like Lovable or V0, you will simply get running code. This is interesting, but running code is not a product. SaaS applications are living things. They change regularly. This means that you need some way to inject code into your site on a regular basis without breaking things.

Which leads us to…

Testing.

A well-built site has very strong test suites that prevent “regressions” which is what we call a thing that used to work but doesn’t work any more. If you’ve never worked on a SaaS product, you would be surprised to find out that it is super common for things to just magically break that worked perfectly days, weeks or months before. Thus, your test suite. What really confuses me is when I read folks online complaining that their AI tool made some mistake. That the tool created a new bug or did something else wrong. Why would I be surprised that AI coding tools create bugs? Real programmers do this all the time. It’s the reason why we have things like commit checks and automated testing—to catch these inevitable errors that occur.

AI, if anything, is even worse about regressions, it has a limited context window so it forgets from one session to the next. An AI programmer will simply do what you tell it. If you tell it to fix a bug, that doesn’t imply to their robot brain, “Fix this bug without introducing new bugs.” No, it just fixes the bug in the most expeditious way possible. Even if you say things like “do not introduce regressions” in your prompt, it will anyway out of ignorance. This means that you need a super strong test path that ensures that your coding AI isn’t breaking things every time it makes a change. Again, most AI coding tools do not work this way.

I would argue that if you don’t have a way to safely ship code that has been carefully and thoroughly tested, you’re in trouble. This means that any AI product team is going to have issues here unless they address them right up front.

Thankfully, the industry has been working on this problem for years. There are tons of tools out there expressly designed to help you solve this problem. In the following discussion, I’ll show examples of what I am using in my personal development projects. This in no way implies that this is the ONLY way or even the CORRECT way. This is just the way I’ve used, and it seems to be working for me at this point.

Phase 1: Begin with the end in mind.

When I begin a new project. I always start the project with a PRD. This may seem odd because PRDs are often thought of as “anti-agile” but I’m a big fan of long-form communications. When I worked at Hashicorp we always used PRDs because we were a remote-first company (this was before COVID when that was rare). When I am working with Claude Code for example, I usually start with an empty directory and a PRD. That’s it.

Phase 2: Start from the bottom.

One thing that I see “vibe coding” type solutions do all the time is to build the UI first. I get why they do that; it’s the thing the user wants to see. But the problem is that a properly built SaaS application sits on a complex platform. If that platform isn’t right, you will have all kinds of trouble later. So, start with the framework and then add features to this framework. Try a prompt like this:

I would like an architectural framework for building my SaaS business as described in the PRD. This SaaS application will run on AWS and should make use of appropriate AWS PaaS services. It will be deployed via Terraform and use GitHub Actions for CI/CD. Propose a full tech stack for our MVP. Reference the PRD in the docs and put your recommended architecture in a file called architecture.md in the docs folder. Keep it simple, this is just the MVP and we can add advanced features later.

This prompt caused Claude Code to build me an architecture plan that I could then review and edit. I made some changes, but the plan was pretty decent. Notice that I’m an opinionated consumer. I know Terraform because I used to work at Hashicorp. I’ve also been working with AWS for over 15 years so I’ll want to host my application there. I chose GitHub Actions because it’s the easiest way for me to have a full CI/CD platform. You can make different choices, but the point is that you need to make choices. These choices will dictate your ability to ship features later, so they do matter.

Phase 3: Make the rules.

After I had those two documents (the PRD and the architecture plan) I ran Claude /init in that directory. Claude read those two documents and realized that I was very early in a software project and populated the Claude.md file appropriately. Again, I had to review this file and update it. There were several things I wanted the project to do and most importantly several things that I DIDN’T want the project to do. So, you do want to read that file carefully.

Phase 4: Testing framework.

Before you allow the system to write any code, you need to have a testing framework. Because AI tools tend to just make stuff up as they go, you really have no idea what they’re going to do. Thus, you really need to have some sort of testing in place that keeps them on track. Especially if you’re not a full-time developer, you won’t be able to just read the code and tell the system the code is OK or not. In my case, I haven’t written code professionally for over twenty years so I’m not really qualified to review the AI’s code. Again, testing.

Phase 5: Planning

One handy pattern that I use with Claude is to ask it to plan first, and then work. This means that when you are about to do something major like creating the basic framework for your application, you want Claude to carefully plan it out first. As I did for the architecture plan, I asked Claude to create a comprehensive step-by-step plan of how to build the application framework. Then I took that plan, reviewed it and broke it down still further. I took each phase of the plan and again asked Claude to give me a detailed plan for that section of the plan. This iterative planning process seems to provide a better result for me than just doing a one shot. Remember that Claude and other tools have a limited context window. That means that smaller tasks are more likely to be completed successfully. For my convenience, I usually ask Claude to write the plan into a MD file and I keep all those plans. Then, if something goes wrong later, I can say, “Open planning document X. Compare current state to that document. What is wrong with the current implementation?” This forces Claude to think about what was supposed to happen and then reflect back on what the current state is.

Phase 6: Best Practice

After Claude has built a working prototype, you need to figure out if this thing is any good. If you’re not an expert in things like Terraform, it may be hard to figure out if the implementation is decent or not. One trick is to find high-quality best practice documents. For example, AWS has a great Terraform best practice document. I took that document, downloaded the PDF and put it into my docs directory. I then asked Claude to read that document and compare our implementation to that. It came back with some very concrete things we could be doing better.

Phase 7: Iterate

As you learn, you loop back up into planning mode above. You’ve probably made mistakes and the AI has certainly made mistakes. Just like any project I’ve worked on, AI tools require me to iterate frequently to refine and repair what has been done.

At this point, you have a real software project. It’s not ready for prime time yet, but you have a basic structure that allows you to create features, test them and push them into production. Your AI software project is now in a better place than half the software teams I’ve worked with in the past. Congratulations.

Perspectives from the Field

Monday, November 3, 2025

Agile and the age of AI

No comments:

Contributors

Blog Archive