Where humans matter: Agentic Coding in Practice
What I Learned from AI-Assisted Software Development and How I Approach It Today
Agentic Coding: What I’ve learned and how I approach it today
The discussion about agentic coding fluctuates between two extremes: “AI will solve everything” on the one hand, “It’s all just hype” on the other.
Both positions fall short. After months of intensive work with coding agents, I would like to share with you what I have learned in the process – and how my way of working has changed.
The current state of affairs: 7 theses
Before I get to my own specific experiences, here are 7 core theses written by Simon Wardley, which I have supplemented with my own perspective:
Development is not yet engineering. While testing has become a systematic discipline through practices such as TDD, development remains largely intuition-driven. There are patterns, but no consistent system. Agentic coding could be a catalyst for this transformation, if we approach it correctly.
Small, contextual tools beat monoliths with LLM on top. The prevailing approach of simply enriching existing systems with LLM capabilities does not exploit the potential. More effective are combinable tools with clear inputs, outputs, and specific application contexts.
LLMs are coherence machines, not truth machines. They optimize for plausibility, not correctness. This makes them valuable for drafting and exploration, but unreliable for final decisions without human validation.
Code is more than functionality – structure is the real decision. Architectural decisions manifest themselves in code. LLMs can generate functionality, but structural decisions require an understanding of the system.
The key question: Where do humans stand in the decision-making process? It’s not about whether AI is used, but where human judgment remains indispensable. This boundary must be drawn consciously.
Practices are still evolving. What is considered state of the art today may be obsolete tomorrow. Beware of hasty best practices.
Experimentation is fine, but with an awareness of the terrain. Speed without direction is just getting lost quickly.
My approach today
These theories align well with my experiences. However, theory is one thing and daily practice is another. Here’s what works for me.
A deliberately modular setup
I don’t like working with fully integrated solutions. Not on principle, but because they don’t work optimally for my workflow.
My setup consists of three components:
An IDE, such as IntelliJ IDEA, which allows me to keep track of the code. I can quickly check where everything is located. Git integration is extremely important here — it makes changes traceable and reversible. IntelliJ can do almost everything I need, including inspecting databases. Unfortunately, with power comes complexity. For smaller projects, I prefer the ZED editor because it’s more streamlined and intuitive.
I use the terminal (preferably Ghostty) with my coding agent, which is currently mainly Claude Code. There, I give instructions, observe, and control.
I use an LLM chat window for conceptual work. At the beginning of a project, I use it to work through ideas and organize them in a document before writing code.
This three-way split is no coincidence. It corresponds to the principle of specialized tools: each component has its strengths, none tries to be everything.
I use other specialized tools here and there, such as the GitHub Desktop app. But at its core, these three tools are the ones I use.
Sub-agents as the key
Perhaps the most important lesson learned in recent months is that specialized sub-agents deliver significantly better results than general-purpose agents. The reason is simple—the tailored context makes all the difference.
Two examples from my experience:
Quality assurance: A sub-agent exclusively responsible for quality assurance checks against specified guidelines and documentation. It does not advise; it validates. This is essentially TDD thinking at the agent level — explicit standards instead of intuition.
UI design: I achieve significantly better results when designing user interfaces with a specialized design sub-agent. I specify the direction the design should take and which design principles apply. The agent generates designs within these guidelines instead of working in a vacuum.
In both cases, the lever is the specialized context and the focused system prompt of the sub-agent, not the general intelligence of the model.
Validating coherence
Yes, LLM output has misled me before. In fact, it was precisely because it sounded plausible. The coherence was there, but the truth was not.
My validation process is two-stage. First, I verify what I can myself. For everything else, I use specialized sub-agents with internet access that can verify facts. However, it’s crucial to note that ultimately, humans remain responsible. The sub-agents are tools, not decision-makers.
Hallucinations don’t like to stay alone. Where one thing is wrong, other things are often invalid.
Keeping an eye on structure
When does generated code become problematic? Most obviously, when source code files become too large. There are too many lines. Too much functionality in individual functions.
My approach: I let almost everything be generated. If I want to make changes, I let the agent adapt and then check it. Experience shows that this is faster than writing it myself, unless the changes are minor restructuring or corrections. In that case, I intervene directly.
However, I am responsible for the structure. I decide when a file becomes too large, when functionality needs to be split up, and what the architecture and refactorings should look like. I usually define the architecture before coding begins and document it in Markdown files.
The real problem is communication
Ultimately, humans must decide if what has been generated is good enough. Human judgment is indispensable because only humans can determine if they have received what they wanted.
Here lies an uncomfortable truth: Even with AI, the problem is often communication. The question is not “Can AI do that?” but “Can I articulate what I want?” This is not a new insight—anyone who has ever written requirements knows this. But with Agentic Coding, it becomes immediately apparent.
Not balance, but a pendulum
Is there a perfect balance between trying things out quickly and understanding what I’m doing? I don’t think so. It’s more like swinging back and forth.
I try out ideas to see if they lead to reasonable results. At the latest, I need to understand what I’m doing when I’m convinced of the direction and want to check its viability for the future.
That’s more honest than any best practice. Practices are still evolving. Anyone who claims to have found the optimal workflow today will be working differently in six months.
The open question
The core architectural question of our time remains: Where do we place people in the decision-making process?
This is not a technical question. It is a question of organization, responsibility, and design. Every organization must answer it for itself—consciously, rather than implicitly through tool adoption.
As of today, my answer is: People decide on the structure, validate the results, and take responsibility. Agents generate, specialize, and accelerate. The boundary is not fixed; it shifts with every learning experience.
That is precisely what makes this such an interesting time.

