AI CodingAI Agent

Seven Months In: How Coding with AI Has Shifted

You open Claude Code, type a prompt. It reads files, writes code, runs tests on its own, without asking you in between. You glance at the result, send the next prompt, it keeps going. Through this whole process, you make about 20% of the execution decisions. The other 80% it figures out itself.

This isn’t a feeling. Anthropic measured this number from 400,000 real sessions (report, PDF).

From Fixing Bugs to Shipping Whole Things

Over seven months, debugging sessions dropped from 33% to 19%, nearly cut in half.

The freed-up share flowed to operations and writing, each roughly doubling. Ops went from 14% to 21%, writing and data analysis from about 10% to around 20%.

Overall, about 56% of sessions involve directly writing code, 17% do ops, 14% plan and explore systems, 13% do analysis and writing. Over 40% of sessions no longer have writing code as their core purpose.

Debugging didn’t disappear, it just stopped occupying its own session. It got embedded into larger workflows. Write some code, throw in the tests while you’re at it, update the deployment config, refresh the changelog. Debugging scattered across these actions, became a step rather than the whole thing.

From fixing bugs to shipping whole things

You no longer open AI to fix one bug, but to run something end to end. Give one prompt, it finds the files itself, rewrites the logic, passes the tests, writes the changelog, then deploys. You went from operator to gatekeeper.

During the same period, the total value of these tasks went up 27%. Building new features went up 43%, ops up 34%, fixes up 32%. Every category rose, none got diluted. A session used to fix one function; now a session reads through an entire module, rewrites the logic, passes tests, then deploys. The work blocks handed to AI are getting bigger.

The work blocks handed to AI are getting more valuable

What you save isn’t on the keyboard. You used to think AI saves typing time, but what it actually saves is entire stretches of flow: back-and-forth confirmations, context switching, going from editor to test to deployment. What gets saved is in the process, not in the keystrokes.

Who’s Using It

The people coding with AI are changing too. About 70% of sessions can be linked to an occupation, and the fastest-growing groups are management, sales, and legal.

Lawyers use it to find missing clauses in contracts, sales directors use it to adjust the backend SQL of a data dashboard, product managers use it to modify CI config files. They don’t write code for a living, but they use AI to write code that solves problems in their own domain.

The validation success rate for software-related professionals is 34%, for other occupations 29%, with the maximum gap under seven percentage points. Programming background opens some distance, but not a fundamental gap.

The barrier to programming shifted from whether you can write code to whether you can articulate the problem you’re solving. You understand your domain better than AI does. A lawyer knows which clauses a contract is missing, an accountant knows the reconciliation rules. Explain what you know clearly, and AI will run it right.

Say What You Want

Knowing what you want is one thing, getting AI to execute it right is another. This distance, the data can measure.

You send one prompt, AI runs 12 steps in one go, produces 3,200 words. Another person sends one prompt, AI runs only 5 steps, produces 600 words. Same tool, five times the difference. The gap isn’t in how long the prompt is, it’s in whether you know what you want. Know what you want, AI doesn’t need to confirm back and forth, pushes through in one go. Don’t know what you want, AI has to keep guessing, and every wrong guess is a wasted round-trip.

In a typical session, humans handle about 70% of the planning decisions: what to do, which path to take, what counts as done. The agent handles about 80% of the execution decisions. The division of labor between the question-setter and the problem-solver has already formed.

The precision of judgment here doesn’t look at your resume, it looks at how deep your understanding of the current problem is. A senior engineer asking a Rust question for the first time is a novice in that session. An accountant who has never written Python, but explains the reconciliation rules and edge cases clearly, is an expert in that session. It’s not about your title, it’s about the depth of your understanding of the task in front of you.

This report does more than tell you where the industry is heading. It also marks out a few clear directions for practice.

Practice this. Start with one thing: before each prompt, think about what you’ll use to judge whether it did the right thing after it finishes. Add acceptance criteria. This is the easiest to skip, and the highest payoff. Once acceptance criteria are set, the agent can judge intermediate results itself, without waiting for your confirmation at every step. Once you’re comfortable, add constraints, don’t aim to list everything the first time. After one run, the agent will hit boundaries, and you add them back. This runs on a separate track from programming experience, you don’t need to be a ten-year veteran before practicing.

The data has a clear inflection point here. Novice-level sessions have about 15% validation success, intermediate and above 28% to 33%. From 15% to 28% is a big jump, the payoff from can’t-do to can-do is the largest. 28% to 33% is a gentle slope, there’s not much more to gain above that.

From novice to intermediate is a big jump, then it mostly flattens out

When It Goes Off Track

You can’t see the gap on smooth roads. The success rates of novices and intermediate-plus are about the same when things go well. But in difficult sessions, novices achieve validation success only 4% of the time, experts 15%, nearly four times the difference.

The abandonment rate tells the flip side of the same story. 19% of novice sessions end in abandonment, not a single line of code written, while other users only see 5% to 7%. When the going is smooth, expert and novice success rates are close, but after getting stuck, the expert’s success rate is nearly four times the novice’s. The gap isn’t when things go smoothly, it’s when things get stuck. Experts win at getting unstuck, not at volume production.

Getting unstuck is something you can practice.

Practice this. When stuck, don’t ask why it’s not working, ask from which step things started going wrong. Go back to the conversation log, find where the deviation first appeared. Write a correction prompt at that step, don’t delete the whole session and start over, just continue from that node. Keep the correct steps that already ran, only fix the part that went wrong. The cost of starting over is bigger than you’d think: the agent follows the existing conversation logic forward, the context is already aligned, and if you delete and start over you throw all that alignment away.

Take sessions where you normally get stuck and practice this specifically. Go back to the beginning, trace through sentence by sentence to find the deviation. Once found, write a correction prompt, continue from there. Practicing deviation-finding ten times is more useful than running a hundred smooth sessions.

Running It End to End

As mentioned earlier, debugging sessions got cut nearly in half, ops and writing doubled. People are moving from single-shot bug fixes to letting AI run something end to end. A single prompt isn’t enough anymore. You need to know how to break something into steps, check each one after it’s done, then move forward.

Practice this. First break the task into three to five steps, each with its own output and verification method. Stop and look after each step, only continue if it passes. Once you’re comfortable, turn the checkpoints into rules for AI to verify itself, and auto-advance when passed. Orchestration goes from manual to semi-automatic.

When users keep execution decisions in their own hands, Claude does about 8 operations per turn. Let the agent also take over the judgment of what to do next, and operations per turn jump to about 16. Letting the agent do more planning is feasible, the precondition is that it fully grasps what you actually want upstream. Upstream precision determines downstream delegation boundaries.

Many people can write prompts, few can design multi-step flows. This isn’t some new skill, but it’s the most skipped one.

This data comes from Anthropic’s analysis of 400,000 Claude Code sessions, covering approximately 235,000 users, spanning October 2025 to April 2026, all from interactive sessions, excluding automated pipeline calls. An average session has about 4 turns, producing about 2,400 words. Orchestration isn’t a variable directly measured in the report, it’s an engineering inference drawn from usage migration. “Good enough” refers to in-session observable success signals: whether git commit passes CI, whether tests go green, whether the user confirms, excluding long-term maintenance costs or business outcomes. Seven months is a directional early signal, during which model versions iterated, product forms changed, and user proficiency rose, these changes are the result of multiple forces acting together.

Say it clearly, pull it back, chain it together. Chasing these directions is enough to practice for a while.