AI CodingAI AgentDeveloper Tools

The TypeScript vs Python Debate Is Asking the Wrong Question

The debate on Twitter and in Chinese tech groups has been remarkably consistent lately: what language should AI write code in. One camp says TypeScript, because type constraints keep AI from going rogue. Another says Python, because that’s where AI has the most training data. A third says Rust — because the compiler is so strict that “if it compiles, it’s basically bug-free,” effectively giving AI a built-in code reviewer.

These arguments share a common premise: language choice matters because it affects how well AI “knows” a language. Whichever language scores highest on benchmarks wins.

The premise is wrong. Or rather, it’s asking the wrong question.

Training Data Matters, But Can’t Explain Everything

Let’s start with the numbers. Python accounts for about 17% of GitHub pushes, roughly five percentage points ahead of second-place Java. AI performs best on Python too — in Multi-SWE-bench, a benchmark spanning eight languages and 2,132 real issues, Python resolution rates are 36-48%, while TypeScript sits at 2-7%, and Go and Rust hover at 3-7%. That’s a tenfold gap.

But you can’t pin all of this on training data alone. Java has 12% of GitHub pushes, less than twice Python’s share, yet AI resolves Java issues at only one-third to one-half the Python rate. Go has 1.4 times more pushes than TypeScript, yet the two perform similarly. Rust, with just 1.9% of GitHub activity — less than one-fifth of Go’s — still matches Go’s benchmark performance.

Training data explains roughly 60 to 70 percent of the variance. The remaining 30-plus percent comes from language design and ecosystem factors.

MultiPL-E, an early 2022 study that translated Python’s HumanEval benchmark into 18 languages, reached a counterintuitive conclusion: the presence or absence of static typing has almost no effect on one-shot code generation accuracy. JavaScript (dynamically typed) scored slightly above Python in pass@1. Rust (the strictest static typing) scored significantly below. The paper’s own words: “static type-checking neither helps nor hinders code generation model performance.”

But this finding has an important limitation: it measures one-shot generation. Given a function signature and a docstring, how often does the model nail it on the first try. In real development, AI doesn’t hand in its answer and walk away. It writes, compiles, reads errors, fixes, recompiles, runs tests. It’s a loop.

And the speed of that loop is where languages actually diverge.

Feedback Loop Speed Matters More Than First-Shot Accuracy

Wes McKinney, creator of pandas, described a concrete shift in a February 2026 podcast. He used to build everything in Python because it reads comfortably and writes fast. After a year-plus of using Claude Code, he found the rules had changed:

“The bottleneck has shifted: test suite execution speed and compile times now matter more than how enjoyable a language is to write in. I have been building new projects in Go because the agentic loop — prompt, generate, test, iterate — runs faster in compiled languages.”

Agentic loop. That’s his term, and it’s the key to the whole question. The atomic unit of AI programming isn’t a single output. It’s a turn: you say what you want, the AI writes code, the AI runs compilation or tests to see if it works, reads the errors, fixes them, goes again. Every second you shave off that loop gives the AI one more attempt within its time and token budget. More attempts mean better results.

Under this model, some of a language’s traditional virtues become liabilities. Python tests are slow to start — importing dependencies, parsing code, initializing fixtures can add up to several seconds. Go’s go test ./... finishes in under a second, with incremental caching. Rust’s compiler gets called “slow,” but type-checking a single file takes milliseconds in practice. TypeScript’s tsc --noEmit is fast too.

But speed is only one dimension of feedback. The other is signal quality.

Python’s runtime errors often land far from their root cause. An AttributeError might mean an upstream function returned the wrong type, but the stack trace points to the call site, not the contamination source. The AI agent has to walk back up the call chain layer by layer to find the problem. Every step risks guessing wrong, fixing wrong, introducing new bugs.

Rust’s compiler errors point straight at the root cause. You borrowed a value that’s already been moved — the compiler tells you: line 47 moved it, line 52 tried to use it again, and here’s how to fix it. For an AI agent, this is precision-guided error signaling. No guessing. Go fix what the compiler points at.

This is why some in Chinese tech groups have been saying “the AI-era programming language should be Rust.” The argument is straightforward: languages used to be designed for human friendliness — easy to learn, easy to read, hard to shoot yourself in the foot. With AI doing the writing, those things don’t matter anymore. You just tell AI what you want, it wrestles with the compiler for a few rounds, and whatever emerges that passes compilation is already at a higher quality baseline than Python can guarantee.

But Rust has a problem. “Precise” compiler errors don’t equal “easy to fix.” RustAssistant, a Microsoft Research tool presented at ICSE 2025, was built specifically to let LLMs fix Rust compilation errors. Its peak accuracy is 74%. One in four errors goes unfixed. Over a five-round loop, the survival probability is 0.74 to the fifth power — roughly 22%. And that’s with a purpose-built tool.

Go: The Accidental AI-Optimal Language

Armin Ronacher, creator of Flask and a veteran Rust contributor at Sentry, chose Go for the backend of his AI startup in 2025.

He wrote on his blog: “I’ve evaluated agent performance across different languages, and if you can choose your language, I strongly recommend Go for new backend projects.”

His case isn’t benchmark-driven. It’s based on engineering properties he observed in actual agentic coding. First, Go’s test caching — “surprisingly crucial for efficient agentic loops.” The AI agent changes one file, runs only the relevant tests, gets results in under a second, and moves to the next round. Second, structural interfaces. A type satisfies an interface simply by implementing the required methods. The AI doesn’t need to deal with explicit inheritance or implementation declarations, reducing its chances of getting things wrong. Third, Go’s entire ecosystem has been backward-compatible for a decade. No breaking changes. Whatever the AI learned from its training data is still correct today.

This contrasts sharply with Ronacher’s earlier attitude toward Go. He used to find it verbose, inexpressive, full of if err != nil everywhere. But in agentic coding mode, those things stop being problems. The verbose parts are written by AI — you’re just reviewing. And precisely because Go is verbose, its output is more predictable: no implicit type conversions, no cross-layer exception propagation, no reflection or magic methods. Go gives AI far fewer ways to be wrong.

Someone on Hacker News put it well: Go is LLM RISC. Small instruction set, every instruction’s meaning is clear, hard for AI to guess wrong.

Wes McKinney made the same pivot. Python is the language of pandas, which he created — he wrote Python for over a decade. Now he starts new projects in Go, for essentially the same reason as Ronacher: not because Go is better, but because in the agentic coding feedback loop, Go wastes less time for both human and AI on the question of “is this change actually correct.”

Rust’s Right Context

Rust’s performance in AI-assisted programming is strikingly bimodal.

The most compelling positive case comes from Steve Klabnik, the Rust community’s longtime advocate and co-author of The Rust Programming Language. He was a longtime AI skeptic, but in late 2025 he used Claude Code to write roughly 100,000 lines of Rust in 11 days, creating an experimental systems language called Rue from scratch. The case is dense with signal: someone who knows Rust better than almost anyone used AI to write Rust, and the output was something he could never have written by hand.

Another class of positive evidence comes from Joseph Glanville’s quantitative experiments. He measured interventions per session — how many times a human had to step in during an AI agent session — and ranked Rust first, “head and shoulders above the other options.” His reasoning: Rust’s tool output has extremely high token density. A single compilation error gives you the file, the line number, the error cause, and a suggested fix. The AI agent reads it and is pointed directly at the problem.

But the critics also come from deep experience. Armin Ronacher chose Go, not Rust. matklad, creator of rust-analyzer, tried having AI generate a complete AWS cluster management tool in one shot and concluded the code “lacked any character whatsoever.” It ran, but the architecture was bad, the code inhuman.

This split isn’t random. It tracks along one axis: the user’s Rust expertise.

Reviewing AI-generated Rust code requires the ability to judge whether a borrow checker error means the AI wrote something wrong, or the borrow checker is being overly conservative and the code needs a different approach. The first case is a simple bug fix. The second requires architectural adjustment. If the reviewer lacks this discernment, you get the scenario described by HN user antonvs: an ML team using AI to write Rust, making 40-file changes for a feature that needed changes to one or two files, with no one on the team able to tell which changes were necessary and which were the AI appeasing the compiler.

So Rust’s real positioning isn’t “the best language for AI.” It’s “the best language for experienced Rust developers to accelerate with AI.” If your team already knows Rust well, AI can give you an order-of-magnitude productivity jump. If your team doesn’t know Rust, AI-generated Rust may compile but be architecturally unsound — and you lack the ability to tell.

Three Languages, Three Loops

If you reframe the language debate around feedback loops, the three languages represent three distinct loop structures.

Python’s loop is: write → run → read logs → guess → fix → rerun. Every step has uncertainty. The error location may not be the bug location. Fixing one bug may introduce another somewhere else. But because Python has the most training data, AI’s first-shot accuracy is highest — so the loop may be slow, but it starts from a higher baseline.

TypeScript’s loop is: write → tsc check → fix type errors → run. The type-check step completes in milliseconds, and the error signals are precise. The problem is that TypeScript’s type system is unsound. Passing type checking doesn’t guarantee type correctness, much less logical correctness. This makes TypeScript the most paradoxical language: it gives AI just enough constraint to be useful, but not enough to be trustworthy, creating a false confidence in code that compiles but is logically wrong.

Go’s loop is: write → go build (1-2s) → go test (1-3s) → fix → rebuild and retest. The elegance of this loop is that every step is fast, and every step’s signal is definitive. Compilation passes = no syntax or type errors. Tests pass = behavior matches expectations — assuming you have tests. Go’s test infrastructure is built into the language, standardized, and incrementally cached. The AI agent doesn’t need to spend tokens understanding or configuring test tooling. It just runs go test and gets its signal.

Across these three loops, Go is the only one with no information loss between steps, no “probably right but not sure” intermediate states. This wasn’t Go’s design intent, but it happens to map perfectly onto agentic coding’s needs.

A Selection Framework

If you want a simple, actionable recommendation, here’s roughly how to think about it.

For AI/ML glue code, stick with Python. Not because it’s the best fit, but because PyTorch, JAX, Hugging Face, and the rest of the ML ecosystem live in Python. You can’t route around it. Training data dominance also gives AI the highest first-shot accuracy here.

For new backend projects, CLI tools, and data processing pipelines, seriously consider Go. Not because Go performs better, but because the agentic coding feedback loop runs fastest and consumes the fewest tokens in Go. The tokens and time you save can go toward more iterations or more tests — both of which improve final quality.

For low-level components with hard memory safety or concurrency correctness requirements, and only if your team already has Rust mental models, choose Rust. The compiler gives you guarantees no other language offers, and those guarantees serve as automated acceptance testing during AI’s iterative fix cycles. Pass compilation, and at least one entire class of bugs is eliminated.

As for TypeScript, its irreplaceability isn’t in language features — it’s in ecosystem. Frontend work has no alternative; you use it or something that compiles to it. But this “no choice” happens to be a moat. In agentic coding, TypeScript’s type system gives AI just enough constraint, while the browser and Node.js ecosystems provide the richest training data. TypeScript isn’t the best AI language, but it’s the only option for the frontend.

The Deeper Shift

What’s most interesting about this debate isn’t which language wins. It’s the cognitive migration it exposes.

Until now, our default criterion for choosing a programming language was “human-friendly”: intuitive syntax, smooth toolchain, large community, hard to shoot yourself in the foot. Python is the perfect product of this framework — it was practically designed to make humans feel comfortable.

But with AI in the picture, the selection logic flips. You no longer care whether the language is friendly to you — you’re not the one writing. You care whether it’s friendly to the system that writes, fixes, and verifies code hundreds of times a day.

This means the highest values of the PL community over the past several decades — readability and writability — are being repriced. Compilation speed, the signal density of the type system, the standardization of test tooling, ecosystem stability and backward compatibility: these used to be second-tier metrics. In the agentic coding era, they’ve become more important than syntactic sugar.

Go is the primary beneficiary of this revaluation. It’s not the prettiest, not the most powerful, not the smartest. But it’s the most predictable, the fastest, the most stable — the three properties that matter most to an AI agent. Rob Pike once described Go as “for people who can’t handle a complex language.” Over a decade later, replace “people” with “AI,” and the statement actually becomes more accurate.

This isn’t to say Python will decline or Rust will take over. They won’t. Python’s AI/ML ecosystem is two decades of accumulated infrastructure with no near-term replacement. Rust’s memory safety guarantees are irreplaceable in certain domains. Go’s plainness is an asset in others. Different languages will evolve in different directions — some becoming more popular in the AI era, some fading to the margins. But the variables driving that evolution have already changed.


Thanks to 佐治亚小帅 for suggesting this topic.