OpenAI’s Codex CLI has been quietly writing data to a local SQLite log database on users’ machines, racking up 37 TB in 21 days — roughly 640 TB annualized — and approaching the rated write lifespan of a 1TB consumer SSD. The issue was first reported on April 10, but it wasn’t until June 22, after it climbed the Hacker News front page, that The Register ran a follow-up whose headline bluntly called the needless writes “costing millions.” On June 23 OpenAI merged a fix PR and closed the issue, but the fix is incomplete: another issue filed that day shows the Windows desktop package still reproducing the bug after the fix landed, and a third key fix remains unreleased inside 0.143.0.
First, what the bug looks like. Codex continuously writes runtime data to a local SQLite log database in the background, at a volume large enough to burn through a consumer SSD’s rated write lifespan in under a year. It won’t fill your disk (the logs are garbage-collected automatically), won’t throw any errors, and won’t trigger any system alerts. The only anomaly you might notice is a drive light that stays on and constant disk activity. The cost lands entirely on the SSD’s write-endurance metric — something that never appears in any routine disk check.
So who’s affected: anyone running Codex CLI and keeping it resident is burning hardware directly. Claude Code and OpenCode users are largely unaffected. If you’re a Codex user, read on for versions and self-checks.
Check your version first. Upgrading to 0.142.0 or above is the dividing line — that release merged the fix and, per reporter feedback, cuts about 85% of the log writes. But the fix isn’t all the way there yet: one piece is still unreleased inside 0.143.0, and the Windows desktop package still reproduces after the fix merge (issue #29556). Upgrading to 0.142.0 handles most of it, but it’s not a clean slate until a later version number.
To confirm how much has actually been written to your drive, check
the SMART counters, not disk space. du and Finder show file size; SMART
reads the actual physical write volume. On macOS, install smartmontools
(brew install smartmontools), run
sudo smartctl --all /dev/disk0, and look for
Percentage Used and Data Units Written. On
Linux NVMe, run sudo nvme smart-log /dev/nvme0. On Windows,
run Get-PhysicalDisk | Get-StorageReliabilityCounter, or
use the CrystalDiskInfo GUI.
If you can’t upgrade in time, there are three stopgaps. Symlink
~/.codex/logs_2.sqlite to /tmp so writes land
in memory instead of on disk. Add a trigger to the database that blocks
all inserts:
CREATE TRIGGER block_log_inserts BEFORE INSERT ON logs BEGIN SELECT RAISE(IGNORE); END;
— the cost is losing local diagnostic logs. Run VACUUM
periodically to reclaim space; one user compressed it from 27 GB back
down to 73 MB. Codex ships no official toggle for this log; there’s no
sqlite_logs_enabled-style option in
config.toml.
Codex writes its run logs to ~/.codex/logs_2.sqlite, a
local SQLite database. SQLite is single-file and zero-config; desktop
apps routinely use it for local storage. The problem is what Codex
decided to write into it, and at what frequency.
The log level defaults to TRACE. In Rust’s tokio-rs/tracing library,
TRACE is a level below DEBUG — it dumps every file open, every network
packet verbatim. Codex even logged things like opening
/etc/passwd and raw WebSocket frames. The configuration is
Targets::new().with_default(Level::TRACE), a per-layer
filter. Users set RUST_LOG=warn expecting it to apply
globally, but as long as any target isn’t explicitly overridden, Codex’s
default kicks in and writes anyway, bypassing the environment
variable.
Write frequency amplified the problem. SQLite’s default WAL (Write-Ahead Log) mode appends every write to a log file first, then merges it back into the main database once enough accumulate. Codex keeps inserting new rows while deleting old ones, holding the total row count steady. In one observation window, 36,211 rows were inserted in 15 seconds while the database file size stayed perfectly flat, because SQLite reclaims old space in the background. But that isn’t the same as no physical writes. Every INSERT becomes an actual write on NAND sectors, and DELETE operations produce additional writes too. This gap is called write amplification: the app writes one row, and the physical layer may write three or five times as much data.
Down at the hardware. Consumer 1TB NVMe SSDs like the Samsung 990 PRO, WD SN850X, and Crucial P5 Plus are typically rated at 600 TBW (Total Bytes Written). Per measurements by reporter Rui Fan, an Apache Flink PMC member, his machine ran Codex 24/7 and wrote 37 TB in 21 days — 640 TB annualized, burning through the warranty line in under a year. For a heavy user at 8 hours a day, that’s roughly 213 TB a year — burning through in about three years, still far above a normal dev workload. After the fix it drops to about 32 TB a year, back to ordinary levels. 600 TBW is a warranty value, not a hard failure boundary; many drives keep working well past several times their TBW. As of publication, there’s no public case of a drive actually being killed by this bug.
This lived from April to June before blowing up, and the core reason is that file size and physical bytes written run on two different clocks — and the entire default toolkit for checking disks reads the wrong one.
du, df, Finder, disk-capacity alerts: all of these read the logical
size at the filesystem layer — how many blocks a file occupies, how many
are still free. An SSD’s lifespan runs on a different clock. NAND flash
has a finite number of write-erase cycles, and every physical write
consumes that budget; only the drive’s own SMART counter can read it.
Percentage Used records how much of the rated lifespan is
gone; Data Units Written records how much data has actually
been written to NAND.
Codex’s bug sat squarely in the gap between those two clocks. SQLite writes furiously while reclaiming space, keeping the database file stable. du looks at it: the file isn’t big, isn’t growing — all clear. Meanwhile the physical-write counter spins fast in another dimension. Stare at du for three months and you see nothing wrong; switch to SMART and the burning is visible at a glance.
issue #17320 laid out the complete technical evidence back on April 10: sustained writes of 5 to 16 MiB/s, pwrite64 flooding the strace output. But everyone who read the report evaluated it with their default disk-checking habits — du, free space — and reached a conclusion that seemed reasonable at the time: file size is stable, the disk isn’t full, not an emergency. That conclusion held until June 14, when Rui Fan posted SMART data in issue #28224, putting the 640 TB annualized number officially in front of the public. On June 22 it hit the Hacker News front page, The Register followed up, and the first fix PR was created and merged that same day.
The filesystem cares whether there’s enough space; SMART cares how much longer the drive will last. Everyday ops habits only watch the former, and that’s the blind spot where this bug survived.
Next time your fans aren’t spinning but your drive light stays on, open SMART and take a look. Those three commands run on every machine.