
The AI Coding Agent Arms Race — Why Model Portability Matters More Than Benchmarks
Last month, Claude Opus 4.7 hit 1567 Elo on WebDev Arena — the highest any model has ever scored on that benchmark. The same week, GPT-5.5 claimed the Terminal-Bench 2.0 crown at 82.7%. And in Jakarta, where I’ve been shipping software since before npm existed, nobody in my …












