GPT-5 Is Here: What Changed, What Didn't, and Why It Matters Less Than You Think

The Launch Nobody Expected

OpenAI released GPT-5 on a Tuesday afternoon with a blog post and a shrug. No keynote, no countdown timer, no Sam Altman tweet storm. After years of increasingly theatrical AI launches, the quiet release felt almost subversive.

But the model itself is anything but quiet. GPT-5 represents a genuine step change in reasoning, code generation, and multi-modal understanding. The question is whether that matters as much as it would have two years ago.

What Actually Improved

The benchmarks are impressive. GPT-5 scores 94.2% on MMLU (up from 86.4% for GPT-4), handles 200K context windows natively, and produces code that passes unit tests on first attempt 78% of the time (compared to 52% for GPT-4 Turbo).

More practically: it follows complex instructions more reliably, hallucinates less frequently, and can maintain coherent reasoning across document-length outputs. It's the first model that can reliably summarize a 300-page legal document without losing the thread.

What Didn't Change

GPT-5 still can't reliably do math beyond what it's seen in training data. It still sometimes confuses correlation with causation in its reasoning. It still generates plausible-sounding nonsense when pushed beyond its knowledge boundaries — it's just better at admitting it doesn't know.

And critically, it still requires human oversight for any high-stakes application. "Better" is not "trustworthy." The error rate dropped from ~15% to ~6% on complex reasoning tasks, but that 6% can still be catastrophically wrong.

Why It Matters Less Than You Think

Here's the uncomfortable truth: for most people, GPT-4 was already "good enough." The jump from GPT-3.5 to GPT-4 changed what was possible. The jump from GPT-4 to GPT-5 mostly changes what's convenient. The ceiling of AI capability went up, but the floor of user expectations went up faster.

Sarah Mitchell

Senior technology writer with 12 years covering AI, cybersecurity, and emerging tech. Former editor at Wired and The Verge.