The Opus 4.8 number nobody's posting


The Opus 4.8 number nobody's posting
Not the coding score. Plus Open Design at 4 weeks, and 2 more reads.  ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌
WotAI

The Opus 4.8 number nobody's posting

Hey Reader,

Anthropic shipped Claude Opus 4.8 this week, and the whole feed posted the same screenshot – the coding score.

That's the least interesting part.

The number that actually changes how you work is below, plus three more reads stacked behind it.

Claude Opus 4.8 – the number nobody's posting

Claude Opus 4.8 cover – dynamic workflows in Claude Code, one node fanning out into many parallel agents

Everyone screenshotted the coding score (64.3% to 69.2% on SWE-Bench Pro). The change that matters more: Opus 4.8 is about 4x less likely than 4.7 to let a flaw in its own code pass without flagging it, and it tells you when it's unsure. When you've handed an agent your git loop, "catches its own mistakes" beats any benchmark point – that's the line between delegation you trust and delegation you re-check by hand.

And dynamic workflows landed the same day. You hand Claude a project-scale job, it plans once, runs hundreds of agents in a single session, and checks the results before they reach you. Jarred Sumner used it to port Bun from Zig to Rust – roughly 750,000 lines, 99.8% of the test suite passing, 11 days from first commit to merge.

The post is everything that actually shipped: the model, dynamic workflows, the latest Claude Code (2.1.156), and where this lands next to a persistent n8n workflow you can inspect.

Read the Opus 4.8 breakdown →

Open Design, 4 weeks in – should an SMB owner touch it yet?

Open Design 4 weeks in cover – the trajectory from launch to 52K stars

Four weeks ago Open Design was a design-prototype tool with 19 skills and 71 design systems. Today it's at 132 skills, 150 design systems, 8 stable releases, 52,718 GitHub stars – and it generates video now. Anthropic's hosted Claude Design shipped zero public updates in the same 28 days.

That contrast changes the SMB verdict. The post is the honest go/no-go: install today if you're a technical solo founder, stay on hosted Claude Design if you're non-technical, and the consultant row where the margin math just changed once video generation landed in the same surface as web mockups.

Read the 4-weeks-in verdict →

/code-review vs /ce-code-review – when each one wins

Claude Code review vs ce-code-review cover – native command vs compound-engineering plugin

Claude Code ships /code-review and /simplify natively. The compound-engineering plugin adds /ce-code-review and /ce-simplify-code on top. They look like duplicates. They aren't.

The post is the side-by-side: the quick-review short-circuit most people miss (line 36 of the CE skill defers to native for fast passes), a decision matrix for which to reach for, and the Opus cost gotcha worth knowing – CE forces Sonnet on most personas, so skipping that override quietly runs a review at 3-4x the cost.

Read the comparison →

Agent washing – and why your n8n workflows are more honest than most "AI agents"

Agent washing cover – deterministic n8n workflows vs hyped AI agents

With everyone shipping "agents" this week, here's the filter. Agent washing is rebranding basic automation as an AI agent – slapping the label on something that's really an if-this-then-that flow. A deterministic n8n workflow you can open, read, and rerun is often more reliable than the thing calling itself autonomous.

This one's a few weeks old and worth pulling back up while the agent hype is loud. It's how to tell the real thing from the washed version, and why "boring and auditable" wins for anything that has to run in production.

Read the agent-washing take →


Reply and tell me which one was the read for you. I read every reply.

– Alex


P.S. – The conversations behind posts like these happen in the WotAI Skool community. 760+ builders, three live calls a week, free to jump in.

Come build with us →

Come build with us

Reply to this email anytime – I read every one.

WotAI · AI Automation that Actually Ships

wotai.co

WotAI · 440 N. Barranca Ave., Unit 3777, Covina, CA 91723

WotAI

If you're building with Claude Code, n8n, or AI agents in production, this is the email I'd want. Weekly recap of what shipped, what broke, and how I'm running it. Specific over vague. Replies welcome.

Read more from WotAI

What goes underneath Claude Claude for Small Business is a plugin, not a tier. Plus 3 more reads. What goes underneath Claude Hey Reader, Anthropic shipped Claude for Small Business on May 13. A week in, most people still think it's a new subscription tier parallel to Pro and Team. It isn't. It's a plugin. And whether you want it depends on what you're actually trying to get Claude to do. Four reads below. Claude for Small Business – what's actually in the plugin Anthropic shipped Claude for...

Should AI write your code unsupervised? My answer + today's source-control post. Plus 2.1.141 and Episode 2. Should AI write your code unsupervised? Hey Reader, Short answer to the subject line: yes – but only with the workflow underneath. "Unsupervised" doesn't mean nobody's reviewing, it means the human moves from typing to reviewing. The three reads below are the layers of that workflow stack. Plan Mode is how you organize a session before the agent runs. Source control is the safety net...

Claude Code rate limits doubled + SpaceX 220K GPU deal

Claude Code's limits just doubled Plus the silent 1-hour cache bug in 2.1.131, Episode 2 of the production series, and every WotAI live session now on YouTube. Claude Code's limits just doubled Hey Reader, Eight days, four posts. Big week for Claude Code: 5-hour rate limits doubled (and Anthropic signed for an entire SpaceX data center to back it up), 2.1.131 quietly fixed a 1-hour prompt-cache bug that had been silently overcharging anyone running long-TTL cache control, Episode 2 of the...