The Opus 4.8 number nobody's posting

Hey Reader,

Anthropic shipped Claude Opus 4.8 this week, and the whole feed posted the same screenshot – the coding score.

That's the least interesting part.

The number that actually changes how you work is below, plus three more reads stacked behind it.

Claude Opus 4.8 – the number nobody's posting

Claude Opus 4.8 cover – dynamic workflows in Claude Code, one node fanning out into many parallel agents

Everyone screenshotted the coding score (64.3% to 69.2% on SWE-Bench Pro). The change that matters more: Opus 4.8 is about 4x less likely than 4.7 to let a flaw in its own code pass without flagging it, and it tells you when it's unsure. When you've handed an agent your git loop, "catches its own mistakes" beats any benchmark point – that's the line between delegation you trust and delegation you re-check by hand.

And dynamic workflows landed the same day. You hand Claude a project-scale job, it plans once, runs hundreds of agents in a single session, and checks the results before they reach you. Jarred Sumner used it to port Bun from Zig to Rust – roughly 750,000 lines, 99.8% of the test suite passing, 11 days from first commit to merge.

The post is everything that actually shipped: the model, dynamic workflows, the latest Claude Code (2.1.156), and where this lands next to a persistent n8n workflow you can inspect.

Read the Opus 4.8 breakdown →

Open Design, 4 weeks in – should an SMB owner touch it yet?

Open Design 4 weeks in cover – the trajectory from launch to 52K stars

Four weeks ago Open Design was a design-prototype tool with 19 skills and 71 design systems. Today it's at 132 skills, 150 design systems, 8 stable releases, 52,718 GitHub stars – and it generates video now. Anthropic's hosted Claude Design shipped zero public updates in the same 28 days.

That contrast changes the SMB verdict. The post is the honest go/no-go: install today if you're a technical solo founder, stay on hosted Claude Design if you're non-technical, and the consultant row where the margin math just changed once video generation landed in the same surface as web mockups.

Read the 4-weeks-in verdict →

/code-review vs /ce-code-review – when each one wins

Claude Code review vs ce-code-review cover – native command vs compound-engineering plugin

Claude Code ships /code-review and /simplify natively. The compound-engineering plugin adds /ce-code-review and /ce-simplify-code on top. They look like duplicates. They aren't.

The post is the side-by-side: the quick-review short-circuit most people miss (line 36 of the CE skill defers to native for fast passes), a decision matrix for which to reach for, and the Opus cost gotcha worth knowing – CE forces Sonnet on most personas, so skipping that override quietly runs a review at 3-4x the cost.

Read the comparison →

Agent washing – and why your n8n workflows are more honest than most "AI agents"

Agent washing cover – deterministic n8n workflows vs hyped AI agents

With everyone shipping "agents" this week, here's the filter. Agent washing is rebranding basic automation as an AI agent – slapping the label on something that's really an if-this-then-that flow. A deterministic n8n workflow you can open, read, and rerun is often more reliable than the thing calling itself autonomous.

This one's a few weeks old and worth pulling back up while the agent hype is loud. It's how to tell the real thing from the washed version, and why "boring and auditable" wins for anything that has to run in production.

Read the agent-washing take →

Reply and tell me which one was the read for you. I read every reply.

– Alex

P.S. – The conversations behind posts like these happen in the WotAI Skool community. 760+ builders, three live calls a week, free to jump in.

Come build with us →

Come build with us

Reply to this email anytime – I read every one.

WotAI · AI Automation that Actually Ships

wotai.co

WotAI · 440 N. Barranca Ave., Unit 3777, Covina, CA 91723