Skip to content
All posts

The Model They Won't Ship

April 8, 20263 min readDhruv Jain

Yesterday, Anthropic previewed a new Claude model called Mythos. The company isn't releasing it to the public. Not yet, maybe not for a while.

The reason is worth paying attention to. Anthropic privately warned US government officials that Mythos makes large-scale cyberattacks "significantly more likely this year."

That is not a framing Anthropic uses lightly.

What Mythos Actually Does

Mythos is a general-purpose frontier model with very strong agentic coding and reasoning. It is not a security-specific model. It was not trained to find vulnerabilities.

In early testing, it found thousands of zero-day bugs in real-world code. One of them had been sitting in OpenBSD for 27 years. Nobody had caught it.

Let that sit for a second. A general-purpose model, not told to look for security issues, found a 27-year-old bug in a codebase maintained by some of the best engineers in the world.

This is what "agentic coding" means in 2026. It is not a model that autocompletes your functions. It is a model that can read a codebase, reason about invariants, and independently surface flaws experienced engineers have missed for decades.

Who Gets Access

Around 50 organizations total, through an initiative Anthropic is calling Project Glasswing. The named partners include Apple, Amazon, Google, Microsoft, Cisco, Broadcom, CrowdStrike, Palo Alto Networks, and the Linux Foundation.

Anthropic is committing up to $100M in usage credits across these efforts, plus $4M in direct donations to open-source security organizations.

Everyone else, including almost every founder reading this, will wait.

Why This Matters If You Don't Run a Security Team

This is the uncomfortable part.

While Anthropic is deliberately throttling a model because it is too capable, most founders I see are still debating whether to let Claude touch their codebase at all. The gap between where the frontier is and where production teams actually operate has gotten cartoonish.

Pick the signals that matter:

  • Anthropic is releasing less, not more.

  • Capability is accelerating faster than deployment.

  • Dual-use (defense and offense) is now a first-class design concern.

  • Agentic coding is becoming the default operating mode.

Every one of those should change how you're thinking about AI in your stack, even if you are nowhere near cybersecurity work.

What To Do This Week

Three specific things worth doing if you run or work on a product:

  1. Move off old models. If you are still running an older Claude or GPT because you have not "had time to migrate," this is your sign. The gap between frontier and legacy compounds every quarter. The teams that migrate quickly are pulling away.

  2. Run your critical services through a real code review. Not a full rewrite. Pick one service you actually care about and put it through a thorough review with your strongest available model. Look at what it surfaces. The next generation will surface more, and faster.

  3. Build the supervision workflow, not the replacement fantasy. The question is no longer whether AI can write code. It can. The real question is whether your team has the workflow to supervise it, verify it, and ship its output without breaking things. Teams without that workflow are about to ship broken things very fast.

The Direction Of Travel

Anthropic holding back a model because it is too powerful is not a one-off headline. It is a story about where things are going.

The capability curve is not slowing down. Access is getting tighter at the top, and it is getting stranger at the bottom. The founders who notice early get to position.

The rest get to react.


Sources: TechCrunch, Fortune, Bloomberg

Request an AI Readiness Review

For CTOs, operators, department heads, and compliance leaders who need a practical path from scattered AI usage to governed adoption.

20-min review — exposure, use cases, next step
Your data stays yours — NDA on day one

Opens Cal.com to select your slot

Need context first? Read the proof, case studies or subscribe to the weekly essay.

Q2 AI readiness window

Find the shadow-AI risk before it becomes policy debt.

In 20 minutes, we'll identify the department to review first, the AI usage surface you can't see yet, and whether a readiness audit, workshop, or private AI pilot is the right next step.

NDA-ready20-minute executive reviewNo tool pitchFor regulated or data-sensitive teams

Best fit: CTOs, operators, and compliance leads who need a governed first AI use case.

Review output

Your first governed AI use case

Actionable
01

First department to review

Where AI usage is already creating leverage, risk, or hidden process drift.

02

Shadow-AI exposure surface

The workflows, data paths, and approval gaps leadership cannot currently see.

03

Approval-worthy next step

A readiness audit, workshop, or private pilot scoped for governance first.

The urgency is not hype. Once teams normalize ungoverned AI habits, cleanup becomes policy debt, retraining, and slower approvals.