A New Capability Emerges

Two weeks ago, Anthropic dropped a bombshell on the cybersecurity world. Their latest AI model, Claude Mythos Preview, demonstrated the ability to autonomously discover and exploit software vulnerabilities—turning them into working attacks without any human guidance. This wasn't limited to obscure bugs; the model found flaws in critical systems like operating systems and internet infrastructure—vulnerabilities that teams of professional developers had missed for years. The implications are staggering: such capabilities could compromise the devices and services we rely on daily. As a result, Anthropic has opted not to release the model to the general public, instead granting access to a select group of companies.

Anthropic's Claude Mythos: What It Means for Cybersecurity's Future — Source: www.schneier.com

The Controversy and Speculation

The announcement sent shockwaves through the internet security community. However, Anthropic provided few technical details, leaving many experts frustrated and skeptical. Some speculate that the company simply lacks the GPU resources to run the model at scale, and that the cybersecurity rationale was a convenient excuse to limit distribution. Others argue that Anthropic is staying true to its AI safety mission, prioritizing responsible deployment. Whatever the truth, the conversation is now a mix of hype and counterhype, reality and marketing. Sorting through the noise is no small task, even for seasoned professionals.

Incremental Steps, Shifting Baselines

We see Mythos as a real but incremental step—one in a long line of many. Yet even incremental steps can be profoundly important when viewed from a distance. This is where the concept of shifting baseline syndrome comes into play.

The Shifting Baseline Syndrome

Shifting baseline syndrome describes how people—both the public and experts—tend to discount massive, long-term changes that occur gradually. It has happened with online privacy, and it is now happening with artificial intelligence. While the vulnerabilities discovered by Mythos might have been findable by AI models from last month or last year, they were certainly beyond the reach of models from just five years ago. The announcement is a reminder that the baseline has truly shifted. Finding flaws in source code is exactly the kind of task that modern large language models excel at. Whether it happened last year or will happen next year, the arrival of such autonomous hacking capabilities has been foreseeable. The real question is how we adapt.

How AI Is Reshaping Cybersecurity

We do not believe that an AI capable of autonomous hacking will create a permanent asymmetry between offense and defense. The reality is more nuanced. Consider the range of scenarios:

Easy to find, verify, and patch: Some vulnerabilities can be automatically discovered, validated, and fixed—especially in cloud-hosted web applications built on standard software stacks, where updates can be deployed quickly.
Hard to find, easy to verify and patch: For generic enterprise applications, an AI might locate a subtle bug, but once found, the fix is straightforward and can be rolled out rapidly.
Easy to find (even without powerful AI), but hard or impossible to patch: This category includes IoT devices, industrial control systems, and legacy equipment that are rarely updated or cannot be modified easily. Here, discovery is simple, but remediation is a nightmare.

There is also a fourth category: systems whose vulnerabilities are easy to spot in code but extremely difficult to verify in practice. Think of complex distributed systems and cloud platforms composed of thousands of interacting components. A potential exploit may look real in the source, but confirming it works across the entire environment requires immense engineering effort.

Offense vs. Defense: A Nuanced Landscape

The Mythos announcement forces us to confront a more balanced future. On one side, attackers gain a powerful tool to find and weaponize bugs faster than ever. On the other, defenders can leverage similar AI to detect and patch vulnerabilities before they are exploited. The key is that both sides have access to the same core technology—it's a race, not a one-sided advantage. Moreover, some types of vulnerabilities are inherently biased toward defense: think about memory-safe languages and formal verification, where AI can help write provably secure code. The long-term outlook suggests that while the pace of discovery will accelerate, the asymmetry may be less pronounced than fearmongers claim.

Preparing for a New Reality

Whether Mythos is a revolution or an evolution, the cybersecurity community cannot afford to ignore the trend. AI models that can hack autonomously are coming—if not from Anthropic, then from someone else. The path forward involves several key adaptations:

Invest in defensive AI: Automate vulnerability discovery, patch generation, and incident response using similar models.
Reevaluate risk models: Assume that any software flaw could be rapidly weaponized, so prioritize patching and secure coding practices.
Build nimble infrastructure: Design systems that can be updated and patched quickly, especially in critical sectors like healthcare, energy, and finance.
Encourage transparency: When companies like Anthropic make such announcements, providing more technical context helps the community prepare and reduces harmful speculation.

In the end, Claude Mythos is a cautionary tale and a call to action. The baseline has shifted, and cybersecurity must evolve alongside the AI that is reshaping it.

Anthropic's Claude Mythos: What It Means for Cybersecurity's Future