US Cracks Down on Anthropic AI Models Amid Abuse Concerns

June 15, 2026

Nation-state threat actors and cybercriminals are growing more sophisticated in how they use foundational AI models in their offensive campaigns, reportedly worrying the US government enough to ban foreign nationals from accessing the latest models from Anthropic.

On Friday, Anthropic announced that it had stopped offering access to its latest model, Fable 5, which had launched three days earlier, after the US government issued a national-security order for the company to prevent access to the model by foreign nationals, even those who work for the company. The order also mandates limiting access to Anthropic’s Mythos 5 models, to which hundreds of companies formerly had access.

The bans came less than two weeks after Anthropic published research showing that adversaries are increasingly using its AI services to create malicious code, find vulnerabilities, and automate the cyberattack chain.

“While Claude Mythos Preview demonstrates where frontier AI cyber capabilities are heading — models able to find and exploit vulnerabilities at a level approaching the most skilled human researchers — [our research shows] us how threat actors are misusing generally available models today,” the trio of researchers stated in the research report.

Worries over the potential impact that the latest AI models on vulnerability research and exploit development have cast a shadow on the latest AI model releases. In late April, the AI Security Institute (AISI) — a research arm of the UK government’s Department for Science, Innovation and Technology — confirmed that Anthropic’s Mythos could conduct an end-to-end attack chain.

The research also confirmed that Anthropic was not alone: Open AI’s latest model GPT-5.5 surpassed Mythos for tasks in both practitioner- and expert-level attack chains. In addition, GPT-5.5 became the second model — after Mythos — to complete a 32-step corporate network attack simulation with a 100-million token budget: Mythos succeeded in 3 out of 10 attempts, while GPT-5.5 succeeded in 2 out of 10 attempts.

Chart showing AI models advanced cyberattack capabilities

Anthropic’s Mythos is not alone: Open AI’s GPT-5.5 scores better on complicated attacks. Source: AISI (highlighted colors by author)

AI Usage Spans the Kill Chain

Cyberattack groups are trying to turn these benchmarked tests into reality, and nearly every major company developing foundational AI models has released reports on the status of attacker usage of their models. In February, OpenAI reported on a bevy of cyber operations that the company detected and blocked, including cyber-espionage and nation-state campaigns linked to China and Russia and romance scams linked to a Cambodian cybercriminal network.

In May, Google reported evidence of autonomous malware operations using AI, a zero-day exploit that its Google Threat Intelligence Group (GTIG) believed was developed with AI, and attempts to use AI to evade defenses.

Attackers are exploring better ways of using AI beyond creating pitch-perfect phishing emails, says Luke McNamara, deputy chief analyst at GTIG.

“The most common use cases that we are still seeing involves things like research, troubleshooting code, things of that nature, but we are starting to see some of these more advanced usages of AI — whether it’s AI for vulnerability discovery [or] exploit generation,” he says. “What we have seen is threat actors using AI at virtually every stage of the attack lifecycle.”

While the models are important, the scaffolding or “harness” around the model — which Anthropic describes as the code, architecture, and tooling around the AI models — is what will truly set expert-level automated attack chains apart from ineffectual attack automation, says Vinnie Liu, CEO and co-founder of Bishop Fox, an offensive cybersecurity services firm. The best harnesses include a variety of tests, quality checks, and additional agents to enforce rules, he says.

Liu pointed to the curl project’s elimination of its bug bounty in the face of a surge of AI slop as what happens when AI models do not have a proper harness. Six months earlier, a properly set up code analyzed based on AI found more than 40 issues that were “[m]ostly smaller bugs, but still bugs and there could be one or two actual security flaws in there. Actually truly awesome findings,” the maintainer of curl, Daniel Stenberg, posted on social media platform Mastodon.

“What separates a dangerous operator from a noisy one is the scaffolding … and the operator who keeps it honest and in bounds, especially with the more powerful models,” he says. “A model is an engine without a steering wheel; the scaffolding and the operator are what steer it.”

Fully automated attacks — so-called “AI worms” — perhaps pose the greatest risks.

Fitting AI Into Attacker’s TTPs

Expanding abuse of AI models also underscores the challenges in determining how AI-powered attacks should fit into existing frameworks for classifying offensive methodology — the tactics, techniques, and procedures (TTPs) commonly used by cyberattackers and researchers.

Anthropic used telemetry from 832 accounts banned for malicious activity, mapped that activity to the MITRE ATT&CK framework, and assigned a risk score to the threat actor using what the company calls the AI Risk Enablement Score or ARiES — “a composite score built from three signals: the actor’s threat profile, the model’s contribution to the requested harm, and the observed or potential impact,” the firm stated in its report. The findings show that attackers mainly used AI for malware creation and obfuscation, but also revealed that there is no real way to measure the quality of attackers’ AI scaffolding.

The Anthropic report detailed a campaign, tracked as GTG-1002, in which a threat actor breached government and critical infrastructure organizations in several countries. The threat actor achieved a maximum risk score of 100 by developing scaffolding that used Claude as “an autonomous operator” rather than just an advisor, the report stated.

However, the threat actor’s MITRE profile, which included 30 techniques across 13 tactics, suggested only a medium-risk actor.

“The most dangerous actors are now using AI to orchestrate attacks rather than simply build tools that enable such attacks, and the framework threat investigators use to track threats has yet to catch up,” the researchers stated.

It’s a challenge that the organization is already working on, engaging with the AI model developers and cybersecurity community to determine how to improve the ATT&CK framework in the AI era, says Adam Pennington, the lead for the framework at MITRE.

“We are always looking to evolve ATT&CK to meet what is going on and the latest behavior from adversaries, and AI is no exception to that,” he says. “We are looking for the right way to both track the behaviors we are seeing adversaries do, and try to do as best by defenders as we can.”

True Concerns, or Targeted Campaigns

A cyberattacker using a well-harnessed AI will be able to conduct research on vulnerabilities, search for opportunities to move laterally in a network, and exfiltrated data all in parallel, making speed one of the most significant benefits, says Gunter Ollman, chief technology officer at Cobalt, a penetration testing firm.

“The time — from uncovering what may be a bug to actually being able to exploit it — gets highly compressed, while the skillset of the adversary has been lowered, so that increases the volume of people that have that capability to actually go and do this,” he says.

With threat actors’ proven use of AI services to power their attack chains and the potential risks of attackers outpacing defenders, the government has taken action. In early June, President Trump issued an executive order calling for voluntary testing of frontier AI models in the 30 days prior to release. Anthropic also looks ready to give the European Union access to its latest model to satisfy their security concerns.

Yet, the Trump administration’s appears focused mainly on Anthropic. in February, the US Department of Defense, for example, labeled the company “a supply-chain risk”, after Anthropic sought to establish to limit military applications of the technology. In its statement notifying users that it would comply with the government’s latest legal directive by blocking all access to Fable 5 and Mythos 5, Anthropic voiced its disagreement with the administration’s singling out of its technology.

“If this standard was applied across the industry, we believe it would essentially halt all new model deployments for all frontier model providers,” Anthropic stated. “As we have stated publicly, we believe the government should have the ability to block unsafe deployments, as part of a statutory process that is transparent, fair, clear, and grounded in technical facts. This action does not adhere to those principles.”

Source link

#No Tag

Comments are closed