Anthropic Teams Up With Amazon, Microsoft, and Google on AI Jailbreak Framework

Anthropic

The return of Anthropic’s Claude Fable 5 is currently under the spotlight. After nearly three weeks of export restrictions, the company has emerged with a proposal that could shape how the AI industry responds to security vulnerabilities. Along with Amazon, Microsoft, Google, and other partners, Anthropic is pushing for a common way to measure AI jailbreaks.

Also Read: Ripple, Visa, BlackRock Join 140+ Firms to Back Open USD Stablecoin

Anthropic Restores Claude Fable 5 After US Lifts Export Restrictions

Claude Fable 5
Source: Bloomberg

Anthropic confirmed that Claude Fable 5 will return globally on July 1 across Claude.ai, Claude Code, Claude Cowork and its developer platform after the US Commerce Department lifted export restrictions imposed on June 12.

The restrictions followed a report from Amazon researchers describing an AI jailbreak that prompted Fable 5 to identify software vulnerabilities. In one instance, it was asked to generate exploit code. The Commerce Department temporarily required Anthropic to restrict access to both Fable 5 and its cybersecurity-focused counterpart, Mythos 5. Meanwhile, the issue was reviewed.

According to Anthropic, engineers developed a new safety classifier that now blocks the reported jailbreak technique in more than 99% of test cases. Requests flagged by the system are automatically redirected to Claude Opus 4.8. This is a less capable model. Amidst this, the company continues refining the filters to reduce false positives during everyday coding tasks.

Source: Anthropic

The company also noted that testing found similar defensive cybersecurity capabilities in other advanced AI models. This includes OpenAI’s GPT-5.5 and Kimi K2.7, arguing that the reported behavior was not unique to Fable 5.

Also Read: India, Japan Eye Yen-Rupee Trade Settlement in Fresh De-Dollarization Push

Amazon, Microsoft, and Google Join AI Jailbreak Initiative

Anthropic is now trying to turn the episode into something bigger than a product update. Working with Amazon, Microsoft, Google, and other Glasswing partners, it is drafting a shared framework for evaluating AI jailbreak severity across the industry.

The proposal measures jailbreaks using four factors. This includes capability gain, breadth of impact, ease of weaponization, and discoverability. Anthropic says the framework could help developers prioritize the most dangerous vulnerabilities while giving governments a more consistent basis for evaluating frontier AI systems.

The company is also expanding cooperation with US agencies through pre-release model testing, threat intelligence sharing, and joint AI security research. The announcement comes as regulators worldwide increase scrutiny of advanced AI models, with governments seeking clearer standards around safety testing and deployment.

Also Read: XRPL Lending Protocol Enters Final Voting Stage as XRP Targets $1.20 in July

Sahana Kiran

Written by Sahana Kiran

Sahana Kiran has been covering financial markets since 2019, with a focus on cryptocurrencies, fintech, and the geopolitical events shaping them. She previously reported for AmbCrypto and Watcher Guru, and now writes for BlockNow.