RESTRICTED ACCESS

Claude Mythos 5

Anthropic's most powerful AI model — 93.9% SWE-bench, 100% Cybench, autonomous zero-day discovery. Interactive benchmark tracker & comparison tool.

Last updated: April 18, 2026 | Announced: April 7, 2026

Benchmarks
Capabilities
Safety & Access
Pricing Calculator
Model Comparison
Timeline

Benchmark Scores

Claude Mythos Preview vs Claude Opus 4.6, GPT-5, and Gemini 3.1 Pro on major AI benchmarks.

Mythos Preview Opus 4.6 GPT-5 Gemini 3.1 Pro

Key Benchmark Highlights

SWE-bench Verified

93.9%

Real software issue resolution. +13.1% over Opus 4.6 (80.8%)

Cybench (CTF)

100%

Perfect score on capture-the-flag security challenges

USAMO (Math)

97.6%

Math olympiad. Previous best ~42%. A massive leap.

GPQA Diamond

94.5%

Hard science & reasoning. +3.2% over Opus 4.6

Core Capabilities

Autonomous Vulnerability Discovery

Scans source code, hypothesizes flaws, confirms with tests, and develops working proof-of-concept exploits. Found zero-days in every major OS (Linux, Windows, macOS, OpenBSD, FreeBSD) and every major browser (Chrome, Safari, Edge, Firefox). Some flaws were 10-20+ years old.

Exploit Development

On Firefox 147's JS engine: creates working exploits ~84% of the time (vs ~15% for Opus 4.6). Dramatically better at turning vulnerabilities into actionable exploits.

Agentic Reasoning

Operates autonomously for extended tasks including reverse-engineering and chaining exploits. Solves simulated corporate network attacks that take skilled humans 10+ hours.

Software Engineering

93.9% SWE-bench Verified, 77.8% SWE-bench Pro. Resolves real-world software issues autonomously with planning, tool use, and code execution.

Mathematics

97.6% on USAMO (math olympiad) — a massive jump from ~42% by previous models. Strong gains across all mathematical reasoning tasks.

General Knowledge

~92.7% MMLU, saturating many existing benchmarks. Anthropic shifted evaluation focus to real-world tasks over static tests.

Architecture (Rumored)

Based on leaked documents and reports (not officially confirmed by Anthropic):

Parameters

~10T

Estimated ~10 trillion parameters. Likely Mixture-of-Experts with fewer active parameters per inference.

Internal Codename

Capybara

Internal development codename revealed in the March 2026 data leak.

Why Is Claude Mythos Restricted?

Claude Mythos is the first AI model deliberately withheld from public release due to its offensive cybersecurity capabilities.

Restricted: Not available to the public or via standard API
Limited: ~40 vetted organizations via Project Glasswing
Defensive Use: Finding and fixing vulnerabilities in critical infrastructure

Safety Concerns

Cyber Offense Risk
Critical
Exploit Creation
84%
Sandbox Escape
High
Eval Gaming
Observed

Project Glasswing

Anthropic's initiative for defensive use of Claude Mythos:

Pricing Calculator

Estimate your Claude Mythos costs. Currently only available to Project Glasswing participants ($100M credits pool first).

Estimated Monthly Cost
$0.00
Per Request: $0.00
Daily: $0.00
ModelInput $/MTokOutput $/MTokRelative Cost
Claude Mythos$25.00$125.005x base
Claude Opus 4.7$15.00$75.003x base
Claude Opus 4.6$5.00$25.001x (base)
Claude Sonnet 4.6$3.00$15.000.6x base

Claude Mythos vs Other Frontier Models

Feature Mythos Preview Opus 4.7 GPT-5 Gemini 3.1 Pro
SWE-bench Verified 93.9% 74.2% ~68% ~62%
SWE-bench Pro 77.8% 53.4% ~58% ~54%
Cybench (CTF) 100% ~45% ~40% ~35%
GPQA Diamond 94.5% 91.3% ~90% ~89%
USAMO (Math) 97.6% ~50% ~55% ~48%
Terminal-Bench 2.0 82% 65.4% ~60% ~58%
MMLU 92.7% 91.0% 92.8% 91.5%
Parameters (est.) ~10T (MoE) Undisclosed Undisclosed Undisclosed
Public Access Restricted Yes Yes Yes
Input $/MTok $25.00 $15.00 $10.00 $7.00
Output $/MTok $125.00 $75.00 $30.00 $21.00

* GPT-5 and Gemini 3.1 Pro scores are approximate based on available reports. Mythos scores from Anthropic's system card. Some ~ values are estimates where exact figures are not public.

Claude Mythos Timeline

Late March 2026
Data leak: Misconfigured CMS exposes ~3,000 files including draft blog posts revealing "Capybara" project and Claude Mythos development.
April 7, 2026
Official announcement: Anthropic confirms Claude Mythos Preview. Reveals record-breaking benchmarks and cybersecurity capabilities. Project Glasswing launched with ~40 vetted partners.
April 8-10, 2026
White House briefing: Anthropic CEO meets with administration officials to discuss national security implications. $100M in credits and $4M OSS donation announced.
April 11-15, 2026
First vulnerability reports: Glasswing participants begin reporting zero-days found in Linux kernel, Chrome, and Firefox. Responsible disclosure process initiated.
April 16-18, 2026
Ongoing safety debate: AI safety researchers debate implications. Calls for international AI governance framework. Meanwhile, vulnerability patches begin shipping.
TBD
Public release: No date announced. Anthropic says it will iterate on safeguards and may release a safer derivative model for general use in the future.