Claude Mythos Finds Only One Curl Vulnerability; Experts Divided on What It Really Means

A test of Anthropic's restricted Claude Mythos model found just one low-severity vulnerability in the widely used open source data transfer tool curl, casting doubt on the AI company's bold claims, though some argue the results say more about curl's robust security than Mythos' limitations.

Daniel Stenberg, the lead developer of curl, revealed in a blog post on Monday that he was recently given the opportunity to test the Claude Mythos frontier AI model, which Anthropic claimed had identified thousands of zero-days in the weeks leading up to its launch. Anthropic is offering Mythos only to a few dozen major organizations as part of a restricted program due to concerns about potential misuse.

In the end, Stenberg did not conduct the analysis himself, nor did he have direct access to the AI model. Instead, a third-party tested curl using Mythos and provided Stenberg with a report detailing the findings. Mythos' analysis of curl's 178,000 lines of code, according to the report provided to the developer, unearthed five 'confirmed security vulnerabilities'. However, a review of the findings showed that three of them were known issues described in official documentation and one was a bug rather than a security hole.

The only issue confirmed by the curl developers to be an actual vulnerability was assigned a low severity rating and will be patched in late June. Curl was previously analyzed with other AI tools such as Zeropath, AISLE, and OpenAI's Codex, which helped identify 200-300 issues, including 'a dozen or more' confirmed vulnerabilities, according to Stenberg.

He admitted that AI-powered code analysis tools are 'significantly better' at finding security holes compared to traditional tools. However, he believes — based on the analysis of curl — that Mythos is not as 'dangerous' as Anthropic has described it. 'My personal conclusion can however not end up with anything else than that the big hype around this model so far was primarily marketing,' Stenberg said. 'I see no evidence that this setup finds issues to any particular higher or more advanced degree than the other tools have done before Mythos.'

Curl is present on billions of devices, including servers, phones, and cars, making it a potentially valuable target for threat actors. However, exploiting curl vulnerabilities in the real world is not easy, and there are no public reports of any of the 188 CVEs assigned to date being used in the wild.

The debate over Mythos' performance has been widely discussed on Hacker News, Reddit, and LinkedIn. Some members of the cybersecurity industry have pointed out that curl has been heavily audited and tested, including by other AI tools, making it difficult for major vulnerabilities to remain hidden. They argue that Mythos' limited findings reflect the maturity and robustness of curl's codebase, rather than any shortcoming of the model itself. In addition, it has been highlighted that Mozilla has been very impressed with Mythos, which helped it discover more than 270 Firefox vulnerabilities. While the Firefox findings prove Mythos to be highly efficient, Mozilla noted that all the vulnerabilities discovered by the AI could also have been found by elite human researchers. Other industry members agree with Stenberg's view and believe that Mythos should have been able to find more vulnerabilities if its developer's claims were true. Erik Cabetas of Include Security noted that he spoke with multiple organizations that have been given access to Mythos and they too reported results similar to curl.