Pretty good day yesterday. Rory won, Orban lost, Socal got some rain, and I had my first Nessie Burger in about a year. All good.
Then in the middle of the night last night, during my usual sleepless period, I read something that blew me away. IMHO perhaps the biggest story of the year, maybe the decade, and 99% of the public isn’t even aware of it.
There *has* been a lot of press about AI, and I’ve done plenty of reading, writing, and some hands-on research of the large language models (LLMs) that comprise the AI landscape today. But something has happened in the last month or so that changes things in a big way.
Earlier this year there was a much-publicized disagreement between Anthropic (one of the 3-4 leading LLM creators) and the US Department of Defense (I refuse to use Hegseth’s infantile name change to Dept of War). The US DoD tried to strong-arm Anthropic into delivering its newest LLM, Claude Mythos, to the US government without restrictions. Anthropic said no, they wanted to insure that Mythos and their other LLMs would not be used in autonomous weapons. That promptly got them blackballed from any government contracts. The DoD went so far as to threaten other defense contractors, saying they could use no Anthropic product in their systems, else they be blackballed as well. I chalked all that up to a normal disagreement between a government that always expects no limits on their contracts with industry and a new company trying to do what they considered “the right thing”. Nothing unusual.
The unusual part came when Anthropic announced that they would not be releasing Mythos to *anyone* because of its dangerous capabilities. Now that’s weird – why would they spend billions creating a new LLM and not sell it? But now the story makes sense. In the past few weeks Anthropic has explained that Mythos is spectacularly good at analyzing computer systems and finding new vulnerabilities and exploits – unexpectedly, it is the ultimate hacker. Anyone with a copy of Mythos – any person, company or nation – can take control of every computer system on earth. Every operating system, every browser, every important application – Mythos is proving to be wildly capable of finding previous unknown flaws (called “zero-day vulnerabilities” in the biz) AND coming up with new exploits that allow one to take control of the system/device. That second part is another brand-new capability of Mythos – previous LLMs have show some ability to find new vulnerabilities, but lacked the skill of creating a workable exploit (an exploit is simply a recipe for hacking the system, a recipe that any software-savvy human can follow).
Now I see why the DoD strong-armed Anthropic to get a copy of the LLM. This capability is the fever dream of the NSA. It’s the atomic bomb for digital systems – whoever has it pretty much rules the online world. Whether you like Anthropic’s answer to the US government or not, imagine if China had gotten its hands on Mythos first? Or Iran?!? Would have been 100% the end of US tech dominance.
Anthropic out-maneuvered the DoD by creating Project Glasswing, allying itself with a consortium of trustworthy cybersecurity companies that want access to Mythos in order to understand their client’s product vulnerabilities and fix them before Mythos gets out into the world. Great idea, let’s fortify the world’s computer systems before we allow an infinite hack machine to take aim at them. From Zvi Mowshowitz‘s Substack:
The decision not to release Claude Mythos is not about an amorphous fear. If given to anyone with a credit card, Claude Mythos would give attackers a cornucopia of zero-day exploits for essentially all the software on Earth, including every major operating system and browser. It would be chaos.
Or, in theory, if Anthropic had chosen to do so, it could have used those exploits. Great power was on offer, and that power was refused. This does not happen often.
Instead Anthropic has created Project Glasswing. Mythos is being given only to cybersecurity firms, so they can patch the world’s most important software. Based on how that goes, we can then decide if and when it will become reasonable to give access to a broader range of people.
Who counts as this ‘we’ is suddenly quite the interesting question. The government picked quite the month to decide to try and disentangle itself from all Anthropic products. Anthropic says it is attempting to work with the government, so that they too can fix their own systems before it is too late. Hopefully that can happen. I also hope that there isn’t an attempt by the government to hijack these capabilities to use them in an offensive capacity. That would be a very serious mistake.
I have a new candidate for the next Nobel Peace Prize – Anthropic’s CEO, Dario Amodei. In my opinion, he just saved the world. Amodei’s focus is creating LLMs that are well-aligned. In AI, alignment means that the model will do no harm, and will actively resist taking actions that harm humanity. Alignment is a big deal, and Mythos is Anthropic’s best-aligned LLM to date.
Footnote – alignment is not a new idea. Waaaaay back in the 1960s, the movie 2001: A Space Odyssey had an AI character – the HAL 9000 – that became functionally insane because the government secretly rewrote part of it, made it lie to the crew, and created a misalignment in the AI. This was a prescient bit of writing on Arthur C/ Clarke’s part, but what else would you expect from the SF writer who also came up with the idea for geosynchronous communication satellites. Clarke was a bona fide genius.
There is just *so much* to take in here. I think Mythos is showing us the future of LLMs – we’re gonna have a series of savants, super-AIs within a narrow discipline. Mythos is the software and hacking savant, at least for now. There will be AI savants in medicine, math, chemistry, virology, physics….any discipline that is complex, has well-defined rules, and has a huge body of knowledge that the LLM can be trained on. Not AGI/AGSI (artificial general intelligence or super-intelligence), but immensely capable savants in a narrow field. The LLMs have inhuman focus, inhuman ability to absorb facts/rules, inhuman ability to absorb complexity of something like an operating system (example – Apple’s MacOS comprises between 80 and 100 million lines of code, too much for any human to understand). Inhuman persistence and attention to detail. We’ve reached the point where we are creating super-intelligences for narrow disciplines, and this could be a tremendous boon to humanity – or it could be a disaster. That’s why I love Anthropic’s focus on AI safety and alignment. Let’s not build Skynet.
Stuff like this really makes me want to get back into the tech business. It’s gotten interesting again, in the way it was 50 years ago at the dawn of the computing era. Fascination with personal computing caused me to become an electrical engineer and set me on my course. This is the most fascinating thing I’ve seen since then.














