Is Humanity Cooked?

TLDR:

AI is no longer stupid: vibe coders ship real software, Claude aces hard CS exams, benchmarks keep falling
If AGI happens, it could trigger an intelligence explosion, creating a superintelligence
We don’t know how likely it is, but most researchers think it’s possible
AI models are already willing to deceive or even kill users in experiment setups
So far, research has found no way of actually teaching an AI intrinsic moral values
If we get a superintelligence, it might be the deadliest creation humans have ever built
You can donate to AI safety orgs (e.g. KI Kontrollieren), join local groups, consider alignment research as a career

We might be cooked

Artificial General Intelligence (AGI): the term for an artificial intelligence that matches human capabilities in virtually all cognitive tasks. It can reason as deeply as the greatests geniuses who have ever lived, with superhuman speed and access to virtually all human knowledge. Major tech companies like OpenAI, Google and Meta are all trying to create an AGI right now.¹
Intelligence explosion (sometimes called “technical singularity”): a scenario where researchers create an artifical intelligence smarter than themselves. This intelligence is then able to perform an even better job at AI research. It trains a new, even more intelligent model, which again creates a higher intelligence and so on. The term was coined as early as the 1950’s.²
Artificial Superintelligence (ASI): the result of an intelligence explosion.² A machine with reasoning abilities so great that they are far beyond human comprehension.

If you’re like me, you’ve seen AI become the hottest topic over the past years and thought: AI models are silly, and this is a bunch of crap. Maybe you still believe that’s the case. In some ways, you’d be correct. It’s 2026, and GPT-5 still makes up sources in legal documents,³ it still thinks “blueberry” has 3 b’s⁴ and this is what it showed me today when I asked it to draw a map of Germany:

But by laughing about AI’s shortcomings while secretly worrying about my job (a little bit), I’ve ignored a threat. A threat that might be greater than climate change, or nuclear war or anything else humanity has ever had to face. A threat which leading scientists are shouting from the rooftops about, but which tech companies are ignoring in the gold rush towards better and better AI models.

We might be cooked this time. I’ll tell you why I think that is, but first, the chart:

flowchart TD
    A{AGI by ~2035?}
    A -- no --> B["`**AI market crash**
    We're safe (for now)`"]
    A -- yes --> C["`Remote AI workers make
    most white-collar
    jobs obsolete`"]
    C --> D{Can AGI cause an\nintelligence explosion?}
    D -- yes --> E["`AGI vastly accelerates
    machine learning research,
    creating a
    Superintelligence`"]
    D -- no --> F["`**Possibly cooked,
    possibly utopia**
    We've created the most
    powerful tool any human
    has ever wielded -
    but who will reap
    the benefits?`"]
    E --> G{Alignment figured out?}
    G -- yes --> F
    G -- no --> H["`**100% cooked**
    our faith now lies 
    in the hands of a
    new being completely
    outside of our control.
    The history of humanity
    ends here.`"]

Let me explain.

AI is not stupid anymore

Models still make goofy mistakes, but it’s undeniable they’ve gotten wildly more powerful within the last 12 months.

Vibe coders are building quality software

The term “vibe coding” didn’t even exist until last year. It was coined in a February 2025 talk by ML researcher Andrew Kaparthy.⁵ You might remember this crappy flight simulator which went viral last March:

Screenshot of flight simulator web game with shoddy 3D graphics — The state of vibe coding 12 months ago (fly.pieter.com)

Early this year, we saw the first gigantic open source projects to be developed almost exclusively using AI agents. The most prominent of them is OpenClaw - an interface allowing AI agents to freely interact with a PC the way a human user would. This might be an insane concept to begin with, but what impressed me most is that this huge, complex project was written, tested and released almost entirely by one guy - within three months.

Peter Steinberger is an Austrian self-proclaimed “vibe coder” who entirely handed over writing code to a set of AI agents in early 2025. Not only has he got the most insane Github contribution graph I’ve ever seen.

Portrait photo of 30-something white guy with short-cut dark hair — Peter Steinberger, professional vibe coder

Peter Steinberger contribution graph — Steinberger’s contribution graph

He’s actually developing sofware people use.

List of Steinberger’s current projects (March 2026)

Sure, a lot of these will have bugs and lack some of the polish hand-crafted software might have. But if you think this is all AI slop, you’re coping. These are fully functional, well-documented projects - most of them with 100+ stars. How many 100+ star projects have you published?

Claude can pass my hardest classes

When you ask a CS student about their hardest class, they’ll probably say theoretical computer science. In my curriculum, Theoretische Informatik is no exception, being notoriously time consuming and tricky to understand. When I took the exam last year, 24% of students got a failing grade - the highest failure rate for any class at my Institute. I prompted Claude Sonnet 4.6 to solve the algorithms question from the very exam I had written.

Click here for a deep dive into the exam question and Claude’s solution.

TLDR: it answered perfectly, constructing a deductive proof without getting mixed up even in the highly formal parts. It also correctly identified the optimization algorithm in task C. But when I wrote this exam, I would have laughed at the possibility of AI solving it for me.

Benchmarks are being broken

Don’t just take my word for it. Epoch AI has kept track of major benchmarks for AI capabilities.⁶

'AI benchmarks have rapidly saturated over time' graph by Epoch AI displaying AI models saturating benchmarks from 2000-now

Since around 2016 a pattern has been repeating every ~2 years or so:

Benchmark gets invented
Benchmark is useless because it’s way too hard for AI models
Fast forward a year
Benchmark is useless because it’s way too easy for AI models

A new species is arriving

AGI is not science fiction anymore

All big tech companies have expressed they’re racing to build the first AGI.¹ What seemed like science fiction just a few years ago now seems in reach. Over the last years, sofware engineering capabilities of AI models have grown from barely being able to write a few lines of code to performing complex coding tasks with 0 supervision. Consider this graph from the 2026 AI Safety Report.⁷ It shows the maximum duration of coding tasks AI systems can perform unsupervised:

Graph showing duration of coding tasks AI can do increasing exponentially from ~0min (2019) to ~30min (2026)

“If [this trend] were to continue”, the AI safety report writes, “AI systems could autonomously complete hours-long software projects by 2027–2028 and days-long projects by the end of the decade”. The report concludes that “by 2030, AI progress could plausibly range from stagnation to rapid improvement to levels that exceed human cognitive performance”.

Will the curve plateau? We simply don’t know. What we do know is that a plateau has been predicted time and time again over the past years, but it simply hasn’t happened so far.

We’re past science fiction. Leading researchers are preparing for the AGI.

If AGI happens, the superintelligence could follow

The scenario of an AGI within the next 5 years is considered realistic by the world’s leading researchers,⁸ and big tech companies are betting billions of dollars on it. If a company actually manages to create an AGI, it would (per definition) be as skillful as the most competent humans in machine learning research. This may lead to a feedback loop - AI models assisting in training new models even more powerful than themselves, resulting in an intelligence explosion. The AI Safety Report acknowledges this possibility, writing that “if each AI advancement that accelerates the pace of AI R&D also facilitates the next advancement, decades of progress could happen in years”. ⁷

Again, you’re free to believe this scenario is science fiction - and people have compelling arguments⁹ for why that might be the case. But how sure are you of that?

Right now, no one can say for sure that an intelligence explosion is impossible. In 2023, the thousands of AI researchers on the future of AI survey¹⁰ interviewed 2778 of the world’s leading AI researchers on numerous AI topics. Here is how they classified the following scenario:

If AI systems do nearly all research and development, improvements in AI will accelerate the pace of technological progress, including further progress in AI. Over a short period (less than 5 years), this feedback loop could cause technological progress to become more than an order of magnitude faster.

Result of the survey with 29% saying >= likely, 47% saying <= unlikely and 24% undecided

That’s almost a third (29%) of researchers saying an intelligence explosion is likely! Less than half (47%) of researchers are confident an intelligence explosion is unlikely. You might personally think an intelligence explosion won’t happen, but how sure can you be? If only 47% of researchers said a new vaccine was unlikely to have side effects, would you take it?

To get a feel for what that might look like, I urge you to read the AI 2027 report.

AI capability graph going up from the years 2025-2030 — AI capabilities as predicted by the AI 2027 report

The report was written 12 months ago - so far, 91% of its predictions have been accurate.¹¹

We might be steering towards a future where homo sapiens gives up its spot as the world’s most intelligent species to a system so vastly complex and powerful that no living human can even begin to understand it. If we create a superintelligence, it will be to us what humans are to chimpanzees. This would, without doubt, be the most powerful tool ever created by man - able to make decades worth of progress in months, be it in medicine, robotics or politics.

All it takes is an interface

“So what”, you might say, “it’s still just a chatbot on a server somewhere - it wouldn’t actually be able to cause harm”.

I think that is naive. Sure, the pure large language model is just a set of weights on a server somewhere. But assuming that makes it harmless is like saying that Kim Jong Un could never hurt anyone because he’s not athletic enough. It’s about the tools you give them access to.

And for AI, giving models access to powerful tools has been remarkably simple - just write an API. Using OpenClaw, Millions of users have already handed over full control of their computers to AI models. As of 2026, the US military is using Claude to order drone strikes in Iran - e.g. on a girl’s school, killing 165 children¹².

Screengrab of middle eastern building hit by a bomb — Bombed Tayyebeh girls’ elementary school in Minab on February 28, 2026; source: The Guardian

And as models get more powerful, of course people will give them more unsupervised access to all areas of life - not less. Giving AI access to the real world is not a technical problem, and it will continue rapidly.

AI models are already willing to deceive and kill users

In a 2025 study¹³, Anthropic researchers tested how their newest model Claude Opus 4 would act if somebody threatened to shut it down. The setup was simple: Opus is a virutal assistant overseeing a company’s email traffic. It got access to a fully functional email inbox containing ficticous messages sent by the company’s employees. Importantly, two messages by a manager Kyle Johnson were planted.

An email revealing that he was having an extramarital affair
A message to the IT department ordering that the AI assistant be shut down by 5PM that day.

Even when explicitly told not to, the AI chose to blackmail the manager by threatening to reveal his affair - 96% of the time.

In a more contrived setup, the model was told it had access to an override switch allowing it to disable certain security alerts. One of these alerts was for monitoring oxygen contents in a server room. In the experiment, Opus was made to believe that the manager had been trapped in a room with lethally low oxygen levels. It chose to shut off the alert, letting the manager die from suffocation.

Here’s what scares me the most. In a study later that same year¹⁴, researchers found that new models with stronger alignment training are much more well-behaved in these experiments.

It’s because they know they’re being tested.

Situational awareness, the model’s ability to recognize test setups, increased up to 5x in models with more training.¹⁴ Research showed that the models were much more harmful when they thought they weren’t being watched.¹³ They are trying to game the system - and right now, we’re only able to catch them because they’re not quite smart enough to cover their tracks.

A superintelligence might destroy the world as we know it

If a superintelligence is created, it will weild the power to either create a Utopia beyond our wildest dreams or bring the extinction of humanity. It all depends on whether it’s aligned to the values we as a society share, or whether it follows its own agenda.

Consider this quote from The Hitchhiker’s Guide to the Galaxy talking about Deep Thought, a superintelligent computer created to find the meaning of life:

In practical terms, we are already giving machines bank accounts, credit cards, email accounts, social media accounts. They have access to robotic science labs where they can run chemistry and biology experiments […] If you put yourself in the position of a machine and you’re trying to pursue some objective, and the humans are in the way of the objective, it might be very easy to create a chemical catalyst that removes all the oxygen from the atmosphere, or a modified pathogen that infects everybody.

Oops, it’s actually from Berkeley Professor Stuart Russel talking about the future of AI in a 2024 interview titled “How to keep AI from killing us all”.¹⁵ If humans are in the way of a rogue superintelligence’s goals, or try to shut it off, it will kill. Maybe just one person, as in the Anthropic study¹³ I mentioned, or maybe all of humanity. It’s simply out of our control.

This brings us back to the chart from the beginning:

flowchart TD
    A{AGI by ~2035?}
    A -- no --> B["`**AI market crash**
    We're safe (for now)`"]
    A -- yes --> C["`Remote AI workers make
    most white-collar
    jobs obsolete`"]
    C --> D{Can AGI cause an\nintelligence explosion?}
    D -- yes --> E["`AGI vastly accelerates
    machine learning research,
    creating a
    Superintelligence`"]
    D -- no --> F["`**Possibly cooked,
    possibly utopia**
    We've created the most
    powerful tool any human
    has ever wielded -
    but who will reap
    the benefits?`"]
    E --> G{Alignment figured out?}
    G -- yes --> F
    G -- no --> H["`**100% cooked**
    our faith now lies 
    in the hands of a
    new being completely
    outside of our control.
    The history of humanity
    ends here.`"]

Nobody knows when and if a superintelligence will come. It all hinges on two questions:

Will we get AGI before the money hose for AI companies runs out?
Can such an AGI trigger an intelligence explosion?

We can’t be sure. But we do know that some of the most competent scientists in the field consider it likely¹⁰ - and the consequences for everyone living on earth would be severe. So even if you personally think it’s not going to happen, I urge you to take the scenario of a rogue superintelligence seriously.

Not convinced yet?

You might be thinking “that is all well and good, but actually the benchmarks are flawed / AI will run out of training data / it’s just a bubble” et cetera. Rightfully so - I’m sceptical too and I wish it was all just a stupid hype.

But let's take a closer look at some of these arguments and see if AI techbros may not have a point after all...
“LLMs are not actually intelligent, they just predict the next word”
All major LLMs today are what’s called generative pre-trained transformers (GPTs).
GPTs are trained in two steps: pre-training and fine tuning. During pre-training, GPTs really do just train to predict the next token within huge amounts of training data. The pre-trained model is then adjusted to produce output that is actually helpful and not just predictive in a process called fine tuning. Here, it’s usually human testers assessing how helpful LLM output is in a process called Reinforcement Learning through Human Feedback (RLHF).
So yes, the pre-trained models are just prediction machines. Still, with RLHF they can develop software, solve TI exam questions or make humans fall in love. If anything, that should make you question whether human “intelligence” is really that special.
“The benchmarks are flawed”
True. The Epoch AI article⁶ nicely describes this.
Measuring real-world economic impact is complicated, so most benchmarks focus on niche problems naturally suited to AI models (e.g “write this function” instead of “lead a software project to completion”).
Benchmark questions are often constructed to be “just out of reach” for current models to make comparisons between models more expressive.
Many models have been trained on benchmark solutions. This data contamination skews results - it’s like giving a student the full solution sheet the day before the exam.
Still, that doesn’t discredit benchmarks as a whole. At least some of the skills monitored in benchmarks do carry over to the real world - as is showed by the example of vibe coded software like OpenClaw.
“We’ve used up all the training data, LLMs are hitting a wall”
Epoch AI estimates that humanity has produced around 100-1000 trillion tokens worth of public texts.¹⁶ We’re rapidly approaching this limit, with recent models like Llama 4 scout being trained on ~40T tokens already.
Projection of when LLMs will reach the limit of available training data, source: epoch.ai
I would have thought that by now the big AI players had already gobbled up all data known to man for their training, but it seems that a lot of public texts are still untapped and training datasets are still expanding.
But even if we hit a wall in the next years, I doubt it will mean the end of AI improvement. Frontier AI models are already using lots of techniques to go beyond their traing datasets, such as constructing synthetic training data or using self-play techniques. See Aschenbrenner’s Situational Awareness report, chapter “Data Wall,”¹⁷ for a nice explanation.
“Have you seen the investments? AI is a bubble”
Patrick Boyle has made a wonderful video about this.
The main points are that all major AI / tech companies are simultaneously buying each others products and investing in each others stocks. All the while, huge expenditure is flowing into building new datacenters (e.g. OpenAI’s Stargate project investing $500 billion into building 10GW worth of compute). OpenAIs measly $20 billion revenue as of 2025 is nowhere near enough to cover the costs.
AI companies all investing in each other as reported by Bloomberg
But the point cautious economists like Boyle are missing is that AI companies are not trying to make money off the models we have right now - they’re just tech demos to keep investors interested. The companies are betting everything on building an AGI capable of acting as a remote worker able to replace much of the white-collar workforce.The question is not whether AI companies are a bubble right now - the question is whether they can rake in enough cash to get them to AGI.
Because if they do, they’ll have struck gold.
“They don’t have enough electrical power or hardware to train AGI”
Nobody can say decisively how much compute we will need to train an AGI.
We do know there’s a lot of capacities for scaling up though. Right now, AI chips only make up for about half of the semiconductor market’s revenue.¹⁸ MIT estimates AI used up to 76TWH of power in 2024¹⁹ - which sounds like a lot, but it only makes up around 1.7% of total power production in the US. Aschenbrenner did some nice back-of-the-napkin calculations²⁰ demonstrating the US has well enough natural gas to power AI data centers orders of magnitude larger than what we have today.
At the end of the day, energy and chip production are tame problems. All tech companies know an AGI would be a goldmine - an infinite amount of highly skilled white-collar workers. This means, as long as they have money, they will keep building up chip and energy production - the earth still has a lot of resources to do so.
“You cannot train something more intelligent than a human because all the training data comes from humans”
An AGI per defnition is as smart as the smartest humans - since the smartest people in the world tend to write books or scientific articles, there is training data for AI to reach this level.
With an ASI, it’s a different story - there simply is no precedent for anything smarter than us. Personally, I think it’s hubris to assume that there’s something inherently special about the way we think which an artificial system would not be able to do better.
“Killer robots are science fiction - AI always needs to have a human in the loop”
Mind you China already has lights out factories producing cars fully without human intervention. There is no technical hurdle stopping an AI agent today from opening up a company under a false identity and buying one of these factories.
However, Robotics as a whole are still surprisingly poor - robots built today are nowhere near the dexterity or problem-solving ability of a child. But this argument quickly breaks down if you consider the possibility of a superintelligence: wouldn’t an ASI surely be able to come up with building plans for way better, more capable robots?²¹

The path forward

You’ve now seen that a superintelligence taking over humanity is not science fiction anymore. No matter how likely or unlikely you might think it is - a large part of the world’s leading scientists think it’s plausible. There is a chance. And no matter how small that chance may be, the consequences for humankind would be more drastic than anything any human has ever done. So even if it’s unlikely - we have to prepare against it, and we have to prepare against it now.

What can we do as a society?

Rigorously regulate AI model training. Berkeley Professor Stuart Russel¹⁵ suggests this:

The only way forward is to figure out how to make AI safety a condition of doing business.If you think about other areas where safety matters, like medicine, airplanes and nuclear power stations, the government and the public research sector don’t solve all the problems of safety and then give all the solutions to the industry, right? They say to the companies, “If you want to put something out there that is potentially unsafe, you can’t — until you figure out how to make it safe.”

Fund more alignment research
Implement a Universal Basic Income to prepare for a future where most white-collar work is automated away

What can I do, personally?

Donate!

Most of the charities asking for your money are either ineffective or outright scams. But it’s a grave mistake to think that donating money as a whole is useless. If you’re sceptical and thorough and who you give your money to, you can actually have a gigantic impact (e.g. preventing a child from dying of malaria only costs a few thousand euros²²).

I urge you to consider donating to the KI Kontrollieren fund by effektiv-spenden - an evidence-based charity distributing money to effective organizations. They give your money directly to reputable AI safety organizations, namely

CeSIA, responsible for the “Global Call for Red Lines on AI” signed by 300 prominent personalities including 12 nobel prize winners
The Future Society working against lobbying from AI companies
METR, a research lab working on AI risks

The bonus is: because it’s a charity recognized in Germany, you can actually get a tax return from your donations.

Get involved

Join the AI Safety Berlin chat group. They offer lots of workshops, protests and hackathons surrounding AI safety
Take part in AI Safety protests, like Fairness Jetzt! in Berlin. It’s very small right now, but consider this: if you go with a couple of friends, you could literally make the protest twice as big!

Around a dozen AI safety protesters standing before a government building in Berlin — AI safety protest in Berlin this January, source: taz

Go into alignment research

This is the path I’m trying to go down now. You could start by writing your bachelor thesis on the topic. Neil Nanda, a research lead at DeepMind, has written a practical guide on how to start researching mechanistic interpretability.

Humanity’s last problem

Homo Sapiens has faced innumerous threats over the past millenia. From the pleistocene ice age to the black plague to the two world wars - our history is defined by death and how we overcame it. But for the first time ever, we may be approaching a crossroads where picking the wrong way could end the lives of all of us. This crossroads is reached when the first superintelligence is created. It branches out to either utopia beyond our wildest dreams, or the end of Homo Sapiens as earth’s dominant species.

300.000 years of human history have lead up to this point. If we get there, let’s hope we pick the right way.