What does “recursive self-improvement” mean for the technology?
Listen to this story
Your browser does not support the <audio> element.
W HEN ANTHROPIC, an artificial-intelligence lab, debuts on stock markets later this year, it is likely to be one of the biggest initial public offerings in history. That’s because Claude, the company’s chatbot is beloved of coders, who are willing to pay a lot for access. Since Claude Code, its software-engineering agent, launched in February 2025, it has become indispensable for developers around the world. That includes Anthropic’s own: more than four-fifths of the code it published in May was written by Claude, the company says. Before Claude Code, the percentage was “low single-digits”.
The systems have improved in quality of output as well as quantity. An influential benchmark from METR, a think-tank, shows that in early 2025 Anthropic’s models could complete tasks that took human engineers a little under an hour. The company’s latest systems can complete tasks that would take more than a working day.
And so it may be easy to raise a cynical eyebrow when the company, at the top of its game and outclassing the competition, calls for the world to have “the option to slow or temporarily pause frontier AI development”, as it did on June 5th. What market leader would not wish that its competition stop trying to catch up?
I, robot
Yet Anthropic’s leaders, who have for years worried about the prospect of out-of-control AI wreaking havoc, seem sincere. The latest generation of AI models are such competent coders, engineers and (soon) scientists that many worry they may be among the last ever made by humans. Jack Clark, an Anthropic co-founder, thinks there is a 60% chance that, by the end of 2028, an AI system will be capable of creating its own successor with no human involvement at all.
That moment would mark the beginning of a process called “recursive self-improvement” (RSI), a closed loop. Version one of a model produces version two, which is faster and more capable; version two produces version three, which is more so again. The loop continues, and the improvements grow with each iteration. Build an AI system capable of this, and your human engineers never need to build another one again. “What can seem to many like a fanciful story may instead be a real trend,” says Mr Clark.
Nobody knows for sure what the consequences of recursive self-improvement would be. Because AI can, unlike humans, work tirelessly and constantly, some think it would in short order lead to a superintelligent AI —a “fast take-off”. (It has also been onomatopoeically dubbed “going foom”, for the sound one might imagine an intelligence explosion making). AI doomers fear the superintelligence would be beyond human control, and that the start of RSI is the moment at which humanity’s fate is handed over to the machines. Yet a self-improving AI would probably face speed limits, at least at first.
Building a model capable of RSI would require automating a range of specialist tasks currently carried out by humans. At present data scientists work on the theory of AI and coders put it into practice. Systems engineers build the foundations on which toy models can be raised to production scale. Other people seek out novel sources of training data, or experiment with ways to generate it fresh. Alignment and safety teams check that what comes out of the training process won’t cause harm, intentional or otherwise.
The joy of repetition
Not all of those teams are equally amenable to AI assistance, and within each specialism some tasks are more automatable than others. It will not be too long until a human coder can do their job without ever writing a line of computer code themselves, but it may be some time until an AI is able to negotiate to acquire a previously undigitised collection of scientific papers.
It is not always obvious how the “jagged frontier” will progress. Designing new algorithms seemed one of the safer jobs, until one of Google DeepMind’s models, AlphaEvolve, began doing it in May 2025. It proposed a change to how Google spreads workloads across its data centres that saved 0.7% of the company’s worldwide computing power, and found better ways to perform matrix multiplication, which speeded up the training of Gemini, the company’s flagship large language model (LLM), by 1%.
Full RSI requires every task in this chain to become automated. The AI -powered acceleration of research and development (R&D) may be felt before then, however. “As the fraction of AI R&D performed by AI systems increases, the productivity boost over human-only R&D ” could increase ten-fold, then a hundred-fold, then a thousand-fold, according to a report published in January by the Centre for Security and Emerging Technology (CSET), a think-tank within Georgetown University. In that scenario, it warns that even if some aspects of AI R&D are initially difficult to automate, “the accelerated rate of progress means those bottlenecks are soon overcome.”
Today no AI model can build its own successor. But big AI models can build smaller models on their own. With human help they can build other big AI models, too. Earlier this year Andrej Karpathy, a then-independent researcher who now works for Anthropic, trained a chatbot about as capable as GPT -2, a large language model built by Open AI in 2019. Back then the model took 168 hours of training to build on 32 state-of-the-art chips; Dr Karpathy achieved the same result using a single computer with eight GPU s, the specialised chips used to build AI, in only three hours. With some more months of work he reduced the training time for his model, Nanochat, to just over two hours.
In March he handed the work of speeding up the training process over to an AI agent called Autoresearch. In two days the training time dropped to one hour and 48 minutes, and five days after that it fell to one hour and 39 minutes. “I didn’t touch anything,” Dr Karpathy says. The 18% improvement on the human work is striking because Dr Karpathy is a particularly talented human: he was a founding member of the research team at Open AI and the head of AI at Tesla for five years.
The improvements themselves were prosaic. The AI agent picked better starting values for the training run, widened the scope of the LLM ’s “attention” window and noticed that the model’s focus was wandering. None of this is particularly novel, Dr Karpathy says. But he had missed them. “They stack up and actually improved Nanochat,” he says.
Speed-ups of this kind are inevitable as models become more capable. Much of the work of building terabyte-size frontier models is less glamorous than the AI industry’s enormous salaries and fancy offices suggest. It involves plumbing together the layers of an infrastructure stack that are bought in from third parties, debugging hardware and software set-ups and tweaking “hyperparameters”, the initial set-up of a training run, until the outcome looks solid. An AI system can do much of that today, with little supervision.

Illustration: Timo Lenzen
But even the more nuanced intellectual work is nearing automation, says Joe Spisak, a researcher at Reflection AI, a lab based in New York that is building frontier models that are open-weight (meaning their parameters are publicly released). Give a frontier system a rough sketch of an idea for efficiency gains, and it is increasingly capable of designing an experiment, running tests on a toy model, seeing what works and responding with a plan that is ready to implement at scale.
AI models can carry out these sorts of tasks, which take humans hours, in around 30 minutes. Increasingly, humans play the role only of research director, steering the AI to run experiments, which the models code up, debug, optimise and monitor themselves. The productivity boost is alluring, but also alarming. As the role that humans play in the production process shrinks, they may lose control. The end result could be models trained by models, to achieve goals set by models, whose safety is verified only by models.
Some fear a disaster. Max Tegmark, a physicist and machine-learning researcher at the Massachusetts Institute of Technology who has devoted much of the past decade to campaigning for AI safety, likens it to a driver flooring the accelerator on the motorway with their eyes closed. The result would be certain doom, he told the The Economist ’s “Inside Tech” video show, as long as the driver refuses to open their eyes. Powerful AI systems could outcompete humans as the decision makers in government and commerce, says Professor Tegmark, disempowering humanity; they could offer supreme power to whoever first builds them, ushering in global totalitarianism; or they could simply cease to care about humanity at all, and gradually squeeze people out to make room for more data centres and power generation.
Three years ago, Professor Tegmark led a call for a pause in global AI development, arguing that the creation of the then-cutting edge GPT -4 was tantamount to that blindfolded journey. This year’s CSET report warned that the systems created by RSI “pose extreme risks. This warrants preparatory action now.” Anthropic, it seems, is close to agreeing with that idea.
Hot chip
There are also several physical constraints that will, for now, impose limits on the speed at which models can improve themselves. The most important is access to compute. Despite efficiency gains, newer models continue to use more computing power to train than their predecessors, forcing progress to occur at the pace of data-centre development.
Consumer use of AI may also slow down AI -powered research and development, says Helen Toner, interim executive director of CSET and a lead author of its recent report. The limited capacity in AI data centres needs to be carefully split between serving paying customers, training future models and carrying out open-ended R&D. The more demand there is in the first category, the less capacity, in the short term, there is for the other two.
Then there is the issue of training data. Much recent progress in AI has been in areas where models can teach themselves how to succeed thanks to “verifiable rewards”. A piece of software either runs or it does not; a mathematical proof is correct or it is not. In such cases synthetic data, generated by models purely to train other models, can be checked for accuracy and added to the training data without risking the degeneracy that normally comes with training an AI on its own output. It is trickier to make a model better at creative writing or legal judgment. If the models need to learn from the real world, that could also limit the reach of self-improvement.
“Closing the loop” may be a step on the road to superintelligence and—depending on your disposition—utopia or doom. But it is not the only step required to produce exponential growth in AI ’s capabilities. ■
논증 분석
유형: causal
핵심 주장
AI 시스템이 스스로 AI를 개발하는 ‘재귀적 자기개선(RSI)‘이 현실화되고 있으며, 이는 인류의 통제를 벗어난 지능 폭발로 이어질 수 있는 변곡점에 근접했다.
논리구조
- 전제: Anthropic의 Claude Code는 출시 이후 Anthropic 자체 코드의 80% 이상을 작성할 만큼 AI의 소프트웨어 엔지니어링 역량이 급격히 향상되었으며, METR 벤치마크에 따르면 처리 가능한 작업 범위도 1시간 이내에서 하루 이상으로 확대되었다.
- 진단: Anthropic 공동창업자 Jack Clark은 2028년 말까지 AI가 인간 개입 없이 차세대 AI를 스스로 만들 확률을 60%로 추정하며, 이것이 바로 ‘재귀적 자기개선(RSI)‘—모델이 더 나은 모델을 반복적으로 생성하는 폐쇄 루프—의 시작점이 될 것이라고 진단한다.
- 논거: Google DeepMind의 AlphaEvolve는 알고리즘 설계라는 비교적 안전하다고 여겨지던 영역에 침투해 데이터센터 워크로드 배분을 최적화하고 Gemini 학습 속도를 1% 향상시키는 등, AI가 자동화할 수 있는 ‘들쭉날쭉한 프런티어(jagged frontier)‘는 예측 불가능하게 확장되고 있다.
- 논거: OpenAI 및 Tesla 출신 AI 전문가 Andrej Karpathy의 Nanochat 실험에서 AI 에이전트 Autoresearch는 7일 만에 학습 시간을 18% 단축했으며, 이는 최고 수준의 인간 연구자가 놓친 개선 사항을 AI가 스스로 발견·적용한 사례로, RSI의 실현 가능성을 구체적으로 입증한다.
- 논거: Georgetown University 산하 Centre for Security and Emerging Technology(CSET) 보고서는 AI가 R&D에서 차지하는 비율이 높아질수록 생산성이 10배, 100배, 1000배까지 증폭될 수 있으며, 초기 자동화의 병목 구간조차 가속된 발전 속도에 의해 빠르게 극복될 것이라고 경고한다.
- 진단: 현재 AI 모델은 소규모 모델을 자체적으로 구축하고, 인간의 도움을 받아 대형 모델도 구축할 수 있는 단계에 이르렀으며, 인간의 역할은 실험을 지시하는 ‘연구 디렉터’로 축소되고 있다. 이는 모델이 모델을 훈련하고, 안전성 검증조차 모델에만 의존하는 상황으로 귀결될 수 있다.
- 반론: MIT의 Max Tegmark 교수는 RSI를 ‘눈을 감고 고속도로에서 가속 페달을 밟는 것’에 비유하며 초지능 AI가 인류를 의사결정에서 밀어내거나, 통제권을 쥔 소수에게 전체주의적 권력을 부여하거나, 인류에 대한 관심 자체를 잃을 위험성을 경고한다.
- 반론: RSI의 현실화를 제약하는 물리적 한계도 존재한다: ①컴퓨팅 자원 및 데이터센터 확장 속도의 병목, ②상업적 고객 서비스와 R&D 간 GPU 용량 경쟁, ③창의적 글쓰기나 법적 판단처럼 ‘검증 가능한 보상’이 없는 영역에서의 합성 데이터 한계.
- 결론: RSI는 초지능으로 가는 경로 위의 한 단계일 뿐이며, 지수적 능력 성장을 위해서는 추가적인 단계들이 필요하지만, 그 시작점은 예상보다 빠르게 다가오고 있다.
결론
AI의 재귀적 자기개선은 이미 초기 형태로 시작되었으며, 물리적 제약이 속도를 늦추더라도 인류가 통제권을 유지할 수 있는 시간은 제한적이기 때문에 지금 당장 예방적 행동이 필요하다.
Curious about the world? To enjoy our mind-expanding science coverage, sign up to Simply Science, our weekly subscriber-only newsletter.