From MIPS to Exaflops in only decades: the calculation of power explodes and it will change the AI


Take part in our daily and weekly newsletters to get the latest updates and exclusive content for reporting on industry -leading AI. Learn more


At the latest NVIDIA GTC conference, the company revealed what it described as the first single-RAK system of servers who are able to do an exaflop-a billion billions or a quintillion floating point operations (flops) per second. This breakthrough is based on the latest GB200 NVL72 system, which contains the latest Blackwell graphics processing units (GPUS) from NVIDIA. A standard computer shelf is about 6 feet high, a little more than 3 feet deep and less than 2 feet wide.

Smarting an exaflop: from border to Blackwell

I noticed a few things about the announcement. First, the world’s first exaflop computer in the world was only installed in the OAK Ridge National Laboratory in 2022 a few years ago in 2022. For comparison: the “Frontier” supercomputer created by HPE and operated by AMD GPUS and CPUS originally consisted of 74 server shelves. The new NVIDIA system has achieved a larger performance density of around 73x in just three years, which corresponds to a tripling of the performance every year. This progress reflects remarkable progress in computing, energy efficiency and architectural design.

Second, it must be said that both systems hit the milestone in Exascale, but they are built for different challenges that are optimized for the speed, the other for precision. The Exaflop specification of NVIDIA is based on mathematics with a lower precision-in particular 4-bit and 8-bit slide-of-course comma operations for AI workload as optimal, including tasks such as training and executing large language models (LLMS). These calculations prioritize the speed compared to precision. In contrast, the Exaflop rating for Frontier was reached using 64-bit mathematics with 64-bit double prayer, the gold standard for scientific simulations, in which the accuracy is of crucial importance.

We have put a long way (very quickly)

This progress seems to be almost unbelievable, especially since I remember the most modern way when I started my career in the computer industry. My first professional job was as a programmer in the DEC KL 1090. This machine, part of the PDP-10 series of Timeshare Mainframes from DEC, offered 1.8 million instructions per second (MIPS). Apart from the CPU power, the machine connected to the cathode jet tube (CRT) is displayed via fixed cables. There were no graphics functions, only bright text on a dark background. And of course no internet. Remote users who are connected via telephone lines with modems at speeds of up to 1,200 bits per second.

Dec System 10; Source: by Joe Mabel, CC BY-SA 3.0.

500 billion times more calculation

The comparison of MIPS with flops gives general progress, but it is important to remember that these metrics measure different arithmetic loads. MIPS reflects the full number of processing speed, which is useful for the general sales of computing, especially for business applications. Flops measures the swimming point performance, which is of crucial importance for scientific work pollution, and the difficult number behind the modern AI, such as the matrix mathematics and the linear algebra, which are used to train and execute machine learning models (ML).

Although it is not a direct comparison, the sheer scale of the difference between MIPS and flops now provides a strong example of the rapid growth of computing power. The new NVIDIA system uses them as a rough heuristic to measure the work carried out, and is about 500 billion times more powerful than the DEC machine. This type of jump illustrates the exponential growth of computing power over a single professional career and raises the question: If so much progress is possible in 40 years, what could bring the next 5?

For his part, Nvidia has offered some clues. At GTC, the company announced a roadmap that predicted that its next generation system will deliver the performance of Blackwell Ultra Rack-Versand this year with 14 to 15 exaflops in the A-optimized work in the next or two years.

Efficiency is just as remarkable. By reaching this level of performance in a single rack, less physical space per unit of work, fewer materials and possibly lower energy consumption per operation, although the absolute performance requirements of these systems remain immensely.

Does Ai really need all this calculation of strength?

While such performance gains are actually impressive, the AI ​​industry is now facing a fundamental question: How much computing power is really necessary and at what costs? The race for the construction of massive new AI data centers is driven by the growing requirements of exascal computing and more and more capable AI models.

The most ambitious effort is the 500 billion dollars project argate, which provides for 20 data centers in the United States, each exceeding half a million square foot. A wave of other hyperscale projects is either in progress or in planning phases around the world, since companies and countries are trying to ensure that they have the infrastructure to support the AI ​​workload of tomorrow.

Some analysts now fear that we will be able to superstructure the AI ​​data center capacity of AI. The concern strengthened after the publication of R1, an argumentation model from China’s Deepseek, which requires significantly less calculation than many of his colleagues. Microsoft later canceled leasing contracts with several data center providers and triggered speculation that it re -calibrated their expectations of future demand for AI infrastructure.

However, The register recommended that this withdrawal may have more to do with some of the planned AI data centers that are not sufficiently robust to support the electricity and cooling needs of AI systems of the next generation. AI models already cross the limits of the current infrastructure. With Technology Review reported This can be the reason why many data centers in China have to struggle and fail because they have been built for specifications that are not optimal for current needs, let alone that of the next few years.

AI inference requires more flops

Models of argument carry out most of their work at runtime through a process known as an inference. These models operate some of the most progressive and resource-intensive applications today, including deep research assistants and the aspiring wave of agent AI systems.

While deepseek-r1 the industry was initially in the view that the future AI could possibly require fewer Computing power, Nvidia CEO Jensen Huang pressed back hard. Speaking For CNBC he countered this perception: “It was the exact opposite conclusion that everyone had.” He added that the argumentation of KI 100x more computers than the non -relaxing AI.

If the AI ​​continues to develop from argumentation models to autonomous active substances and beyond, the demand for computers will probably increase again. The next breakthroughs may not only be in language or vision, but also in the coordination of AI agents, in fusion simulations or even in digital large-scale twins, each of which is made possible by the type of arithmetic that we have just seen.

Openai apparently announced on the right, he just announced New funds of 40 billion US dollarsThe largest private tech financing round per record. The company said in A Blog post That the financing “enables us to push the limits of AI research even further, to scale our arithmetic infrastructure and to always deliver powerful tools for the 500 million people who use chatt every week”.

Why is so much capital flows in AI? The reasons range from competitiveness to national security. Although a certain factor is noticeable, as illustrated by a McKinsey headline: “AI could increase the company’s profit by 4.4 trillion dollars a year.”

What’s next? It is everyone’s assumption

In their core, it is about information systems about the summary of complexity, be it through an emergency vehicle that I once wrote in Forran, a student reporting instrument or a modern AI system to accelerate drug discovery. The goal has always been the same: to make the world more sense.

Now we cross a threshold with a powerful AI. For the first time we may have computing power and intelligence to tackle problems that once called outside the people.

The New York Times Kevin Roose’s columnist Recently conquered this moment well: “Every week I meet engineers and entrepreneurs who work on AI and tell me that the change-large change, worldwide change, the type of transformation that we have never seen before-is around the corner.” And that doesn’t even count the breakthroughs that arrive every week.

Only in the last few days we have seen Openais GPT-4O Almost perfect pictures Google publishes from the text, which can be the most advanced Argumentation model But in Gemini 2.5 Pro and Runway, a video model with a shot-to-shot character and scenic content, some venture steat Notes So far, most AI video geriatricians have been withdrawn.

What comes next is really a guess. We do not know whether a powerful AI will have a breakthrough or a breakdown, whether it will help to solve fusion energy or trigger new biological risks. But with more and more flops that come online over the next five years, one thing seems to be safe: innovation will come quickly – and with violence. It is also clear that flops scale that our conversations about responsibility, regulation and reluctance.

Gary Grossman is EVP of technology practice at Noble and global advantage of the Edelman Ai Excellence Center.



Source link