Nvidia CEO Jensen Huang speaks during a press conference at The MGM during CES 2018 in Las Vegas on January 7, 2018.
Almond Ngan | AFP | Getty Images
Software that can write passages of text or draw images that look like they were created by a human has sparked a gold rush in the tech industry.
Companies like Microsoft and Google are struggling to integrate cutting-edge AI into their search engines, while billionaire competitors like OpenAI and Stable Diffusion are racing to release their software to the public.
Many of these applications are powered by a roughly $10,000 chip that has become one of the most important tools in the artificial intelligence industry: the Nvidia A100.
The A100 has become the “workhorse” for artificial intelligence professionals right now, said Nathan Benaich, an investor who publishes a newsletter and report on the AI industry, including a partial list of supercomputers using the A100. Nvidia takes 95% of the market for GPUs that can be used for machine learning, according to New Street Research.
The A100 is ideal for the kind of machine learning models that support tools like ChatGPT, Bing AI or Stable Diffusion. It is able to perform many simple calculations simultaneously, which is important for training and using neural network models.
The technology behind the A100 was originally used to render demanding 3D graphics in games. It’s often referred to as a graphics processor or GPU, but these days Nvidia’s A100 is configured and geared for machine learning tasks and running in data centers, not glowing gaming PCs.
Large companies or startups working on software like chatbots and image generators need hundreds or thousands of Nvidia chips and either buy them themselves or secure access to the computers from a cloud provider.
Read more about technology and crypto from CNBC Pro
Hundreds of GPUs are required to train artificial intelligence models, such as large language models. The chips must be powerful enough to process terabytes of data quickly in order to recognize patterns. After that, GPUs like the A100 are also needed for “inference,” or using the model to generate text, make predictions, or identify objects in photos.
That means AI companies need access to many A100s. Some entrepreneurs in the industry even see the number of A100s they have access to as a sign of progress.
“A year ago we had 32 A100s,” Emad Mostaque, CEO of Stability AI, wrote on Twitter in January. “Dream big and stack more GPUs, kids. Brrr.” Stability AI is the company that helped create Stable Diffusion, an image generator that garnered attention last fall and is reportedly worth over $1 billion.
Now Stability AI has access to over 5,400 A100 GPUs, according to an estimate from the State of AI report, which records and tracks which companies and universities have the largest collection of A100 GPUs – although cloud providers aren’t included, who do not do so do not publish their numbers publicly.
Nvidia is driving the AI train
Nvidia will benefit from the AI hype cycle. Though overall sales fell 21%, investors pushed the stock about 14% higher on Thursday, largely because the company’s AI chip business — which it reports as data centers — rose 11% to more than $3.6 billion -Dollar sales increased during the quarter and showed continued growth.
Nvidia stock is up 65% so far in 2023, outperforming the S&P 500 and other semiconductor stocks alike.
Nvidia CEO Jensen Huang couldn’t stop talking about AI during a call with analysts on Wednesday, suggesting that the recent boom in artificial intelligence is at the heart of the company’s strategy.
“The activity around the AI infrastructure that we’ve built and the activity around inferring with Hopper and Ampere to affect big language models has just gone through the roof in the last 60 days,” Huang said. “There is no question that our views of this year at the beginning of the year have changed quite dramatically as a result of the last 60, 90 days.”
Ampere is Nvidia’s codename for the A100 chip generation. Hopper is the codename for the new generation, including H100, which has been shipping recently.
More computers needed
Nvidia A100 processor
Compared to other types of software such as For example, hosting a web page that occasionally consumes processing power in microsecond bursts, machine learning tasks can consume all of the computer’s processing power, sometimes hours or days.
This means that companies that find themselves with a successful AI product often need to purchase more GPUs to handle peak times or to improve their models.
These GPUs don’t come cheap. In addition to a single A100 on a card that plugs into an existing server, many data centers use a system that includes eight A100 GPUs working together.
This system, Nvidia’s DGX A100, has a suggested retail price of almost $200,000, although it comes with the necessary chips. On Wednesday, Nvidia said it will sell cloud access directly to DGX systems, which will likely lower entry costs for tinkerers and researchers.
It’s easy to see how the cost of the A100 can add up.
For example, an estimate by New Street Research found that the OpenAI-based ChatGPT model in Bing Search could require 8 GPUs to provide an answer to a question in under a second.
At that rate, Microsoft would need over 20,000 8-GPU servers just to make the model available to everyone on Bing, suggesting the Microsoft feature could cost $4 billion in infrastructure spending.
“If you come from Microsoft and you want to scale that at the scale of Bing, that’s maybe $4 billion. If you want to scale at the scale of Google, which serves 8 or 9 billion queries every day, you actually need to spend $80 billion on DGXs,” said Antoine Chkaiban, technology analyst at New Street Research. “The numbers we came up with , are huge. But they simply reflect the fact that every single user that moves to such a large language model requires a huge supercomputer while using it.”
The latest version of Stable Diffusion, an image generator, was trained on 256 A100 GPUs, or 32 machines each with 8 A100s, according to information published online by Stability AI, totaling 200,000 compute hours.
At market price, the model costs $600,000 just to train, Stability AI CEO Mostaque said on Twitter, hinting in a tweet exchange that the price is unusually cheap compared to competitors. This does not take into account the cost of “inference” or deploying the model.
Nvidia CEO Huang said in an interview with CNBC’s Katie Tarasov that the company’s products are actually inexpensive for the amount of computation these types of models require.
“We took a $1 billion data center running CPUs and we shrank it down to a $100 million data center,” Huang said. “Well, $100 million if you put that in the cloud and shared by 100 companies is almost nothing.”
Huang said Nvidia’s GPUs allow startups to train models at a much lower cost than using a traditional computer processor.
“Now you could build something like a big language model like a GPT for about $10-20 million,” Huang said. “It’s really, really affordable.”
Nvidia isn’t the only company making GPUs for artificial intelligence applications. AMD And intel have competing GPUs and big cloud companies like Google And Amazon design and implement their own chips specifically designed for AI workloads.
Still, “AI hardware remains highly consolidated at NVIDIA,” according to the State of AI Compute report. As of December, more than 21,000 open-source AI papers claimed to use Nvidia chips.
Most researchers The Nvidia V100 chip included in the State of AI Compute Index was released in 2017, but A100 was growing rapidly in 2022 to become the third most widely used Nvidia chip, right behind a $1500 or less consumer graphics chip originally slated for use was playing.
The A100 is also notable for being one of the few chips to have export controls imposed for national defense reasons. Last fall, Nvidia said in an SEC filing that the US government imposed a licensing requirement that would ban the A100 and H100 from being exported to China, Hong Kong and Russia.
“The USG indicated that the new license requirement will address the risk that the covered products may be used in or diverted to a ‘military end-use’ or ‘military end-user’ in China and Russia,” Nvidia said in its filing. Nvidia previously said it modified some of its chips for the Chinese market to comply with US export restrictions.
The toughest competition for the A100 could be its successor. The A100 was first introduced in 2020, ages ago in chip cycles. Introduced in 2022, the H100 begins series production – in fact, Nvidia recorded more H100 chip sales than the A100 in the quarter ended January, it said on Wednesday, despite the H100 being more expensive per unit.
The H100, Nvidia says, is the first of its data center GPUs to be optimized for transformers, an increasingly important technique that many of the latest and greatest AI applications use. Nvidia said on Wednesday that it wants to make AI training more than 1 million percent faster. That could mean that at some point, AI companies wouldn’t need as many Nvidia chips.