AI processor wars burning hot and bright
Scott Foster
Apple is the latest in a growing list of challengers to Nvidia, the world’s leading designer of artificial intelligence (AI) processors but China is the only nation that competes with the US in the technology. With market forces short-circuited by US-led technology bans and sanctions, China does so by necessity.
Apple’s share price jumped more than 7% to a new all-time high on June 11, following CEO Tim Cook’s presentation of the company’s AI strategy at its Worldwide Developers Conference the day before.
Nvidia’s share price was down 0.7% on June 11, reminding investors and others that while the company’s sales and profits will probably continue to grow, its extraordinarily high market share and stock market valuation are both likely to slip in the future.
Nvidia faces a growing list of global competitors who seek a share of the market and customers who would rather not deal with a monopoly supplier. The situation is different but less favorable in China, where US government sanctions have undercut Nvidia’s ability to compete with Huawei and other local AI chip designers.
Apple is starting by integrating OpenAI’s ChatGPT with an upgraded version of its Siri virtual assistant. After that, it will enable users to create their own emoji digital icons, called Genmoji, to suit their “vibe” as San Jose’s Mercury News puts it.
“Users will also be able to create personalized photos,” the article continues, “such as taking a picture of your mom and making it into a stylized, cartoon-y version, adding a superhero cape.” Other “Apple Intelligence” services will follow. “It is the next big step for Apple,” said Cook.
This should make the new iPhones, iPads and Macs more competitive but it is a far cry from Nvidia’s current top-of-the-line Hopper, recently announced Blackwell and next-generation Rubin AI processors, which are or will be used to create large language models and digital twins of complicated industrial machinery and workflows.
Nvidia currently has 80% or more of the AI processor market, in the estimation of analysts. AMD (another American integrated circuit design company), Intel and many other competitors including Google, Amazon Web Services, IBM and AI ventures SambaNova, Cerebrus and Groq, are also positioning for a share of the market.
Barron’s reports that Microsoft, Meta and Oracle purchase 15% to 25% of their AI processors from AMD, and most of the rest from Nvidia. AMD’s Instinct MI300 AI accelerator offers a viable alternative to Nvidia’s H100 GPU. Upgrades to both devices are on the way.
In April, Intel released its Gaudi 3 AI accelerator, which it claims delivers “50% on average better inference and 40% on average better power efficiency than Nvidia H100 – at a fraction of the cost.”
Targeting the enterprise market for generative AI, Gaudi 3 has been made available to computer makers Dell, HP, Supermicro, Lenovo and other customers including Bosch, IBM and Indian telecom services company Bharti Airtel.
Intel has also announced it will create an open platform for enterprise AI with SAP, RedHat, VMware and other software companies to accelerate the deployment of secure generative AI systems.
More seriously for Nvidia, Intel, Qualcomm, Google Cloud, Arm, Samsung and other companies have formed the Unified Acceleration Foundation (UXL) to develop an open-source, open-standard AI accelerator software ecosystem as an alternative to Nvidia’s currently dominant proprietary Compute Unified Device Architecture (CUDA) computing platform.
UXL states that “anyone can join” and China’s Xiangdixian Computing Technology is also a member. This puts it in the same category as RISC-V open standard IC design architecture: an opportunity for China but a potential target of US politicians.
Nvidia customers Apple, Meta and Microsoft Azure are also getting into the act: Apple with its M4 SoC (System-on-Chip) which powers the new iPad Pro; Meta with its MTIA (Meta Training and Inference Accelerator) which is now in its second iteration; and Microsoft Azure with its Maia 100 AI Accelerator. Google and Amazon are also among the biggest users of Nvidia processors.
In China, AI processors are designed by tech giants Alibaba, Baidu, Huawei and Tencent, and smaller specialists including Bitmain, Cambricon, Enflame, Inspur, MetaX and Xiangdixian Computing Technology. Aside from a relative lack of experience, their main problem is that their advanced designs cannot be turned into chips by TSMC or other non-Chinese foundries because of US sanctions.
There are more than 40 semiconductor foundries in China but even SMIC, the largest and most technologically sophisticated, does not have access to EUV lithography equipment and therefore cannot produce large volumes of chips at process nodes smaller than 7nm.
This is also true for Huawei, which is developing its own internal semiconductor production capability. Outside China, TSMC, Samsung and Intel are moving from 5nm to 3nm and soon 2nm.
But sanctions cut both ways. The US government has banned the sale of Nvidia’s H100 and other advanced AI processors to Chinese customers, forcing them to depend on the dumbed-down H20.
The restrictions imposed by the US Commerce Department are so severe that Huawei’s Ascend 910B AI processor has been taking market share away from Nvidia based on a combination of performance, price and fears that sanctions might be tightened further.
Those fears are now being realized as the Biden administration is reportedly planning to restrict the provision of gate-all-around transistor architecture and high-bandwidth memory to China.
Both technologies are key to the fabrication of the most advanced AI processors. Alibaba, Baidu and Tencent used Nvidia processors before sanctions were imposed; now they are customers of Huawei. Last February, Nvidia named Huawei as one of its top competitors.
In an ironical twist, Enflame and MetaX have reportedly created dumbed-down versions of their own processors that meet US requirements so that they can be made by TSMC. But the Chinese are putting most of their effort into making the best use of the foreign equipment to which they do have access and to developing their own equipment industry.
For now, to compensate for their lack of EUV lithography equipment, Huawei and SMIC are using what they call self-aligned quadruple patterning to produce 5nm and perhaps even 3nm chips.
Huawei has also developed its own AI computing platform. Known as Da Vinci, it is not as advanced and has a much smaller user base than Nvidia’s CUDA, but five or six years ago it was only a concept. The same is true of China’s entire AI industry.
On the large language model front, Dylan Patel of SemiAnalysis wrote in May that China’s open-source DeepSeek generative AI model is not only much cheaper than Meta’s newest Llama 3 series model but also better. “Even more interesting,” he added, “is the novel architecture DeepSeek has brought to market. They did not copy what Western firms did. There are brand new innovations.”
Andrew Carr, chief scientist at US generative animation venture Cartwheel, told the Financial Times that DeepSeek comes close to Llama 3 and costs a fraction of OpenAI’s GPT-4.
The Text and Image GEnerative Research (TIGER) lab at the University of Waterloo in Ontario ranks DeepSeek-V2 seventh out of ten large language models with an overall score of 54.8%. OpenAI’s GPT-4o ranks first at 72.6%. Yi-Large from China’s 01.AI scores 57.5%, Alibaba’s Owen15-72B 52.6%. TIGER Lab’s own MAmmo ranks ninth at 50.4%.
Kai-Fu Lee, the CEO of 01.AI, is an AI specialist who earned his PhD at Carnegie Mellon in the United States. Born in Taiwan, he worked at Apple and Silicon Graphics before moving to Beijing to lead Microsoft Research Asia and Google China between 1998 and 2009. After that, he established the Sinovation venture capital firm.
Lee founded 01.AI in 2023 to build large language models in both Chinese and English. The Large Model Systems Organization “Chinese Ranking” dated May 21, 2024, shows Yi-Large running a close second to the most recent version of OpenAI’s GPT-4o.
The “Overall Ranking” places it seventh out of 15 models, behind three versions of GPT-4o, Google’s Gemini 1.5 Pro, Anthropic’s Claude 3 Opus and the top version of GPT-4.
So far, the training of Chinese large language models has been heavily dependent on Nvidia AI accelerators. But the use of locally-made processors and supercomputers is increasing as the quality of the Chinese models improves.
https://asiatimes.com/2024/06/ai-processor-wars-burning-hot-and-bright/