Large language models to increasingly become a platform

Advanced hardware customers are increasingly using GPUs, network cards to build "modern AI factories with data as the raw material input and intelligence as the output" -- that's according to executives at Nvidia this week, with CEO Jensen Huang suggesting that "in the future, you're going to see large language models essentially becoming a platform themselves that [will] be running 24/7, hosting a whole bunch of applications."

The semiconductor firm reported record earnings late Wednesday. Long a stalwart of the gaming industry in which high-powered graphics hardware is critical for obvious reasons, the data center has now "become our largest market platform" CFO Colette Kress said, as company revenues hit $8.3 billion for the quarter.

One of the biggest workloads driving this data centre and cloud demand is natural language processing, which has been revolutionized by "transformer based models", Kress added, pointing to recent industry breakthroughs traced to transformers like GPT-3 -- OpenAI's model which, given any text prompt like a phrase or a sentence, returns a text completion in natural language -- Nvidia's own Megatron BERT for drug discovery, and DeepMind's AlphaFold; an AI system that predicts a protein’s 3D structure from its amino acid sequence.

Transformers are AI models that learn context and "meaning" by tracking relationships in sequential data like the words in a sentence; this allows for self-supervised learning without the need for human labeled data. Pioneered at Google the latter's influential 2017 paper on this notes pithily that transformers are capable of directly modelling "relationships between all words in a sentence, regardless of their respective position. [e.g. in the sentence] 'I arrived at the bank after crossing the river', to determine that the word “bank” refers to the shore of a river and not a financial institution, the Transformer can learn to immediately attend to the word 'river..."

Large language models need serious computational firepower...

This allows for huge potency around text generation, translation, summarisation and answering questions and to Nvidia CEO Huang, customers are going to see more and more large language models as a service, much as GPT-3 now has tens of thousands of developers around the globe are building on our platform and applications powered by it directly via an API. These kinds of colossal models need a tonne of compute behind them and that's a major play for firms like Nvidia, with Huang keen on an earnings call to point to the company's "next-generation" data centre GPU, its "H100" which features a monstrous 800 billion transistors: "H100 is ideal for advancing large language models and deep recommender systems, the two largest scale AI workloads today.

"We are working with leading server makers and hyperscale customers to qualify and ramp H100" he added, with the company's first ever data centre CPU, "Grace" to also launch early next year (2023).

As The Stack recently noted, chip firms are all making large pushes into supportive SaaS offerings that enhance what their hardware can do and Nvidia is no exception. Referring back to the power of transformers, Huang waxed lyrically about the potential, noting: "One of my favorites is using Transformers to understand the language of chemistry or using transformers and using AI models to understand the language of proteins, amino acids, which is genomics. To apply AI to understand -- to recognize the patterns, to understand the sequence and essentially understand the language of chemistry and biology is a really, really important breakthrough... all of these different models need an engine to run on. And that engine is called NVIDIA AI. [Other engines are available!]

He added: "In the case of hyperscalers, they can cobble together a lot of open source and we provide a lot of our source to them and a lot of our engines to them for them to operate their AI. But for enterprises, they need someone to package it together and be able to support it and refresh it, update it for new architecture, support old architectures in their installed base, etc. and all the different use cases that they have..."

Markets were unimpressed. Shares fell nearly 10% after the chipmaker said it expects second-quarter sales to be $8.1 billion, below the $8.44 billion analysts were expecting. Covid lockdowns in China and Russia's invasion Ukraine were among the obvious macroeconomic/geopolitical headwinds most companies are facing but Nvidia emphasised that its networking business was, partly as a result of this turmoil, "highly supply constrained."

Huang told analysts: "Our demand is really, really high.

"And it requires a lot of components aside from just our chips. Components and transceivers and connectors and cables. And just -- it's a really -- it's a complicated system, the network, and there are many physical components. And so the supply chain has been problematic. We're doing our best and our supply has been increasing... We're really grateful for the support from the component industry around us, and we'll be able to increase that."

Large language model platforms will be the "new factories"

Large language models need serious computational firepower...

Follow The Stack on LinkedIn today