Nvidia, known for chips designed to process AI systems, is providing access to large language models.
What’s new: Nvidia announced early access to NeMo LLM and BioNeMo, cloud-computing services that enable developers to generate text and biological sequences respectively, including methods that tune inputs — rather than the models themselves — to enable models trained on web data to work well with a particular user’s data and task without fine-tuning. Users can deploy a variety of models in the cloud, on-premises, or via an API.
How it works: The new services are based on Nvidia’s pre-existing NeMo toolkit for speech recognition, text-to-speech, and natural language processing.
- NeMo LLM provides access to large language models including Megatron 530B, T5, and GPT-3. Users can apply two methods of so-called prompt learning to improve the performance.
- The prompt learning method called p-tuning enlists an LSTM to map input tokens to representations that elicit better performance from a given model. The LSTM learns this mapping via supervised training on a small number of user-supplied examples.
- A second prompt learning approach, prompt tuning, appends a learned representation of a task to the end of the tokens before feeding them to the model. The representation is learned via supervised training on a small number of user-supplied examples.
- BioNeMo enables users to harness large language models for drug discovery. BioNeMo includes pretrained models such as the molecular-structure model MegaMolBART, the protein-structure model ESM-1, and the protein-folding model OpenFold.
Behind the news: Nvidia’s focus on prompt learning and biological applications differentiate it from other companies that provide large language models as a service.
- HuggingFace’s Accelerated Inference API allows users to implement over 20,000 transformer-based models.
- NLP Cloud allows users to fine-tune and deploy open-source language models including EleutherAI’s GPT-J and GPT-NeoX 20B.
- In December 2021, OpenAI enabled customers to fine-tune its large language model, GPT-3.
Why it matters: Until recently, large language models were the province of organizations with the vast computational resources required to train and deploy them. Cloud services make these models available to a wide range of startups and researchers, dramatically increasing their potential to drive new developments and discoveries.
We’re thinking: These services will take advantage of Nvidia’s H100 GPUs, developed specifically to process transformer models. Nvidia CEO Jensen Huang recently said the public no longer should expect chip prices to fall over time. If that’s true, AI as a service could become the only option for many individuals and organizations that aim to use cutting-edge AI.