2 3 5 6 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

MT-NLG

MT-NLG stands for Megatron-Turing Natural Language Generation

(MT-NLG) is the largest and most powerful monolithic transformer English language model with 530 billion parameters. It was developed by NVIDIA in collaboration with Microsoft and is based on the Megatron-LM framework. MT-NLG improves upon the prior state-of-the-art models in zero-, one-, and few-shot settings and demonstrates unmatched accuracy in a broad set of natural language tasks such as completion prediction, reading comprehension, commonsense reasoning, natural language inference, word sense disambiguation and more. MT-NLG can also generate fluent and coherent text on various topics and domains.

MT-NLG was trained on the NVIDIA DGX SuperPOD-based Selene supercomputer using novel parallelism techniques that enabled scaling to such a large model size. NVIDIA has announced an Early Access program for its managed API service to MT-NLG model, which allows organizations to experiment, employ and apply this large language model on downstream language tasks. NVIDIA also intends to collaborate with researchers on how to apply this technology in a responsible manner and how to detect, prevent and manage elements like toxicity, bias and inappropriate responses that are often eminent with such large language models.

Related Entries

Spread the word: