NVIDIA AI Introduces the Latest Version of its NeMo Megatron Large Language Model Framework

With the popularity of NLP models, mainly LLMs(Large Language Models), big news came today for the fans of the NeMo framework as NVIDIA announced the update to NeMo, decreasing the training time by 30%. These updates come with two trailblazing technologies and a hyperparameter tool that optimizes and scales the training data that can be used with any number of GPU training, and deploying the model becomes much more accessible and significantly faster. LLMs like BLOOM and MT-LNG are also trained and deployed on NVIDIA’s AI platform, LLMs have become one of the world’s most important technologies that involve investing vast amounts of time and money, and a well-planned architecture along with highly technical expertise is required to make the technology as successful as LLMs are.

With enormous contributions in creating real-time content generation, text-generation, and customer service chatbots, the recent advancements by NVIDIA is a step toward making LLMs more powerful and more applicable in different scenarios. The latest improvements in the framework reduce training time by 30%, which makes training faster by about ten days. In the NeMo Megatron framework, the whole process, you’ll train a dataset of roughly 175-billion parameter models, which now will take about 24 days on using 1024 NVIDIA A100 GPUs, which before used to happen in about 34 days but now saves almost saves about 250,000 hours of GPU computing combined.

The two new technologies used to increase the speed in the NeMo framework are sequence parallelism (SP) and selective active recomputation (SAP). Sequence parallelism expands tensor-level model parallelism by noticing the previously non-parallelized layers of transformers. Instead of checkpointing and recomputing all the transformer layers, it is possible to the checkpoint. It recomputes only parts of each transformer layer that takes up a considerable amount of memory but are easy to compute. SAP improves the recalculations where memory constraints force the recomputation of some activations. The hyperparameter tool is introduced in NeMo Megatron to automatically find optimal training and inference configurations, which eliminates the time searching for an optimal design. The hyperparameter tool allows configurations with the highest model throughput or lowest latency during inference.

In conclusion, the new updates made to NeMo Megatron will significantly increase future calculations. They would allow much faster results and open multiple avenues that were not available before. With the introduction of SP, SAP, and hyperparameter tool, it will be fascinating to see what other applications the framework can bring and how these tools can be used for frameworks other than NeMo Megatron.


  • https://developer.nvidia.com/blog/nvidia-ai-platform-delivers-big-gains-for-large-language-models/
Please Don't Forget To Join Our ML Subreddit