Machine learning, artificial intelligence, and robots have all contributed to the development of today’s healthcare system. Though, researchers around the world agree that improved computational tools are required throughout the healthcare ecosystem to facilitate discoveries in disease understanding, diagnostics, and treatment. In the middle of the ongoing data revolution, researchers in the life sciences urgently want a novel strategy for incorporating machine learning into biomedicine.
Recently, NVIDIA and the Broad Institute of MIT and Harvard announced a partnership to equip the Terra cloud platform with artificial intelligence and acceleration tools for rapidly analyzing massive amounts of healthcare data used by biomedical researchers in academia, startups, and large pharma companies.
The team believes this collaboration can bridge the gap between scientific insights and practical advantages for patients, leveraging the power of big language models to develop and implement joint solutions.
The work focuses on the following key areas:
- Adding support for NVIDIA ClaraTM Parabricks® to the Terra platform. Six new Terra procedures use Parabricks, a GPU-accelerated software suite for secondary sequencing data processing. With Clara Parabricks, users can now analyze a full genome in just over one hour instead of 24 hours in a CPU-based environment while reducing the computation cost by more than half.
- NVIDIA BioNeMo is artificial intelligence (AI) application framework that can build foundational models for DNA and RNA to better understand human biology.
- According to their recently published article, more than 100000 researchers used Genome Analysis Toolkit (GATK), a new deep learning model to better detect genetic variants related to diseases. This will aid those working in the field of drug discovery in their quest to create new therapeutics.
Previously used in speech recognition to prevent the prediction of meaningless (i.e., low-probability) word sequences, language models have come a far way and found widespread application in machine translation, natural language generation, Optical Character Recognition, handwriting recognition, information retrieval, and other areas of computational linguistics.
To facilitate training, inference, and scaling, NVIDIA’s BioNeMo architecture features pre-trained LLMs for proteins and chemistry. BioNeMo is a specialized version of the NVIDIA NeMo Megatron framework designed to analyze chemical, protein, and DNA/RNA sequences.
With BioNeMo, programmers can efficiently train and deploy biology LLMs with billions of parameters. In the future, the teams plan to extend this collaboration to develop new models for incorporation into the BioNeMo library and the Terra platform.
Broad Institute scientists will also have access to NVIDIA RAPIDSTM, a GPU-accelerated data science toolbox for speedier data preparation that can be utilized for genomic single-cell analysis, and MONAI, an open-source deep learning framework for medical imaging AI.
Researchers can now perform a variety of genomic data studies in less time and at a lesser cost using NVIDIA Parabricks GPU-accelerated workflows. Parabricks on GPUs can perform analysis for Broad’s GATK best practices germline workflow up to 24x faster and at a fraction of the cost.
Please Don't Forget To Join Our ML Subreddit
Tanushree Shenwai is a consulting intern at MarktechPost. She is currently pursuing her B.Tech from the Indian Institute of Technology(IIT), Bhubaneswar. She is a Data Science enthusiast and has a keen interest in the scope of application of artificial intelligence in various fields. She is passionate about exploring the new advancements in technologies and their real-life application.