Google Health’s New AI Study Achieves New Advances In Identification Of Genetic Diseases Using Machine Learning

Studies have shown that genetic diseases can cause critical illnesses. Early detection of the underlying problem allows for life-saving treatments in many situations. Recent research shows that nearly 6% of all infants are affected by genetic or congenital disorders, yet clinical sequencing testing can take days or weeks to diagnose these diseases. 

Genome sequencing has the potential to improve the understanding, diagnosis, and treatment of disease. To that end, Google Health has made significant breakthroughs in the rapid identification of genetic diseases and promoting genomic tests’ equity across ancestries. 

To begin, the Google Health team has collaborated with the University of California Santa Cruz Genomics Institute to develop PEPPER-Margin-DeepVariant, a method for analyzing data for Oxford Nanopore sequencers. It is one of the fastest commercial sequencing technologies available today. 

The findings show that this method detected a likely disease-causing variation in less than 8 hours after sequencing began in the quickest cases, compared to 13.5 hours in the previous fastest case.

The team believes that new sequencing instruments can lead to dramatic breakthroughs in the field, and machine learning (ML) can help these devices reach their full potential. In collaboration with Pacific Biosciences (PacBio), a genomic sequence platform developer, their recent study demonstrates that researchers can now use Google’s machine learning and algorithm development tools to extract more information from sequencing data.

A broad perspective of genomes, transcriptomes, and epigenomes is offered by PacBio’s long-read HiFi sequencing. Researchers have reliably identified diseases that are otherwise difficult to diagnose using alternative approaches using PacBio’s technology combined with DeepVariant.

In addition, they have also open-sourced DeepConsensus, an approach that produces more accurate reads of sequencing systems when used with PacBio’s sequencing systems. This increase in precision will enable researchers to use PacBio’s technology to solve a broader range of problems, including the finalization of the Human Genome and the assembly of the genomes of all vertebrate species.

The genomics sector, like other disciplines of health and medicine, is wrestling with health equity challenges. If these issues are not addressed, they can lead to the exclusion of some communities. As a result, the genomics resources used by scientists and clinicians to detect and filter genetic variants and evaluate the importance of these variants are not equally effective for people of different ancestries.

The researchers teamed up with 23andMe to build an enhanced resource for people of African heritage and collaborated with the UCSC Genomics Institute to develop pangenome with an aim to improve methods and genomics resources for under-represented populations.

Furthermore, they have introduced two open-source tools that improve genetic discovery by more precisely identifying illness labels and increasing the use of health metrics in genetic association research. The team hopes their technology will benefit everyone’s health and understanding of biology. 


Related Paper:

Tanushree Shenwai is a consulting intern at MarktechPost. She is currently pursuing her B.Tech from the Indian Institute of Technology(IIT), Bhubaneswar. She is a Data Science enthusiast and has a keen interest in the scope of application of artificial intelligence in various fields. She is passionate about exploring the new advancements in technologies and their real-life application.