Researchers at Michigan State University Developed ‘DANCE,’ a Python Library to Support Deep Learning Models for Analyzing Single-Cell Gene Expression at Scale

From single-modality profiling (RNA, protein, and open chromatin) to multimodal profiling and spatial transcriptomics, the technology for analyzing single cells has advanced rapidly in recent years. A proliferation of computational approaches, especially those based on machine learning, has been thus prompted by the rapid expansion of this subject. 

Researchers state that it is challenging to replicate the results as shown in the original articles due to the diversity and complexity of current approaches. Hyperparameter tweaking, incompatibilities between programming languages, and the lack of a publicly available codebase all provide significant obstacles. Since most existing works have only reported their performance on limited datasets and comparisons with insufficient methodologies, a systematic benchmarking procedure is required to evaluate methods completely.

As part of a recent study, researchers from Michigan State University, University of Washington, Zhejiang University of Technology, Stanford University, and Johnson & Johnson introduce DANCE, a deep learning library and benchmark designed to accelerate advancements in single cell analysis. 

👉 Read our latest Newsletter: Microsoft’s FLAME for spreadsheets; Dreamix creates and edit video from image and text prompts......

DANCE offers a comprehensive set of tools for analyzing single-cell data at scale, allowing developers to create their deep-learning models with greater ease and efficiency. In addition, it can be used as a benchmark for comparing the performance of various computational models for single-cell analysis. DANCE presently includes support for 3 modules, 8 tasks, 32 models, and 21 datasets.

Currently, DANCE offers:

  1. Single modality analysis.
  2. Multimodality analysis
  3. Spatial transcriptomics analysis 

Autoencoders and GNNs are widely used deep learning frameworks supported and applicable across the board. According to their paper, DANCE is the first all-inclusive benchmark platform for single-cell analysis. 

In this work, the researchers have used novel components. They started the work by compiling task-specific standard benchmark datasets and making them readily available with a single parameter adjustment. Baseline classical and deep learning algorithms are implemented for every task. All the collected benchmark datasets are used to fine-tune the baselines until they achieve the same or better results than the original studies. End users just need to run a single command line where they have wrapped all super-parameters in advance to acquire the stated performance of the fine-tuned models.

The team used PyTorch Geometric (PSG) framework as the backbone. Furthermore, they standardize their baselines by transforming them into a fit-predict-score framework. For each task, all the implemented algorithms are fine-tuned on all of the gathered standard benchmarks via grid search to obtain the optimal model. The related super-parameters are stored in a single command line for user reproducibility.

The team believes their work benefits the entire single-cell community from the DANCE platform. End users don’t have to put much time and effort into model implementation and fine-tuning. Instead, all they need to do to replicate our results is run the command line. In addition, the researchers also provide support for graphics processing units (GPUs) for the speedy training of deep learning-based models.

Present DANCE lacks a unified set of tools for preprocessing and graph creation. The team plans to work on this in the future. They also stated that DANCE would be made available as a SaaS service so users wouldn’t have to rely solely on their own device’s processing power and storage capacity. 

This Article is written as a research summary article by Marktechpost Staff based on the research paper 'DANCE: A Deep Learning Library and Benchmark for Single-Cell Analysis'. All Credit For This Research Goes To Researchers on This Project. Check out the paper, code and tool.
Please Don't Forget To Join Our ML Subreddit

Tanushree Shenwai is a consulting intern at MarktechPost. She is currently pursuing her B.Tech from the Indian Institute of Technology(IIT), Bhubaneswar. She is a Data Science enthusiast and has a keen interest in the scope of application of artificial intelligence in various fields. She is passionate about exploring the new advancements in technologies and their real-life application.