CMU Researchers Open-Source ‘auton-survival’: A Comprehensive Python Code Repository of User-Friendly, Machine Learning Tools for Working with Censored Time-to-Event Data

Machine learning is being used in almost every industry, including healthcare. However, due to the intrinsic complexity of healthcare data, classical machine learning faces various difficulties while dealing with these data. This is because healthcare outcomes like mortality, stroke, cancer initiation, and readmission frequently have a continuous time to events. Since time-to-event data frequently contains individuals whose outcomes are missing or censored owing to loss of follow-up, dealing with this type of data is much more difficult. The researchers have established that traditional classification and regression methods do not offer a simple solution to dealing with such clinical data.

Many researchers have been interested in applying deep neural networks, which may be used to create nonlinear representations of complex clin

A new study by Auton Lab at Carnegie Mellon University introduced the auton-survival package, a comprehensive Python library of user-friendly tools for machine learning applications in the presence of censored time-to-event data.

Auton-survival offers a unique suite of processes that enable a variety of experiments, from data pre-processing and regression modeling to model evaluation. Auton-survival also employs a scikit-learn-like API, making it easy for users who are already familiar with Python’s machine learning features to adopt it.

To support quick prototyping, auton-survival includes extensive documentation for utilities and sample code notebooks. The program for repeatable machine learning for healthcare research, auton-survival, is open source and hosted on GitHub.

Traditional machine learning is faced with several difficulties by complex multimodal data, frequently seen in healthcare and other applications. The team believed deep neural networks and representation learning could model such complex data using an auton-user-friendly survival interface.

Deep representation learning-based extensions to the Cox Proportional Hazards (CPH) model, latent variable survival regression models, Deep Cox Mixtures (DCM), and Deep Survival Machines (DSM), which model the time-to-event distribution as a fixed size mixture, ease the strict assumptions of proportional hazards. The team also focused on the issue of weakening the restrictive assumptions of the proportional hazards model using discrete time and adversarial and parametric techniques.

The package includes a handy SurvivalModel class that facilitates quick experimentation with a unified API encapsulating various alternative regression estimators. Along with the previously stated models, the SurvivalModel class also contains the well-known non-parametric survival model known as Random Survival Forests (RSF).

The existing Python tools for machine learning and survival analysis have been used to make software architecture decisions in auton-survival. 

It provides features for quick testing with several classes of survival regression models and associated metrics to assess the discriminative power and calibration of the model. In addition, auton-survival offers the only simple-to-use APIs for estimation of treatment and counterfactual effects, as well as subgroup discovery, to address the following real-world issues requiring censored time-to-events:

  1. Estimation of Contradictory and Therapeutic Effects
  2. Using Time-Varying Covariates in Survival Regression
  3. Evaluation of Subgroups and Phenotypes

The researchers hope their work will allow machine learning and healthcare communities to further enhance the collection of open-source survival regression methodologies that can support the reproducible analysis of censored time-to-event data.

This Article is written as a research summary article by Marktechpost Staff based on the research paper 'auton-survival: an Open-Source Package for Regression, Counterfactual Estimation, Evaluation and Phenotyping with Censored Time-to-Event Data'. All Credit For This Research Goes To Researchers on This Project. Check out the paper, package and reference article. 

Please Don't Forget To Join Our ML Subreddit

Tanushree Shenwai is a consulting intern at MarktechPost. She is currently pursuing her B.Tech from the Indian Institute of Technology(IIT), Bhubaneswar. She is a Data Science enthusiast and has a keen interest in the scope of application of artificial intelligence in various fields. She is passionate about exploring the new advancements in technologies and their real-life application.