Researchers at MIT DAI Lab Have Recently Built Cardea: A Machine Learning Framework That Turns Health Care Data Into Insights

Hospitals and other healthcare organizations have invested a significant amount of time and effort into implementing electronic healthcare reports, transforming hastily scribbled physicians’ notes into long-lasting databases. However, gathering this information is just half the war. Turning these archives into real insights to make future decisions will require much more time and effort.

Cardea, a software framework created by MIT’s Data to AI Lab (DAI Lab) researchers and software engineers, is designed to assist with this. The technology will help hospitals brace for incidents as big as global pandemics and as small as no-show appointments by shepherding patient data through an ever-growing range of machine learning models. According to DAI Lab research scientists, hospitals could solve “hundreds of different forms of machine learning problems” with Cardea (LIDS). 

The software is open-source and uses generalizable approaches, which increases openness and encourages teamwork. Cardea is based on AutoML (automated machine learning) domain, aiming to democratize predictive tools by making it easy for people—including non-experts—to create, use, and understand them. AutoML frameworks like Cardea surface actual machine learning models, along with examples of what they do and how they operate, rather than asking people to develop and code a whole machine learning algorithm. Users will then combine and coordinate modules to achieve their objectives, similar to going to a buffet instead of preparing from scratch. Data scientists have created various machine learning applications for health care, but most of them are difficult to use, even for experts.

Cardea guides users along a pipeline to transform reams of data into valuable forecasts, with choices and protections at each stage. A data assembler greets them first, ingesting the details they have. Cardea is designed to work with the new industry standard for electronic health reports, Quick Healthcare Interoperability Resources (FHIR). Since hospitals use FHIR in various ways, Cardea was designed to respond to different situations and datasets optimally. Its data inspector points out some inconsistencies with the data such that they can be corrected or dismissed.

Cardea then asks the customer what they want to know. Even apparently insignificant questions like how long a patient will be in the hospital are critical in day-to-day hospital operations, particularly during the COVID-19 pandemic. Users may choose from various models. The software framework then uses the dataset and model to learn trends from past patients and simulate what could happen in this situation, assisting stakeholders in making informed decisions.

Cardea is now set up to assist with four different forms of resource-sharing problems. However, since the pipeline uses many other models, it can quickly be modified to new situations. Stakeholders will soon be able to use Cardea to solve every prediction challenge within the healthcare domain.

The team delivered a paper at IEEE International Conference outlining the method on Data Science and Advanced Analytics. The researchers compared the system’s accuracy to that of members of a popular data science platform and discovered that it outperformed 90% of them. They have put it to the test by challenging data analysts to make projections on a mock healthcare dataset using Cardea. They discovered that Cardea increased their productivity significantly—for example, feature engineering, which usually takes them two hours, took them just five minutes.

Staff in hospitals are often tasked with making high-stakes, life-or-death decisions. Therefore, they need to understand what’s going on and have confidence in the instruments, including Cardea. Cardea’s next step is a model audit, which would provide much more transparency. It was open-sourced, allowing users to integrate their tools. The team also intends to add more data visualizers and descriptions to offer a more comprehensive perspective to make the information framework more available to non-experts.



🐝 Join the Fastest Growing AI Research Newsletter Read by Researchers from Google + NVIDIA + Meta + Stanford + MIT + Microsoft and many others...