Apple ML Researchers Develop ‘Neo’: A Visual Analytics System That Enables Machine Learning Practitioners To Generalize Confusion Matrix Visualization to Hierarchical and Multi-Output Labels

This Article Is Based On The Research Paper 'Neo: Generalizing Confusion Matrix Visualization to Hierarchical and Multi-Output Labels'. All Credit For This Research Goes To The Researchers Of This Paper 👏👏👏

Please Don't Forget To Join Our ML Subreddit

In Machine Learning (ML), model evaluation is the most challenging step. The confusion matrix is one of the globally utilized performance metrics to evaluate the model for classification tasks. It is also a visualization tool that many ML courses and researchers have used. Moreover, it is a table with two dimensions, i.e., actual class label and predicted class label. The actual class label is represented by a row, while a column in the confusion matrix represents the predicted class label. Additionally, it represents a visual proxy for accuracy. This is insufficient for several evaluations. Also, the diagonal confusion matrix comprises more examples than the off-diagonal matrix. This will eventually hide off-diagonal entries (confusions). Also, when researchers try to enhance the model, the problem of confusion hiding gets worse. Also, if the dataset has a strong class imbalance, multiple outputs, or hierarchical structure, the scalability of the confusion matrix decreases. To resolve these challenges, Apple researchers conducted an ML research study. As a result, they realize that the confusion matrix is complex and does not support multi-class output. The source paper creates an algebra for confusion matrix where they are modeled as probability distributions. This helps to generate a solution to the challenges of traditional confusion matrices. The apple researchers propose NEO, a visual analytics system that supports diverse configurations and complex data structures. The proposed NEO model is represented in the figure below: 

The challenges of the Confusion matrix are as depicted below:

C1) Hidden Performance metrics: Important precision measures such as accuracy, F1 score, precision, and recall are not explicitly listed in the confusion matrix.

C2) Hierarchical Labels: Traditional confusion matrix is suitable for a flat, one-dimensional structure. However, current data types have hierarchical structures.

C3) Multi-Output Labels: Traditional confusion matrix does not support multi-output labels.

C4) Communicating confusions: The confusion matrix should be easily exported without quality loss and considering the project background.

The proposed task of this research is represented below: 

T1) Flexible data analysis that also does scaling and normalization.

T2) Visualizing hierarchical labels while traversing. 

T3) Visualizing multi-output labels by transformation.

T4) Sharing the examination and configurations of the confusion matrix.

Key Contributions:
  1. Formal Survey: Taking survey from practitioners working in the machine learning domain at Apple. Here, the survey is taken based on questionaries related to i) stages of machine learning, ii) data classes, iii) when confusion matrix is used in ML, iv) insights gained or missed from confusion matrix, v) hierarchical confusion vi) multiple labels vii) any other approach utilized.

The response depicts the interest of practitioners in visualization.

  1. Algebra for confusion matrix: Generalize and model confusion matrix as a probability distribution. 

In this process, conditioning, marginalization, and nesting are applied to transform high-dimensional multi-output labels. Conditioning is used to extract subviews of a large confusion matrix. Marginalization discards multivariate distributions that are not required. Nesting is used to examine multiple labels concurrently. 

  1. A visual analytic system-NEO: It is a model that supports hierarchical and multi-output labels. A reactive model in which visualization is updated while authoring a spec and spec is updated while interacting with the visualization. NEO is a modern web-based system built with Svelte, Typescript, and D3. It provides efficient interactions for configuring a confusion matrix to assess related classes.
  2. Evaluation scenarios: It also represents the use of NEO for the assessment of machine learning models. This includes: 
  • Object detection: Here, the confusion matrix is further analyzed using NEO to discover hidden confusions.
  • Classification of large scale – Hierarchical images: Here, to deal with a large hierarchical confusion matrix NEO starts from the root and collapses all sub hierarchies. The performance measures of the model are reevaluated considering each sub-hierarchy and class.
  • Detecting online multi-output toxicity: The proposed model can handle mild toxic and severe toxic comments with some false negatives.

Conclusion: This work generalizes the capabilities of the confusion matrix. An algebra is created by utilizing formative research to provide more variations of the confusion matrix. A visual analytics tool NEO is developed that permits researchers to create, interrelate, and share confusion metrics flexibly. Finally, the model’s usefulness is demonstrated with three assessment situations that can assist individuals in realizing the performance of the model and hidden confusions effectively. The future enhancement of the work is to scale visualization of confusion matrix, the discovery of submatrix automatically, interactive analysis using metadata, comparing model confusion over the period, and creating datasets from confusions.



Priyanka Israni is currently pursuing PhD at Gujarat Technological University, Ahmedabad, India. Her interest area lies in medical image processing, machine learning, deep learning, data analysis and computer vision. She has 8 years of teaching experience to engineering graduates and postgraduates.