Researchers At The University Of Genoa And AWS Analyze Techniques To Make Machine Learning (ML) Models Fairer For Underrepresented Groups

What is fairness?

Fairness is just the treatment or behavior without favoritism or discrimination. There are different concepts or notions of defining fairness, such as equal odds, equal opportunity, demographic parity.

Demographic parity means that every subgroup of a class should receive positive outcomes at equal rates. It is the most common concept of fairness, which implies that specific results should be independent of a class’s subgroup. (e.g., Customer is eligible for a loan or not should be independent of his/her Gender).

source : https://www.amazon.science/research-awards/success-stories/algorithmic-bias-and-fairness-in-machine-learning

This research on algorithmic fairness provides three main approaches, i.e., pre-processing data, post-processing an already learned ML model, and in-processing, which consists of enforcing fairness notions by imposing specific statistical constraints on the learning phase of the model.

This research uses Empirical Risk Minimization (ERM) to make the machine learning model fairer. The core concept of ERM is that the model performance on the sample data may or may not perform well on the real-world data as the latter may have a different probability distribution. It provides a way to estimate the true risk of a model from its empirical risk, which can be computed from the available data. This concept extends to the actual and practical fairness risk of ML models.

source : https://www.amazon.science/research-awards/success-stories/algorithmic-bias-and-fairness-in-machine-learning

Empirical Risk Minimization under fairness constraint

Paper link: https://arxiv.org/pdf/1802.08626.pdf/

This paper’s approach is a new in-processing method based on EMR, which incorporates a fairness constraint into the learning problem. It encourages the conditional error of a learned classifier to be approximately constant with respect to demographic groups(e.g., Gender). They derive both risk and fairness bounds that support the statistical consistency of their approach. Their approach is specified to Kernel methods (SVM) and observed that fairness requirement means an orthogonality constraint between the vector of the weights describing our model and the vector representing the discrimination between the different subgroups.

They further observed that the constraint converts into a simple pre-processing step for this approach for linear models.

Fair Regression with Wasserstein barycenters

Paper link: https://arxiv.org/pdf/2006.07286.pdf/

In this paper, they proposed a post-processing method for transforming the real-valued regression function of the machine learning model so that the probability of getting a positive outcome should be approximately the same for different sub-groups of class. Under the unfair regression function, different populations have different probability distributions; the function skews the population’s probabilities with the demographic attribute. The difference between subgroups distributions is calculated using the Wasserstein distance. It shows that the mean of the optimal fair predictor’s distribution is the mean of the different subgroups’ distributions, as computed using Wasserstein distance.

The paper’s numerical experiments indicate that post-processing is effective in making the model fairer.

source : https://www.amazon.science/research-awards/success-stories/algorithmic-bias-and-fairness-in-machine-learning

Exploiting MMD and Sinkhorn divergences for fair and transferable representation learning

Paper Link: https://assets.amazon.science/36/8c/b7e3a27d4998be3cd4a3118cdd6f/exploiting-mmd-and-sinkhorn-divergences-for-fair-and-transferable-representation-learning.pdf/

This paper focuses on Deep Learning. The approach in this paper modifies the data representation approach to meet certain fairness constraints. It established that the representation would reduce bias even when transferred to a novel task. It proposes two different measurings the distance between probability distribution, i.e., maximum mean discrepancy and Sinkhorn divergence. This small distance ensures that we represent similar inputs in a similar way when they differ only on the demographic attribute.

This paper presents the experiment of three different real-world datasets showing that the proposed method outperforms state-of-the-art approaches by a significant margin.

Source: https://www.amazon.science/research-awards/success-stories/algorithmic-bias-and-fairness-in-machine-learning