When you use a machine learning model to predict something, it is essential to know how reliable the predictions are. It is tough to understand what is happening inside the model, and the complex learning algorithms are often used as “black boxes.” Selective regression is a technique used to improve the performance in which the learning algorithm can either predict the target variable or abstain from making predictions based on its confidence level. It does improve the overall performance of the model with decreased coverage(fraction of cases on which it predicts), but it may become worse for subgroups with underrepresented data and cause bias. This is because the training data may contain an overrepresentation of some subgroups, which influences the confidence measure. Fairness attempts to improve ML models wrt bias in sensitive variables (like gender, race, etc.), as these may sometimes form subgroups with underrepresented data. In short, it has been observed that while attempting to improve the performance of a model, there is a decrease in the fairness of the model. MIT researchers propose a method to mitigate disparities among minority subgroups in machine learning models.
Fair Regression: This algorithm is made to retain fairness w.r.t features like gender,i.e, the predictions should not discriminate w.r.t this feature.
Fair selective regression
It evaluates the disparities in selective regression and tries to improve on them. The MIT team wanted to ensure that the performance of every subgroup improves as the overall model improves in selective regression. The method they used is called the Monotonic selective risk. This method ensures that for every subgroup, the mean squared error decreases monotonically with a decrease in coverage. This ensures that no subgroup is discriminated against in selective regression. The researchers developed two NN algorithms, one containing all the sensitive features and the second employing calibration techniques to ensure that the prediction remains the same irrespective of the sensitive attributes. When implemented, the disparities that were there in the normal selective regression were reduced by getting lower errors for underrepresented subgroups. Notably, the overall error rate was not affected.
In summary, a machine learning model’s performance can be improved by selective regression, which decreases the coverage, but it comes with some disadvantages like unfairness, and the researchers have given a technique to improve fairness without compromising the accuracy.
This Article is written as a research summary article by Marktechpost Research Staff based on the research paper 'Selective Regression Under Fairness Criteria'. All Credit For This Research Goes To Researchers on This Project. Checkout the paper, github link and MIT article. Please Don't Forget To Join Our ML Subreddit
Prathvik is ML/AI Research content intern at MarktechPost, he is a 3rd year undergraduate at IIT Kharagpur. He has a keen interest in Machine learning and data science.He is enthusiastic in learning about the applications of Machine learning in different fields of study.