Addressing AI’s Generalization Gap: Researchers From University College London Propose Spawrious – An Image Classification Benchmark Suite Containing Spurious Correlations Between Classes And Backgrounds

With the increasing popularity of Artificial Intelligence, new models are getting released almost every day with brand-new features and problem-solving capabilities. Researchers in recent times have been focusing on coming up with approaches to strengthen AI models’ resistance to unknown test distributions and lessen their reliance on spurious features. Considering the examples of self-driving cars and autonomous kitchen robots, they have not been widely deployed yet because of the challenges posed by their behavior in out-of-distribution (OOD) settings, which refer to the scenarios that differ significantly from the training data the models were exposed to.

Numerous studies have looked into the issue of spurious correlations (SCs) and suggested methods to lessen their negative effects on model performance. It has been demonstrated that classifiers trained on well-known datasets like ImageNet rely on background data, which is spuriously linked with class labels but not necessarily predictive of them. Though progress has been made in developing methods to address the SC problem, there is still a need to address the limitations of existing benchmarks. Current benchmarks like Waterbirds and CelebA hair color benchmarks have limitations, one of which is their focus on simplistic one-to-one (O2O) spurious correlations, when in reality, many-to-many (M2M) spurious correlations are more common, involving groups of classes and backgrounds.

Recently, a team of researchers from University College London has introduced an image classification benchmark suite called the Spawrious dataset which contains spurious correlations between classes and backgrounds. It includes both one-to-one (O2O) and many-to-many (M2M) spurious correlations, which have been categorized into three difficulty levels: Easy, Medium, and Hard. The dataset consists of approximately 152,000 high-quality, photo-realistic images generated using a text-to-image model, and an image captioning model has been employed to filter out unsuitable images, ensuring the dataset’s quality and relevance.

Upon evaluation, the Spawrious dataset has demonstrated incredible performance as the dataset imposed challenges for the current state-of-the-art (SOTA) group robustness approaches, such as Hard-splits, which presented a significant challenge, with none of the tested methods achieving over 70% accuracy using a ResNet50 model pretrained on ImageNet. The team has mentioned how the models’ performance shortcomings have been caused by their reliance on fictitious backgrounds by looking at the classifications they made incorrectly. This shows how the Spawrious dataset was able to successfully tests classifiers and reveal their weaknesses to erroneous correlations.

To illustrate the difference between the O2O and M2M benchmarks, the team has used an example of collecting training data during the summer, consisting of two groups of animal species from two distinct locations, with each animal group being associated with a specific background group. However, as the seasons change and animals migrate, the groups exchange locations, causing the spurious correlations between animal groups and backgrounds to reverse in a way that cannot be matched on a one-to-one basis. This highlights the need to capture the intricate relationships and interdependencies in M2M spurious correlations.

Spawrious seems like a promising benchmark suite for OOD, domain generalization algorithms, and for evaluating and improving the robustness of models in the presence of spurious features.

Check Out Theย Paper and Github.ย Donโ€™t forget to joinย our 25k+ ML SubReddit,ย Discord Channel,ย andย Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us atย

๐Ÿš€ Check Out 100โ€™s AI Tools in AI Tools Club

Tanya Malhotra is a final year undergrad from the University of Petroleum & Energy Studies, Dehradun, pursuing BTech in Computer Science Engineering with a specialization in Artificial Intelligence and Machine Learning.
She is a Data Science enthusiast with good analytical and critical thinking, along with an ardent interest in acquiring new skills, leading groups, and managing work in an organized manner.

๐Ÿ Join the Fastest Growing AI Research Newsletter Read by Researchers from Google + NVIDIA + Meta + Stanford + MIT + Microsoft and many others...