IBM Researchers Showcase an Image Classifier Pretrained on Synthetic Data and Fine-Tuned on Real Data for Tasks, Did as Well as One Trained Exclusively on ImageNet’s Database of Real-World Photos

In a recent research article by IBM, the research team raised concerns regarding how the machine learning and computer vision jobs be performed with safe, synthetic data. The models are trained from real-world data like video scraped from sites like YouTube. Deep learning models learn to make predictions and decisions based on patterns extracted from billions of real-world examples. Health information, financial information, consumer information, and online material are all covered by copyright, ethical, and privacy rules. Other forms of data include high curation costs, biases, and built-in vulnerabilities that have resulted in more common instances like chatbots ranting about gender and ethnicity and resume screeners excluding competent job applicants. 

IBM has always pioneered in the advancement of AI technologies and has successfully created AI solutions for enterprise problems. This time they came up with a low-cost alternative to real data, using synthetic data that is safe to use and has proven capable of training models similar to real data. According to the researcher, by 2024, 60% of the data used in training AI models will be synthetically generated. So basically, these are computer-generated images that look real, have fewer permissions, and are also impervious to malicious attacks.

There are mainly two ways to generate synthetic images :

  • Using generative models, building AI systems that can learn from data and drastically speed up the time it takes to find new opinions to test.
  • Using Graphics engines, as the name says that they are image-generating systems used to train task-specific learning. This technology is widely used to train self-driving cars and warehouse robots.
  • Researchers developed Task2Sim, an AI model that learns to generate synthetic, task-specific data for pretraining image-classification models. The researchers made use of ThreeDWorld, a setting created using the Unity graphics engine., to create pictures with realistic objects and scenes.

According to the researchers, some advantages of using synthetic data are:

  • In a virtual environment, creating images from scratch presents fewer challenges, such as the tiresome task of categorizing what is in each picture.
  • Synthetic images are that you can control their parameters — the background, lighting, and the way objects are posed.
  • You can generate unlimited training data, and you get labels for free.

Not only did a classifier pre-trained on Task2Sim’s fake images perform as well as a model trained on real ImageNet photos, but it also outperformed a rival trained on images generated with random simulation parameters. 

Gaining knowledge of picture generation for task-specific learning

According to the researchers, building images from scratch presents more minor challenges, including the tedious job of labeling what’s in each picture. Synthetic images, including the backdrop, lighting, and object positioning, can be played with. Unlimited training data may be produced, and labels are provided without charge.

The next stage is for IBM researchers to investigate whether they can train their classifier to surpass those trained on actual data.

Additionally, they plan to use synthetic data for more difficult vision tasks, including seeing people and animals in situations and dissecting pictures into their parts. A safer method of learning about the physical world The complexity and unpredictability of our world may be better understood by AI models if they can be taught how objects and animals behave in a virtual environment. According to researchers, you don’t need actual data to understand that structure.

Paper 1:

Paper 2:


Please Don't Forget To Join Our ML Subreddit
🐝 Join the Fastest Growing AI Research Newsletter Read by Researchers from Google + NVIDIA + Meta + Stanford + MIT + Microsoft and many others...