Google AI Open-Sources ‘Google Scanned Objects’: A Dataset of 3D-Scanning Common Household For Deep Learning Research

Deep learning is used in many recent developments in computer vision and robotics; however, training deep learning models takes a lot of data to generalize to new circumstances. Deep learning for computer vision has traditionally depended on datasets of millions of things obtained through web scraping. However, the process of forming these databases may be time-consuming, and labeling mistakes can cause perceptions of progress to be skewed. Furthermore, this method is challenging to apply to arbitrary three-dimensional forms or real-world robotic data.

Many of the inherent restrictions in these datasets may be mitigated by utilizing tools like Gazebo, MuJoCo, and Unity to simulate robots and surroundings. However, simulation is a rough representation of reality; handmade models made of polygons and primitives typically don’t match real things well. Even if a scene is created directly from a 3D scan of a real-world environment, the moveable items in that scan will behave as fixed backdrop scenery and will not react in the same manner that real-world objects would. Due to these difficulties, only a few big libraries of high-quality 3D object models can be included in physical and visual simulations to provide the variation required for deep learning.

In 2011, Google robotics researchers began scanning items to create high-fidelity 3D representations of everyday home goods to aid robots in recognizing and grasping objects in their settings. However, it soon became apparent that 3D models could be used for more than only object detection and robotic grasping, such as scene building for physical simulations and 3D object display for end-user applications. As a result, the Scanned Things project was expanded to deliver Google 3D experiences at scale, capturing many 3D scans of home objects using a more efficient and economical technique than standard commercial-grade product photography.

Scanned Artifacts was a full-fledged project that included curating large-scale objects for 3D scanning, creating new 3D scanning gear, efficient 3D scanning software, quick 3d modeling software for quality verification, and customized frontends for web and mobile users. We also conducted human-computer interaction research to develop successful interactions with 3D objects.

The team of researchers constructed a scanning apparatus to gather photos of an object from many directions in under-regulated and precisely calibrated settings to build high-quality models. A structured light approach was implemented in the scanning rig to infer a 3D geometry from camera photos with light patterns projected onto an item. Two machine vision cameras were used to detect shapes, a DSLR camera was used to extract high-quality HDR color frames, and a computer-controlled projector was used to recognize patterns.


Internal scanned models from the beginning utilized protocol buffer information, high-resolution graphics, and formats that were unsuitable for simulation. Physical attributes, such as mass, were acquired for specific items by weighing them during scanning, but surface characteristics, such as abrasion or deformation, were not.

As a result, after gathering data, they created an automated workflow to address these difficulties and enable the usage of scanned models in simulation systems. The automated pipeline eliminates object mesh scans that do not fulfill virtual environment requirements by filtering out invalid or redundant objects, automatically assigning object names based on text descriptions of the things, and automatically assigning object names based on text descriptions of the objects. 

The Scanned Objects dataset comprises 1030 scanned objects and their accompanying metadata, totaling 13 Gb. As these models are scanned instead of hand-drawn, they accurately represent real-world qualities rather than idealized recreations, making it easier to transfer knowledge from simulation to the actual world.

Over 25 articles and projects have already used the Scanned Objects dataset, encompassing computer vision, computer graphics, robot manipulation, robot navigation, and 3D form processing. The dataset was primarily utilized to generate synthetic training data for learning algorithms in most applications.

This Article is written as a summary article by Marktechpost Staff based on the paper 'Google Scanned Objects: A High-Quality Dataset of 3D Scanned Household Items'. All Credit For This Research Goes To Researchers on This Project. Checkout the paper, article, try it here.

Please Don't Forget To Join Our ML Subreddit
🐝 Join the Fastest Growing AI Research Newsletter Read by Researchers from Google + NVIDIA + Meta + Stanford + MIT + Microsoft and many others...