To drive the next generation of web applications, Tensorflow released its first depth estimation API, called Depth API, and ARPortraitDepth, a model for estimating a depth map for portraiture. They also published 3D photo, a computational photography application that uses the anticipated depth and creates a 3D parallax effect on the given portrait image, further persuading people of the enormous possibilities of depth information. Tensorflow has also launched a live demo for people to try and convert their photographs into 3D versions.
Portrait Depth API is a deep learning model that outputs a depth map from a single color portrait photograph. To improve computing performance, the team used a lightweight U-Net architecture model. The design consists of an encoder that downscales the feature map resolution by a factor of two and a decoder that enhances the feature map resolution to the same level as the input. The encoder’s deep learning features are concatenated to the relevant layers with the exact spatial resolution as the decoders to obtain high-resolution signals for depth estimation. The training procedure entails forcing the decoder to make depth predictions with increasing resolutions at each layer and applying a loss to each one. This allows the decoder to calculate the appropriate depth by gradually increasing complexities.
Feeding an ML model with a high number of varying data points is critical to ensure accuracy and durability. The researchers manually created pairs of color and depth images using a high-quality performance capture system with camera setups to construct a superior training dataset. Actual data were also acquired from mobile phones having a front-facing depth sensor to ensure that the depth quality was not on par with the synthetically generated data, but the color images produced were beneficial in boosting the model’s performance. Before passing the image into the neural network of depth estimation, a segmentation model was run with MediaPipe and TensorFlow.js to improve the model’s robustness against background variation. The researchers believe that this portrait depth model could open the door to a plethora of new creative applications centered on the human body. The detailed instructions to install the portrait depth API, which is one of the variants of the new depth API, can be accessed here.
Khushboo Gupta is a consulting intern at MarktechPost. She is currently pursuing her B.Tech from the Indian Institute of Technology(IIT), Goa. She is passionate about the fields of Machine Learning, Natural Language Processing and Web Development. She enjoys learning more about the technical field by participating in several challenges.