Researchers From Oxford Introduces ‘DOVE’, An AI Method That Learns Deformable 3D Objects By Just Watching Videos On Youtube

The problem of learning 3D deformable objects from 2D images is an extremely difficult one. The traditional way to learn these things relies on explicit supervision, such as keypoints and templates which restricts their applicability when the object isn’t in a controlled environment like inside a lab.

Researchers from Oxford propose a novel way called ‘DOVE’ (Deformable Objects from Videos) to learn deformable 3D objects without explicit keypoints or template shapes. The method relies on monocular videos which naturally provide correspondences across time and can be applied in the “wild”. This novel technique is able to predict 3D canonical shape, deformation, viewpoint and texture using only 2D images of birds. This new method can allow people to animate the bird’s motion or manipulate their perspective on it much more easily than before.

Dynamic 3D reconstruction of objects has long been a goal for scientists and engineers. Now, this new technology allows us to automatically reconstruct the shape of an object from just one video clip using correspondences between different views in other videos taken by cameras at slightly varying angles with respect to each other. Consider if you had few minutes worth of footage showing two birds sitting on a tree where all camera angles were static; we could then use this information as input data into our model which would be predictive enough that it can simulate what will happen next frame-by-frame without any additional training or instructions!

Unlike the existing approaches, this new method (DOVE) for learning 3D shapes does not require explicit supervision such as keypoints, viewpoint or template shapes. It relies on the temporal information inherent in videos alone to learn more about geometry of an object.

This method is a powerful way to create and animate 3D representations of objects. The DOVE algorithm can even learn from YouTube videos without explicit geometric supervision, such as keypoints or template shapes. Given the right data preprocessing models for object detection and optical flow, this system can be trained even faster than before!


Github: (Code coming soon!)


Asif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of Artificial Intelligence for social good. His most recent endeavor is the launch of an Artificial Intelligence Media Platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is both technically sound and easily understandable by a wide audience. The platform boasts of over 2 million monthly views, illustrating its popularity among audiences.

🐝 [FREE AI WEBINAR] 'Beginners Guide to LangChain: Chat with Your Multi-Model Data' Dec 11, 2023 10 am PST