A New Study by NVIDIA, University of Toronto, McGill, and Vector Institute Proposes Neural Representation that Enables the Real-Time Rendering of Neural SDFs

Researchers from NVIDIA, the University of Toronto, McGill University, and the Vector Institute led a study that proposed an efficient neural representation that enables the real-time rendering of high-fidelity neural SDFs while delivering SOTA quality geometric reconstruction.

 Earlier studies have shown that neural networks can encode accurate 3D geometry without topology or resolution restrictions by learning the SDF. The researchers identify that the neural approximations of signed distance functions (SDFs) have become a primary choice for several SOTA computer vision and graphics applications. Generally, such systems encode the SDFs with a large, fixed-size multi-layer perceptron (MLP)/ neural network to approximate complex shapes with implicit surfaces.

Neural SDFs have become very popular as a 3D geometry representation for graphics. SDFs are a function of position which returns the nearest distance to the surface. Sphere tracing, a root-finding algorithm, is used to generate SDFs. The algorithm requires numerous SDFs evaluations per pixel. Although these functions are differentiable and smooth, SDFs are very slow to render. Generally, neural SDFs are made of large MLPs, making sphere tracing expensive. 

Neural SDF encoding methods often attain SOTA geometry reconstruction quality. However, they are computationally expensive for real-time graphics. This is because, for every pixel, many forward passes through the extensive network are required. 

The researchers have encoded geometry using a sparse voxel octree containing feature vectors at the corners, where the octree levels correspond to levels of detail. Small MLP is used to decode these vectors without compromising the reconstruction quality.

The suggested architecture includes a model that combines a sparse-octree data structure with the small surface extraction neural network. The sparse-octree data structure encodes the geometry and enables a geometric neural level of detail (LOD). LOD regards to 3D shapes filtered to limit feature variations to approximately twice the pixel size in image space to mitigate flickering and accelerate rendering by reducing model complexity. When combined with a tailored sphere tracing algorithm, the introduced approach is highly expressive and computationally performant. 

Source: https://arxiv.org/pdf/2101.10994.pdf
Source: https://arxiv.org/pdf/2101.10994.pdf

The proposed approach attained a rendering speedup of 2-3 orders of magnitude over baseline networks and SOTA reconstruction quality for complex shapes on both 3D geometric and 3D image-space metrics.

This approach profoundly depends on the point samples used during training. Therefore, it is difficult to scale the scaling representation to vast scenes or fragile, volume-less geometry. Nonetheless, the researchers believe that this study is a huge step forward in neural implicit function-based geometry. They hope that this will be a useful component for potential applications like scene reconstruction, ultra-precise robotics path planning, interactive content creation, and many more.

Paper: https://arxiv.org/pdf/2101.10994.pdf

Github: https://nv-tlabs.github.io/nglod/

Tanushree Shenwai is a consulting intern at MarktechPost. She is currently pursuing her B.Tech from the Indian Institute of Technology(IIT), Bhubaneswar. She is a Data Science enthusiast and has a keen interest in the scope of application of artificial intelligence in various fields. She is passionate about exploring the new advancements in technologies and their real-life application.

🚀 The end of project management by humans (Sponsored)