Researchers from SJTU China Introduce TransLO: A Window-Based Masked Point Transformer Framework for Large-Scale LiDAR Odometry

Researchers from Shanghai Jiao Tong University and China University of Mining and Technology have developed TransLO. This LiDAR odometry network integrates a window-based masked point transformer with self-attention and masked cross-frame attention. Effectively handling sparse point clouds, TransLO employs a binary mask to eliminate invalid and dynamic points. 

The approach discusses common LiDAR odometry methods, including Iterative Closest Point (ICP) variants and the widely used LOAM, which extracts features for motion estimation. It emphasizes LOAM’s variants, incorporating ground segmentation for improved performance. TransLO, the first transformer-based LiDAR odometry network, the study combines CNNs and transformers for global feature embeddings, enhancing outlier rejection and 3D scene understanding. Components like projection-aware masks, Window-based Masked Self Attention (WMSA), and Masked Cross Frame Attention (MCFA) are evaluated through ablation studies to demonstrate TransLO’s effectiveness.

LiDAR odometry is crucial for applications like SLAM, robot navigation, and autonomous driving, traditionally relying on ICP or feature-based approaches. Learning-based methods, particularly CNNs, face challenges in capturing long-range dependencies and global features in point clouds. TransLO uses a window-based masked point transformer with self-attention and masked cross-frame attention to process point clouds and predicts pose estimation efficiently. 

TransLO employs a window-based masked point transformer that efficiently processes point clouds using a 2D projection, a local transformer capturing long-range dependencies, and an MCFA predicting pose estimation. Point clouds are projected onto a cylindrical surface, employing stride-based sampling layers with WMSA for feature encoding. CNNs enlarge the receptive field, and a projection-aware mask addresses point cloud sparsity. A pose-warping operation aids iterative refinement. Ablation studies confirm component effectiveness, and TransLO outperforms existing methods on the KITTI odometry dataset.

The experiment results on the KITTI odometry dataset demonstrate TransLO’s superior performance with an average rotational RMSE of 0.500°/100m and translational RMSE of 0.993%. TransLO outperforms recent learning-based methods and even surpasses LOAM on most evaluation sequences. Ablation studies highlight the significance of WMSA and the binary mask, which filters outliers. The MCFA module improves translation and rotation errors by establishing soft correspondences between frames, emphasizing its crucial role in the model’s success.

The TransLO framework introduces a projection step that may result in information loss, potentially affecting odometry accuracy. The study needs a detailed analysis of the computational complexity of TransLO, hindering a thorough understanding of its efficiency compared to other methods. Evaluation is confined to the KITTI odometry dataset, raising questions about the method’s generalizability to diverse scenarios. The lack of comparisons with non-transformer methods restricts understanding TransLO’s relative strengths and weaknesses.

The proposed TransLO network, an end-to-end window-based masked point transformer for LiDAR odometry, integrates CNNs and transformers to enhance global feature embeddings and outlier rejection, achieving state-of-the-art performance on the KITTI odometry dataset. Key components include WMSA for long-range dependencies and MCFA for frame association and pose prediction. Ablation studies confirm the importance of WMSA, the binary mask for outlier filtering, and the crucial role of MCFA in establishing soft correspondences. TransLO demonstrates superior accuracy, efficiency, and global feature focus for large-scale localization and navigation.

Check out the Paper and Github. All credit for this research goes to the researchers of this project. Also, don’t forget to join our 33k+ ML SubReddit, 41k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, you will love our newsletter..

Sana Hassan, a consulting intern at Marktechpost and dual-degree student at IIT Madras, is passionate about applying technology and AI to address real-world challenges. With a keen interest in solving practical problems, he brings a fresh perspective to the intersection of AI and real-life solutions.

🐝 Join the Fastest Growing AI Research Newsletter Read by Researchers from Google + NVIDIA + Meta + Stanford + MIT + Microsoft and many others...