Robot Information Intelligent Perception and Navigation Based on Multi-Sensor Fusion

Xibin  Li; Yiyang Luo; Mingsong  Bao; Haoming  Sun

Authors

Xibin Li School of Safety Engineering, China University of Mining and Technology, Xuzhou 221116, Jiangsu, China
Yiyang Luo Purchasing Department, Sinopec International Business Tianjin Co., Ltd., Tianjin 300042, Tianjin, China
Mingsong Bao School of Safety Engineering, China University of Mining and Technology, Xuzhou 221116, Jiangsu, China
Haoming Sun Technical Research and Development Department, Shandong Tesla Robot Co., Ltd.,Yantai 264006, Shandong, China

Abstract

Various sensors are different in terms of time synchronization, data dimension, and sampling frequency, which makes the deep fusion of heterogeneous data difficult. In addition, the existing high-precision fusion algorithms rely heavily on computational resources and cannot meet the needs of lightweight robots with limited computing power. To address these issues, this research work applies a fusion method based on a multimodal convolutional neural network (2D-ResNet-50 + 3D-CNN) and a cross-modal attention mechanism to ensure the synchronization and unified format of multi-sensor data by means of data preprocessing and time alignment technology. Then, a convolutional neural network is used to extract features from visual image data, laser radar point cloud data, and inertial measurement unit (IMU) data, and the information from different sensors is fused through a cross-modal attention mechanism. A modular architecture is used to optimize the computational efficiency of the system. The system is
divided into multiple independent modules, each responsible for a specific task. The event trigger mechanism is used to dynamically activate and schedule related modules to enhance the system’s intelligence. This method is deployed on NVIDIA’s Jetson Xavier NX platform, and the experiments are conducted under the Robot Operating System framework. Experiments show that the robot’s control error does not exceed 0.25 when performing path tracking tasks. The path planning time in various environments does not exceed 150 milliseconds. This method can improve perception precision while maintaining high real-time performance and efficiency with limited computational resources, significantly optimizing the robot’s navigation performance in complex dynamic environments.

Keywords: multi-sensor fusion, robot navigation, deep learning, cross-modal attention mechanism, modular architecture

Cite As

X. Li, Y. Luo, M. Bao, H. Sun, "Robot Information Intelligent Perception and Navigation Based on Multi-Sensor Fusion",
Engineering Intelligent Systems, vol. 34 no. 2, pp. 273-284, 2026.

Robot Information Intelligent Perception and Navigation Based on Multi-Sensor Fusion

Authors

Abstract

Downloads

Published

Issue

Section

License

Developed By

Information