 In this paper, the authors proposed a novel approach called fusion pillars to fuse multi-sensor data, lidar and camera, for unmanned system applications. The approach consists of three branches, a point-based branch, a voxel-based branch, and an image-based branch. Two modules were designed to enhance the voxel-wise features in the pseudo-image, the set abstraction self, SAS, fusion module and the pseudo-view cross, PVC, fusion module. By considering the relationship between the point-wise and voxel-wise features, the SAS fusion module self-fused the point-based branch and the voxel-based branch to enhance the spatial information of the pseudo-image. Additionally, the PVC fusion module introduced the RGB information as auxiliary information and cross-fused the pseudo-image and RGB image of different scales to supplement the color information of the pseudo-image. Experiments showed that fusion pillars outperformed other state-of-the-art approaches, achieving better detection accuracy for small objects. This article was authored by Jing Zhang, Da Su, Yu Zongli, and others. We are article.tv, links in the description below.