Performance of object detection using single images has been significantly improved by recent progress of artificial intelligence technologies. But, existing technologies focus on detecting objects in 2D images, which makes difficult to use it in real-world robot applications carrying out various tasks such as detecting objects, operating or avoiding them as well. Instead of the use of 2D images, 3D object detection is more suitable for such robot applications. Of course, research on 3D detection has been conducted in the academic community, but its performances do not reach that of 2D object detection as of yet. This is because 3D points from 3D scanner are too sparse to capture fine structured and small objects such as bicycle, person and road sign etc.
In this dissertation, we propose a Camera-LiDAR sensor fusion method for enhancing 3D object detection and pose estimation for robotic applications as twofold.
The first part of this dissertation is a 3D object proposal method which will reduce the search region of an object. By proposal of a region assumed to contain an object, rather than searching an entire area for the object, we can increase time efficiency and improve accuracy for detection. In this dissertation we propose a 3D object proposal, applying an object proposal used for a 2D image onto 3D dataset. The proposed 3D object proposal method shows higher recall with fewer number of proposals to that of 2D, by using discontinuity in 3D.
The second part is a depth completion which make dense depth maps from sparse 3D point measurements from LiDAR data. A major bottleneck in 3D object detection and pose estimation comes from the sparsity of the LiDAR sensor itself. The proposed method propagates initial sparse depth points into a corresponding image with a geometric consistency assuming that 3D points is perpendicular to the neighbor normal vector. In this step, we additionally propose an accurate surface normal estimation to handle over-smoothing artifact in depth boundaries. We demonstrate that the estimated dense depth maps benefit robotics applications in real-world environments. However, there is still problem of our depth completion in the computational complexity.
Finally, we propose a selective depth propagation method to resolve the computational complexity. Using our object proposal, we generate selective regions for depth completion, and then propagate sparse 3D depth into those regions. As a result, our unified method achieves reducing the computational time by 10 times compared to depth completion for whole image regions.