LiDAR sensors provide accurate depth measurements, but data are sparse due to their inherent characteristics. This is insufficient for high-level applications, including autonomous driving. Accordingly, there are many studies for generating dense depth information. Monocular depth estimation is a technique for estimating a depth map using only color images and can be used in many devices and applications. However, since a color image is a 2D plane projected in a 3D space, it does not contain sufficient information for estimating the depth. In this paper, we propose a deep learning network that extracts features from a segmentation map to estimate a more accurate depth map. Depth completion is the most accurate depth estimation technique for generating a dense depth map from a sparse depth map. In this task, the fusion method of two data and the refinement method are important. In this paper, we propose a two-stage network consisting of the shallow feature fusion module, multi-perspective layers, and the confidence guidance layer. The proposed monocular depth estimation model containing efficient feature extraction structure of the segmentation map infers a more accurate depth map than the base model in the KITTI depth prediction validation dataset. And, the proposed depth completion model infers significantly faster than the top-ranked models in the KITTI depth completion online leaderboard and provides a high-accuracy depth map.