Depth sensing is a rapidly growing market as the demand of depth sensing technology for hand-held device increases. Porting depth sensing cameras to mobile devices presents a great challenge due to power, size, cost, and speed constraints of hand-held devices. Stereo vision, which has its roots in human binocular vision, is arguably the most thoroughly investigated depth extraction scheme. However stereo vision has a high computational complexity, which results in high power consumption and large computation time. For small area and real-time operation, cost-efficient hardware acceleration of depth extraction is needed. In this dissertation, a fast depth-extraction system for the disparity-based depth sensing camera is proposed. The hardware has optimal cost and accuracy achieved by mathematical modeling and analysis of depth accuracy. The proposed accelerator is fully pipelined, achieves high frame rates, performs online estimation of depth with each input pixel and does not require a frame buffer. It can provide depth at 30 frames per second at 1920 x 1080 resolution. Furthermore, the proposed system has low power consumption as for the aforementioned speed and resolution it only requires 290.76 mW. The proposed system is extended to scalable depth extraction hardware accelerator for multi-baseline stereo camera systems. Multiple stereo pairs are used to observe depth at varying depth ranges. For each stereo pair, the baseline and the depth range are determined to obtain the best compromise between the hardware cost and the depth accuracy. The proposed scalable depth extraction hardware accelerator makes it an ideal choice for depth extraction systems in constrained environments.