Representing human-made objects as a collection of base primitives has a long history in computer vision and reverse engineering. In the case of high-resolution point cloud scans, the challenge is to be able to detect both large primitives as well as those explaining the detailed parts. While the classical RANSAC approach requires casespecific parameter tuning, state-of-the-art networks are limited by memory consumption of their backbone modules such as PointNet++ [27], and hence fail to detect the finescale primitives. We present Cascaded Primitive Fitting Networks (CPFN) that relies on an adaptive patch sampling network to assemble detection results of global and local primitive detection networks. As a key enabler, we present a merging formulation that dynamically aggregates
the primitives across global and local scales. Our evaluation demonstrates that CPFN improves the state-of-the-art SPFN performance by 13 − 14% on high-resolution point cloud datasets and specifically improves the detection of fine-scale primitives by 20 − 22%. Our code is available at: https://github.com/erictuanle/CPFN