3D graphics hardware for mobile multimedia devices should be implemented within limited memory bandwidth, area, and power budget. Among various bandwidth-saving techniques, tessellation reduces the amount of geometry data transfer by generating highly detailed geometry from coarse meshes inside the 3D graphics hardware. Despite its obvious effectiveness, only a few high-performance gaming systems have integrated dedicated tessellators with additional floating-point datapath and complex control logic.
In this thesis, we propose the architecture of a shader-based tessellator for mobile 3D graphics. The proposed tessellator is implemented with a negligible hardware penalty because floating-point computations of tessellation are accelerated by the existing GPU pipeline and only tessellation-specific control logic is handled by an additional hardware unit. Tightly coupled with a vertex shader, the additional unit dynamically produces topological configurations and parametric coordinates of refinement patterns in the type of indexed triangle strips for object-level adaptive tessellation. The crack-free topological configurations improve the efficiency of a vertex cache so as to avoid redundant shader operations.
In addition to the tessellation functionality, the shader architecture is enhanced for area and energy efficiency as well as higher performance. The latency of floating-point datapath is reduced by adopting fast DP4 units. The floating-point computations of the special function unit are also performed by the DP4 units to improve area efficiency. Clock gating by tool-based automatic method and manual clock-gating cell insertion reduces unnecessary power dissipation of idle modules. We additionally reduce redundant on-chip memory accesses by utilizing the operational characteristics of the multi-threaded shader architecture and reducing the size of frequently accessed general purpose registers.
The proposed geometry processor is fabricated on three chips...