A network-on-chip (NoC) is applied to achieve extensive communication bandwidth required for parallel computing. A 125 GOPS NoC-based parallel processor with a bio-inspired visual attention engine (VAE) exploits both data and object-level parallelism while dissipating 583 mW by packet-based power management. The use of more PEs, VAE, and low latency NoC enables higher performance and power efficiency over the previous design. NoC-based parallel processor consisting of 12 IPs: a main processor, 8 PE clusters (PECs), VAE, a matching accelerator (MA), and an external interface. The ARMlO-compatible 32b main processor controls the overall system operations. The VAE detects the feature points on the entire image by neural network algorithms like contour extraction. The 8 PECs perform data-intensive image processing applications such as filtering and histogram calculations. The MA accelerates nearest neighbor search to obtain a final recognition result in real-time. The DMA-like external interface distributes automatically the corresponding image data to each PEC to reduce system overhead. Each core is connected to the NoC via a network interface.