The latest proposed HBM-PIM architectures target to accelerate bandwidth sensitive operation with PIM units around memory banks. In this PIM architecture, locality of the PIM operands must be guaranteed due to their architectural limitation. But state-of-the-art GPUs use hash function for physical address and memory location mapping which causes the problem that each operand is stored in different bank, so PIM operation cannot be performed. In this thesis, we propose software and hardware architecture to utilize PIM units in GPUs with HBM-PIM system by rearranging the PIM operands in HBM logic die to guarantee PIM operand locality.