In this work, we propose a deep reinforcement learning (DRL) framework called Pin-opt, designed to create a reusable solver capable of optimizing pin assignment to minimize signal integrity (SI) and power integrity (PI) degradation in microbump packages. The increasing data rates of high-bandwidth systems have made SI/PI issues critical for ensuring the reliability of these systems. While previous research using meta-heuristic methods has optimized pin assignment to reduce SI/PI degradation in similar vertical interconnections, these approaches tend to be inflexible, providing problem-specific solutions suitable only for square-shaped pin arrangements. Our approach, Pin-opt, leverages the advantages of a learning-based method to create a practical solution applicable to pin maps of any shape and with a very large pin count. By representing pins as graphs during the training process, Pin-opt becomes adaptable to any pin arrangement and demonstrates significant performance improvements when solving large-scale pin assignment problems. We evaluate the performance, computational cost, reusability, and scalability of Pin-opt by comparing it to the genetic algorithm (GA), a conventional meta-heuristic method used for solving optimization tasks. To demonstrate its practicality, Pin-opt is also applied to a pin map of high bandwidth memory (HBM).