Global adaptive routing is a critical component of high-radix networks in large-scale systems and is necessary to fully exploit the benefits of the path diversity of high-radix networks. However, global adaptive routing involves making a routing decision between minimal and non-minimal paths based on "approximate'' information, often based on local information. As a result, while simulations might provide high performance for a given configuration, it is not necessarily robust as network parameter changes or network size scales. Different heuristic-based adaptive routing algorithms have been proposed and in this work, we identify the limitations of previously proposed adaptive routing algorithms and their inability to properly route packets across different networks. To solve those issues, we propose to use an adaptive routing algorithm that leverages local channel utilization information based on reinforcement learning, namely $k$-armed bandit. We also propose to use packet queuing latency as feedback so that it is aware of the global condition of the network. We show that using either local or global information has its own limitations and by combining both local and global information, high routing performance can be achieved across all traffic patterns in various high-radix networks.