As network architecture becomes complex and the user requirement gets diverse, the role of efficient network resource management becomes more important. However, existing network scheduling algorithms such as the max-weight algorithm suffer from poor delay performance. In this paper, we present a reinforcement learning-based network scheduling algorithm that achieves both optimal throughput and low delay. To this end, we first formulate the network optimization problem as an MDP problem. Then we introduce a new state-action value function called W-function and develop a reinforcement learning algorithm called W-Learning that guarantees little performance loss during a learning process. Finally, via simulation, we verify that our algorithm shows delay reduction of up to 40.8% compared to the max-weight algorithm over various scenarios.