The unsteady vortex method is modified to estimate the contribution of leading-edge vortices and was used to simulate the unsteady aerodynamics of the flapping wing model. The reinforcement learning environment to train flapping wing kinematics is established based on a deep neural network. The optimal hovering wing kinematics that leads to maximum lift and lift/drag ratio is found.