This paper proposes a planning framework for a multitarget unmanned aerial vehicle (UAV) reconnaissance mission under target position uncertainty based on the imitation learning of a policy network. A problem in planning the flight path and task sequence for the reconnaissance of given targets is formulated as a Markov decision process (MDP). A deep neural classifier is introduced as a policy network to predict the probability over admissible immediate actions at each mission state. A set of MDPs with specified target locations (training set) are constructed and solved offline to generate the segments of underlying optimal policy, which are used to train the policy network. The trained policy networks are used for online (runtime) planning of the UAV conducting reconnaissance for arbitrary target locations by providing the near-optimal action corresponding to specified target locations and the current state. A numerical case study has been conducted to demonstrate the effectiveness of the proposed approach from the perspective of its classification performance and the expected return in comparison with a baseline online planning algorithm.