Speech imagery based brain-computer interfaces (BCI) have been researched as a potentially powerful method to provide communication and control without explicit bodily movements. Compared to other BCI systems, speech imagery based BCIs can employ mental tasks directly related to control tasks, making them more intuitive for use in daily-life. However, training data is difficult to collect in sufficient size and number in terms of time and user comfort; preparation time to wear electroencephalogram (EEG) measurement devices are lengthy and the devices themselves are uncomfortable to use, making it difficult for subjects to undergo long training sessions without feeling fatigued. To overcome this problem, we suggest a novel data augmentation method using actual speech data. Based on the similarity between the tasks imagined speech and actual speech, we augment EEG training data of imagined speech with EEG data of actual speech and show significant improvements for two out of three subjects.