Acoustic emission (AE) monitoring has gained significant interest as a promising method for monitoring of changes in structural integrity and durability. Long-term AE monitoring needs to detect and distinguish crack signals from ambient noise (or dummy) signals; however, it is still a daunting task which currently limits field implementation of the AE method. Herein, we explore the feasibility of using convolutional neural network (CNN) models to detect AE crack signals from ambient signals. The trained models are validated both with noise embedded synthesized signals and with upscaled physical model experiments simulating earthquake loading to a scaled model foundation by using a large-scale shaking table. The 2D CNN model trained the laboratory synthesized signal sets effectively captured the crack and crack-free signals in all cases including the upscaled physical model experiments. This study presents a simple but robust CNN model for pre-filtering of crack signals and a novel training method for enhanced accuracy, which can be applied for real-time structural health monitoring of concrete-based structures.