Patch-Mix Contrastive Learning with Audio Spectrogram Transformer on Respiratory Sound Classification

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 41
  • Download : 0
Respiratory sound contains crucial information for the early diagnosis of fatal lung diseases. Since the COVID-19 pandemic, there has been a growing interest in contact-free medical care based on electronic stethoscopes. To this end, cutting-edge deep learning models have been developed to diagnose lung diseases; however, it is still challenging due to the scarcity of medical data. In this study, we demonstrate that the pretrained model on large-scale visual and audio datasets can be generalized to the respiratory sound classification task. In addition, we introduce a straightforward Patch-Mix augmentation, which randomly mixes patches between different samples, with Audio Spectrogram Transformer (AST). We further propose a novel and effective Patch-Mix Contrastive Learning to distinguish the mixed representations in the latent space. Our method achieves state-of-the-art performance on the ICBHI dataset, outperforming the prior leading score by an improvement of 4.08%.
Publisher
International Speech Communication Association
Issue Date
2023-08-22
Language
English
Citation

24th International Speech Communication Association, Interspeech 2023, pp.5436 - 5440

DOI
10.21437/Interspeech.2023-1426
URI
http://hdl.handle.net/10203/316028
Appears in Collection
AI-Conference Papers(학술대회논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0