(A) study on effective knowledge distillation methods for compressing large-scale speech self-supervised learning models음성 자기지도학습 모델 압축을 위한 효과적인 지식 증류기법에 관한 연구
The success of self-supervised learning in the domain of speech has led to the development of large-scale self-supervised models. However, using these kinds of large-scale models in practice can be costly, potentially limiting the use of these models especially in resource-constrained settings. In this regard, we propose a model compression method FitHuBERT, which uses a thinner and deeper architecture across almost all of its model components compared to prior work. Additionally, we propose knowledge distillation with hints to improve performance, and the use of time-reduction layers to increase efficiency. Evaluation results on the SUPERB benchmark show that our model outperforms previous work especially on content related tasks, while having fewer parameters and faster inference time.