DC Field | Value | Language |
---|---|---|
dc.contributor.author | 정현재 | ko |
dc.contributor.author | 구자현 | ko |
dc.contributor.author | 김회린 | ko |
dc.date.accessioned | 2021-02-04T04:30:08Z | - |
dc.date.available | 2021-02-04T04:30:08Z | - |
dc.date.created | 2020-11-28 | - |
dc.date.created | 2020-11-28 | - |
dc.date.created | 2020-11-28 | - |
dc.date.issued | 2020-06 | - |
dc.identifier.citation | 말소리와 음성과학, v.12, no.2, pp.29 - 37 | - |
dc.identifier.issn | 2005-8063 | - |
dc.identifier.uri | http://hdl.handle.net/10203/280563 | - |
dc.description.abstract | Recently, the neural network-based deep learning algorithm has dramatically improved performance compared to the classical Gaussian mixture model based hidden Markov model (GMM-HMM) automatic speech recognition (ASR) system. In addition, researches on end-to-end (E2E) speech recognition systems integrating language modeling and decoding processes have been actively conducted to better utilize the advantages of deep learning techniques. In general, E2E ASR systems consist of multiple layers of encoder-decoder structure with attention. Therefore, E2E ASR systems require data with a large amount of speech-text paired data in order to achieve good performance. Obtaining speech-text paired data requires a lot of human labor and time, and is a high barrier to building E2E ASR system. Therefore, there are previous studies that improve the performance of E2E ASR system using relatively small amount of speech-text paired data, but most studies have been conducted by using only speech-only data or text-only data. In this study, we proposed a semi-supervised training method that enables E2E ASR system to perform well in corpus in different domains by using both speech or text only data. The proposed method works effectively by adapting to different domains, showing good performance in the target domain and not degrading much in the source domain. | - |
dc.language | Korean | - |
dc.publisher | 한국음성학회 | - |
dc.title | 라벨이 없는 데이터를 사용한 종단간 음성인식기의 준교사 방식 도메인 적응 | - |
dc.title.alternative | Semi-supervised domain adaptation using unlabeled data for end-to-end speech recognition | - |
dc.type | Article | - |
dc.type.rims | ART | - |
dc.citation.volume | 12 | - |
dc.citation.issue | 2 | - |
dc.citation.beginningpage | 29 | - |
dc.citation.endingpage | 37 | - |
dc.citation.publicationname | 말소리와 음성과학 | - |
dc.identifier.doi | 10.13064/KSSS.2020.12.2.029 | - |
dc.identifier.kciid | ART002602983 | - |
dc.contributor.localauthor | 김회린 | - |
dc.description.isOpenAccess | N | - |
dc.subject.keywordAuthor | automatic speech recognition | - |
dc.subject.keywordAuthor | end-to-end | - |
dc.subject.keywordAuthor | semi-supervised | - |
dc.subject.keywordAuthor | domain adaptation | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.