DC Field | Value | Language |
---|---|---|
dc.contributor.author | Kang, Minsu | ko |
dc.contributor.author | Lee, Jihyun | ko |
dc.contributor.author | Kim, Simin | ko |
dc.contributor.author | Kim, Injung | ko |
dc.date.accessioned | 2023-09-06T03:01:16Z | - |
dc.date.available | 2023-09-06T03:01:16Z | - |
dc.date.created | 2023-09-06 | - |
dc.date.issued | 2021-06-06 | - |
dc.identifier.citation | ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.7043 - 7047 | - |
dc.identifier.uri | http://hdl.handle.net/10203/312254 | - |
dc.description.abstract | We propose an end-to-end speech synthesizer, Fast DCTTS, that synthesizes speech in real time on a single CPU thread. The proposed model is composed of a carefully-tuned lightweight network designed by applying multiple network reduction and fidelity improvement techniques. In addition, we propose a novel group highway activation that can compromise between computational efficiency and the regularization effect of the gating mechanism. As well, we introduce a new metric called elastic mel-cepstral distortion (EMCD) to measure the fidelity of the output mel-spectrogram. In experiments, we analyze the effect of the acceleration techniques on speed and speech quality. Compared with the baseline model, the proposed model exhibits improved MOS from 2.62 to 2.74 with only 1.76% computation and 2.75% parameters. The speed on a single CPU thread was improved by 7.45 times, which is fast enough to produce mel-spectrogram in real time without GPU. | - |
dc.language | English | - |
dc.publisher | IEEE | - |
dc.title | Fast DCTTS: Efficient Deep Convolutional Text-to-Speech | - |
dc.type | Conference | - |
dc.identifier.wosid | 000704288407064 | - |
dc.identifier.scopusid | 2-s2.0-85114800098 | - |
dc.type.rims | CONF | - |
dc.citation.beginningpage | 7043 | - |
dc.citation.endingpage | 7047 | - |
dc.citation.publicationname | ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) | - |
dc.identifier.conferencecountry | CN | - |
dc.identifier.conferencelocation | Toronto, ON | - |
dc.identifier.doi | 10.1109/icassp39728.2021.9413373 | - |
dc.contributor.nonIdAuthor | Kang, Minsu | - |
dc.contributor.nonIdAuthor | Kim, Simin | - |
dc.contributor.nonIdAuthor | Kim, Injung | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.