GATING RECURRENT MIXTURE DENSITY NETWORKS FOR ACOUSTIC MODELING IN STATISTICAL PARAMETRIC SPEECH SYNTHESIS | |
Wang, Wenfu; Xu, Shuang; Xu, Bo | |
2016-03 | |
会议日期 | 2016-3-21 |
会议地点 | Shanghai, China |
关键词 | Statistical Parametric Speech Synthesis Gating Units Gru Gating Recurrent Mixture Density Network |
页码 | 5520-5524 |
英文摘要 | Though recurrent neural networks (RNNs) using long short-term memory (LSTM) units can address the issue of long-span dependencies across the linguistic inputs and have achieved the state-of-the-art performance for statistical parametric speech synthesis (SPSS), another limitation of the intrinsic uni-Gaussian nature of mean square error (MSE) objective function still remains. This paper proposes a gating recurrent mixture density network (GRMDN) architecture to jointly address these two problems in neural network based SPSS. What’s more, the gated recurrent unit (GRU), which is much simpler and has more intelligible work mechanism than LSTM, is also investigated as an alternative gating unit in RNN based acoustic modeling. Experimental results show that the proposed GRMDN architecture can synthesize more natural speech than its MSE-trained counterpart and both the two gating units (LSTM and GRU) show comparable performance. |
语种 | 英语 |
内容类型 | 会议论文 |
源URL | [http://ir.ia.ac.cn/handle/173211/19654] |
专题 | 数字内容技术与服务研究中心_听觉模型与认知计算 |
作者单位 | Institute of Automation, Chinese Academy of Sciences, Beijing, China |
推荐引用方式 GB/T 7714 | Wang, Wenfu,Xu, Shuang,Xu, Bo. GATING RECURRENT MIXTURE DENSITY NETWORKS FOR ACOUSTIC MODELING IN STATISTICAL PARAMETRIC SPEECH SYNTHESIS[C]. 见:. Shanghai, China. 2016-3-21. |
个性服务 |
查看访问统计 |
相关权益政策 |
暂无数据 |
收藏/分享 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论