GATING RECURRENT MIXTURE DENSITY NETWORKS FOR ACOUSTIC MODELING IN STATISTICAL PARAMETRIC SPEECH SYNTHESIS
Wang, Wenfu; Xu, Shuang; Xu, Bo
2016-03
会议日期2016-3-21
会议地点Shanghai, China
关键词Statistical Parametric Speech Synthesis Gating Units Gru Gating Recurrent Mixture Density Network
页码5520-5524
英文摘要Though recurrent neural networks (RNNs) using long short-term memory (LSTM) units can address the issue of long-span dependencies across the linguistic inputs and have achieved the state-of-the-art performance for statistical parametric speech synthesis (SPSS), another limitation of the intrinsic uni-Gaussian nature of mean square error (MSE) objective function still remains. This paper proposes a gating recurrent mixture density network (GRMDN) architecture to jointly address these two problems in neural network based SPSS. What’s more, the gated recurrent unit (GRU), which is much simpler and has more intelligible work mechanism than LSTM, is also investigated as an alternative gating unit in RNN based acoustic modeling. Experimental results show that the proposed GRMDN architecture can synthesize more natural speech than its MSE-trained counterpart and both the two gating units (LSTM and GRU) show comparable performance.
语种英语
内容类型会议论文
源URL[http://ir.ia.ac.cn/handle/173211/19654]  
专题数字内容技术与服务研究中心_听觉模型与认知计算
作者单位Institute of Automation, Chinese Academy of Sciences, Beijing, China
推荐引用方式
GB/T 7714
Wang, Wenfu,Xu, Shuang,Xu, Bo. GATING RECURRENT MIXTURE DENSITY NETWORKS FOR ACOUSTIC MODELING IN STATISTICAL PARAMETRIC SPEECH SYNTHESIS[C]. 见:. Shanghai, China. 2016-3-21.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace