End-to-end Language Identification using Attention-based Recurrent Neural Networks | |
Wang Geng; Wenfu Wang; Yuanyuan Zhao; Xinyuan Cai; Bo Xu; Cai Xinyuan | |
2016-09 | |
会议日期 | 2016.9.8-2016.9.12 |
会议地点 | San Francisco, USA |
关键词 | Language Identification End-to-end Training Attention |
英文摘要 | This paper proposes a novel attention-based recurrent neural network (RNN) to build an end-to-end automatic language identification (LID) system. Inspired by the success of attention mechanism on a range of sequence-to-sequence tasks, this work introduces the attention mechanism with long short term memory (LSTM) encoder to the sequence-to-tag LID task. This unified architecture extends the end-to-end training method to LID system and dramatically boosts the system performance. Firstly, a language category embedding module is used to provide attentional vector which guides the derivation of the utterance level representation. Secondly, two attention approaches are explored: a soft attention which attends all source frames and a hard one that focuses on a subset of the sequential input. Thirdly, a hybrid test method which traverses all gold labels is adopted in the inference phase. Experimental results show that 8.2% relative equal error rate (EER) reduction is obtained compared with the LSTM-based frame level system by the soft approach and 34.33% performance improvement is observed compared to the conventional i-Vector system. |
会议录 | InterSpeech2016 |
内容类型 | 会议论文 |
源URL | [http://ir.ia.ac.cn/handle/173211/41097] |
专题 | 数字内容技术与服务研究中心_听觉模型与认知计算 |
通讯作者 | Cai Xinyuan |
推荐引用方式 GB/T 7714 | Wang Geng,Wenfu Wang,Yuanyuan Zhao,et al. End-to-end Language Identification using Attention-based Recurrent Neural Networks[C]. 见:. San Francisco, USA. 2016.9.8-2016.9.12. |
个性服务 |
查看访问统计 |
相关权益政策 |
暂无数据 |
收藏/分享 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论