Addressing the Under-translation Problem from the Entropy Perspective

CORC > 自动化研究所 > 中国科学院自动化研究所 > 模式识别国家重点实验室 > 自然语言处理团队

	Addressing the Under-translation Problem from the Entropy Perspective
	Zhao, Yang1 ; Zhang, Jiajun1 ; Zong, Chengqing1 ; He, Zhongjun 2; Wu, Hua 2
	2019
会议日期	2019
会议地点	Honolulu, Hawaii, USA
英文摘要	Neural Machine Translation (NMT) has drawn much attention due to its promising translation performance in recent years. However, the under-translation problem still remains a big challenge. In this paper, we focus on the under-translation problem and attempt to find out what kinds of words are more likely to be ignored. Through analysis, we observe that a source word with a larger translation entropy is more inclined to be dropped. To address this problem, we proposed a coarse-to-fine framework, in which we first introduce a simple strategy to reduce the entropy of high-entropy words through constructing the pseudo target sentences. Then we propose three methods, including pre-training method, multitask method and two-pass method, to encourage the neural model to focus on these high-entropy words. Experimental results on various translation tasks show that our method can significantly improve the translation quality and substantially reduce the under-translation cases of high-entropy words.
内容类型	会议论文
源URL	[http://ir.ia.ac.cn/handle/173211/23194]
专题	自动化研究所_模式识别国家重点实验室_自然语言处理团队
作者单位	1.中国科学院自动化研究所 2.百度
推荐引用方式 GB/T 7714	Zhao, Yang,Zhang, Jiajun,Zong, Chengqing,et al. Addressing the Under-translation Problem from the Entropy Perspective[C]. 见:. Honolulu, Hawaii, USA. 2019.