CORC  > 北京大学  > 信息科学技术学院
Adjusting Word Embeddings by Deep Neural Networks
Gao, Xiaoyang ; Ichise, Ryutaro
2017
关键词NLP Word Embeddings Deep Learning Neural Network
英文摘要Continuous representations language models have gained popularity in many NLP tasks. To measure the similarity of two words, we have to calculate their cosine distances. However the qualities of word embeddings depend on the corpus selected. As for word2vec, we observe that the vectors are far apart to each other. Furthermore, synonym words with low occurrences or with multiple meanings are even further in distance. In these cases, cosine similarities are no longer appropriate to evaluate how similar the words are. And considering about the structures of most of the language models, they are not as deep as we supposed. "Deep" here refers to setting more layers in the neural network. Based on these observations, we implement a mixed system with two kinds of architectures. We show that adjustment can be done on word embeddings in both unsupervised and supervised ways. Remarkably, this approach can successfully handle the cases mentioned above by largely increasing most of synonyms similarities. It is also easy to train and adapt to certain tasks by changing the training target and dataset.; NEDO (New Energy and Industrial Technology Development Organization); CPCI-S(ISTP); 398-406
语种英语
出处9th International Conference on Agents and Artificial Intelligence (ICAART)
DOI标识10.5220/0006120003980406
内容类型其他
源URL[http://ir.pku.edu.cn/handle/20.500.11897/480760]  
专题信息科学技术学院
推荐引用方式
GB/T 7714
Gao, Xiaoyang,Ichise, Ryutaro. Adjusting Word Embeddings by Deep Neural Networks. 2017-01-01.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace