Adjusting Word Embeddings by Deep Neural Networks | |
Gao, Xiaoyang ; Ichise, Ryutaro | |
2017 | |
关键词 | NLP Word Embeddings Deep Learning Neural Network |
英文摘要 | Continuous representations language models have gained popularity in many NLP tasks. To measure the similarity of two words, we have to calculate their cosine distances. However the qualities of word embeddings depend on the corpus selected. As for word2vec, we observe that the vectors are far apart to each other. Furthermore, synonym words with low occurrences or with multiple meanings are even further in distance. In these cases, cosine similarities are no longer appropriate to evaluate how similar the words are. And considering about the structures of most of the language models, they are not as deep as we supposed. "Deep" here refers to setting more layers in the neural network. Based on these observations, we implement a mixed system with two kinds of architectures. We show that adjustment can be done on word embeddings in both unsupervised and supervised ways. Remarkably, this approach can successfully handle the cases mentioned above by largely increasing most of synonyms similarities. It is also easy to train and adapt to certain tasks by changing the training target and dataset.; NEDO (New Energy and Industrial Technology Development Organization); CPCI-S(ISTP); 398-406 |
语种 | 英语 |
出处 | 9th International Conference on Agents and Artificial Intelligence (ICAART) |
DOI标识 | 10.5220/0006120003980406 |
内容类型 | 其他 |
源URL | [http://ir.pku.edu.cn/handle/20.500.11897/480760] |
专题 | 信息科学技术学院 |
推荐引用方式 GB/T 7714 | Gao, Xiaoyang,Ichise, Ryutaro. Adjusting Word Embeddings by Deep Neural Networks. 2017-01-01. |
个性服务 |
查看访问统计 |
相关权益政策 |
暂无数据 |
收藏/分享 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论