Adaptive Attention Annotation Model: Optimizing the Prediction Path through Dependency Fusion | |
Wang, Fangxin1,2; Liu, Jie1; Zhang, Shuwu1,3; Zhang, Guixuan1; Zheng, Yang1; Li, Xiaoqian1,2; Liang, Wei1; Li, Yuejun1,2 | |
刊名 | KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS |
2019-09-30 | |
卷号 | 13期号:9页码:4665-4683 |
关键词 | image annotation multiple dependencies self-attention prediction path Triplet Margin loss |
ISSN号 | 1976-7277 |
DOI | 10.3837/tiis.2019.09.019 |
通讯作者 | Liu, Jie(jie.liu@ia.ac.cn) |
英文摘要 | Previous methods build image annotation model by leveraging three basic dependencies: relations between image and label (image/label), between images (image/image) and between labels (label/label). Even though plenty of researches show that multiple dependencies can work jointly to improve annotation performance, different dependencies actually do not "work jointly" in their diagram, whose performance is largely depending on the result predicted by image/label section. To address this problem, we propose the adaptive attention annotation model (AAAM) to associate these dependencies with the prediction path, which is composed of a series of labels (tags) in the order they are detected. In particular, we optimize the prediction path by detecting the relevant labels from the easy-to-detect to the hard-to-detect, which are found using Binary Cross-Entropy (BCE) and Triplet Margin (TM) losses, respectively. Besides, in order to capture the inforamtion of each label, instead of explicitly extracting regional featutres, we propose the self-attention machanism to implicitly enhance the relevant region and restrain those irrelevant. To validate the effective of the model, we conduct experiments on three well-known public datasets, COCO 2014, IAPR TC-12 and NUSWIDE, and achieve better performance than the state-of-the-art methods. |
资助项目 | National Key R&D Program of China[2017YFB1401000] ; Key Laboratory of Digital Rights Services, is one of the National Science and Standardization Key Labs for Press and Publication Industry |
WOS关键词 | AUTOMATIC IMAGE ANNOTATION |
WOS研究方向 | Computer Science ; Telecommunications |
语种 | 英语 |
出版者 | KSII-KOR SOC INTERNET INFORMATION |
WOS记录号 | WOS:000488294100019 |
资助机构 | National Key R&D Program of China ; Key Laboratory of Digital Rights Services, is one of the National Science and Standardization Key Labs for Press and Publication Industry |
内容类型 | 期刊论文 |
源URL | [http://ir.ia.ac.cn/handle/173211/26114] |
专题 | 数字内容技术与服务研究中心_新媒体服务与管理技术 |
通讯作者 | Liu, Jie |
作者单位 | 1.Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China 2.Univ Chinese Acad Sci, Beijing 100049, Peoples R China 3.Beijing Film Acad, AICFVE, Beijing 100088, Peoples R China |
推荐引用方式 GB/T 7714 | Wang, Fangxin,Liu, Jie,Zhang, Shuwu,et al. Adaptive Attention Annotation Model: Optimizing the Prediction Path through Dependency Fusion[J]. KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS,2019,13(9):4665-4683. |
APA | Wang, Fangxin.,Liu, Jie.,Zhang, Shuwu.,Zhang, Guixuan.,Zheng, Yang.,...&Li, Yuejun.(2019).Adaptive Attention Annotation Model: Optimizing the Prediction Path through Dependency Fusion.KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS,13(9),4665-4683. |
MLA | Wang, Fangxin,et al."Adaptive Attention Annotation Model: Optimizing the Prediction Path through Dependency Fusion".KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS 13.9(2019):4665-4683. |
个性服务 |
查看访问统计 |
相关权益政策 |
暂无数据 |
收藏/分享 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论