VQAPT: A New visual question answering model for personality traits in social media images

doi:10.1016/j.patrec.2023.10.016

CORC > 自动化研究所 > 中国科学院自动化研究所 > 多模态人工智能系统全国重点实验室

	VQAPT: A New visual question answering model for personality traits in social media images
	Biswas, Kunal 1; Shivakumara, Palaiahnakote 2; Pal, Umapada 1; Liu, Cheng-Lin 3,4; Lu, Yue 5
刊名	PATTERN RECOGNITION LETTERS
	2023-11-01
卷号	175 页码:66-73
关键词	Personality trait images Multimodal concept Text recognition Social media images Natural language processing Visual question answering
ISSN号	0167-8655
DOI	10.1016/j.patrec.2023.10.016
通讯作者	Shivakumara, Palaiahnakote(shiva@um.edu.my)
英文摘要	Visual Question Answering (VQA) for personality trait images on social media is challenging because of multiple emotions and actions with complex backgrounds in social media images. This work aims at developing a new VQA model for different personality traits (VQAPT) identification in a single image. This work considers the Big Five Factors (BFF) for personality traits namely, Openness, Conscientiousness, Extraversion, Agreeableness and Neuroticism. VQA is proposed based on the observation that multiple personality traits can be seen in a single image. We propose a model integrating text recognition and person/face recognition to derive the unique relationship between the text and the person's action in the image. Furthermore, a dynamic text-object graph for personality traits identification is constructed according to the query. For understanding a query, we explore the Contrastive Language-Image Pre-trained (CLIP) transformer encoder in this work. Since it is the first work of its kind, we have created a new dataset under this work for evaluation and the dataset is available publicly as mentioned in Section 4. The effectiveness of the proposed method is also evaluated on two benchmark datasets, namely TextVQA for VQA and PTI for personality traits identification.
资助项目	Ministry of Higher Education Malaysia[FRGS/1/2020/ICT02/UM/02/4] ; University Grants Commission (UGC) , India
WOS研究方向	Computer Science
语种	英语
出版者	ELSEVIER
WOS记录号	WOS:001102930500001
资助机构	Ministry of Higher Education Malaysia ; University Grants Commission (UGC) , India
内容类型	期刊论文
源URL	[http://ir.ia.ac.cn/handle/173211/55234]
专题	多模态人工智能系统全国重点实验室
通讯作者	Shivakumara, Palaiahnakote
作者单位	1.Indian Stat Inst, Comp Vis & Pattern Recognit Unit, Kolkata, India 2.Univ Malaya, Fac Comp Sci & Informat Technol, Kuala Lumpur, Malaysia 3.Univ Chinese Acad Sci, Inst Automat, Chinese Acad Sci, Beijing, Peoples R China 4.Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing, Peoples R China 5.East China Normal Univ, Shangahi Key Lab Multidimens Informat Proc, Shanghai, Peoples R China
推荐引用方式 GB/T 7714	Biswas, Kunal,Shivakumara, Palaiahnakote,Pal, Umapada,et al. VQAPT: A New visual question answering model for personality traits in social media images[J]. PATTERN RECOGNITION LETTERS,2023,175:66-73.
APA	Biswas, Kunal,Shivakumara, Palaiahnakote,Pal, Umapada,Liu, Cheng-Lin,&Lu, Yue.(2023).VQAPT: A New visual question answering model for personality traits in social media images.PATTERN RECOGNITION LETTERS,175,66-73.
MLA	Biswas, Kunal,et al."VQAPT: A New visual question answering model for personality traits in social media images".PATTERN RECOGNITION LETTERS 175(2023):66-73.