purpose: the thrust of this paper is to present a method for improving the accuracy of automatic indexing of chinese-english mixed documents. design/methodology/approach: based on the inherent characteristics of chinese-english mixed texts and the cybernetics theory, we proposed an integrated control method for indexing documents. it consists of "feed-forward control", "in-progress control" and "feed-back control", aiming at improving the accuracy of automatic indexing of chinese-english mixed documents. an experiment was conducted to investigate the effect of our proposed method. findings: this method distinguishes chinese and english documents in grammatical structures and word formation rules. through the implementation of this method in the three phases of automatic indexing for the chinese-english mixed documents, the results were encouraging. the precision increased from 88.54% to 97.10% and recall improved from 97.37% to 99.47%. research limitations: the indexing method is relatively complicated and the whole indexing process requires substantial human intervention. due to pattern matching based on a bruteforce (bf) approach, the indexing efficiency has been reduced to some extent. practical implications: the research is of both theoretical signifi cance and practical value in improving the accuracy of automatic indexing of multilingual documents (not confined to chinese-english mixed documents). the proposed method will benefit not only the indexing of life science documents but also the indexing of documents in other subject areas. originality/value: so far, few studies have been published about the method for increasing the accuracy of multilingual automatic indexing. this study will provide insights into the automatic indexing of multilingual documents, especially chinese-english mixed documents.
英文摘要
purpose: the thrust of this paper is to present a method for improving the accuracy of automatic indexing of chinese-english mixed documents. design/methodology/approach: based on the inherent characteristics of chinese-english mixed texts and the cybernetics theory, we proposed an integrated control method for indexing documents. it consists of "feed-forward control", "in-progress control" and "feed-back control", aiming at improving the accuracy of automatic indexing of chinese-english mixed documents. an experiment was conducted to investigate the effect of our proposed method. findings: this method distinguishes chinese and english documents in grammatical structures and word formation rules. through the implementation of this method in the three phases of automatic indexing for the chinese-english mixed documents, the results were encouraging. the precision increased from 88.54% to 97.10% and recall improved from 97.37% to 99.47%. research limitations: the indexing method is relatively complicated and the whole indexing process requires substantial human intervention. due to pattern matching based on a bruteforce (bf) approach, the indexing efficiency has been reduced to some extent. practical implications: the research is of both theoretical signifi cance and practical value in improving the accuracy of automatic indexing of multilingual documents (not confined to chinese-english mixed documents). the proposed method will benefit not only the indexing of life science documents but also the indexing of documents in other subject areas. originality/value: so far, few studies have been published about the method for increasing the accuracy of multilingual automatic indexing. this study will provide insights into the automatic indexing of multilingual documents, especially chinese-english mixed documents.
ZHAO Yan,SHI Hui. A method for improving the accuracy of automatic indexing of Chinese-English mixed documents[J]. chinese journal of library and information science,2012,5(4):77-92.
APA
ZHAO Yan,&SHI Hui.(2012).A method for improving the accuracy of automatic indexing of Chinese-English mixed documents.chinese journal of library and information science,5(4),77-92.
MLA
ZHAO Yan,et al."A method for improving the accuracy of automatic indexing of Chinese-English mixed documents".chinese journal of library and information science 5.4(2012):77-92.
修改评论