オオヤマ ワタル OHYAMA Wataru
大山 航
所属 東京電機大学 システムデザイン工学部 情報システム工学科
職種 教授
言語種別 英語
発行・発表の年月 2011
形態種別 国際会議論文
査読 査読あり
標題 A Study on Automatic Chinese Text Classification
執筆形態 共著
掲載誌名 11TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2011)
出版社・発行元 IEEE COMPUTER SOC
巻・号・頁 920-924
著者・共著者 Xi Luo,Wataru Ohyama,Tetsushi Wakabayashi,Fumitaka Kimura
概要 In this paper, we perform Chinese text classification using N-gram (uni-gram, bi-gram and mixed uni-gram/bigram) frequency feature instead of word frequency feature to represent documents and propose the use of mixed unigram/ bi-gram after feature transformation. We further propose a serial approach based on feature transformation and dimension reduction techniques to improve the performance. Experimental results show that our proposed approach is efficient and effective for improving the performance of Chinese text classification. Furthermore, we present several experiments evaluating the selection of features based on part-of-speech analysis and the results show that suitable combination of part-of-speech can lead to better classification performance.
DOI 10.1109/ICDAR.2011.187
ISSNコード 15205363
DBLP ID conf/icdar/LuoOWK11
PermalinkURL http://dblp.uni-trier.de/db/conf/icdar/icdar2011.html#conf/icdar/LuoOWK11