FIT2012 第11回情報科学技術フォーラム

抄録

E-034
Automatic Tweaking of Confounding Training Tweets for Improving Classification Accuracy

○Muhammad Asif Hossain Khan・Masayuki Iwai・Kaoru Sezaki（The University of Tokyo）

Twitter has become a valuable source of information for extracting early symptoms to predict changes in different economic and social indicators. However, misclassification of relevant tweets can easily lead to a cry wolf situation. We have presented a framework to automatically identify noisy tweets in the training set that may confound the judgment of a classifier. We have also modified conventional likelihood based collocation feature selection method. Even with relatively small training set, our method could achieve better classification accuracy.