Computer Science – Learning
Scientific paper
2010-12-13
Computer Science
Learning
The paper is withdrawn
Scientific paper
Unsupervised term weighting schemes, borrowed from information retrieval field, have been widely used for text categorization and the most famous one is tf.idf. The intuition behind idf seems less reasonable for TC task than IR task. In this paper, we introduce inverse category frequency into supervised term weighting schemes and propose a novel icf-based method. The method combines icf and relevance frequency (rf) to weight terms in training dataset. Our experiments have shown that icf-based supervised term weighting scheme is superior to tf.rf and prob-based supervised term weighting schemes and tf.idf based on two widely used datasets, i.e., the unbalanced Reuters-21578 corpus and the balanced 20 Newsgroup corpus. We also present the detailed evaluations of each category of the two datasets among the four term weighting schemes on precision, recall and F1 measure.
Lin Mengxiang
Wang Deqing
Wu Wenjun
Zhang Hui
No associations
LandOfFree
Inverse Category Frequency based supervised term weighting scheme for text categorization does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with Inverse Category Frequency based supervised term weighting scheme for text categorization, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Inverse Category Frequency based supervised term weighting scheme for text categorization will most certainly appreciate the feedback.
Profile ID: LFWR-SCP-O-26460