A probabilistic methodology for multilabel classification

Computer Science – Artificial Intelligence

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

14 pages, 1 figure, under review

Scientific paper

Multilabel classification is a relatively recent subfield of machine learning. Unlike to the classical approach, where patterns are labeled with only one category, in multilabel classification, an arbitrary number of categories is chosen to label a pattern. Due to the problem complexity (the solution is one among an exponential number of alternatives), a very common solution (the binary method) is frequently used, learning a binary classifier for every category, and combining them all afterwards. The assumption taken in this solution is not realistic, and in this work we give examples where the decisions for all the labels are not taken independently, and thus, a supervised approach should learn those existing relationships among categories to make a better classification. Therefore, we show here a generic methodology that can improve the results obtained by a set of independent probabilistic binary classifiers, by using a combination procedure with a classifier trained on the co-occurrences of the labels. We show an exhaustive experimentation in three different standard corpora of labeled documents (Reuters-21578, Ohsumed-23 and RCV1), which present noticeable improvements in all of them, when using our methodology, in three probabilistic base classifiers.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

A probabilistic methodology for multilabel classification does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with A probabilistic methodology for multilabel classification, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and A probabilistic methodology for multilabel classification will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-498188

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.