SMOTE: Synthetic Minority Over-sampling Technique

Computer Science – Artificial Intelligence

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Scientific paper

10.1613/jair.953

An approach to the construction of classifiers from imbalanced datasets is described. A dataset is imbalanced if the classification categories are not approximately equally represented. Often real-world data sets are predominately composed of "normal" examples with only a small percentage of "abnormal" or "interesting" examples. It is also the case that the cost of misclassifying an abnormal (interesting) example as a normal example is often much higher than the cost of the reverse error. Under-sampling of the majority (normal) class has been proposed as a good means of increasing the sensitivity of a classifier to the minority class. This paper shows that a combination of our method of over-sampling the minority (abnormal) class and under-sampling the majority (normal) class can achieve better classifier performance (in ROC space) than only under-sampling the majority class. This paper also shows that a combination of our method of over-sampling the minority class and under-sampling the majority class can achieve better classifier performance (in ROC space) than varying the loss ratios in Ripper or class priors in Naive Bayes. Our method of over-sampling the minority class involves creating synthetic minority class examples. Experiments are performed using C4.5, Ripper and a Naive Bayes classifier. The method is evaluated using the area under the Receiver Operating Characteristic curve (AUC) and the ROC convex hull strategy.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

SMOTE: Synthetic Minority Over-sampling Technique does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with SMOTE: Synthetic Minority Over-sampling Technique, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and SMOTE: Synthetic Minority Over-sampling Technique will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-580333

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.