Document Classification Using Expectation Maximization with Semi Supervised Learning

Computer Science – Information Retrieval

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Scientific paper

As the amount of online document increases, the demand for document classification to aid the analysis and management of document is increasing. Text is cheap, but information, in the form of knowing what classes a document belongs to, is expensive. The main purpose of this paper is to explain the expectation maximization technique of data mining to classify the document and to learn how to improve the accuracy while using semi-supervised approach. Expectation maximization algorithm is applied with both supervised and semi-supervised approach. It is found that semi-supervised approach is more accurate and effective. The main advantage of semi supervised approach is "Dynamically Generation of New Class". The algorithm first trains a classifier using the labeled document and probabilistically classifies the unlabeled documents. The car dataset for the evaluation purpose is collected from UCI repository dataset in which some changes have been done from our side.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Document Classification Using Expectation Maximization with Semi Supervised Learning does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Document Classification Using Expectation Maximization with Semi Supervised Learning, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Document Classification Using Expectation Maximization with Semi Supervised Learning will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-43506

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.