Bayesian Treatment of Incomplete Discrete Data applied to Mutual Information and Feature Selection

Computer Science – Learning

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

11 pages, 1 figure

Scientific paper

Given the joint chances of a pair of random variables one can compute quantities of interest, like the mutual information. The Bayesian treatment of unknown chances involves computing, from a second order prior distribution and the data likelihood, a posterior distribution of the chances. A common treatment of incomplete data is to assume ignorability and determine the chances by the expectation maximization (EM) algorithm. The two different methods above are well established but typically separated. This paper joins the two approaches in the case of Dirichlet priors, and derives efficient approximations for the mean, mode and the (co)variance of the chances and the mutual information. Furthermore, we prove the unimodality of the posterior distribution, whence the important property of convergence of EM to the global maximum in the chosen framework. These results are applied to the problem of selecting features for incremental learning and naive Bayes classification. A fast filter based on the distribution of mutual information is shown to outperform the traditional filter based on empirical mutual information on a number of incomplete real data sets.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Bayesian Treatment of Incomplete Discrete Data applied to Mutual Information and Feature Selection does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Bayesian Treatment of Incomplete Discrete Data applied to Mutual Information and Feature Selection, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Bayesian Treatment of Incomplete Discrete Data applied to Mutual Information and Feature Selection will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-52480

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.