Robust Feature Selection by Mutual Information Distributions

Computer Science – Artificial Intelligence

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

8 two-column pages

Scientific paper

Mutual information is widely used in artificial intelligence, in a descriptive way, to measure the stochastic dependence of discrete random variables. In order to address questions such as the reliability of the empirical value, one must consider sample-to-population inferential approaches. This paper deals with the distribution of mutual information, as obtained in a Bayesian framework by a second-order Dirichlet prior distribution. The exact analytical expression for the mean and an analytical approximation of the variance are reported. Asymptotic approximations of the distribution are proposed. The results are applied to the problem of selecting features for incremental learning and classification of the naive Bayes classifier. A fast, newly defined method is shown to outperform the traditional approach based on empirical mutual information on a number of real data sets. Finally, a theoretical development is reported that allows one to efficiently extend the above methods to incomplete samples in an easy and effective way.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Robust Feature Selection by Mutual Information Distributions does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Robust Feature Selection by Mutual Information Distributions, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Robust Feature Selection by Mutual Information Distributions will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-693124

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.