Physics – Data Analysis – Statistics and Probability
Scientific paper
2006-05-23
Physics
Data Analysis, Statistics and Probability
Scientific paper
Histograms are convenient non-parametric density estimators, which continue to be used ubiquitously. Summary quantities estimated from histogram-based probability density models depend on the choice of the number of bins. In this paper we introduce a straightforward data-based method of determining the optimal number of bins in a uniform bin-width histogram. Using the Bayesian framework, we derive the posterior probability for the number of bins in a piecewise-constant density model given the data. The most probable solution is determined naturally by a balance between the likelihood function, which increases with increasing number of bins, and the prior probability of the model, which decreases with increasing number of bins. We demonstrate how these results outperform several well-accepted rules for choosing bin sizes. In addition, we examine the effects of small sample sizes and digitized data. Last, we demonstrate that these results can be applied directly to multi-dimensional histograms.
No associations
LandOfFree
Optimal Data-Based Binning for Histograms does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with Optimal Data-Based Binning for Histograms, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Optimal Data-Based Binning for Histograms will most certainly appreciate the feedback.
Profile ID: LFWR-SCP-O-182629