Computer Science – Data Structures and Algorithms
Scientific paper
2011-07-13
Computer Science
Data Structures and Algorithms
Scientific paper
We consider a basic problem in unsupervised learning: learning an unknown \emph{Poisson Binomial Distribution} over $\{0,1,...,n\}$. A Poisson Binomial Distribution (PBD) is a sum $X = X_1 + ... + X_n$ of $n$ independent Bernoulli random variables which may have arbitrary expectations. We work in a framework where the learner is given access to independent draws from the distribution and must (with high probability) output a hypothesis distribution which has total variation distance at most $\eps$ from the unknown target PBD. As our main result we give a highly efficient algorithm which learns to $\eps$-accuracy using $\tilde{O}(1/\eps^3)$ samples independent of $n$. The running time of the algorithm is \emph{quasilinear} in the size of its input data, i.e. $\tilde{O}(\log(n)/\eps^3)$ bit-operations (observe that each draw from the distribution is a $\log(n)$-bit string). This is nearly optimal since any algorithm must use $\Omega(1/\eps^2)$ samples. We also give positive and negative results for some extensions of this learning problem.
Daskalakis Constantinos
Diakonikolas Ilias
Servedio Rocco A.
No associations
LandOfFree
Learning Poisson Binomial Distributions does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with Learning Poisson Binomial Distributions, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Learning Poisson Binomial Distributions will most certainly appreciate the feedback.
Profile ID: LFWR-SCP-O-224588