Convergence Rates for Mixture-of-Experts

Mathematics – Statistics Theory

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Scientific paper

In mixtures-of-experts (ME) model, where a number of submodels (experts) are combined, there have been two longstanding problems: (i) how many experts should be chosen, given the size of the training data? (ii) given the total number of parameters, is it better to use a few very complex experts, or is it better to combine many simple experts? In this paper, we try to provide some insights to these problems through a theoretic study on a ME structure where $m$ experts are mixed, with each expert being related to a polynomial regression model of order $k$. We study the convergence rate of the maximum likelihood estimator (MLE), in terms of how fast the Kullback-Leibler divergence of the estimated density converges to the true density, when the sample size $n$ increases. The convergence rate is found to be dependent on both $m$ and $k$, and certain choices of $m$ and $k$ are found to produce optimal convergence rates. Therefore, these results shed light on the two aforementioned important problems: on how to choose $m$, and on how $m$ and $k$ should be compromised, for achieving good convergence rates.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Convergence Rates for Mixture-of-Experts does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Convergence Rates for Mixture-of-Experts, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Convergence Rates for Mixture-of-Experts will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-147368

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.