Feature selection for high-dimensional integrated data

Statistics – Applications

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Submitted

Scientific paper

Motivated by the problem of identifying correlations between genes or features of two related biological systems, we propose a model of \emph{feature selection} in which only a subset of the predictors $X_t$ are dependent on the multidimensional variate $Y$, and the remainder of the predictors constitute a "noise set" $X_u$ independent of $Y$. Using Monte Carlo simulations, we investigated the relative performance of two methods: thresholding and singular-value decomposition, in combination with stochastic optimization to determine "empirical bounds" on the small-sample accuracy of an asymptotic approximation. We demonstrate utility of the thresholding and SVD feature selection methods to with respect to a recent infant intestinal gene expression and metagenomics dataset.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Feature selection for high-dimensional integrated data does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Feature selection for high-dimensional integrated data, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Feature selection for high-dimensional integrated data will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-687470

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.