Estimating and Sampling Graphs with Multidimensional Random Walks

Computer Science – Data Structures and Algorithms

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Scientific paper

Estimating characteristics of large graphs via sampling is a vital part of the study of complex networks. Current sampling methods such as (independent) random vertex and random walks are useful but have drawbacks. Random vertex sampling may require too many resources (time, bandwidth, or money). Random walks, which normally require fewer resources per sample, can suffer from large estimation errors in the presence of disconnected or loosely connected graphs. In this work we propose a new $m$-dimensional random walk that uses $m$ dependent random walkers. We show that the proposed sampling method, which we call Frontier sampling, exhibits all of the nice sampling properties of a regular random walk. At the same time, our simulations over large real world graphs show that, in the presence of disconnected or loosely connected components, Frontier sampling exhibits lower estimation errors than regular random walks. We also show that Frontier sampling is more suitable than random vertex sampling to sample the tail of the degree distribution of the graph.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Estimating and Sampling Graphs with Multidimensional Random Walks does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Estimating and Sampling Graphs with Multidimensional Random Walks, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Estimating and Sampling Graphs with Multidimensional Random Walks will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-124110

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.