Providing Diversity in K-Nearest Neighbor Query Results

Computer Science – Databases

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

20 pages, 11 figures

Scientific paper

Given a point query Q in multi-dimensional space, K-Nearest Neighbor (KNN) queries return the K closest answers according to given distance metric in the database with respect to Q. In this scenario, it is possible that a majority of the answers may be very similar to some other, especially when the data has clusters. For a variety of applications, such homogeneous result sets may not add value to the user. In this paper, we consider the problem of providing diversity in the results of KNN queries, that is, to produce the closest result set such that each answer is sufficiently different from the rest. We first propose a user-tunable definition of diversity, and then present an algorithm, called MOTLEY, for producing a diverse result set as per this definition. Through a detailed experimental evaluation on real and synthetic data, we show that MOTLEY can produce diverse result sets by reading only a small fraction of the tuples in the database. Further, it imposes no additional overhead on the evaluation of traditional KNN queries, thereby providing a seamless interface between diversity and distance.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Providing Diversity in K-Nearest Neighbor Query Results does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Providing Diversity in K-Nearest Neighbor Query Results, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Providing Diversity in K-Nearest Neighbor Query Results will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-527504

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.