Mining Top-k Approximate Frequent Patterns

Computer Science – Databases

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

13 pages

Scientific paper

Frequent pattern (itemset) mining in transactional databases is one of the most well-studied problems in data mining. One obstacle that limits the practical usage of frequent pattern mining is the extremely large number of patterns generated. Such a large size of the output collection makes it difficult for users to understand and use in practice. Even restricting the output to the border of the frequent itemset collection does not help much in alleviating the problem. In this paper we address the issue of overwhelmingly large output size by introducing and studying the following problem: mining top-k approximate frequent patterns. The union of the power sets of these k sets should satisfy the following conditions: (1) including itemsets with larger support as many as possible and (2) including itemsets with smaller support as less as possible. An integrated objective function is designed to combine these two objectives. Consequently, we derive the upper bounds on objective function and present an approximate branch-and-bound method for finding the feasible solution. We give empirical evidence showing that our formulation and approximation methods work well in practice.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Mining Top-k Approximate Frequent Patterns does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Mining Top-k Approximate Frequent Patterns, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Mining Top-k Approximate Frequent Patterns will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-605980

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.