Computer Science – Databases
Scientific paper
2012-03-19
Journal of Computing, Volume 4, Issue 2, February 2012, 48-55
Computer Science
Databases
ISSN 2151-9617
Scientific paper
Clustering is one of the main tasks in exploratory data analysis and descriptive statistics where the main objective is partitioning observations in groups. Clustering has a broad range of application in varied domains like climate, business, information retrieval, biology, psychology, to name a few. A variety of methods and algorithms have been developed for clustering tasks in the last few decades. We observe that most of these algorithms define a cluster in terms of value of the attributes, density, distance etc. However these definitions fail to attach a clear meaning/semantics to the generated clusters. We argue that clusters having understandable and distinct semantics defined in terms of quartiles/halves are more appealing to business analysts than the clusters defined by data boundaries or prototypes. On the samepremise, we propose our new algorithm named as quartile clustering technique. Through a series of experiments we establish efficacy of this algorithm. We demonstrate that the quartile clustering technique adds clear meaning to each of the clusters compared to K-means. We use DB Index to measure goodness of the clusters and show our method is comparable to EM (Expectation Maximization), PAM (Partition around Medoid) and K Means. We have explored its capability in detecting outlier and the benefit of added semantics. We discuss some of the limitations in its present form and also provide a rough direction in addressing the issue of merging the generated clusters.
Chakrabarti Amlan
Goswami Saptarsi
No associations
LandOfFree
Quartile Clustering: A quartile based technique for Generating Meaningful Clusters does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with Quartile Clustering: A quartile based technique for Generating Meaningful Clusters, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Quartile Clustering: A quartile based technique for Generating Meaningful Clusters will most certainly appreciate the feedback.
Profile ID: LFWR-SCP-O-212924