Computer Science – Data Structures and Algorithms
Scientific paper
2012-03-22
Computer Science
Data Structures and Algorithms
13 pages
Scientific paper
Measurement data, snapshots of a system, and traffic or activity logs are typically collected repeatedly. {\em Difference queries}, which identify and measure change, are central to anomaly detection, monitoring, and planning. When the data is sampled, as is often necessary to meet resource constraints, queries need to be processed over the sampled data. Surprisingly, however, we are not aware of pre-existing satisfactory estimators even for Euclidean distances. We derive estimators for $L_p$ ($p$-norm) distances that are nonnegative and variance optimal in a Pareto sense. Our estimators are suitable for independent or coordinated samples of the data and have provable strong properties. For coordinated sampling we present two estimators that tradeoff variance according to similarity of the data. Moreover, one of the estimators has the property that for all data, has variance is close to the minimum possible for that data. We study performance of our estimators for Manhattan and Euclidean distances ($p=1,2$) on diverse datasets, demonstrating scalability and accuracy.
Cohen Edith
Kaplan Haim
No associations
LandOfFree
How to Estimate Change from Samples does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with How to Estimate Change from Samples, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and How to Estimate Change from Samples will most certainly appreciate the feedback.
Profile ID: LFWR-SCP-O-381400