The Boundary Between Privacy and Utility in Data Anonymization

Computer Science – Databases

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Scientific paper

We consider the privacy problem in data publishing: given a relation I containing sensitive information 'anonymize' it to obtain a view V such that, on one hand attackers cannot learn any sensitive information from V, and on the other hand legitimate users can use V to compute useful statistics on I. These are conflicting goals. We use a definition of privacy that is derived from existing ones in the literature, which relates the a priori probability of a given tuple t, Pr(t), with the a posteriori probability, Pr(t | V), and propose a novel and quite practical definition for utility. Our main result is the following. Denoting n the size of I and m the size of the domain from which I was drawn (i.e. n < m) then: when the a priori probability is Pr(t) = Omega(n/sqrt(m)) for some t, there exists no useful anonymization algorithm, while when Pr(t) = O(n/m) for all tuples t, then we give a concrete anonymization algorithm that is both private and useful. Our algorithm is quite different from the k-anonymization algorithm studied intensively in the literature, and is based on random deletions and insertions to I.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

The Boundary Between Privacy and Utility in Data Anonymization does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with The Boundary Between Privacy and Utility in Data Anonymization, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and The Boundary Between Privacy and Utility in Data Anonymization will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-21816

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.