Cooperative Game Theory Based Privacy Preserving Techniques for Data Publishing

L. S Chakravarty

Awarded 2015

In recent years, the nature of the adversary’s behaviour is a witness for the drastic increase of the need for new mechanisms demanding Privacy Preserving Data Publishing (PPDP). Meeting this demand is quite challenging since privacy is subject to several constraints such as adversary background knowledge, information loss, and utility. For instance, in record linkage problem of privacy the adversary can easily identify the individual’s sensitive information like age, disease, by linking
the published data with other data sources.

Enormous data are being collected on different aspects of people’s lives such as their credit card transaction records, phone call and email lists, personal health information and web browsing habits. Most of this data is to be scanned for important information for other purposes like evolution of credit risk. Such analysis of private information often raises concerns regarding the privacy rights of individuals’ and organizations. Disobedience to privacy protocols and adversary’s background information may lead to privacy attacks. In this regard, privacy community people have proposed different noteworthy principles like k-Anonymity, l-Diversity and -Differential privacy.

Cooperation among objects like people, organization etc., has recently emerged as a novel paradigm that can yield tremendous performance gains and enables efficient and new models for privacy preserving data publishing. This thesis focuses on how Cooperative Game Theory (CGT) helps in the study of how an agent behaves, in cooperative manner, in formation of groups in data privacy context.

To overcome some of the shortcomings in k-Anonymity, this work presents new approaches to achieve k-Anonymity using Concept Hierarchy Trees (CHT) based on CGT. The observed shortcomings in existing works are: i) The anonymization is done based on an assumed value of k, ii) the information loss can be found only after anonymization is done, iii) if the information loss is found to be more than the affordable loss then another value of k is to be considered and the whole process of anonymity has to be repeated. To achieve anonymity, coalitions are established between the tuples based on their payoffs which are assigned using (CHT) of
Quasi Identifiers (QID). The new system allows the user to control the required information loss as anonymization metric in data publishing. It also presents a new approach to reduce information loss and in turn reduces anonymization level while achieving k-Anonymity using CHT with weighted levels. The experimental results showing the practicality and scalability are presented.

Co-Operative Privacy Game (CoPG) is proposed in this work to achieve data
privacy. The main objective of the CoPG is to obtain individual’s (player) privacy as a goal that is rationally interested in other individuals (players) privacy. CoPG is formally defined in terms of Nash equilibria, i.e., all the players are in their best coalition, to achieve k-anonymity. The Cooperative Value (CoV ) of each tuple are measured using the characteristic function of the CoPG to identify the coalitions. As the underlying game is convex; the algorithm is efficient and yields high quality coalition formation with respect to intensity and disperse. The variations of the information loss with the parameters β (weight factor of ’nearness’) and γ (multiplicity) are analyzed and the obtained results are discussed.