Privacy Inclined Sequential Publishing of dynamic Social Network data

Jyothi Vadisala, Awarded

Social networks are being applied to a wide variety of applications. Social networks model the social activities between individuals, that change as time goes by. Data privacy is a major problem that has to be considered for dynamic networks before releasing network data to the public or third parties like researchers and analyzers that would compute statistics or make a deep analysis of these data. As the demand for dynamic social network analysis increases, the privacy issue in sequential publishing of social network needs to be considered. In this context, many different anonymization techniques have been proposed in the literature.

This thesis aims to address the privacy risks in dynamic social networks and to contribute a few techniques for data anonymization while optimising the privacy- utility tradeoff. Specifically, the proposed models deal with emerging privacy issues in sequential publishing of social network data. These type of data are derived from activities of our everyday life, thus the privacy of such data is impor- tant. The main objective of this thesis work is divided into two research problems.

First research problem, addresses the privacy risks of identity disclosures in sequential releases of a dynamic network. The aim is to propose a privacy model that provides adequate privacy protection and retains enough graph data utility in social network data publication. Specifically, this proposed kw-NMF anonymity model, protects identity disclosure in the context of number of common friends as a background knowledge of an adversary. To achieve this kw-NMF anonymity Number of Mutual Friend Anonymization (NMFA) greedy algorithm is proposed for large scale graph data by adding edges in anonymization process.

Performance evaluation is done on both real and synthetic data sets. The results show that the dynamic kw-NMF anonymity model can retain much of the characteristics of the network while confirming the privacy protection. The metrics used for evaluation are Graph Modification, Average Shortest Path Lengths and Average Clustering Coefficients from the aspects of protections against different information hysteresis (w) and different protection level requirements (k). Experimental results shows that the proposed algorithm can retain much of the characteristics of a dynamic network while confirming the privacy protection.

Second research problem, focuses on how the anonymization methods preserves in the existing communities of the original social networks. The k-degree and k-NMF anonmyization methods are used for anonymizing original social network and de- tect the communities for both original and anymomized networks by heuristic algorithm modular optimization Louvain method. The preservation of community is measured by Percentage of Naive Community Preservation (%NCP) and Percentage of Community Preservation at Node Level (%CPNL).

The performances of the two anonymized social networks are compared with the original social network. The experimental results show that the k-NMF anonymization method preserves the most of the communities of the original social network than k-degree anonymization method.