Preserving Individual Privacy in Big Data Analytics
We live in the age of big data. With an increasing number of people, devices, and sensors connected with digital networks, individual data now can be largely collected and analyzed by various applications for social good as well as for commercial interests. However, the data generated by individual users contain unique behavioral patterns and sensitive information, and therefore must be transformed prior to the release for analysis. The AOL search log release in 2006 is an example of privacy catastrophe, where the searches of an innocent citizen were quickly re-identified by a newspaper journalist. In this talk, I will present a novel framework to release continuous aggregation of private data for an important class of real-time data analytics, such as disease outbreak detection and traffic monitoring, to name a few. The key innovation is that the new framework captures the underlying dynamics of the continual aggregate statistics with time series state-space models, and simultaneously adopts filtering techniques to correct the observed, noisy data. I will show that the framework provides a rigorous, provable privacy guarantee to individual data contributors without compromising the output analysis results. Towards the end, I will also share my vision for privacy-preserving big data analytics and various research opportunities.
Liyue Fan is a post-doc research associate in the Integrated Media Systems Center at the University of Southern California. She obtained her PhD in Computer Science & Informatics at Emory University and BSc in Mathematics at Zhejiang University in China. Her primary research interests are Databases, Data Analytics, and Data Privacy. She received the CCC Blue Sky Ideas award for her vision paper on Privacy-Preserving Social Relationship Inference. She has been named a “Rising Star in EECS” by MIT in 2015.