Perceptual User Interfaces Logo
University of Stuttgart Logo

Private Mouse and Keyboard Behaviour Dataset


Description: Mouse and keyboard dataset can include sensitive personal data (i.e. login credentials, banking information, or text messages). Differential Privacy [1] allows data scientists to train behaviour models without collecting the raw inputs from users.

Goal: Allow data scientists to train models on mouse and keyboard data that they can’t see using differential privacy. Steps:
1- Deploy a domain node using HAGrid [2].
2- Deploy a network node that collects data from different domain nodes and handles the network requests using PySyft and PyGrid [3].
3- Data owners can upload datasets to domain nodes. Noise is added to data once uploaded via differential privacy.
4- Data scientists can log into the network, get a privacy budget and run machine learning models.

Supervisor: Mayar Elfares and Guanhua Zhang

Distribution: 20% Literature, 60% Implementation, 20% Analysis

Requirements: Good Python skills, good knowledge of operating systems and databases

Literature:

[1] Dwork, Cynthia and Aaron Roth. 2014. The Algorithmic Foundations of Differential Privacy. Foundations and Trends in Theoretical Computer Science. 9(3-4), p.211-407.

[2] HAGRid: https://pypi.org/project/hagrid/

[3] PySyft: https://github.com/OpenMined/PySyft