Privacy-prEserving Data-sharing framework (PriED)

THE CHALLENGE

Industrial Internet provides a communication and computation platform, allowing the collection of large-scale data for machine learning tasks. Despite the promise of training and deployment acceleration and potential optimization via sharing data sets, the adoption of such technologies is impacted by the increasing concerns of information privacy, limiting interoperability. While prior work has largely explored privacy-preserving mechanisms, it still remains non-trivial how to design data selection mechanisms that can model the similarity of data owners to better facilitate partnership.

OUR SOLUTION

Motivated by the lack of effective data sharing mechanisms for heterogeneous machine learning tasks in Manufacturing Industrial Internet, the Lourentzou and Jin labs at VT propose a novel task-driven data-sharing framework formulation, PriED, that combines shared data/local data from participating enterprises to improve the performance of supervised learning methods. The proposed framework combines (1) privacy preserving generative models to facilitate proxy-data exchange, and (2) an attention-based dynamic data selection strategy that is trained with reinforcement learning. We have demonstrated performance improvements on a real semiconductor manufacturing case.

 

Certain industrial processes such as the Czochralski crystal growth process could directly benefit from this method of data sharing due to the high amount of defective samples in relationship to normal samples, complicating modeling efforts.

Data from multiple stakeholders (data owners) are distilled into low-dimensional vector representations while preserving data privacy. An attention-based dynamic data selection mechanisms progressively learns to retrieve data from participants based on their data contributions to the downstream task performance for the respective data receiver. PriED is trained end-to-end with reinforcement learning to allow the incorporation of flexible reward mechanisms, e.g., incentivize correct predictions for the minority class in an anomaly detection task.

Patent Information: