Privacy Friendly Federated Missing Data Processing for Improved ML and Downstream Analysis

Cafe not only outperforms local and standard federated models but also surpasses centralized models that have full access to all the data, which is both surprising and groundbreaking.

Invention Summary:

Decentralized machine learning (ML) models, such as Federated Learning (FL), often face limitations due to incomplete data in distributed datasets. This issue is further complicated by the heterogeneity of missing data, where the distribution of missing data varies across datasets. These discrepancies, caused by privacy concerns or systematic errors, undermine data preprocessing efforts and hinder machine learning performance, preventing organizations from fully leveraging their data’s potential.

To overcome this challenge, Dr. Jaideep Vaidya and his team have developed Complementarity Adjusted Federated Learning (Cafe), a novel approach for federated imputation of missing data, that is effective even for data that is Missing Not At Random (MNAR) — the most difficult form of missing data to address. Cafe locally learns each client’s missing data mechanism, quantifies heterogeneity across clients, and uses pairwise complementarity and sample size scores to create federated averages of local imputation models. This innovative method has consistently outperformed baseline approaches in both imputation and federated prediction tasks, making it a game-changer for decentralized machine learning.

Market Applications:

Clinical research data preparation tools
Financial fraud detection systems
Personalized medicines services

Advantages:

First imputation model that can handle MNAR type data
Leverages heterogeneity in missing data to improve data processing and quality control
Outperformed baseline models over MNAR data
Outperforms local, global, and federated imputation models

Publication: Cafe: Improved Federated Data Imputation by Leveraging Missing Data Heterogeneity

Intellectual Property & Development Status: Patent pending. Available for licensing and/or research collaboration. For any business development and other collaborative partnerships, contact: marketingbd@research.rutgers.edu

Direct Link:

https://canberra-ip.technologypublisher.com/tech/Privacy_Friendly_Federated_M issing_Data_Processing_for_Improved_ML_and_Downstream_Analysis

Bookmark this page

Download as PDF

For Information, Contact:

Wenjuan Zhu

Licensing Manager

Rutgers, The State University of New Jersey

848-932-4058

wz284@research.rutgers.edu