A Divide and Conquer Framework for Breaking Unlearnable Data

Advantages

  • Bypasses unlearnable image data through label modifications
  • Achieves over 85% accuracy on protected datasets
  • Work across multiple types of unlearnable data
  • Use both manual and automated (genetic algorithm) approaches
  • Reveal hidden weaknesses in current image protection methods
  • Help researchers test and improve future unlearnable defenses

Summary

Unlearnable data is designed to protect images from being used in unauthorized AI training by adding hidden noise that disrupts learning. But deep learning models are still finding ways to learn from these protected datasets, raising concerns about how effective current protections really are.

This invention introduces a new divide-and-conquer framework that bypasses unlearnable data by changing the AI’s classification task. It breaks complex classification problems into simpler ones, then recombines the results to improve learning. Using manual rules and automated genetic algorithms, this method achieved over 85% accuracy on five protected CIFAR-10 datasets exposing a key vulnerability in today’s image protection techniques.

This diagram illustrates the divide-and-conquer framework used to break unlearnable data. By finding an optimal classification tree and expanding data using nonlinear transformations, the system enables accurate model training even on protected datasets.

Desired Partnerships

  • License
  • Sponsored Research
  • Co-Development
Patent Information: