Search Results - raha+moraffah

2 Results Sort By:
Exploiting Class Probabilities for Black-Box Sentence-Level Attacks
Background Text classification models have become increasingly prevalent in cybersecurity applications, but remain susceptible to adversarial examples (e.g., carefully crafted sentences with human-unrecognizable changes to the inputs, that are misclassified). Adversarial attacks provide profound insights into the classifiers’ vulnerabilities,...
Published: 9/15/2025   |   Inventor(s): Raha Moraffah, Huan Liu
Keywords(s): Machine Learning, Natural Language Processing, Red teaming, Security
Category(s): Physical Science, Applied Technologies, Artificial Intelligence/Machine Learning, Cybersecurity
Adversarial Text Purification: Large Language Model Approach for Defense
Background Adversarial purification is a defense mechanism for safe-guarding classifiers against adversarial attacks without knowing the type of attacks or training of the classifier. These techniques analyze and eliminate adversarial perturbations from the attacked inputs, and help to restore purified samples that retain similarity to the attacked...
Published: 6/27/2025   |   Inventor(s): Raha Moraffah, Shubh Khandelwal, Amrita Bhattacharjee, Huan Liu
Keywords(s): Artificial Intelligence, Defense Applications, Machine Learning, Natural Language Processing, Security, Text Mining
Category(s): Physical Science, Artificial Intelligence/Machine Learning, Applied Technologies, Cybersecurity