Document Details

EyeCatchingSamarium

Uploaded by EyeCatchingSamarium

Tags

machine learning adversarial attacks artificial intelligence

Full Transcript

OFFENSIVE AI LECTURE 4: ADVERSARIAL MACHINE LEARNING 3 Dr. Yisroel Mirsky [email protected] Today’s Agenda  Confidentiality Attacks  Causative Attacks Dr. Yisroel Mirsky 2 3 Attacks on Confidentiality Dr. Yisroel Mirsky Rigaki M. Et al. A Survey of Privacy Attacks in Machine Learning. 2021 4 Attac...

OFFENSIVE AI LECTURE 4: ADVERSARIAL MACHINE LEARNING 3 Dr. Yisroel Mirsky [email protected] Today’s Agenda  Confidentiality Attacks  Causative Attacks Dr. Yisroel Mirsky 2 3 Attacks on Confidentiality Dr. Yisroel Mirsky Rigaki M. Et al. A Survey of Privacy Attacks in Machine Learning. 2021 4 Attacks on Confidentiality Motivating Scenario Gonna make the world a better place Free cancer detection model for all! Uses Bob’s CT scans to train model 𝑓 Users infer from 𝑓 that Bob has cancer Rigaki M. Et al. A Survey of Privacy Attacks in Machine Learning. 2021 5 Attacks on Confidentiality Threat Model Wants information about... MLaaS  Amazon  Google  Microsoft  BigML  Algorithmia ... Has access to... 𝑥∈𝒟 𝜃 Adversary Model 𝑓𝜃 𝒟 (𝑡𝑟𝑎𝑖𝑛) Queries on 𝑓𝜃 Data Owner not always not always ≠ ≠ Model Owner Model Consumer Rigaki M. Et al. A Survey of Privacy Attacks in Machine Learning. 2021 6 Inferring the Model’s Parameters Attack Goals 𝒜 wants to steal 𝜃 from 𝑓𝜃 Model Extraction: obtain 𝜃 or 𝜃෨  Has access to 𝑓 𝑥; 𝜃 API  Generally black box Exact copy (e.g., SVM, LR) Approximation (e.g., DNN) C.s. et al. Copycat CNN: Stealing Knowledge by Persuading Confession with Random Non-Labeled Data. 2018 7 Model Extraction To steal a model, an attacker only needs to learn the decision boundary (hyperplane) Copycat Approach Method 1 1. Obtain 𝑥~𝒟 2. Use 𝑓 for labelling 3. Train 𝑓′ on the labeled data Jagielski M, et al. High accuracy and high fidelity extraction of neural networks. USENIX’20 8 Model Extraction Accuracy is subjective to the Ultimate Goal Get 𝑥~𝒟 Get labels from 𝑓 Train 𝑓′ High Accuracy: 𝑓′ should replicate the performance of 𝑓 -For stealing model for task High Fidelity: Original 𝒟 and 𝑓𝜃 Get 𝑥 where min 𝑓(𝑥) 𝑥 𝑓′ should replicate the decision boundary of 𝑓 -For going from black box to white box attack (adv ml membership inference...) [0.1, 0.1, 0.1, … ]

Use Quizgecko on...
Browser
Browser