Lecture 9 - DF Defences.pdf
Document Details

Uploaded by EyeCatchingSamarium
Tags
Full Transcript
OFFENSIVE AI LECTURE 9: DEFENCES AGAINST DEEPFAKES Dr. Yisroel Mirsky [email protected] Today’s Agenda Detection Conventional forensics Deep learning-based Visual Audio Prevention Dr. Yisroel Mirsky 2 Dr. Yisroel Mirsky 3 Overview Prevention vs Detection Prevention The defence stops de...
OFFENSIVE AI LECTURE 9: DEFENCES AGAINST DEEPFAKES Dr. Yisroel Mirsky [email protected] Today’s Agenda Detection Conventional forensics Deep learning-based Visual Audio Prevention Dr. Yisroel Mirsky 2 Dr. Yisroel Mirsky 3 Overview Prevention vs Detection Prevention The defence stops deepfakes from being created, deployed, or consumed Detection The defence identifies an ongoing attack. Dr. Yisroel Mirsky 4 Overview Active vs Passive Active e.g., modifications The defence performs actions which prevent attackers from achieving their goal e.g., probing, encryption,... Passive The defence searches for the attack after the fact e.g., detection https://link.springer.com/article/10.1007/s11042-021-11733-y 5 Detection - Visual Datasets Datasets Number of images Number of Videos Year Flickr-Faces-HQ (FFHQ) [link] Images 70,000 (fake) − 2019 CelebA [link] Images 30,000 (1024×1024) - 2017 UADFV [line] Videos − 98 (49 real + 49 fake ) 2018 WildDeepfake [link] Videos − 707 2020 Ding et al. [link] FaceForensics [link] Images Images, videos 1,500,000 420,053 1004 2019 2019 FaceForensics++ [link] Images, videos 1,800,000 3000 2019 DeepfakeTIMIT datasets [link] Celeb-DF [link] Videos − 320 2018 Videos − 1203 (408 real + 795 fake) 2020 MFC Datasets [link] Videos, images 300,000 (4000 manipulated) 2019 Images Videos Videos 35,000,000(100,000 manipulated) 53,000 − − 150 620 5214 2018 2019 2019 Videos Videos − − 4113 3363 2020 2019 FFW [link] VidTIMIT [link] facebook DFDC Preview [link] DFDC [link] DeepfakeDetection [link] Type 2022: FakeAVCeleb: A Novel Audio-Video Multimodal Deepfake Dataset 6 Detection Dr. Yisroel Mirsky Dr. Yisroel Mirsky 7 Detection Detection by Modality Visual (images/video) 1. 2. 3. Audio Both Let’s start here... Dr. Yisroel Mirsky 8 Detection - Visual Techniques 1. Classic Forensics Analytical (signal extraction) 2. Directed Approaches (Artifact-Specific) ML focused on specific features 3. Undirected Approaches Classification Anomaly Detection ML given all features (learns own features) Zheng L. A survey on image tampering and its detection in real-world photos 2019 9 Detection - Visual 1. Classical Forensics Without prior knowledge (comparison images), it is hard for people to detect tampering S.J. Nightingale, K.A. Wade, D.G. Watson Can people identify original and manipulated photos of real-world scenes? 2017 Which Part of this image is fake? Detected by observing (1) edge anomalies and/or (2) region anomalies Zheng L. A survey on image tampering and its detection in real-world photos 2019 10 Detection - Visual 1. Classical Forensics Edge Anomalies Example from cut-and paste Detected using Edge detection algorithms Spectral Analysis 𝑥 (e.g., Canny Edge Algorithm) (e.g., 1D Hilbert-Huang transform) Local Patterns (descriptors) First derivative 𝑟 Frequencies 𝑓𝑓𝑡 Locality Zheng L. A survey on image tampering and its detection in real-world photos 2019 11 Detection - Visual 1. Classical Forensics Region Anomalies JPEG Compression Inconsistency Detected using DCT coefficients (2D FFT) Lighting Consistency Anomalies Detected by Comparing image patches Blur artifacts from Diffusion Detected using Image Laplacians (filters) Statistical Features Zheng L. A survey on image tampering and its detection in real-world photos 2019 12 Detection - Visual 1. Classical Forensics CRF is how the sensor interprets colour Region Anomalies Inconsistent Camera Traces Unique latent patterns left in image CRF: Camera Response Function CFA: Colour Filter Array PRNU: Photo Response Non-uniformity Added content will have different patterns CFA is how the camera interpolates neighbouring values to obtain full colours PRNU is the noise influencing the imaging process. A physical defect in the sensor Generated content will have anomalous/incon sistent patterns unique to each camera/model Dr. Yisroel Mirsky 13 Detection - Visual 2. Directed Approaches (using ML on specific artifacts) Seven types of Artifacts: Spatial 1. Blending 2. Environment 3. Forensics Concepts overlap with classical Forensics (edge+region) Temporal 4. Behaviour 5. Physiology 6. Synchronization 7. Coherence Dr. Yisroel Mirsky 14 Detection - Visual 2. Directed Approaches 2.1 Spatial – Blending Train models that specialize on detecting edge artifacts: Model can Be generic image classifier Be trained on edge/frequency features Have built-in specialized filters Li L. Et al. Face X-ray for More General Face Forgery Detection. 2020 15 Detection - Visual 2. Directed Approaches 2.1 Spatial – Blending Face X-ray 1. Build face replacement dataset 2. Splice by similarity No deepfakes = focus on boundary Model predicts boundary Self-supervised learning No manual labelling! Dataset Creation: Li L. Et al. Face X-ray for More General Face Forgery Detection. 2020 16 Detection - Visual 2. Directed Approaches 2.1 Spatial – Blending Face X-ray Dr. Yisroel Mirsky 17 Detection - Visual 2. Directed Approaches 2.2 Spatial – Environment Context can highlight abnormalities Examples: Residuals from face warping Lighting Variable fidelity Some works directly contrast foreground to background Only works when head is spliced in Can be evaded if passed through refiner! Nirkin Y, et al. DeepFake Detection Based on Discrepancies Between Faces and their Context. 2020 Li X. Et al. Fighting Against Deepfake: Patch&Pair Convolutional Neural Networks (PPCNN). 2020 18 Detection - Visual 2. Directed Approaches 2.2 Spatial – Environment Nirkin et al. Method 1: Compare entire contexts: The two contexts are embedded then presented separately to encourage their contrast Issue: some areas in a context may be weighed less Patch&Pair CNN (PPCNN) Method 2: comparing patches: The two branches guide the ResNet to contrast face and background on arbitrary regions Multiple input passes are made to create input to final convolution Dr. Yisroel Mirsky 19 Detection - Visual 2. Directed Approaches 2.3 Spatial – Forensics PRNU-based CNNs Perspective (inconsistent head poses) Focus Yang X, et al. Exposing Deep Fakes Using Inconsistent Head Poses. 2018 20 Detection - Visual 2. Directed Approaches 2.3 Spatial – Forensics Compare face pose vectors inner face landmarks vs outer face landmarks DF Creation DF Detection Marra F. Et al. Do GANs leave artificial fingerprints? 2018 Yu N, et al. Attributing Fake Images to GANs: Learning and Analyzing AN Fingerprints. 2019 21 Detection - Visual 2. Directed Approaches 𝑥𝑔 2.3 Spatial – Forensics GANs have Fingerprints! kernels 2 8 32 128 512 CycleGAN Denoiser 𝑥𝑔ҧ ProGAN 𝜎𝑔 -Wavelet denoising -High pass filter -Weiner filter Cleaned image (no noise) Noise pattern (residual) Agarwal S, et al. Protecting World Leaders Against Deep Fakes. 2019 22 Detection - Visual 2. Directed Approaches 2.4 Temporal – Behaviour Can compare mannerisms of identity to past footage Usually requires lots of training data of the identity Politicians Celebrities News Casters ... Action Units correlate differently between people Example: When Obama tilts his head up, he raises his eyebrows Contradictions can be detected Li Y. Et al. In Ictu Oculi: Exposing AI Generated Fake Face Videos by Detecting Eye Blinking. 2018 23 Detection - Visual 2. Directed Approaches 2.5 Temporal – Physiology Deepfakes tend to omit biological signals which aren’t important to visualization Identifying anomalies or lack of these signals can indicate DF Blink detection Ciftci U, et al. FakeCatcher: Detection of Synthetic Portrait Videos using Biological Signals. 2020 24 Detection - Visual 2. Directed Approaches 2.5 Temporal – Physiology Deepfakes tend to omit biological signals which aren’t important to visualization Identifying anomalies or lack of these signals can indicate DF Fake Catcher measures blood flow and pulse Agarwal S, et al. Detecting Deep-Fake Videos from Phoneme-Viseme Mismatches. 2020 Bolles R, et al. Spotting Audio-Visual Inconsistencies (SAVI) in Manipulated Video. 2018 25 Detection - Visual 2. Directed Approaches 2.6 Temporal - Synchronization Audio-video mismatches – dub detection 1. Mouth landmarks don’t match audio; or 2. Mouth shapes (visemes) do not match utterances (phonemes) In this work, visemes were broken down to units and compared to the audio They focus on closed-mouth phonemes (B, P,M) where DFs tend to fail Here, landmarks are compared to audio based landmark predictions Korshunov P, et al. Speaker Inconsistency Detection in Tampered Video. 2018 26 Detection - Visual 2. Directed Approaches 2.6 Temporal - Synchronization Audio-video mismatches – dub detection General Method: 1. Predict next frame (modality A) using previous frames (from modality B) 2. Measure delta (large difference = anomaly!) OR Amerini I, et al. Exploiting Prediction Error Inconsistencies through LSTM-based Classifiers to Detect Deepfake Videos. 2020 27 Detection - Visual 2. Directed Approaches 2.7 Temporal – Coherence Inter-frame inconsistencies Jitter Flicker distortions Here, an LSTM uses the last frame 𝑥 (𝑖−1) to predict the current frame 𝑥ҧ (𝑖) Comparing 𝑥ҧ (𝑖) to 𝑥 (𝑖) reveals errors Dr. Yisroel Mirsky 28 Detection - Visual 3. Undirected Approaches Classification Regular DNN Classifiers Ensembles of DF CNNs 3D CNNs over video Siamese Networks: real vs fake Anomaly Detection Autoencoder VAE … Wang R. Et al, FakeSpotter: A Simple yet Robust Baseline for Spotting AI-Synthesized Fake Faces. 2020 30 Detection - Visual 3. Undirected Approaches Classification – Neural Activation FakeSpotter Utilize experience of another network Face recognition model Look at the activations to detect anomalies Overcomes noise and distortions (like content loss) Dang H. Et al. On the Detection of Digital Face Manipulation. 2020 Li J. Et al. Zooming into Face Forensics: A Pixel-level Analysis. 2019 31 Detection 3. Undirected Approaches Classification – Localization Train on masks (requires GT) Use XAI (e.g., SHAP) Subtracts real from fake to create attention map: -used to focus loss -used for localization deployment Dr. Yisroel Mirsky 33 Detection - Audio Detecting Fake Audio Dr. Yisroel Mirsky 34 Detection - Audio Generic Approaches Supervised 𝐶 Real/fake? Unsupervised 𝐶 Anomalous? reconstruction Wang, Run. Et Al. DeepSonar: Towards Effective and Robust Detection of AI-Synthesized Fake Voices. 2020 35 Detection - Audio Example Approach DeepSonar Similar to FakeSpotter, we can detect fakes by monitoring activations from a Speech Recognition system (SR) 1. 2. 3. Make set of real and fake audio Apply degradations to both (makes model robust) Monitor Speech recognition model for abnormalities Monitoring feature maps is better than the content itself: -Robust to background noise -Robust to distortions Dr. Yisroel Mirsky 36 Detection Evasion Evasion Recall, a defence is an active game Attacker can make a retaliatory step Always consider a defence’s flaws Dr. Yisroel Mirsky 37 Detection Evasion Recall the Arms Race... Adversary 1. Analyse defence 2. Devise Evasion Defender 4. Develop defence 3. Analyse Evasion Example: If you use deep learning to detect, attacker can use adversarial ML to evade... Dr. Yisroel Mirsky 38 Detection Evasion Meet Andrew Waltz Made with StyleGan2 Meet Andrew’s Nemisis A CNN-based Classifier Wang S. Et al. CNN-generated images are surprisingly easy to spot... for now. 2020 39 Detection Evasion The classifier is very good at detecting GAN fingerprints since they all exhibit fignerprints Even New GAN generated Content like Mr. Waltz, that was not in the training set Carlini, N. Et al. Evading Deepfake-Image Detectors with White- and Black-Box Attacks. 2020 40 Detection Evasion Hold on, I’ve heard this story before... = That’s better... Carlini, N. Et al. Evading Deepfake-Image Detectors with White- and Black-Box Attacks. 2020 41 Detection Evasion Hold on, I’ve heard this story before... PGD Evasion Patch Real to Fake Fake to Real Universal Perturbation Dr. Yisroel Mirsky 42 Detection Evasion What about detecting sematic artifacts? Face splice anomaly? Use Refiner, etc.. Quality is getting better every year... What about monitoring activations? Use it in the adversarial loss! FakeSpotter DeepSonar Dr. Yisroel Mirsky 43 Detection Evasion In Summary If a DNN is used in a defence Then it vulnerable to adversarial attacks Always considers the attacker’s abilities + limitations These Defences are not worthless Raise the difficulty bar (prevents easy attacks) Main Challenge for attacker: hard to cover all anomalies Main Challenge for defender: can’t run all possible detection methods Expensive raises FPR Yasur et al. Deepfake CAPTCHA: A Method for Preventing Fake Calls. 2022 44 Defenses Against RT-DF Deepfake Captcha: force the adversary into the spotlight by exploiting DF limitations Must consider evasions too: replay, turn off DF,.. Yasur et al. Deepfake CAPTCHA: A Method for Preventing Fake Calls. 2022 45 Defenses Against RT-DF Deepfake Captcha: force the adversary into the spotlight by exploiting DF limitations Example Challenges: Video Audio Drop object Bounce Object Fold shirt Stroke hair Interact with background scenery Spill water Pick up requested object Hand expressions Tongue motion Fold ear Face occlusions Remove glasses Mimic phrase hum tune Sing part of song Repeat accent Change tone Clear throat Whistle 46 Prevention Dr. Yisroel Mirsky Dr. Yisroel Mirsky 47 Prevention Techniques 3. Cyber Security 1. Data Provenance 2. Counter Attacks 4. Awareness Dr. Yisroel Mirsky 48 Prevention 1. Data Provenance Main Idea: Track the source of data Track changes Who edited What was edited Guarantees authenticity Fraga P. Et al. Fake News, Disinformation, and Deepfakes: Leveraging Distributed Ledger Technologies and Blockchain to Combat Digital Dec. 2019 Hasan H. Et al. Combating Deepfake Videos Using Blockchain and Smart Contracts. 2019 49 Prevention 1. Data Provenance Blockchains A distributed database that is robust to tampering Public: everybody can see info on the media Robust: cannot be changed by malicious actors Auditable: history of logs is transparent Problem: Only works for news sources not protecting individuals from attacks Shan S. Et al. Fawkes: Protecting Privacy against Unauthorized Deep Learning Models. 2022 50 Prevention & Mitigation 2. Counter Attacks Using Adversarial ML (a taste of their own medicine...) Causative Idea: Poison Attacker’s data Collections by ‘cloaking’ faces as another identity (perturbations) Wrong faces will be collected from web crawl using new DNN Li Yu. Et al. Hiding Faces in Plain Sight: Disrupting AI Face Synthesis with Adversarial Perturbations. 2019 51 Prevention 2. Counter Attacks Using Adversarial ML (a taste of their own medicine...) Exploratory Dr. Yisroel Mirsky 52 Prevention 3. Cyber Security Digital Signatures Creator can sign content with private key Everyone can verify the content with the public key No one can change the content without the private key (or it will break the signature!) 𝑚: media 𝑒: encrypt 𝑑: decrypt 𝑠: signature (𝑚, 𝑠) encrypt ALICE 𝑒𝑘 𝑝𝑟𝑖𝑣 ℎ 𝑚 decrypt EVE =𝑠 BOB 𝑑𝑘 𝑝𝑢𝑏 ℎ 𝑚 == ℎ(𝑚) No more DF problems in healthcare! Dr. Yisroel Mirsky 53 Prevention 4. Awareness Posted Media Look for artifacts Check the original source Verify news using multiple sources Calls Be suspicious of urgent pretexts Especially if the audio is not perfect Listen for monotonic speech Challenge the caller to perform inflections and motions