Fake Image Detection Project Report PDF
Document Details
Uploaded by Deleted User
Jaypee Institute of Information Technology
2024
Jasmeen Kaur, Ritika, Jeetesh Saini
Tags
Related
- Lecture 5 Remote Sensing Image Interpretation PDF
- Neuroimaging Survey on Autism Spectrum Disorder Detection PDF
- Scale-Space Theory & Image Feature Detection PDF
- BIOL643 Lecture 3 Image Processing and Manipulation UBt PDF
- A3 CVP BASIC - Image Processing Basics Practice Test PDF
- A Survey of Target Detection and Recognition Methods in Underwater Turbid Areas PDF
Summary
This project report details the Fake Image Detection project completed by Jasmeen Kaur, Ritika, and Jeetesh Saini at Jaypee University of Information Technology in India during 2024. The report discusses the methodology used for detecting deepfakes in images and the challenges faced in the project.
Full Transcript
FAKE IMAGE DETECTION A major project report submitted in partial fulfillment of the requirement for the award of degree of Bachelor of Technology in Computer Science & Engineering...
FAKE IMAGE DETECTION A major project report submitted in partial fulfillment of the requirement for the award of degree of Bachelor of Technology in Computer Science & Engineering Submitted by Jasmeen Kaur (211429), Ritika (211432), Jeetesh Saini (211436) Under the guidance & supervision of Dr. Ekta Gandotra Department of Computer Science & Engineering and Information Technology Jaypee University of Information Technology, Waknaghat, Solan - 173234 (India) December 2024 vii SUPERVISOR’S CERTIFICATE This is to certify that the major project report entitled ‘ Fake Image Detection’, submitted in partial fulfillment of the requirements for the award of the degree of BachelorofTechnologyinComputerScience&Engineering,intheDepartmentof Computer Science & Engineering and InformationTechnology,JaypeeUniversityof InformationTechnology,Waknaghat,isabonafideprojectworkcarriedoutundermy supervision during the period from July 2024 to December 2024. Ihavepersonallysupervisedtheresearchworkandconfirmthatitmeetsthestandards required for submission. The project work has been conducted in accordance with ethical guidelines, and the matter embodied in the report has not been submitted elsewhere for the award of any other degree or diploma. Supervisor Name: Dr. Ekta Gandotra Date: 30 November2024 Designation:Associate Professor Place: JUIT,Solan Department: Dept. of CSE & IT i CANDIDATE’S DECLARATION We hereby declare that the work presented in this report entitled ‘Fake Image Detection’ in partial fulfillment of the requirements for the award of the degree of Bachelor of Technology in Computer Science & Engineering submitted in the DepartmentofComputerScience&EngineeringandInformationTechnology,Jaypee University of Information Technology, Waknaghat is an authenticrecordofmyown work carried out over a period from July 2024 to December 2024 under the supervision ofDr Ekta Gandotra. Wefurtherdeclarethatthematterembodiedinthisreporthasnotbeensubmittedfor the award of any other degree or diploma at any other university or institution. Name:JasmeenKaur Name:Ritika Name: Jeetesh Saini RollNo.:211429 RollNo.:211432 RollNo.:211436 Date: 01/12/24 Date:01/12/24 Date: 01/12/24 Thisistocertifythattheabovestatementmadebythecandidatesistruetothebestof my knowledge. Supervisor Name: Dr. Ekta Gandotra Date: 30 November2024 Designation:Associate Professor Place: JUIT,Solan Department: Dept. of CSE & IT ii ACKNOWLEDGEMENT We would like to express our deepest gratitude to everyone who contributed to the successful completion of our major project,“FakeImage Detection”. Firstandforemost,wearethankfultoouresteemedinstitutionandDr.EktaGandotra for providing us with the guidance, resources, and encouragement needed to pursue this innovative endeavour. We are particularly indebted to our project guide, whose expertise,mentorship,andcontinuoussupportwereinvaluablethroughouttheproject. We extend our sincere thanks to our classmates, friends, and family for their unwavering support, patience, and encouragement during the development of this project. Their belief inourvisionmotivatedustoovercomechallengesanddelivera meaningful solution. Lastly, we are grateful to all the researchers and developers in the field of artificial intelligence, biometric systems, and software development whose workinspiredand guided our project. This project is a testament to the collective efforts and shared vision of improving the lives of missing individuals and their families through technology. We hope our project contributes positively to this cause and inspires further advancements in this field. Jasmeen Kaur (211429) Ritika (211432) Jeetesh Saini (211436 iii TABLE OF CONTENT CERTIFICATE …………............................................................................................ i CANDIDATE DECLARATION …………................................................................ ii ACKNOWLEDGEMENT........................................................................................ iii LIST OF TABLES...................................................................................................... v LIST OF FIGURES................................................................................................... vi LIST OF ABBREVIATIONS.................................................................................. vii ABSTRACT.............................................................................................................. viii CHAPTER 1: INTRODUCTION.............................................................................. 1 1.1 INTRODUCTION..................................................................................... 1 1.2 PROBLEM STATEMENT....................................................................... 2 1.3 OBJECTIVES........................................................................................... 2 1.4 MOTIVATON............................................................................................ 2 1.5 ORGANISATION OF PROJECT REPORT......................................... 3 CHAPTER 2: LITERATURE REVIEW.................................................................. 5 2.1 OVERVIEW OF RELEVANT LITERATURE...................................... 5 2.2 KEY GAPS.............................................................................................. 12 CHAPTER 3: SYSTEM DEVELOPMENT........................................................... 14 3.1 REQUIREMENTS AND ANALYSIS................................................... 14 3.2 PROJECT DESIGN AND ARCHITECTURE.................................... 15 3.3 DATA PREPARATION............................................................................20 3.4 IMPLEMENTATION............................................................................. 20 3.5 KEY CHALLENGES............................................................................. 26 CHAPTER 4: TESTING.......................................................................................... 28 4.1 TESTING STRATEGY.......................................................................... 28 4.2 TEST CASES AND OUTCOMES…...................................................... 29 CHAPTER 5: RESULTS AND EVALUATION..................................................... 31 iv 5.1 RESULTS ……………............................................................................. 31 CHAPTER 6: CONCLUSION AND FUTURE SCOPE....................................... 39 6.1 CONCLUSION........................................................................................ 39 6.2 FUTURE SCOPE.................................................................................... 39 REFERENCES.......................................................................................................... 41 iv LIST OF TABLES S. No Title Page No. 1 Overview of relevant literature 6 2 Models Performance comparison on Yonsei Dataset 26 3 Models Performance comparison on NVIDIA 27 Flickr Dataset v LIST OF FIGURES S. No. Title of Figures Page No. 1 Workflow Diagram 14 2 Project Architecture Diagram 16 3 Resnet50 Model 20 4 XceptionNet Model 21 5 DenseNet121 Model 21 6 VGG16 Model 22 7 Models Performance Comparison on Yonsei Dataset 27 8 odels Performance Comparison on NVIDIA Flickr M 29 Dataset 9 Confusion Matrix of DenseNet 30 10 Confusion Matrix of Resnet101 31 11 Confusion Matrix of XceptionNet 32 vi LIST OF ABBREVIATIONS, SYMBOLS OR NOMENCLATURE Abbreviation Full Form AI Artificial Intelligence API Application Programming Interface AUC Area Under Curve CNN Convolutional Neural Network CUDA Compute Unified Device Architecture GAN Generative Adversarial Network GPU Graphics Processing Unit LIME Local Interpretable Model-agnostic Explanations LR Learning Rate ML Machine Learning ReLU Rectified Linear Unit ROC Receiver Operating Characteristic SGD Stochastic Gradient Descent SHAP SHapley Additive exPlanations vii ABSTRACT The proliferation of deepfake technology has raised significant concerns across various sectors, including media, politics, and cybersecurity. Deepfakes,createdusingGenerative Adversarial Networks (GANs) and other machinelearningtechniques,arehighlyrealistic fakeimagesorvideosthatmanipulatereal-worldcontenttodepicteventsorstatementsthat never occurred. While this technology has legitimate applications in entertainment and creative fields, its misuse has been alarming. Deepfakes have been used to spread misinformation, impersonate public figures, and commit fraud, making it difficult for individuals and institutions to trust the authenticity of digital media. High-profile cases, such as fabricated videosinvolvingRashmikaMandanna,PrimeMinisterNarendraModi, and Mark Zuckerberg, have demonstrated the societal and political dangers of deepfakes. TheNVIDIAFlickrdatasetshowedsuperiorperformancecomparedtotheYonseidataset, particularly in terms of model accuracy. The top 2 models, VGG16 and DenseNet121, achievedimpressiveaccuracyratesofupto95%,significantlyoutperformingothermodels tested on the same dataset. These results highlight the robustness of these models in detectingdeepfakeimages,eveninchallengingconditions.However,achievingsuchhigh accuracycameatthecostofincreasedtrainingtimes,whichrangedfromminutestohours depending onthemodel.Despitethelongertrainingdurations,themodelsdemonstrateda clearadvantageintermsofaccuracy,makingthemmorereliableforreal-worldapplications where precision is critical. This further underscores the importance of selecting the right dataset and model for deepfake detection tasks. The primary objective is to create a solution that can be integrated into real-world applications, such as cybersecurity, social media monitoring, and media forensics, where image authenticityiscritical.Ourdetectionsystemaimstonotonlypreventthemisuseof deepfakesbutalsoenhancepublictrustindigitalcontentbyprovidingatooltoverifythe authenticity of images. This project will contribute to ongoing efforts to combat disinformation and protect against cybercrime in the digital age. viii CHAPTER 1: INTRODUCTION 1.1 INTRODUCTION In recent years, the proliferation of fake images, particularly those generated using Generative Adversarial Networks (GANs), has become asignificantconcerninboth thedigitalandphysicalworlds.TheseAI-generatedimages,oftenreferredtoasdeep fakes, are becoming increasingly realistic, making it difficult to distinguishbetween realandmanipulatedmedia.Whiledeepfaketechnologyhasopenednewpossibilities in creative industries, it has also been weaponized to tarnish reputations, spread misinformation, and conduct cybercrimes. Severalreal-worldcasesdemonstratethedestructivepowerofdeepfakes.Forinstance, adeepfakevideoofIndianactressRashmikaMandannawascirculated,showingher in a compromising situation, which harmed her reputation and caused distress. Similarly, PrimeMinisterNarendraModihasbeenatargetofdeepfakes,withvideos falsely attributing harmful speeches or actions to him, which could have significant political and social consequences. In another case, Mark Zuckerberg, the CEO of Facebook,wasfeaturedinadeepfakevideowhereheappearedtomakecontroversial statements about controlling people's data, sparking concerns over how easily powerful figures can be manipulated. The rise of social media and image-sharing platforms has accelerated the spread of such fake content, raising questions about the authenticity of visual information online. In this project, we aim to develop areliablesystemforfakeimagedetection that can effectively identify deepfake content. By leveraging cutting-edge machine learning models andimageanalysistechniques, our goal is to create a tool that helps individuals and organizations differentiate betweenrealandmanipulatedimages.Thiswillhelpmitigatethesocietalandethical consequences posed by the widespread use ofdeepfaketechnology,ensuringamore secure digital environment. 1 1.2 PROBLEM STATEMENT The increasing sophistication of deepfake technology, especially throughGANs,has made it difficult to distinguish real images from manipulated ones. Deepfakes are being used for malicious purposes, such as spreading disinformation, causing reputational harm, and enabling cybercrime. Current detection systems struggle to keep up with advancements in fake image generation, creating a gap in reliable identification of fraudulent content. This project aimstodevelopamachinelearning systemusingGANmodelstodetectfakeimages,safeguardingdigitalmediaintegrity and addressing social, ethical, and security risks. 1.3 OBJECTIVE Theprimaryobjectiveofourprojectistodevelopanadvancedfakeimagedetection systemthatcandistinguishrealimagesfromdeepfakeswithhighaccuracy.Ourfocus isoncreatingasystemthatisnotonlyeffectivebutalsoadaptabletovariousdeepfake generationtechniques.Toachievethis,weaimtoincorporatestate-of-the-artmachine learningalgorithms,particularlyConvolutionalNeuralNetworks(CNNs),whichare known for their ability to extract deep features from images. By applying this technology, we hope to build a robust model capable of analyzing images and identifying manipulations introduced by GAN-based deepfake models. Another key objective is to make the detection system user-friendly and applicable to real-world scenariossuchasmediaforensics,socialmediaplatforms,andcybersecurity.Withthe increasinguseofdeepfaketechnologyinonlinedisinformationcampaigns,ourproject seeks to provide a practical solution that can be integrated intovariousplatformsto ensure media authenticity. 1.4 MOTIVATION OF THE PROJECT The motivation behind thisprojectstemsfromthegrowingthreatposedbydeepfake technology,whichisincreasinglybeingusedformaliciouspurposes.Deepfakeshave been weaponized to tarnish reputations, particularly those of public figures, by creating fake videos and images that depict them incompromisingsituations.These manipulated media have far-reaching implications, from damaging personal reputations to influencing political outcomes. 2 Additionally, cybersecurity crimes involving deepfakes have seen a rise, with criminals using fake identities for fraud, impersonation, and data theft. The danger extends beyond social media into sectors such as finance, national security, and journalism, where misinformation can have serious consequences. Our project is driven by the need to address this growing concern by providing a reliable, efficient, and easy-to-use detection system that can be employed by individuals and institutions alike. By developing tools that can effectively combat deepfakes, we hope to contribute to a safer and more secure digital environment, ensuring that people can trust the images they see online. 1.5 ORGANIZATION OF PROJECT REPORT Thisprojectreportissystematicallyorganizedintosixchapterstoprovideastructured anddetailedaccountoftheworkundertaken,frominceptiontoconclusion.Thereport is organized as follows: Chapter 1: Introduction Thischapterlaysthefoundationoftheprojectbypresentingthebackground,problem statement,objectives,andthesignificanceandmotivationforundertakingthiswork.It concludes with an outline of the project report’s organization. Chapter 2: Literature Survey Thischapterprovidesanin-depthreviewoftheexistingliterature,focusingonrecent advancements over the past five years. It identifies key gaps and limitations in the current state of knowledge that the project aims to address. Chapter 3: System Development Thischapterdiscussesthecompletedevelopmentprocessofthesystem,startingfrom requirement analysis to implementation. It includes technical details such as project design, data preparation, implementation techniques, and challenges faced during development. 3 Chapter 4: Testing This chapter outlines the testing strategy employed to ensure system reliability, followed by test cases and their respective outcomes. Chapter 5: Results and Evaluation This chapter presents the results obtained from the project and evaluates their significance. It includes a comparative analysis with existing solutions (if applicable). Chapter 6: Conclusions and Future Scope The concluding chapter summarizes theprojectfindings,highlightsitscontributions, andidentifiesitslimitations.Italsooutlinesthepotentialdirectionsforfutureresearch and development. 4 CHAPTER 2: LITERATURE SURVEY 2.1 OVERVIEW OF RELEVANT LITERATURE CNNs were utilized to detect fake images, demonstrating good accuracy but highlighting the need for scalable methods to handle larger datasets effectively . Similarly, ELA combined with deep learning modelslikeResNet18andGoogLeNet achieved an accuracy of 89.5% in deepfake detection, although it struggled with low-quality or compressed images.GANsanddeepconvolutionalmodelsproved effective for detecting deepfakes on social media platforms, but issues like mode collapse and limited datasets posed challenges . An improved Dense CNN architecture attained 98.33%-99.33% accuracy but faced limitationswhenappliedto cross-domain datasets. Hybridapproaches,suchascombiningVGG16andCNN,achieved95%accuracyand 94%precisioninfakeimagedetectionbutencounteredcomputationalcomplexityasa bottleneck . GANs were leveraged for high-quality facial image generation, highlighting their efficiency but exposing gaps in face realism and dataset size . UsingGANsandtheCelebAdataset,researchersgeneratedrealisticfaces,butthelack of diversity and dependency on dataset quality were major drawbacks . Comparative studies withCNNmodels,suchasVGGFace,reached99%accuracyin detecting manipulated images but noted limitations in adapting to varying deepfake generation techniques. A GAN-based model coupled with Random Forest addressed imbalanced intrusion detectiondatasets,showingimprovedrareattackdetectionbutfacingoverfittingrisks and scalability concerns . DCT anomaly detection in GAN-generated images achieved 99.9% accuracy but lacked robustness in noisy environments . Generalizable propertiesoffakeimageswerestudiedusingpatch-levelclassification, emphasizingtheneedforstandardizedpreprocessingtechniquestoenhancedetection accuracy . Surveys on deepfake detection methods provided comprehensive overviewsoftechniquesbuthighlightedgapsinreal-timedetectionandhandlingnew manipulation techniques. 5 Pairwise learning methods improved accuracy in detecting manipulated images but were limited to static image analysis, excluding videos . Histogram-based techniques effectively detected fakecolorizedimagesbutstruggledagainstadvanced manipulation methods . GANs facilitated high-fidelity image generation and enhanced deepfake detection capabilities but revealed issues such as dependencyon training data and risks of misuse . CNN architecture studies highlighted their foundational role in image recognition but lacked coverage ofadvancedmodelsand computational complexities. 6 Table 1 : Overview of relevant literature uthor & A J ournal/ ools/ T ey K imitations L Paper Conference Techniqu Findings/ / Title (Year) es/ Results Gaps [Citation] Dataset Identified 1 adde Kumar M ational Conference N onvolutional C Identifies acks L - Identifying on Advanced Trends Neural f ake images exploratio Fake Images in Computer Science Networks using CNNs, n of Using CNN and Information (CNNs),Deep explores the alternative Technology (2024) Learning. accuracy of detection CNN models methods in detecting and manipulated scalability media for large datasets 2 . R rticlepublished A esNet18, R 9.5% 8 ensitiveto S Rafiqueet on Scientific GoogLeNet, accuracy low-quality al., "Deep Reports Squeeze Net, and Fake (2023) ELA, KNN compressed Detection and SVM images and Dataset: Classificat Publicly ion Using available Error-Lev deepfake el detection Analysis dataset by and Deep Yonsei Learning, University " 3 . Preeti, P I nternational ANs G chieved A ode M M. Conference on with Inception Score collapse Kumar, Machine Deep IS= 1.074 and and and H. K. Learning and Convolut Fréchet convergenc Sharma, Data ional Inception e issues "A Engineering Models Distance FID= with GAN; GAN-Bas (2023) Dataset: 49.3 small ed Model CelebA- datasets of HQ and pose Deepfake FFHQ challenges. Detection dataset. in Social Media," 7 uthor & A J ournal/ ools/ T ey K imitations L Paper Conference Techniqu Findings/ / Title (Year) es/ Results Gaps [Citation] Dataset Identified 4 . Patel et al., IEEE Access (2023) Y -CNN D Achieved imited L "An Improved Dataset: a ccuracy in the performanc Dense CNN Utilises range of e on Architecture images from 98.33%-99.33% cross-doma for Deepfake multiple in datasets. Image sources for Detection," training. 5 . Munir K pplied A Sciences D eep chieved A omputational C et al., "A (2022) Learning 95% complexity Novel (Hybrid precision and Deep of 94% Learning VGG16 accuracy in Approach and deepfake for CNN) detection Deepfake Dataset:Photos Image hopped real Detection and fake faces " dataset 6 D. Koli et I nternational Journal G ANs, fficientl E imited L a l., For Multidisciplinary Deep y dataset "Explorin Research (2022) Learning generated usage and g Dataset: high-qual improveme Generativ N/A ity facial nt needed e images in face Adversari using realism al GANs. Networks for Face Generatio n" 7 ake Face F I nternational AN G enerate G imited L Generator Journal of Dataset: d realistic diversity in : Advanced CelebA. human generated Generatin Computer faces faces; g Fake Science and with high dependency Human Applications quality on the Faces (IJACSA) quality of using (2022) the dataset. GAN. 8 uthor & A J ournal/ ools/ T ey K imitations L Paper Conference Techniqu Findings/ / Title (Year) es/ Results Gaps [Citation] Dataset Identified 8 . H S. omputational C NNs, C chieved A ay not M Shad et Intelligence and specificallythe 99% address all al., Neuroscience VGGFace accuracy variations "Compara (2021) model in deepfake tive Dataset: techniques; Analysis Kaggle reliant on of dataset theselected Deepfake (70,000 datasets. Image images from Detection Flickr and Method 70,000 Using images Convoluti produced by onal StyleGAN) Neural Network" 9. J. Lee and K. ersonal P and AN, G chieved A Overfitting ark, P Ubiquitous Random improved r isk in "GAN-based Computing Forest classification GAN, Imbalanced (2021) Dataset: performance needs Data Intrusion CICIDS2017 of rare attacks. further Detection dataset optimizatio System". n for larger datasets 1 0 O . Giudice et arXiv (2021) AN Specific A G chieved equires R . al., "Fighting Frequencies 99.9% additional Deepfakes by (GSF), accuracy. robustness Detecting GAN Discrete in noisy DCT Cosine scenarios Anomalies" Transform (DCT) Dataset: CelebA, FFHQ, Deepfak e datasets 9 uthor & A J ournal/ ools/ T ey K imitations L Paper Conference Techniqu Findings/ / Title (Year) es/ Results Gaps [Citation] Dataset Identified 11. L . Chai et al., uropean E AN models G ffective E ifferences D "What Makes Conference on (ProGAN, detection of in Fake Images ComputerVision StyleGAN, fake images preprocessi Detectable? (ECCV) (2020) Glow, etc.), through ng Understanding CNNs patch-level pipelines Properties That Dataset: classification. can affect Generalize," CelebA- accuracy if HQ, not FFHQ, properly and mitigated. others. 12. R uben arXiv (2020) eep D omprehensiv C imited L Tolosana, Learning, e survey of focus on Ruben GANs, Face deepfake real-time Vera-Rodriguez Manipulation techniquesand detection , Julian Fierrez, Detection detection and Javier methods, emerging Ortega-Garcia - covering techniques DeepFakes and state-of-the-art for Beyond: A detection improved Survey of Face models fake Manipulation generation and Fake Detection. 1 3 C hih-Chu pplied A airwise P roposes P ocuses F . ng Hsu, Sciences (2020) Learning, a on Yi-Xiu Deep pairwise image-bas Zhuang, Learning, learning ed Chia-Yen Image methodto deepfakes, Lee - Manipulation improve lacks Deep the exploratio Fake detection n of video Image accuracy deepfake Detection of detection Based on deepfake techniques Pairwise images Learning 10 uthor & A J ournal/ ools/ T ey K imitations L Paper Conference Techniqu Findings/ / Title (Year) es/ Results Gaps [Citation] Dataset Identified 14. Y . Guo et IEEE CID-HIST F igh H educed R al., "Fake ransactions on T (Histogram-b accuracy accuracy Colorized Image ased) & in with more Image Processing FCID-FE detecting advanced Detection, (2018) (Feature fake colorization " Extraction in colourize methods LAB space) d images detection methods Dataset: Images generated by state-of-the-ar t colorization techniques 15. S mith, J. ature Scientific N AN G chieved A ensitivity S (2018). Reports (2018) Dataset; high to training Deep CelebA, fidelity in data Fakes” FFHQ, image quality; using andother generatio potential Generativ datasets n and for misuse. e for face improved Adversari generatio detection al n methods Networks (GAN). Unpublish ed conferenc e presentati on, University of California San Diego. 11 uthor & A J ournal/ ools/ T ey K imitations L Paper Conference Techniqu Findings/ / Title (Year) es/ Results Gaps [Citation] Dataset Identified 1 6 K eiron npublished but U onvolut C etailed D imited L . Teilo available ional explanati coverage O'Shea - online(2015) Neural on of of An Network CNNs, advanced Introducti s layers CNN on to (CNNs), (convolut architectur Convoluti Filters, ional, es (e.g., onal Image pooling, ResNet, Neural Recognit and fully Inception) Networks ion connected . Doesnot Tasks ), and address applicatio computati ns in onal image complexiti processin es or g and alternative object techniques detection like RNNs. 2.2 KEY GAPS IN LITERATURE 1. Existing models for fake image detection face challenges with high c omputational resource demands, which hinder their efficiency and real-time application. Training and inference times are often long, reducing their practicality in dynamic scenarios, and models struggle to adapt to evolving deepfake techniques, leading to decreased accuracy. 2. Models trained on specific datasets have limited effectiveness when applied to d iverse or unseen images, highlighting the need for better generalization to improve cross-dataset and real-world applicability. 3. Current systems often ignore multimodal cues such as audio or text, but incorporating these features could enhance detection robustness by providing a richer context. 4. Many models operate as "black boxes," offering little transparency into their d ecision-making, and improving explainability would increase trust, especially in sensitive applications. 12 5. Ethical concerns, including the reinforcement of biases in training data and p redictions, as well as the need to minimize false positives and negatives, are crucial for ensuring fairness and reliability in areas like law enforcement and journalism. 13 CHAPTER 3: SYSTEM DEVELOPMENT 3.1 REQUIREMENTS AND ANALYSIS Effective system development begins with identifying and analyzing key requirements. This section outlines the tools, technologies, and processes utilized to support the project, ensuring alignment with the objectives of creating a robust deepfake detection system. 3.1.1 SYSTEM REQUIREMENTS Hardware Requirements NVIDIA GPU with CUDA Toolkit: Crucial for accelerating the training of convolutional neural networks (CNNs) used in deep face detection. Software Requirements Python Environment: Managed via Anaconda for simplified package management and seamless dependency resolution. Jupyter Notebook: Facilitates model experimentation and visualizingtraining results interactively. Google Colab: Provides additional GPU support and enables collaborative development. Libraries and Frameworks TensorFlow/Keras: For implementing and training CNN models. OpenCV: Handles image processing and preprocessing tasks. Pandas and NumPy: Essential for efficient data manipulation and numerical computations. 3.1.2 KEY FUNCTIONAL REQUIREMENTS The system must preprocess datasets that include both real and deepfake images to prepare them for model training and evaluation. 14 The system should train and compare different convolutional neuralnetwork (CNN) architectures, such as EfficientNet B0, B2, and B4, to identify the model that achieves optimal accuracy. The system must provide detailed performance metrics, including training time, accuracy, and loss, for each model during the evaluation phase. 3.1.3 KEY NON-FUNCTIONAL REQUIREMENTS The system should be scalable, capable of handling large datasets and adapting to future advancements in deepfake generation technologies without compromising performance. The system should ensure robustness, maintaining high detection accuracy even in the presence of low-quality, noisy, or compressed images. It should be efficient in terms of computational resource usage, minimizing the time required for training and inference while maintaining accuracy. The system should offer ease of integration with other tools and platforms for seamless development, experimentation, and deployment. The system must be secure, protecting sensitive data during the data collection, preprocessing, and model evaluation phases. 3.2 PROJECT DESIGN AND ARCHITECTURE The project architecture and design are an importantparttoensurethescalabilityof theproject,itsefficiencyandrobustnessaswell.Thissectionaimstooutlinethemain components oftheproject’sarchitecture,itsdesignconsiderationsandtheworkflows that show its functionality. 3.2.1 OVERVIEW OF PROJECT ARCHITECTURE This project makesefficientuseofmoderntoolsandtechnologiesinordertobuilda system that can detect fake images effectively and efficiently. The architecture also includes the components for data preprocessing, model training, evaluation and deployment. 15 Key elements of the architecture include: Data Pipeline: ○ Integrationwithandcollectionofdatasetscontainingbothrealandfake images. ○ Use of preprocessing tools like Python libraries (e.g., OpenCV, NumPy) to standardize and augment data. Model Training Environment: ○ TensorFlow framework used for developing Convolutional Neural Network (CNN) models. ○ NVIDIA GPUs with CUDA Toolkit, for accelerated training. ○ Various platforms like Anaconda Navigator, Jupyter Notebook and Google Colab for experimentation. Evaluation Metrics: ○ Metrics accuracy, precision, recall, and F1-score to validate model performance. 3.2.2 WORKFLOW DIAGRAM Theworkflowillustratestheend-to-endprocessofthesystem,fromdataacquisitionto comparative analysis of those models. Key steps include: 1. Data Collection:Gather datasets of real and fakeimages. 2. Data Preprocessing: Clean, augment, and splitdataintotraining,validation, and test sets. 3. Model Training: Train CNN models, such as ResNet101 and EfficientNet, using optimized hyperparameters. 4. Evaluation:Test model accuracy and analyze performancemetrics. Figure1explainsworkflowofthesystemandoutlinesthecompleteprocessfromdata acquisition to model evaluation. It begins with data collection, where datasets containing both real and fake images are gathered. This data is then preprocessed, involving steps like cleaning, augmenting, and splitting the data into training, validation, and testsetstoensurepropermodeltrainingandgeneralization.Oncethe data is prepared, thesystemproceedstomodeltraining,whereConvolutionalNeural Network Model GANistrainedusingoptimizedhyperparameterstoachievethebest 16 performance.Finally,themodelundergoesevaluation,whereitsaccuracyistestedand variousperformancemetrics,includingprecision,recall,andF1-score,areanalyzedto assess the model's ability to detect deep fake images effectively. Figure 1 : Workflow Diagram 3.2.3 DESIGN CONSIDERATIONS Toensureanefficientandeffectivesystem,thefollowingdesignconsiderationswere prioritized: Modular Design: The architecture isdividedintomodularcomponents(e.g., preprocessing, training, evaluation) to allow independent updates and scalability as we move ahead with its implementation. Performance Optimization: Use of GPUs and parallel processingtoreduce training time and improve inference speed. User-Friendly Interface: Integration with tools like Jupyter Notebook for easy interaction and visualization of results. Error Handling: Incorporating mechanisms to handle corrupted data, failed training runs, and other potential issues. 3.2.4 PROJECT ARCHITECTURE DIAGRAM 17 Theprojectarchitecturediagramprovidesahigh-levelviewofthesystemcomponents and their interactions: Data Input and Splitting Layer: Handles data ingestion, preprocessingand its splitting into test train and validation data. Training Layer: Includes the GAN Architecture that further consists of a generator using StyleGan or ProGan and a CNN models baseddiscriminator with the necessary computational environment (e.g., NVIDIA GPU, Intel GPU, CUDA Toolkit). Evaluation Layer: Provides metrics and insights to validate model performance. Figure 2 illustrates the project architecture and workflow for deep fake image detection. It begins with Data Collection,wheredatasetsofrealandfakeimagesare gathered. The collected data thenundergoesDataPreprocessing,includingstepslike resizing, formatting, and image enhancement to ensure consistency and improvethe quality of the data. Next, the data is Split into training and testing sets, with 70% allocatedfortrainingand30%fortesting.TheGANArchitectureplaysacrucialrole inthissystem,whereResamplingtechniquesareusedtohandleimbalanceddata.The data isCategorizedintotwoprimaryclasses:RareClassandOtherClasses,allowing for targeted training strategies. The GAN Generator (e.g., StyleGan or ProGan) generates synthetic data, which is then Resampled for training purposes.Inparallel, Model Training takes place using CNN-based architectures to train the GAN Discriminator for effective fake image detection. Finally, the system undergoes Testing Using Evaluation Metrics, where the model’s performance is evaluated, and Result Analysis helps determine the success and efficiency of the system. 18 Figure 2 : Project Architecture Diagram 3.2.5 KEY TECHNOLOGIES USED The following tools and technologies were essential in designing and implementing the system: Hardware: ○ NVIDIA GPUs with CUDA Toolkit, INTEL GPU for accelerated model training. Software: ○ Kaggle for dataset exploration and existing model explorations. ○ Python-based libraries for data processing (NumPy, Pandas, OpenCV). ○ Machine learning frameworks like TensorFlow and Keras. ○ Jupyter Notebook and Google Colab for development and experimentation.. 19 3.3 DATA PREPARATION 3.3.1 DATA PIPELINE The data pipeline ensures a streamlined process for preparing input data for model training and evaluation. Data Collection Dataset Used: ○ YonseiFakeandRealImageDataset:Contains2041images(960fake and 1081 real). ○ NVIDIA Flickr Dataset subset: Comprises 140k images(70krealand 70k fake generated by StyleGAN). Dataset Split: ○ Yonsei Dataset:Splitted using code into 80% trainingand 20% testing. ○ NVIDIA Flickr Dataset: Pre-splitted on kaggle into 50k images for training (real and fake each), 10k for validation, and 10k for testing. Data Preprocessing Preprocessing ensures consistent and high-quality input to the CNN models: Resizing: Images were resized to 150x150 or 224x224 pixels, depending on the model requirements. Normalization:Pixel values were normalized to therange [0, 1]. Data Augmentation: Techniques like horizontal flipping, zoom, shear, and rotation were applied to increase dataset diversity. Libraries Used:OpenCV, NumPy, TensorFlow/Keras utilities. 3.4 IMPLEMENTATION Thecurrentimplementationphaseoftheprojectinvolvedtranslatingthearchitectural blueprint into a functional system. The systemwasbuilttodetectfakeimagesusing GANs. 20 Till now we have explored various CNN models and their behaviours in order to developanefficienthybridusingthosemodelstoworkasGANdiscriminatorlaterin the project development. This chapter provides an in-depth description of the implementation done so far. 3.4.1 MODEL IMPLEMENTATION 3.4.1.1 Implementation Algorithm Step 1: Data Preparation Loaddatasets,suchastheYonseidatasetandNVIDIAFlickrdataset,ensuring proper organization into directories (e.g., train/real, train/fake). Preprocess images using libraries like OpenCV or TensorFlow utilities: ○ Resize images to required dimensions (e.g., 150x150 or 224x224 pixels). ○ Normalize pixel values to the range [0, 1]. ○ Augment data with techniques like flipping, zooming, rotation, and cropping to enhance diversity. Step 2: Dataset Splitting Split datasets into training, validation, and test sets. For example: ○ Yonsei Dataset: 80% training, 20% testing. ○ NVIDIA Flickr Dataset: 50k images for training, 10k for validation, and 10k for testing. Step 3: Model Initialization Load pre-trained CNN architectures. ○ Use weights="imagenet" to leverage pre-trained weights. ○ Exclude the top classification layer (include_top=False) to allow customization. Step 4: Custom Layer Design Add custom layers to the base model for binary classification: ○ Apply GlobalAveragePooling2D() to reduce spatial dimensions. 21 ○ Add fully connected dense layers with ReLU activation. ○ Incorporate dropout layers (e.g., 0.3) to prevent overfitting. ○ Use a final dense layer with sigmoid activation for binary classification. Step 5: Model Compilation Compile the model with appropriate loss functions and optimizers: ○ Use binary_crossentropy for binary classification tasks. ○ Choose optimizers like Adam or SGD with learning rate scheduling. ○ Define metrics such as accuracy for evaluation. Step 6: Training Configuration Configure training parameters: ○ Set batch sizes (e.g., 16 or 32) and epoch count (e.g., 10-25). ○ Incorporate callbacks like EarlyStopping, ModelCheckpoint, and ReduceLROnPlateautoenhancetrainingefficiencyandtomainlysave the training state of the model or weights avoiding overfitting. Step 7: Model Training Train the model using the fit method: ○ Provide training and validation datasets. ○ Monitor training and validation accuracy/loss at each epoch. ○ Use GPU acceleration (e.g., NVIDIA CUDA Toolkit) to reduce training time. Step 8: Model Evaluation Evaluate the trained model on the test set: ○ Calculate metrics such as accuracy, precision, recall, and F1-score. ○ Generate confusion matrices and ROC-AUC curves for detailed analysis. 22 This algorithm provides a structured approach, ensuring that every stage of implementation is well-documented and efficient.The project explored various CNN architectures to achieve high performance in detecting fake images. 3.4.1.2 Convolutional Neural Networks (CNNs) Used 1. ResNet50 and ResNet101 : ○ Architecture: A 50-layer and 101-layer deep residual networks respectively. ○ Key Features: Solves vanishing gradient issues using skip connections. ○ Implementation Details: Optimizer: SGD with learning rate decay as shown in fig.3. Loss Function: Binary Cross-Entropy. Results: Achieved 64.11% accuracy on the Yonsei datasetand over 62%on the NVIDIA Flickr dataset. Figure 3: Resent 50 model 2. EfficientNetB2 to B4: ○ Architecture: A lightweight and scalable CNN optimized for computational efficiency. ○ Key Features: Usage of compound scaling methods(Width scaling + Depth scaling + Resolution scaling). 23 ○ Implementation Details: Optimizer:AdamwithearlystoppingandReduceLROnPlateau callbacks. Loss Function: Binary Cross-Entropy. Results: Lower performance(~50%)oninitialtrialswithplans to test EfficientNetB4 for improvement, improved to 50% 3. XceptionNet: ○ Architecture: A deep learning model leveraging depthwiseseparable convolutions for efficient computation. ○ Key Features: Improves upon Inception architecture by using fewer parameters and achieving better performance on complex tasks. ○ Implementation Details: Optimizer: Adam. Loss Function: Categorical Cross-Entropy as shown in fig. 4. Results: Achieved 73% accuracy, highlighting room for optimization in training. Figure 4: Xception model 4. DenseNet121: ○ Architecture: A densely connected neural networkpromotingfeature reuse as shown in fig.5. ○ Key Features: Utilizes dense connection between layers in order to improve feature reuse. ○ Implementation Details: Techniques: Early stopping, ReduceLROnPlateau, and batch normalization. 24 Results: Achieved an accuracyof92%ontheNVIDIAdataset with robust generalization. Figure 5: DenseNet121 model 5. VGG16: ○ Architecture: A classic deep learning architecture with 16 layers pre-trained on ImageNet. ○ Key Features: Known for its deep convolutional layers, uses 3x3 filters and max pooling. ○ Implementation Details: Added fully connected layers with ReLU activation and batch normalization as shown in fig. 6. Optimizer: Adam with a learning rate of 0.0001. Loss Function: Categorical Cross-Entropy. Results: Achieved 95% ROC-AUC on the NVIDIA dataset. 25 Figure 6: Vgg16 model 3.4.2 EVALUATION METRICS USED Accuracy:Correctly classified images divided by thetotal number of images. Precision:Fraction of true positives among predictedpositives. Recall:Fraction of true positives among actual positives. F1-Score:Harmonic means of precision and recall. Training Time:Time required to train models on eachdataset. 3.5 KEY CHALLENGES Training Time: Training larger datasets such as the NVIDIA Flickr dataset required significant computational resources, with models like DenseNet121 taking over 4hourstotrain.ThisnecessitatedtheuseofNVIDIAGPUswith CUDA Toolkit for acceleration. EfficiencyonSmallerDatasets:TheYonseidataset,beingsmallerinsize,led to less efficient model performance due to overfitting and limited generalizability. Techniques like data augmentation and regularization were employed to improve outcomes. HighGPUSystemRequirements:Trainingdeeplearningmodelseffectively