CSIT375 L1 Introduction.pdf

Introduction CSIT375 AI for Cybersecurity Dr Manoj Kumar, UOWD Dr Chen Chen, UOWA SOCS Unive...

Introduction CSIT375 AI for Cybersecurity Dr Manoj Kumar, UOWD Dr Chen Chen, UOWA SOCS University of Wollongong in Dubai Disclaimer: The presentation materials come from various sources. For further information, check the references section Outline What is AI Brief introduction to cyber security How AI helps Cyber security Limitations of AI in Security What is AI Artificial Intelligence (AI) Artificial intelligence is a popular term that indicates algorithmic solutions to complex problems typically solved by humans. AI systems have been loosely defined to be machine-driven decision engines that can achieve near-human-level intelligence Artificial intelligence is the ability of machines/computers to perform tasks that are generally associated with human driven intellectual processes like reasoning. What is AI Artificial Intelligence (AI) has been moving extremely quickly in the last few years, demonstrating a potential to revolutionize every aspect of our lives Stock price prediction Ensuring compliance in workplace Work Economy Online meeting transcriptions Security Anomaly detection Mobility Risk assessment Self-driving cars What is AI AI can be broadly defined as technology that can learn and produce intelligent behavior Input Output Malware An AI Process “ransomeware” Cybersecurity What is AI AI can be broadly defined as technology that can learn and produce intelligent behavior Input Output “Four kids are playing Pixels: An AI Process with a ball” More than just a category Computer Vision about the image! What is AI AI can be broadly defined as technology that can learn and produce intelligent behavior Input Output Audio Clip: An AI Process “set an alarm for 7:00 a.m” Speech Recognition What is AI AI can be broadly defined as technology that can learn and produce intelligent behavior Input Output Text: “Hello, how are you?” An AI Process “Bonjour, comment allez-vous” Machine Translation What is AI Think of this as incoming impulses (input) passed from one neuron (AI process) to the next, if any, and finally generating an output Input Output AI process, which is essentially a mathematical function-- more on this next week What is AI Connect as many of these neurons as needed, resulting in what is called a neural network (a branch of AI) A neural network is a method in artificial intelligence that teaches computers to process data in a way that is inspired by the human brain The more layers you add, the deeper it becomes. Deep ones are referred to as deep neural networks or deep learning (DL) models 28 What is AI Subsequently, train the DL model botnets ransomware ransomware Example 1 What is AI Again, with a different known example A botnet is a group of Internet-connected devices, each of which runs one or more bots. botnets ransomware ransomware Example 2 What is AI Yet again, with another different known example botnets ransomware ransomware Repeat until it learns the patterns Example 3 of ransomware and botnets in the input data What is AI After training the DL model, use it to infer what an unknown malware is botnets ransomware ? √ Definitions AI: is the concept of creating intelligent machines ML: a branch of AI, helps to build AI-driven applications DL: a branch of machine learning, uses large amount of data and deep neural networks to train a model Definitions Cyber Security Cybersecurity is the body of technologies, processes and practices designed to protect networks, computers, programs and data from attack, damage or unauthorized access Goals: C-I-A triad Confidentiality unauthorized disclosure of information Integrity unauthorized modification of information Availability unauthorized withholding of information Others: non-repudiation, accountability, etc. What is cybersecurity all about? Protecting information and systems from threats and attacks ensuring they remain secure and functional In an organization, the people, processes, and technology need to complement one another to create an effective defense from cyber attacks Users: understand and comply with basic data security principles Processes: framework for how to deal with cyber attacks Technology: tools needed to protect from cyber attacks entities need to be protected: endpoint, networks, the cloud firewalls, malware protection, antivirus software, email security solutions… Common types of cybersecurity threats Phishing sending fraudulent emails that resemble emails from reputable sources steal sensitive data, e.g. credit card numbers, login information Ransomware extort money by blocking access to files or the computer systems until the ransom is paid Malware gain unauthorized access or to cause damage to a computer Social engineering trick people into revealing sensitive information How AI helps security AI + ML + Cyber Security Machine learning has been quickly adopted in cybersecurity for its potential to automate the detection and prevention of attacks, particularly for next-generation antivirus AI systems AI has a lot of potential for cybersecurity applications, such as learning from existing cyber incidents, predicting attacker behavior, and taking proactive defensive measures to protect critical infrastructures How AI helps security AI lets computers learn without being explicitly programmed Challenges in managing threat information track and correlate massive data not feasible to manage with only people => automate the analysis In security, AI continuously learns by analyzing data to find patterns and predict threats in massive data sets detect malware find insider threats keep people safe when browsing uncovering suspicious user behavior … How AI helps security Find threats on a network detect threats by constantly monitoring the behavior of the network for anomalies AI engines process massive amount of data in near real time to discover critical incidents detection of insider threats, unknown malware, and policy violations Keep people safe when browsing predict “bad neighborhoods” online to help prevent people from connecting to malicious websites analyze Internet activity to automatically identify attack infrastructures staged for current and emergent threats How AI helps security Provide endpoint malware protection detect unknown malware that is trying to run on endpoints identify new malicious files and activity based on the attributes and behaviors of known malware Protect data in the cloud analyzing suspicious cloud app login activity detecting location-based anomalies conducting IP reputation analysis to identify threats and risks in cloud apps and platforms Applications of AI in Security Monitoring Automated, continuous, feed data to analysts Things to look for: intrusion, privacy violation, exfiltration,... Analysis Not necessarily continuous Often initiated by humans Applications: threat intelligence, incident investigation, vulnerability assessment,... High-level view of a monitoring pipeline system logs data access logs Feature Anomaly Security network traces extraction detection analyst Example: detecting anomalous actor behavior Some reasons to care Employee account compromised by malware? Intentional malicious activity? Goal: model actor behavior, find anomalies What we need to do Identify useful features Model normalcy Find outliers Feature extraction: modeling actors Partition logs by actor and time Represent (actor, time) pairs as vectors of binary variables Modeling normalcy and finding outliers Need to find low-probability features or combinations of features Many possible approaches Nearest neighbors ⇒ similarity metric between actors Intuition: find users that are not very similar to any other users Variant: compare a user to her past “Neighbors” are feature vectors in user’s past Identify changes in behavior “Strange pairs”⇒ features that rarely appear together Intuition: identify users with pairs of features that occur frequently individually but rarely together E.g., “accessed source code” and “works in HR” AI for security analysis Skilled analysts are a valuable resource: give them the tools to use their time effectively Disk Analyst images signals Data network analysis logs, … system Questions about data Questions an analyst might ask Broad spectrum of tools Looking for causes and effects graph traversal Triaging malware classification Statistical (as well as graph-based) approaches are effective in this problem domain Example 1: graph traversal for incident investigation Some questions to answer: Were any machines affected by watering hole attack? User A downloaded malware. What should be cleaned up? Given a graph representation of all relevant logs, can be framed as a large-scale graph search problem Graph creation Log lines induce graph components Edges annotated with times and semantics Many different log sources in one huge graph Sample graph query Given watering hole hostname X... → IPs that it resolved to → internal IPs that talked to them → machines (assets) those internal IPs belonged to → users who used those machines → other machines those users have logged into Hours of manual research replaced by a ~10-second query Example 2: malware classification Given a binary, is it malware? If so, what kind? Each sample is an executable. It has indicators (features) from static and dynamic analysis (e.g., basic block structure, registry changes,...) Malware in training corpus also has one or more labels (from manual labeling, A/V signatures, etc.) denoting its families Why is this useful? Incident triage – is this malware we should care about? Modeling the data Each sample X is a sparse N-dimensional vector (N ≈ millions) Each label is an integer in [1, k] (k ≈ thousands) Learns a projection into a low-dimensional embedding space Makes the problem computationally feasible Provides meaningful metric inside embedding space Using the model Once the projection matrices are learned, we can do useful things Compare two samples? Project into embedding space, measure distance Closest family to a sample? Project sample and all families, find smallest distance Approximate nearest sample? Filter samples by closest family Limitations of AI in Security “Security is a process” Technology (AI, ML, etc.) is only a tool, not a complete solution User education (social engineering is surprisingly successful) System hardening (auth, secure engineering, timely patches,...) Operational procedures Adapting to growth (new hires / platforms) Maintaining alertness (in the absence of major incidents) Gathering intelligence Escalation and response playbooks Limitations of AI in Security False negatives are very expensive Could cause arbitrary damage to our users False positives are expensive too Analyst time is valuable Alerts should make sense to a human The analyst (security expert) is key False positives + inexplicable results → signal fatigue Summary Subject overview What is AI Brief introduction to cyber security AI Applications in Cybersecurity Limitations of AI in Security Some Important Tools 1. Crowdstrike Falcon is a security solution that uses an AI-based detection system called User and Entity Behavior Analytics (UEBA) 2. StringSifter is a machine-learning program that ranks strings based on their relevance to malware analysis. 3. BioHAIFCS is an acronym for Konstantinos Demertzis and Lazaros Iliadi’s Bio-inspired Hybrid Artificial Intelligence Framework for Cyber Security. 4. IBM’s QRadar Advisor is a cybersecurity product that uses AI to quickly mitigate problems while keeping the company’s bottom line intact. QRadar SIEM 5. The TAA tool is a cloud-based AI technology utilized in Broadcom’s Symantec’s enterprise-focused advanced threat prevention comprehensive cyber security platform. 6. Cognito by Vectra is an AI-powered technology that identifies and responds to cyberattacks in the cloud in minutes. 7. Sophos’ Intercept X is a cybersecurity technology that includes a deep learning neural network, transforming endpoint security from a reactive to a predictive strategy to defend against potential threats and cyber assaults. 8. TrendMicro’s DefPloreX is an AI-powered machine learning toolbox for large-scale cybercrime forensics. 9. Intraxpexion is an artificial intelligence (AI) software solution that applies Deep Learning algorithms to identify danger and deliver early alerts from threat detection. 10. Malwarebytes References Machine Learning , Analytics & Cyber Security the Next Level Threat Analytics, Manjunath N V Data mining for security at Google, Max Poletto Google security team WSABIE: scaling up to large vocabulary image annotation. Jason Weston, Samy Bengio, and Nicolas Usunier Machine learning foundational course, google developers AI for Medicine, Mohammad Hammoud, CMU Qatar Practical Machine Learning in Infosecurity, Clarence Chio and Anto Joseph

CSIT375 L1 Introduction.pdf

Document Details

Tags

Related

Full Transcript

Upgrade to continue