NTU Cyber Threat Intelligence Lifecycle Collection PDF

Summary

This document discusses the NTU Cyber Threat Intelligence Lifecycle Collection module. It covers topics related to collecting information and data for cyber threat intelligence. It also introduces the intelligence lifecycle, types of intelligence, and collection management.

Full Transcript

NTU Cyber Threat Intelligence Lifecycle Collection Welcome to the NTU Cyber Threat Intelligence Lifecycle course collection module. My name is Chris Carson and I'll be your instructor for this module. This module will cover a number of topics related to the collection of information and data used fo...

NTU Cyber Threat Intelligence Lifecycle Collection Welcome to the NTU Cyber Threat Intelligence Lifecycle course collection module. My name is Chris Carson and I'll be your instructor for this module. This module will cover a number of topics related to the collection of information and data used for cyber threat intelligence purposes. We'll begin with a brief introduction and recap of the intelligence lifecycle and then we'll talk about some of the types of intelligence that are often collected by CTI teams. We'll spend some time discussing each of these types in greater detail. And for the purposes of this module, we've broken them into three broad categories, clear deep and dark web intelligence, external telemetry and forensics based intelligence and internal telemetry and forensics based intelligence. Finally, we'll discuss how understanding our collection sources fits into the overall collection management process and broader intelligence framework. To recap, the intelligence lifecycle describes five distinct phases of the end-to-end intelligence process, which we use as an overarching framework for organizing the core workflows of a cyber threat intelligence team. You should have already taken the requirements and planning module, which details the intelligence requirements process and how it sets the foundation for the subsequent phases of the intelligence cycle. If you've not done so, please pause this module and review the requirements and planning module first. The collection phase, which is the focus of this module, relates to the collection of information and data that will be used to conduct analysis and produce intelligence products. Next, the processing and ingestion phase relates to the processes undertaken to ensure that the information and data we have collected is in a format that is appropriate for analysis and can be ingested into analytic tools and workbenches. The analysis and production phase focuses on the analysis of this data information to derive analytic assessments and judgments, which are then developed into intelligence products to enable stakeholder action and decision. Lastly, the dissemination and feedback phase includes all the processes and workflows associated with delivering intelligence products to stakeholders and soliciting feedback to refine intelligence requirements and products. Collection phase includes all the workflows, tools, and processes used to collect information and data to meet a team's intelligence needs, which, as you learned in the previous module, are formalized into collection requirements. The collection process includes identifying and leveraging existing intelligence sources that a team may have access to, as well as identifying intelligence gaps and seeking out new sources of intelligence that can fill those gaps. Collectively, these processes are often referred to as collection management, and in many organizations, particularly government and military intelligence organizations, collection management is treated as a unique discipline with its own tradecraft, tools, and skill sets. As you'll recall from the requirements and planning module, intelligence needs are generally broken down into three cascading elements. Intelligence requirements, or IRs, essential elements of information, or EEIs, and collection requirements, or CRs. Essentially, we can think of IRs as analytic questions we're trying to answer, EEIs as the factual pieces of information we would need to answer those questions, and CRs as the information and data we would need to collect to fulfill our EEIs. To put this into non-cyber terms, an RIR might be, will country X invade country Y? An EEI for that question might be, has country X significantly increased the number of troops stationed near country Y's border? And finally, a related CR might be, collect information related to the number of troops stationed within 50 miles of the border within country Y over the last six months. One additional concept that relates to the collection phase is what we call tasking, which is a formal assignment of a CR to a collection source in a format that is appropriate for that source. Continuing with the example of above, if we wanted to task our CR to an aerial photography asset, we may provide specific details or specifications appropriate for the capabilities of that asset, such as, please capture photos of country X military barracks, vehicles, and field hospitals within 50 miles of the border of country Y, which we would then be used to estimate the number of troops. To bring this back to the cyber threat intelligence space, some CRs and their associated taskings will only be appropriate for some sources based on the types of information and data that those sources specialize in collecting. As a result, your primary skill related to collection management is a good understanding of the types of sources appropriate for your intelligence needs and the strengths and limitations of each and the applications of that intelligence to your mission space. When thinking about collection and how to appropriately test collection requirements, we need to think about the types of data and information that can be leveraged for cyber threat intelligence purposes. To explore this concept, we'll be discussing three overarching categories of cyber threat intelligence. Clear, deep, and dark web intelligence, external telemetry and forensics-based intelligence, and internal telemetry and forensics-based intelligence. We'll also discuss common sources and vendors that would fall within each of these categories and the applications and limitations of the intelligence associated with each category. Clear, deep, and dark web intelligence refers to intelligence derived from data and information collected through manual or automated means from clear, deep, and dark web resources, usually in the form of text-based content. Clear web intelligence generally includes all news articles, blogs, tweets, videos, images, and other content that is publicly available, indexed by conventional search engines, and can be accessed without special permissions or processes. For example, if you've ever gathered cyber threat related intelligence through Google searches, you've conducted clear web collection. Deep web intelligence includes content that resides behind paywalls or on non-publicly available resources, which are not indexed by conventional search engines, but can be accessed through a conventional internet browser as long as the user has an account or the appropriate permissions to access that resource. For example, non-public content on a social media network that would require the user to be connected to the publisher of that content would fall within deep web collection. Dark web intelligence includes all content that resides on overlay networks that use the internet but require specific software, configurations, or authorization to access, such as the Tor network. These resources are, by their nature, unable to be indexed by conventional search engines and, in some cases, include content that is illegal or generated from illegal activity. When we think of deep and dark web resources as it relates to cyber threat intelligence, we're usually talking about content found on cyber criminal forums, illicit marketplaces, and social media networks that are commonly used by threat actors to communicate with one another. In addition to conducting their own open source research, most CTI teams use one or more vendors that specialize in the collection and dissemination of clear, deep, and dark web intelligence. Most of these vendors do this by using specialized tools that collect large amounts of data and content from a variety of sources, allowing them to gather relevant content at scale. Vendors in this space also collect and translate foreign language content and, in some cases, develop threat actor personas that are used to access restricted or invite-only resources, such as cyber criminal forums and marketplaces. Vendors will also use these personas to directly interact with the threat actors and, in some cases, purchase stolen credentials, data, and personally identifiable information being sold on those marketplaces. Vendors in this space generally maintain intelligence repositories that allow CTI analysts to create custom queries and alerts for specific types of information or data within their holdings. Creating custom queries in these platforms is a common way to task these types of vendors with your appropriate collection requirements. For example, if you have a CR to collect reportedly compromised credentials associated with your organization, you might develop an alerting rule that notifies you if the vendor collects any email addresses matching your domain from a cyber criminal marketplace. Due to the human intelligence capabilities of some of the vendors in this space, you may also want to task them to actively collect intelligence from specific resources or threat actors of interest to your organization. So now that we've learned a little bit about what collectors in this space do and how they do it, let's explore some of the common applications of this type of intelligence to the CTI space. Two of the most commonly leveraged applications of this type of intelligence is its use for notification of potentially compromised credentials or data. This often occurs when a threat actor attempts to sell compromised credentials or data to other threat actors, which are then used to conduct downstream cyber attacks, fraud, or identity theft. Sometimes we also see claims of initial compromise of an organization, application, or other resource. Some threat actors, often referred to as initial access brokers, specialize in developing and selling footholds and compromised systems. They then advertise and sell this access to other threat actors who use this access to conduct follow-on cyber attacks, such as data exfiltration or ransomware deployment. CTI teams can also use this type of intelligence to discover when critical vendors or supply chain organizations have been potentially compromised, allowing their organizations to initiate third-party risk mitigation activities. One key aspect to note about these applications is that threat actors often provide as little information as possible about the organization, domain, or application they've compromised. Remember, their goal is to entice potentially interested buyers, but they're also aware that intelligence vendors and law enforcement organizations regularly monitor cyber criminal forms and marketplaces looking for this kind of information too. This is when vendors will leverage their threat actor personas to engage directly with an actor selling something of probable interest to their customers in an attempt to gather more details about the compromised resource or data, and in some cases, purchase it on behalf of the victim organization. Lastly, this type of intelligence is often used to understand emerging threats. This may come in the form of threat actors selling new types of commodity malware and tools, or discussions around tactics, techniques, and procedures. Threat actors also use deep and dark web forums to recruit other actors for specific purposes or operations. This may reveal plans and intentions related to emerging operations or targeting of specific industry verticals. With web intelligence, there are a few caveats to consider as it relates to the ultimate sources of this intelligence. Specifically, you need to consider if the source is reliable and authoritative and approach this type of intelligence with a level of healthy skepticism. One of the most common issues with this type of intelligence is that much of it ultimately derives from threat actors themselves. It's important to remember that threat actor claims may be false, exaggerated, or contain inaccuracies. Some threat actors will claim to have compromised an organization or application for notoriety, or will attempt to resell compromised credentials or data from historical breaches in an attempt to make a quick profit. With this in mind, it's important to remember that this type of intelligence is often the starting point for an investigation rather than a definitive source of truth. For example, if a threat actor claims to have compromised an organization, their CTI team should communicate that to appropriate stakeholders and work with incident responders to investigate it as a potential incident. But a CTI team should never consider a threat actor claim to be a definitive proof of compromise. Due to the nature of this intelligence, which often derives from claims made after an incident has already occurred, is generally more valuable for initiating incident response and risk mitigation activities than it is for conducting proactive defensive measures, such as threat hunting or detections engineering. In other words, intelligence from clear, deep, and dark web sources tends to lead to reactionary incident response workflows rather than strategic planning. Lastly, it's important to remember that engaging with threat actors either directly or through a vendor can create several risks for an organization. For example, a threat actor selling stolen credentials may be a sanctioned entity, and therefore purchasing those credentials could result in legal risks to your organization. As a result, CTI teams should always ensure they are operating within their organization's risk appetite and legal frameworks when engaging with threat actors, even if they're doing so through a vendor. Now that we've discussed clear, deep, and dark web intelligence, let's discuss some of the considerations and thoughts related to external telemetry and forensics-based intelligence. External telemetry and forensics-based intelligence refers to intelligence derived from the sensors, logs, security appliances, networks, and applications of external organizations. This data may be collected in an automated fashion from sensors and security solutions or manually during incident response engagements. Because this data is most often collected from security solutions, it is, by its nature, intended to be used for incident response and intelligence purposes. As a result, the data elements collected are often rich, fit for purpose, and include both host and network- based indicators. Most vendors in this space aggregate data collected from multiple customer environments, which allows them to derive unique technical insights on threats affecting multiple customers or targeting specific systems, applications, or industry verticals. CTI teams may collect some external telemetry and forensic-based intelligence from industry or government intelligence partners, but generally this type of intelligence comes primarily from vendors that develop, deploy, and manage technical cybersecurity solutions or applications. For most of these vendors, their cybersecurity solutions and applications are their core product lines and are used to directly support incident response, network defense, and other threat management activities. Most organizations with developing or mature cybersecurity programs will very likely already use one or more of these vendors to support their cybersecurity operations, even if they don't use them for intelligence purposes. As a result, these types of vendors often have the greatest visibility on the threats impacting the types of systems and applications their solutions were designed to detect and mitigate. They are also able to aggregate data from multiple customers to derive unique insights on the threat landscape that can be leveraged to create a variety of intelligence products, including threat feeds, finished intelligence, and threat hunting. So now that we've learned a little bit about what external telemetry and forensic-based intelligence is, let's consider its applications to cyber threat intelligence. As mentioned, these types of intelligence sources provide high-confidence technical insights on threats to systems and applications that are detailed enough to be leveraged for proactive network defense purposes. As a result, these types of vendors are a primary source of high-fidelity observables and indicators of compromise, such as IPs and URLs, that can be directly leveraged to support network defense, threat hunting, and attribution. For example, a URL identified as a credential harvesting site associated with a ransomware operation could be leveraged to create alerts or blocks for users attempting to navigate to that website, used to conduct retroactive hunts to see if any users in your environment connected to it in the past, and in the event of positive hits for either, establish probable attribution to the ransomware operation, which may inform additional incident response and risk mitigation activities. Similarly, these vendors can often provide technical analysis and detection rules for observed malware samples that can be leveraged to identify malware in your environment and support incident response processes. But perhaps more importantly, these vendors often provide detailed analyses on the specific ways in which threat actors conduct their attacks, most often referred to as tactics, techniques, and procedures, or TTPs. While threat actors can easily change the email addresses or websites they use to support their operations, changing their strategies, behaviors, and tools is far more difficult. By understanding threat actor TTPs, organizations can develop threat-informed defensive measures and prioritize investment in engineering and security solutions that are most likely to mitigate their primary threats. In a similar vein, the aggregated nature of these insights provides an understanding of the types of threats affecting different types of organizations, geographic regions, and technology solutions, allowing network defenders to prioritize the mitigation of threats that are the most likely to target their organizations. We've explored the application of external telemetry and forensics-based intelligence. Let's briefly touch upon a few issues to consider when using these types of sources. As mentioned, the key takeaway from a stakeholder perspective is that this type of intelligence can be used to support several proactive network defense activities and inform prioritization, investment, and security engineering. However, those types of applications generally require a high degree of cybersecurity acumen and organizational maturity. To put this into perspective, if you don't have any stakeholders capable of conducting a behavior-based threat hunt, then developing behavior-based cyber threat intelligence products probably won't be of much value to those stakeholders. Lastly, when considering which vendors to onboard or task for collection, it's important to consider that vendors in this space have different types of visibility depending upon the types of cybersecurity products they manage and their geographic footprint. For example, if you were working with a vendor based in Canada that sells an email security solution, their intelligence offerings are likely to focus on email-based threats affecting organizations in North America. This means that intelligence vendors and the ways in which they are tasked should be approached based on alignment with your organization's security needs, technology stack, and geographic location. Now that we've discussed clear deep and dark web intelligence and external telemetry and forensics-based intelligence, let's take a look at the last category, internal telemetry and forensics-based intelligence. We've learned about some of the threat intelligence sources that will help us understand the threat landscape. Let's talk about threat intelligence collected from our own environment. Essentially, internal telemetry and forensics-based intelligence includes any data or information collected from internal sensors, security appliances, applications, logs, and post-incident forensics that can be used to answer our intelligence requirements. This intelligence can be analyzed by itself or can be correlated and enriched with external intelligence sources to derive unique, actionable, and highly relevant insights. However, this type of intelligence is often the most difficult for CTI teams to fully leverage, even in mature organizations. As a result, fully integrating internal telemetry and forensics-based intelligence into a CTI program is often considered to be the holy grail of many CTI teams. Let's explore why. At its core, internal telemetry and forensics-based intelligence is any threat intelligence that is derived from data collected from your organization's environment. Nearly every component of a network generates log data that, when appropriately parsed, cleaned, aggregated, and analyzed, can provide highly valuable insights on the threat activity targeting or impacting the environment in which it was collected. Many security solutions and applications generate contextualized security alerts, which are primarily intended to support incident responders but can also be leveraged by intelligence analysts to understand their threat environment. However, effectively leveraging this data is harder than it sounds. First, this data is coming from applications and components developed by a variety of technology vendors or, in some cases, in-house engineering teams. As a result, the logs and alerts from these systems often contain different data elements and data components, which require custom parsing and processing prior to human analysis. Second, most of these applications are simply logging events, such as logins, so the vast majority of data is not threat-related. In other words, once the data is in a format that it can be examined, an immediate challenge is isolating the data that is actually associated with malicious activity. Third, once relevant threat data is isolated, it needs to be correlated into an incident. To put this into perspective, let's say you have an incident in which a threat actor sends a phishing email, a user clicks on that email and is taken to a credential harvesting website, the user inputs their credentials, and then the threat actor takes those credentials and attempts to log in remotely but is blocked. Each step of this attack chain is going to be captured by a different sensor or log source and across a time frame of several hours or days. All of these data components then need to be stitched together to understand what the threat actor did and how they did it before it can be leveraged for broader intelligence applications. In mature organizations, much of this work is carried out by incident responders and investigators using specialized tools and platforms, such as the Security Information and Event Management or SIEM solution. If this is the case, a CTI team may be able to integrate the findings from its incident response team as a collection resource itself. Now that we've learned what internal telemetry and forensics-based intelligence is and why it can be a challenge to collect, let's discuss why it's often considered one of the best, if not the best, source of intelligence for a CTI program. To understand this, let's think back to our discussion on external telemetry and forensics-based intelligence. Essentially, all external intelligence sources are used as a proxy for understanding what you're likely to encounter in your own environment, but internally-derived intelligence is different because internal intelligence provides you with direct insight on threats that have actually targeted or impacted your environment, and more importantly, how the threat actors conducted those attacks. The ability to understand threat actor tactics, techniques, and procedures used against your own organization provides invaluable, actionable insights for nearly all of your stakeholders. In a similar vein, internal threat intelligence provides unique insights on your organization's attack surface, such as the applications or systems targeted by threat actors and the primary means by which threat actors gain initial access to your environment. You can also compare internal intelligence with your external intelligence holdings to determine if there are any possible overlaps with known threat actors or activity clusters. And if you do observe overlaps, you can now leverage that external intelligence to directly inform incident responders about additional behaviors they may need to investigate, as well as directly inform strategic investment and engineering undertaken to prevent that threat actor from compromising your organization again in the future. To explore what internal telemetry and forensics-based intelligence is and why it's so valuable, we should conclude this section with a few considerations for thinking about internal data as a potential collection source. This may seem obvious, but cannot be stressed enough that the robustness and accessibility of this type of intelligence is going to vary significantly from organization to organization, depending on its technology stack, organizational maturity, and resource constraints. And, as a result, the strategies used to incorporate this type of intelligence will be different for every organization. On a similar note, unlike the other two categories of intelligence we covered, which are most often purchased from vendors or acquired from intelligence-sharing partners, this type of intelligence must be developed in-house. As a result, some organizations may be hesitant to invest the engineering and personnel resources required to develop such a capability, particularly if the value proposition is not clearly understood by decision makers. Lastly, it's important to consider that much of the work that is required to collect and process this intelligence is more closely aligned to the skill sets and mission space of an incident response team rather than a CTI team. This suggests that a CTI team may need to either rely on incident responders to support these workflows or recruit individuals with backgrounds in incident response to do so themselves. Ultimately, this type of intelligence is extremely valuable from a CTI perspective and allows a CTI team to provide unique and highly relevant insights to stakeholders. But, developing this type of capability is difficult and few organizations are able to effectively leverage their internal threat data for intelligence purposes appropriately. This brings us to the end of this module, but before we conclude, let's briefly recap a few key ideas about collection. As mentioned, the collection phase of the intelligence life cycle includes all the workflows, tools, and processes used to collect information and data to meet a CTI team's intelligence needs, which as you learned earlier, are formalized into collection requirements. To action our collection requirements, we'll need to task them to appropriate collectors that generally fall within one of three categories. Clear, deep, and dark web intelligence includes all data, information, and intelligence derived from content found in clear, deep, and dark web resources, usually in the form of text-based content. Due to the nature of this intelligence, which is often derived from claims made after an incident has already occurred, it is generally more valuable for initiating incident response and risk mitigation activities than it is for conducting proactive defensive measures such as threat hunting or detections engineering. External telemetry and forensics-based intelligence refers to intelligence derived from the sensors, logs, security appliances, networks, and applications of external organizations. This data is primarily collected by vendors that manage sensor or security solution deployments or who collect it manually during incident response engagements. These types of intelligence sources provide high confidence, technical insights on threats to systems and applications, and are detailed enough to leverage for proactive network defense purposes. However, they are essentially a proxy for understanding what you're likely to encounter in your own environment. Lastly, internal telemetry and forensics-based intelligence includes any data or information collected from internal sensors, security appliances, applications, logs, and post-incident forensics. This intelligence can be analyzed by itself, or it can be correlated and enriched with external intelligence sources to derive unique, actionable, and highly relevant insights. To briefly tie this all back to the collection management process, consider how we might develop a collection tasking from a sample collection requirements and test them to an appropriate source. For example, if we consider CR1, which reads, reports of incidents impacting banks or financial institutions, we might develop a tasking for a source that falls within the external telemetry and forensics-based intelligence category. Whereas, if we look at CR4, which relates to reports of threat actors claiming to have targeted or shown interest in targeting banks or financial institutions, this requirement would likely be more appropriate for a clear, deep, and dark web collection source. So, now that you've learned about the different categories of cyber threat intelligence and how we appropriately task them to collect intelligence, we'll learn about how that intelligence is analyzed and turned into intelligence projects in future modules. Thank you for completing the collection module of the Cyber Threat Intelligence Lifecycle course. If you have any questions regarding any of the content in this module, I can be contacted at christopher.carston at mastercard.com. I hope you've enjoyed this module and I look forward to seeing you during future modules.

Use Quizgecko on...
Browser
Browser