Alice & Bob Learn Application Security PDF

Table of Contents Cover Introduction Pushing Left About This Book Out-of-Scope Topics The Answer Key Part I: What You Must Know to Write Code Safe Enough to Put on the Internet CHAPTER 1: Security Fundamentals The Security Mandate: CIA Assume Breach Insider Threats Defense in Depth Least Privilege Supply Chain Security Security by Obscurity Attack Surface Reduction Hard Coding Never Trust, Always Verify Usable Security Factors of Authentication Exercises CHAPTER 2: Security Requirements Requirements Requirements Checklist Exercises CHAPTER 3: Secure Design Design Flaw vs. Security Bug Secure Design Concepts Segregation of Production Data Threat Modeling Exercises CHAPTER 4: Secure Code Selecting Your Framework and Programming Language Untrusted Data HTTP Verbs Identity Session Management Bounds Checking Authentication (AuthN) Authorization (AuthZ) Error Handling, Logging, and Monitoring Exercises CHAPTER 5: Common Pitfalls OWASP Defenses and Vulnerabilities Not Previously Covered Race Conditions Closing Comments Exercises Part II: What You Should Do to Create Very Good Code CHAPTER 6: Testing and Deployment Testing Your Code Testing Your Application Testing Your Infrastructure Testing Your Database Testing Your APIs and Web Services Testing Your Integrations Testing Your Network Deployment Exercises CHAPTER 7: An AppSec Program Application Security Program Goals Application Security Activities Application Security Tools CHAPTER 8: Securing Modern Applications and Systems APIs and Microservices Online Storage Containers and Orchestration Serverless Infrastructure as Code (IaC) Security as Code (SaC) Platform as a Service (PaaS) Infrastructure as a Service (IaaS) Continuous Integration/Delivery/Deployment Dev(Sec)Ops The Cloud Cloud Workflows Modern Tooling Modern Tactics Summary Exercises Part III: Helpful Information on How to Continue to Create Very Good Code CHAPTER 9: Good Habits Password Management Multi-Factor Authentication Incident Response Fire Drills Continuous Scanning Technical Debt Inventory Other Good Habits Summary Exercises CHAPTER 10: Continuous Learning What to Learn Take Action Exercises Learning Plan CHAPTER 11: Closing Thoughts Lingering Questions Conclusion APPENDIX A: Resources Introduction Chapter 1: Security Fundamentals Chapter 2: Security Requirements Chapter 3: Secure Design Chapter 4: Secure Code Chapter 5: Common Pitfalls Chapter 6: Testing and Deployment Chapter 7: An AppSec Program Chapter 8: Securing Modern Applications and Systems Chapter 9: Good Habits Chapter 10: Continuous Learning APPENDIX B: Answer Key Chapter 1: Security Fundamentals Chapter 2: Security Requirements Chapter 3: Secure Design Chapter 4: Secure Code Chapter 5: Common Pitfalls Chapter 6: Testing and Deployment Chapter 7: An AppSec Program Chapter 8: Securing Modern Applications and Systems Chapter 9: Good Habits Chapter 10: Continuous Learning Index End User License Agreement List of Illustrations Introduction Figure I-1: System Development Life Cycle (SDLC) Figure I-2: Shifting/Pushing Left Chapter 1 Figure 1-1: The CIA Triad is the reason IT Security teams exist. Figure 1-2: Confidentiality: keeping things safe Figure 1-3: Integrity means accuracy. Figure 1-4: Resilience improves availability. Figure 1-5: Three layers of security for an application; an example of defens… Figure 1-6: A possible supply chain for Bob’s doll house Figure 1-7: Example of an application calling APIs and when to authenticate Chapter 2 Figure 2-1: The System Development Life Cycle (SDLC) Figure 2-2: Data classifications Bob uses at work Figure 2-3: Forgotten password flowchart Figure 2-4: Illustration of a web proxy intercepting web traffic Chapter 3 Figure 3-1: The System Development Life Cycle (SDLC) Figure 3-2: Flaws versus bugs Figure 3-3: Approximate cost to fix security bugs and flaws during the SDLC Figure 3-4: Pushing left Figure 3-5: Using a web proxy to circumvent JavaScript validation Figure 3-6: Example of very basic attack tree for a run-tracking mobile app Chapter 4 Figure 4-1: Input validation flowchart for untrusted data Figure 4-2: Session management flow example Chapter 5 Figure 5-1: CRSF flowchart Figure 5-2: SSRF flowchart Chapter 6 Figure 6-1: Continuous Integration/Continuous Delivery (CI/CD) Chapter 7 Figure 7-1: Security activities added to the SDLC Chapter 8 Figure 8-1: Simplified microservice architecture Figure 8-2: Microservice architecture with API gateway Figure 8-3: Infrastructure as Code workflow Figure 8-4: File integrity monitoring and application control tooling at work… Alice & Bob Learn Application Security Tanya Janca Introduction Why application security? Why should you read this book? Why is security important? Why is it so hard? If you have picked up this book, you likely already know the answer to this question. You have seen the headlines of companies that have been “hacked,” data breached, identities stolen, businesses and lives ruined. However, you may not be aware that the number-one reason for data breaches is insecure software, causing between 26% and 40% of leaked and stolen records (Verizon Breach Report, 2019).1 Yet when we look at the budgets of most companies, the amount allocated toward ensuring their software is secure is usually much, much lower than that. Most organizations at this point are highly skilled at protecting their network perimeter (with firewalls), enterprise security (blocking malware and not allowing admin rights for most users), and physical security (badging in and out of secure areas). That said, reliably creating secure software is still an elusive goal for most organizations today. Why? Right now, universities and colleges are teaching students how to code, but not teaching them how to ensure the code they write is secure, nor are they teaching them even the basics of information security. Most post-secondary programs that do cover security just barely touch upon application security, concentrating instead on identity, network security, and infrastructure. Imagine if someone went to school to become an electrician but they never learned about safety. Houses would catch fire from time to time because the electricians wouldn’t know how to ensure the work that they did was safe. Allowing engineering and computer science students to graduate with inadequate security training is equally dangerous, as they create banking software, software that runs pacemakers, software that safeguards government secrets, and so much more that our society depends on. This is one part of the problem. Another part of the problem is that (English-language) training is generally extremely expensive, making it unobtainable for many. There is also no clear career path or training program that a person can take to become a secure coder, security architect, incident responder, or application security engineer. Most people end up with on-the-job training, which means that each of us has a completely different idea of how to do things, with varying results. Adding to this problem is how profitable it is to commit crimes on the internet, and with attribution (figuring out who did the crime) being so difficult, there are many, many threats facing any application hosted on the internet. The more valuable the system or the data within it, the more threats it will face. The last part of this equation is that application security is quite difficult. Unlike infrastructure security, where each version of Microsoft Windows Server 2008 R2 PS2 is exactly the same, each piece of custom software is a snowflake; unique by design. When you build a deck out of wood in your backyard and you go to the hardware store to buy a 2x4 that is 8 feet long, it will be the same in every store you go to, meaning you can make safe assumptions and calculations. With software this is almost never the case; you must never make any assumptions and you must verify every fact. This means brute-force memorization, automated tools, and other one-size-fits-all solutions rarely work. And that makes application security, as a field, very challenging. Pushing Left If you look at the System Development Life Cycle (SDLC) in Figure I-1, you see the various phases moving toward the right of the page. Requirements come before Design, which comes before Coding. Whether you are doing Agile, Waterfall, DevOps, or any other software development methodology, you always need to know what you are building (requirements), make a plan (design), build it (coding), verifying it does all that it should do, and nothing more (testing), then release and maintain it (deployment). Figure I-1: System Development Life Cycle (SDLC) Often security activities start in the release or testing phases, far to the right, and quite late in the project. The problem with this is that the later in the process that you fix a flaw (design problem) or a bug (implementation problem), the more it costs and the harder it is to do. Let me explain this a different way. Imagine Alice and Bob are building a house. They have saved for this project for years, and the contractors are putting the finishing touches on it by putting up wallpaper and adding handles on the cupboards. Alice turns to Bob and says, “Honey, we have 2 children but only one bathroom! How is this going to work?” If they tell the contractors to stop working, the house won’t be finished on time. If they ask them to add a second bathroom, where will it go? How much will it cost? Finding out this late in their project would be disastrous. However, if they had figured this out in the requirements phase or during the design phase it would have been easy to add more bathrooms, for very little cost. The same is true for solving security problems. This is where “shifting left” comes into play: the earlier you can start doing security activities during a software development project, the better the results will be. The arrows in Figure I-2 show a progression of starting security earlier and earlier in your projects. We will discuss later on what these activities are. Figure I-2: Shifting/Pushing Left About This Book This book will teach you the foundations of application security (AppSec for short); that is, how to create secure software. This book is for software developers, information security professionals wanting to know more about the security of software, and anyone who wants to work in the field of application security (which includes penetration testing, aka “ethical hacking”). If you are a software developer, it is your job to make the most secure software that you know how to make. Your responsibility here cannot be understated; there are hundreds of programmers for every AppSec engineer in the field, and we cannot do it without you. Reading this book is the first step on the right path. After you’ve read it, you should know enough to make secure software and know where to find answers if you are stuck. Notes on format: There will be examples of how security issues can potentially affect real users, with the characters Alice and Bob making several appearances throughout the book. You may recall the characters of Alice and Bob from other security examples; they have been being used to simplify complex topics in our industry since the advent of cryptography and encryption. Out-of-Scope Topics Brief note on topics that are out of scope for this book: incident response (IR), network monitoring and alerting, cloud security, infrastructure security, network security, security operations, identity and access management (IAM), enterprise security, support, anti-phishing, reverse engineering, code obfuscation, and other advanced defense techniques, as well as every other type of security not listed here. Some of these topics will be touched upon but are in no way covered exhaustively in this book. Please consume additional resources to learn more about these important topics. The Answer Key At the end of each chapter are exercises to help you learn and to test your knowledge. There is an answer key at the end of the book; however, it will be incomplete. Many of questions could be an essay, research paper, or online discussion in themselves, while others are personal in nature (only you can answer what roadblocks you may be facing in your workplace). With this in mind, the answer key is made up of answers (when possible), examples (when appropriate), and some skipped questions, left for online discussion. In the months following the publication of this book, you will be able to stream recorded discussions answering all of the exercise questions online at youtube.com/shehackspurple under the playlist “Alice and Bob Learn Application Security.” You can subscribe to learn about new videos, watch the previous videos, and explore other free content. You can participate live in the discussions by subscribing to the SheHacksPurple newsletter to receive invitations to the streams (plus a lot of other free content) at newsletter.shehackspurple.ca. It doesn’t cost anything to attend the discussions or watch them afterward, and you can learn a lot by hearing other’s opinions, ideas, successes, and failures. Please join us. Part I What You Must Know to Write Code Safe Enough to Put on the Internet In This Part Chapter 1: Security Fundamentals Chapter 2: Security Requirements Chapter 3: Secure Design Chapter 4: Secure Code Chapter 5: Common Pitfalls CHAPTER 1 Security Fundamentals Before learning how to create secure software, you need to understand several key security concepts. There is no point in memorizing how to implement a concept if you don’t understand when or why you need it. Learning these principles will ensure you make secure project decisions and are able to argue for better security when you face opposition. Also, knowing the reason behind security rules makes them a lot easier to live with. The Security Mandate: CIA The mandate and purpose of every IT security team is to protect the confidentiality, integrity, and availability of the systems and data of the company, government, or organization that they work for. That is why the security team hassles you about having unnecessary administrator rights on your work machine, won’t let you plug unknown devices into the network, and wants you to do all the other things that feel inconvenient; they want to protect these three things. We call it the “CIA Triad” for short (Figure 1-1). Let’s examine this with our friends Alice and Bob. Alice has type 1 diabetes and uses a tiny device implanted in her arm to check her insulin several times a day, while Bob has a “smart” pacemaker that regulates his heart, which he accesses via a mobile app on this phone. Both of these devices are referred to as IoT medical device implants in our industry. Figure 1-1: The CIA Triad is the reason IT Security teams exist. NOTE IoT stands for Internet of Things, physical products that are internet connected. A smart toaster or a fridge that talks to the internet are IoT devices. Confidentiality Alice is the CEO of a large Fortune 500 company, and although she is not ashamed that she is a type 1 diabetic, she does not want this information to become public. She is often interviewed by the media and does public speaking, serving as a role model for many other women in her industry. Alice works hard to keep her personal life private, and this includes her health condition. She believes that some people within her organization are after her job and would do anything to try to portray her as “weak” in an effort to undermine her authority. If her device were to accidentally leak her information, showing itself on public networks, or if her account information became part of a breach, this would be highly embarrassing for her and potentially damaging to her career. Keeping her personal life private is important to Alice. Bob, on the other hand, is open about his heart condition and happy to tell anyone that he has a pacemaker. He has a great insurance plan with the federal government and is grateful that when he retires he can continue with his plan, despite his pre-existing condition. Confidentiality is not a priority for Bob in this respect (Figure 1-2). Figure 1-2: Confidentiality: keeping things safe NOTE Confidentiality is often undervalued in our personal lives. Many people tell me they “have nothing to hide.” Then I ask, “Do you have curtains on your windows at home? Why? I thought that you had nothing to hide?” I’m a blast at parties. Integrity Integrity in data (Figure 1-3) means that the data is current, correct, and accurate. Integrity also means that your data has not been altered during transmission; the correct value must be maintained during transit. Integrity in a computer system means that the results it gives are precise and factual. For Bob and Alice, this may be the most crucial of the CIA factors: if either of their systems gives them incorrect treatment, it could result in death. For a human being (as opposed to a company or nation-state), there does not exist a more serious repercussion than end of life. The integrity of their health systems is crucial to ensuring they both remain in good health. Figure 1-3: Integrity means accuracy. CIA is the very core of our entire industry. Without understanding this from the beginning, and how it affects your teammates, your software, and most significantly, your users, you cannot build secure software. Availability If Alice’s insulin measuring device was unavailable due to malfunction, tampering, or dead batteries, her device would not be “available.” Since Alice usually checks her insulin levels several times a day, but she is able to do manual testing of her insulin (by pricking her finger and using a medical kit designed for this purpose) if she needs to, it is somewhat important to her that this service is available. A lack of availability of this system would be quite inconvenient for her, but not life-threatening. Bob, on the other hand, has irregular heartbeats from time to time and never knows when his arrhythmia will strike. If Bob’s pacemaker was not available when his heart was behaving erratically, this could be a life-or-death situation if enough time elapsed. It is vital that his pacemaker is available and that it reacts in real time (immediately) when an emergency happens. Bob works for the federal government as a clerk managing secret and top-secret documents, and has for many years. He is a proud grandfather and has been trying hard to live a healthy life since his pacemaker was installed. NOTE Medical devices are generally “real-time” software systems. Real- time means the system must respond to changes in the fastest amount of time possible, generally in milliseconds. It cannot have delays—the responses must be as close as possible to instantaneous or immediate. When Bob’s arrhythmia starts, his pacemaker must act immediately; there cannot be a delay. Most applications are not real-time. If there is a 10-millisecond delay in the purchase of new running shoes, or in predicting traffic changes, it is not truly critical. Figure 1-4: Resilience improves availability. NOTE Many customers move to “the cloud” for the sole reason that it is extremely reliable (almost always available) when compared to more traditional in-house data center service levels. As you can see in Figure 1- 4, resilience improves availability, making public cloud an attractive option from a security perspective. The following are security concepts that are well known within the information security industry. It is essential to have a good grasp of these foundational ideas in order to understand how the rest of the topics in this book apply to them. If you are already a security practitioner, you may not need to read this chapter. Assume Breach “There are two types of companies: those that have been breached and those that don’t know they’ve been breached yet.”2 It’s such a famous saying in the information security industry that we don’t even know who to attribute it to anymore. It may sound pessimistic, but for those of us who work in incident response, forensics, or other areas of investigation, we know this is all too true. The concept of assume breach means preparation and design considerations to ensure that if someone were to gain unapproved access to your network, application, data, or other systems, it would prove difficult, time-consuming, expensive, and risky, and you would be able to detect and respond to the situation quickly. It also means monitoring and logging your systems to ensure that if you don’t notice until after a breach occurs, at least you can find out what did happen. Many systems also monitor for behavioral changes or anomalies to detect potential breaches. It means preparing for the worst, in advance, to minimize damage, time to detect, and remediation efforts. Let’s look at two examples of how we can apply this principle: a consumer example and a professional example. As a consumer, Alice opens an online document-sharing account. If she were to “assume breach,” she wouldn’t upload anything sensitive or valuable there (for instance, unregistered intellectual property, photos of a personal nature that could damage her professional or personal life, business secrets, government secrets, etc.). She would also set up monitoring of the account as well as have a plan if the documents were stolen, changed, removed, shared publicly, or otherwise accessed in an unapproved manner. Lastly, she would monitor the entire internet in case they were leaked somewhere. This would be an unrealistic amount of responsibility to expect from a regular consumer; this book does not advise average consumers to “assume breach” in their lives, although doing occasional online searches on yourself is a good idea and not uploading sensitive documents online is definitely advisable. As a professional, Bob manages secret and top-secret documents. The department Bob works at would never consider the idea of using an online file-sharing service to share their documents; they control every aspect of this valuable information. When they were creating the network and the software systems that manage these documents, they designed them, and their processes, assuming breach. They hunt for threats on their network, designed their network using zero trust, monitor the internet for signs of data leakage, authenticate to APIs before connecting, verify data from the database and from internal APIs, perform red team exercises (security testing in production), and monitor their network and applications closely for anomalies or other signs of breach. They’ve written automated responses to common attack patterns, have processes in place and ready for uncommon attacks, and they analyze behavioral patterns for signs of breach. They operate on the idea that data may have been breached already or could be at any time. Another example of this would be initiating your incident response process when a serious bug has been disclosed via your responsible disclosure or bug bounty program, assuming that someone else has potentially already found and exploited this bug in your systems. According to Wikipedia, coordinated disclosure is a vulnerability disclosure model in which a vulnerability or an issue is disclosed only after a period of time that allows for the vulnerability or issue to be patched or mended. Bug bounty programs are run by many organizations. They provide recognition and compensation for security researchers who report bugs, especially those pertaining to vulnerabilities. Insider Threats An insider threat means that someone who has approved access to your systems, network, and data (usually an employee or consultant) negatively affects one or more of the CIA aspects of your systems, data, and/or network. This can be malicious (on purpose) or accidental. Here are some examples of malicious threats and the parts of the CIA Triad they affect: An employee downloading intellectual property onto a portable drive, leaving the building, and then selling the information to your competitors (confidentiality) An employee deleting a database and its backup on their last day of work because they are angry that they were dismissed (availability) An employee programming a back door into a system so they can steal from your company (integrity and confidentiality) An employee downloading sensitive files from another employee’s computer and using them for blackmail (confidentiality) An employee accidentally deleting files, then changing the logs to cover their mistake (integrity and availability) An employee not reporting a vulnerability to management in order to avoid the work of fixing it (potentially all three, depending upon the type of vulnerability) Here are some examples of accidental threats and the parts of the CIA Triad they affect: Employees using software improperly, causing it to fall into an unknown state (potentially all three) An employee accidentally deleting valuable data, files, or even entire systems (availability) An employee accidentally misconfiguring software, the network, or other software in a way that introduces security vulnerabilities (potentially all three) An inexperienced employee pointing a web proxy/dynamic application security testing (DAST) tool at one of your internal applications, crashing the application (availability) or polluting your database (integrity) We will cover how to avoid this in later chapters to ensure all of your security testing is performed safely. WARNING Web proxy software and/or DAST tools are generally forbidden on professional work networks. Also known as “web app scanners,” web proxies are hacker tools and can cause great damage. Never point a web app scanner at a website or application and perform active scanning or other interactive testing without permission. It must be written permission, from someone with the authority to give the permission. Using a DAST tool to interact with a site on the internet (without permission) is a criminal act in many countries. Be careful, and when in doubt, always ask before you start. Defense in Depth Defense in depth is the idea of having multiple layers of security in case one is not enough (Figure 1-5). Although this may seem obvious when explained so simply, deciding how many layers and which layers to have can be difficult (especially if your security budget is limited). “Layers” of security can be processes (checking someone’s ID before giving them their mail, having to pass security testing before software can be released), physical, software, or hardware systems (a lock on a door, a network firewall, hardware encryption), built-in design choices (writing separate functions for code that handles more sensitive tasks in an application, ensuring everyone in a building must enter through one door), and so on. Figure 1-5: Three layers of security for an application; an example of defense in depth Here are some examples of using multiple layers: When creating software: Having security requirements, performing threat modeling, ensuring you use secure design concepts, ensuring you use secure coding tactics, security testing, testing in multiple ways with multiple tools, etc. Each one presents another form of defense, making your application more secure. Network security: Turning on monitoring, having a SIEM (Security information and event management, a dashboard for viewing potential security events, in real time), having IPS/IDS (Intrusion prevention/detection system, tools to find and stop intruders on your network), firewalls, and so much more. Each new item adds to your defenses. Physical security: Locks, barbed wire, fences, gates, video cameras, security guards, guard dogs, motion sensors, alarms, etc. Quite often the most difficult thing when advocating for security is convincing someone that one defense is not enough. Use the value of what you are protecting (reputation, monetary value, national security, etc.) when making these decisions. While it makes little business sense to spend one million dollars protecting something with a value of one thousand dollars, the examples our industry sees the most often are usually reversed. NOTE Threat modeling: Identifying threats to your applications and creating plans for mitigation. More on this in Chapter 3. SIEM system: Monitoring for your network and applications, a dashboard of potential problems. Intrusion prevention/detection system (IPS/IDS): Software installed on a network with the intention of detecting and/or preventing network attacks. Least Privilege Giving users exactly how much access and control they need to do their jobs, but nothing more, is the concept of least privilege. The reasoning behind least privilege is that if someone were able to take over your account(s), they wouldn’t get very far. If you are a software developer with access to your code and read/write access to the single database that you are working on, that means if someone were able to take over your account they would only be able to access that one database, your code, your email, and whatever else you have access to. However, if you were the database owner on all of the databases, the intruder could potentially wipe out everything. Although it may be unpleasant to give up your superpowers on your desktop, network, or other systems, you are reducing the risk to those systems significantly by doing so. Examples of least privilege: Needing extra security approvals to have access to a lab or area of your building with a higher level of security. Not having administrator privileges on your desktop at work. Having read-only access to all of your team’s code and write access to your projects, but not having access to other teams’ code repositories. Creating a service account for your application to access its database and only giving it read/write access, not database owner (DBO). If the application only requires read access, only give it what it requires to function properly. A service account with only read access to a database cannot be used to alter or delete any of the data, even if it could be used to steal a copy of the data. This reduces the risk greatly. NOTE Software developers and system administrators are attractive targets for most malicious actors, as they have the most privileges. By giving up some of your privileges you will be protecting your organization more than you may realize, and you will earn the respect of the security team at the same time. Supply Chain Security Every item you use to create a product is considered to be part of your “supply chain,” with the chain including the entity (supplier) of each item (manufacturer, store, farm, a person, etc.). It’s called a “chain” because each part of it depends on the previous part in order to create the final product. It can include people, companies, natural or manufactured resources, information, licenses, or anything else required to make your end product (which does not need to be physical in nature). Let’s explain this a bit more clearly with an example. If Bob was building a dollhouse for his grandchildren, he might buy a kit that was made in a factory. That factory would require wood, paper, ink to create color, glue, machines for cutting, workers to run and maintain the machines, and energy to power the machines. To get the wood, the factory would order it from a lumber company, which would cut it down from a forest that it owns or has licensed access to. The paper, ink, and glue are likely all made in different factories. The workers could work directly for the factory or they could be casual contractors. The energy would most likely come from an energy company but could potentially come from solar or wind power, or a generator in case of emergency. Figure 1-6 shows the (hypothetical) supply chain for the kit that Bob has purchased in order to build a doll house for his children for Christmas this year. Figure 1-6: A possible supply chain for Bob’s doll house What are the potential security (safety) issues with this situation? The glue provided in this kit could be poisonous, or the ink used to decorate the pieces could be toxic. The dollhouse could be manufactured in a facility that also processes nuts, which could cross-contaminate the boxes, which could cause allergic reactions in some children. Incorrect parts could be included, such as a sharp component, which would not be appropriate for a young child. All of these situations are likely to be unintentional on the part of the factory. When creating software, we also use a supply chain: the frameworks we use to write our code, the libraries we call in order to write to the screen, do advanced math calculations, or draw a button, the application programming interfaces (APIs) we call to perform actions on behalf of our applications, etc. Worse still, each one of these pieces usually depends on other pieces of software, and all of them are potentially maintained by different groups, companies, and/or people. Modern applications are typically made up of 20–40 percent original code3 (what you and your teammates wrote), with the rest being made up of these third-party components, often referred to as “dependencies.” When you plug dependencies into your applications, you are accepting the risks of the code they contain that your application uses. For instance, if you add something to process images into your application rather than writing your own, but it has a serious security flaw in it, your application now has a serious security flaw in it, too. This is not to suggest that you could write every single line of code yourself; that would not only be extremely inefficient, but you may still make errors that result in security problems. One way to reduce the risk, though, is to use fewer dependencies and to vet carefully the ones that you do decide to include in your software. Many tools on the market (some are even free) can verify if there are any known security issues with your dependencies. These tools should be used not only every time you push new code to production, but your code repository should also be scanned regularly as well. SUPPLY CHAIN ATTACK EXAMPLE The open source Node.js module called event-stream was passed on to a new maintainer in 2018 who added malicious code into it, waited until millions of people had downloaded it via NPM (the package manager for Node.JS), and then used this vulnerability to steal bitcoins from Copay wallets, which used the event-stream library in their wallet software.4 Another defense tactic against using an insecure software supply chain is using frameworks and other third-party components made by known companies or recognized and well- respected open source groups, just as a chef would only use the finest ingredients in the food they make. You can (and should) take care when choosing which components make it into the final cut of your products. There have been a handful of publicly exposed supply chain attacks in recent years, where malicious actors injected vulnerabilities into software libraries, firmware (low-level software that is a part of hardware), and even into hardware itself. This threat is real and taking precautions against it will serve any developer well. Security by Obscurity The concept of security by obscurity means that if something is hidden it will be “more secure,” as potential attackers will not notice it. The most common implementation of this is software companies that hide their source code, rather than putting it open on the internet (this is used as a means to protect their intellectual property and as a security measure). Some go as far as obfuscating their code, changing it such that it is much more difficult or impossible to understand if someone is attempting to reverse engineer your product. NOTE Obfuscation is making something hard to understand or read. A common tactic is encoding all of the source code in ASCII, Base64, or Hex, but that’s quite easy to see for professional reverse engineers. Some companies will double or triple encode their code. Another tactic is to XOR the code (an assembler command) or create their own encoding schema and add it programmatically. There are also products on the market that can perform more advanced obfuscation. Another example of “security by obscurity” is having a wireless router suppress the SSID/Wi-Fi name (meaning if you want to connect you need to know the name) or deploying a web server without a domain name hoping no one will find it. There are tools to get around this, but it reduces your risk of people attacking your wireless router or web server. The other side of this is “security by being open,” the argument that if you write open source software there will be more eyes on it and therefore those eyes will find security vulnerabilities and report them. In practice this is rarely true; security researchers rarely review open source code and report bugs for free. When security flaws are reported to open source projects they don’t necessarily fix them, and if vulnerabilities are found, the finder may decide to sell them on the black market (to criminals, to their own government, to foreign governments, etc.) instead of reporting them to the owner of the code repository in a secure manner. Although security by obscurity is hardly an excellent defense when used on its own, it is certainly helpful as one layer of a “defense in depth” security strategy. Attack Surface Reduction Every part of your software can be attacked; each feature, input, page, or button. The smaller your application, the smaller the attack surface. If you have four pages with 10 functions versus 20 pages and 100 functions, that’s a much smaller attack surface. Every part of your app that could be potentially exposed to an attacker is considered attack surface. Attack surface reduction means removing anything from your application that is unrequired. For instance, a feature that is not fully implemented but you have the button grayed out, would be an ideal place to start for a malicious actor because it’s not fully tested or hardened yet. Instead, you should remove this code before publishing it to production and wait until it’s finished to publish it. Even if it’s hidden, that’s not enough; reduce your attack surface by removing that part of your code. TIP Legacy software often has very large amounts of functionality that is not used. Removing features that are not in use is an excellent way to reduce your attack surface. If you recall from earlier in the chapter, Alice and Bob both have medical implants, a device to measure insulin for Alice and a pacemaker for Bob. Both of their devices are “smart,” meaning they can connect to them via their smart phones. Alice’s device works over Bluetooth and Bob’s works over Wi-Fi. One way for them to reduce the attack surface of their medical devices would have been to not have gotten smart devices in the first place. However, it’s too late for that in this example. Instead, Alice could disable her insulin measuring device’s Bluetooth “discoverable” setting, and Bob could hide the SSID of his pacemaker, rather than broadcasting it. Hard Coding Hard coding means programming values into the code, rather than getting the values organically (from the user, database, an API, etc.). For example, if you have created a calculator application and the user enters 4 + 4, presses Enter, and then the screen shows 8, you will likely assume the calculator works. However, if you enter 5 + 5 and press Enter but the screen still shows 8, you may have a situation of hard coding. Why is hard coding a potential security issue? The reason is twofold: you cannot trust the output of the application, and the values that have been hard coded are often of a sensitive nature (passwords, API keys, hashes, etc.) and anyone with access to the source code would therefore have access to those hard-coded values. We always want to keep our secrets safe, and hard coding them into our source code is far from safe. Hard coding is generally considered a symptom of poor software development (there are some exceptions to this). If you encounter it, you should search the entire application for hard coding, as it is unlikely the one instance you have found is unique. Never Trust, Always Verify If you take away only one lesson from this book, it should be this: never trust anything outside of your own application. If your application talks to an API, verify it is the correct API, and that it has authority to do whatever it’s trying to do. If your application accepts data, from any source, perform validation on the data (ensure it is what you are expecting and that it is appropriate; if it is not, reject it). Even data from your own database could have malicious input or other contamination. If a user is attempting to access an area of your application that requires special permissions, reverify they have the permission to every single page or feature they use. If a user has authenticated to your application (proven they are who they say they are), ensure you continue to validate it’s the same user that you are dealing with as they move from page to page (this is called session management). Never assume because you checked one time that everything is fine from then on; you must always verify and reverify. NOTE We verify data from our own database because it may contain stored cross-site scripting (XSS), or other values that may damage our program. Stored XSS happens when a program does not perform proper input validation and saves an XSS attack into its database by accident. When users perform an action in your application that calls that data, when it is returned to the user it launches the attack against the user in their browser. It is an attack that a user is unable to protect themselves against, and it is generally considered a critical risk if found during security testing. Quite often developers forget this lesson and assume trust due to context. For instance, you have a public-facing internet application, and you have extremely tight security on that web app. That web app calls an API (#1) within your network (behind the firewall) all the time, which then calls another API (#2) that changes data in a related database. Often developers don’t bother authenticating (proving identity) to the first API or have the API (#1) verify the app has authorization to call whatever part of the API it’s calling. If they do, however, in this situation, they often perform security measures only on API #1 and then skip doing it on API #2. This results in anyone inside your network being able to call API #2, including malicious actors who shouldn’t be there, insider threats, or even accidental users (Figure 1- 7). Figure 1-7: Example of an application calling APIs and when to authenticate Here are some examples: A website is vulnerable to stored cross-site scripting, and an attacker uses this to store an attack in the database. If the web application validates the data from the database, the stored attack would be unsuccessful when triggered. A website charges for access to certain data, which it gets from an API. If a user knows the API is exposed to the internet, and the API does not validate that who is calling it is allowed to use it (authentication and authorization), the user can call the API directly and get the data without paying (which would be malicious use of the website), it’s theft. A regular user of your application is frustrated and pounds on the keyboard repeatedly, accidentally entering much more data than they should have into your application. If your application is validating the input properly, it would reject it if there is too much. However, if the application does not validate the data, perhaps it would overload your variables or be submitted to your database and cause it to crash. When we don’t verify that the data we are getting is what we are expecting (a number in a number field, a date in a date field, an appropriate amount of text, etc.), our application can fall into an unknown state, which is where we find many security bugs. We never want an application to fall into an unknown state. Usable Security If security features make your application difficult to use, users will find a way around it or go to your competitor. There are countless examples online of users creatively circumventing inconvenient security features; humans are very good at solving problems, and we don’t want security to be the problem. The answer to this is creating usable security features. While it is obvious that if we just turned the internet off, all our applications would be safer, that is obviously an unproductive solution to protecting anyone from threats on the internet. We need to be creative ourselves and find a way to make the easiest way to do something also be the most secure way to do something. Examples of usable security include: Allowing a fingerprint, facial recognition, or pattern to unlock your personal device instead of a long and complicated password. Teaching users to create passphrases (a sentence or phrase that is easy to remember and type) rather than having complexity rules (ensuing a special character, number, lower- and uppercase letters are used, etc.). This would increase entropy, making it more difficult for malicious actors to break the password, but would also make it easier to use for users. Teaching users to use password managers, rather than expecting them to create and remember 100+ unique passwords for all of their accounts. Examples of users getting around security measures include: Users tailgating at secure building entrances (following closely while someone enters a building so that they do not need to swipe to get in). Users turning off their phones, entering through a scanner meant to detect transmitting devices, then turning it back on once in the secure area where cell phones are banned. Using a proxy service to visit websites that are blocked by your workplace network. Taking a photo of your screen to bring a copyright image or sensitive data home. Using the same password over and over but incrementing the last number of it for easy memory. If your company forces users to reset their password every 90 days, there’s a good chance there are quite a few passwords in your org that follow the format currentSeason_currentYear. Factors of Authentication Authentication is proving that you are indeed the real, authentic, you, to a computer. A “factor” of authentication is a method of proving who you are to a computer. Currently there are only three different factors: something you have, something you are, and something you know: Something you have could be a phone, computer, token, or your badge for work. Something that should only ever be in your possession. Something you are could be your fingerprint, an iris scan, your gait (the way you walk), or your DNA. Something that is physically unique to you. Something you know could be a password, a passphrase, a pattern, or a combination of several pieces of information (often referred to as security questions) such as your mother’s maiden name, your date of birth, and your social insurance number. The idea is that it is something that only you would know. When we log in to accounts online with only a username and password, we are only using one “factor” of authentication, and it is significantly less secure than using two or more factors. When accounts are broken into or data is stolen, it is often due to someone using only one factor of authentication to protect the account. Using more than one factor of authentication is usually referred to as multi-factor authentication (MFA) or two-factor authentication (2FA), or two-step login. We will refer to this as MFA from now on in this book. TIP Security questions are passé. It is simple to look up the answers to most security questions on the internet by performing Open Source Intelligence Gathering (OSINT). Do not use security questions as a factor of authentication in your software; they are too easily circumvented by attackers. When credentials (usernames with corresponding passwords) are stolen and used maliciously to break into accounts, users that have a second factor of authentication are protected; the attacker will not have the second factor of authentication and therefore will be unable to get in. When someone tries to brute force (using a script to automatically try every possible option, very quickly) a system or account that has MFA enabled, even if they eventually get the password, they won’t have the second factor in order to get in. Using a second factor makes your online accounts significantly more difficult to break into. Examples of MFA include: Multi-factor: Entering your username and password, then having to use a second device or physical token to receive a code to authenticate. The username and password are one factor (something you know) and using a second device is the second factor (something you have). Not multi-factor: A username and a password. This is two examples of the same factor; they are both something that you know. Multi-factor authentication means that you have more than one of the different types of factors of authentication, not one or more of the same factor. Not multi-factor: Using a username and password, and then answering security questions. These are two of the same fact, something you know. Multi-factor: Username and password, then using your thumb print. NOTE Many in the information security industry are in disagreement as to whether or not using your phone to receive an SMS (text message) with a pin code is a “good” implementation of MFA, as there are known security flaws within the SMS protocol and some implementations of it. It is my opinion that having a “pretty-darn-good second factor,” rather than having only one factor, is better. Whenever possible, however, ask users to use an authentication application instead of SMS text messages as the second factor. Exercises These exercises are meant to help you understand the concepts in this chapter. Write out the answers and see which ones you get stuck on. If you have trouble answering some of the questions, you may want to reread the chapter. Every chapter will have exercises like these at the end. If there is a term you are unfamiliar with, look it up in the glossary at the end of the book; that may help with your understanding. If you have a colleague or professional mentor who you can discuss the answers with, that would be the best way to find out if you are right or wrong, and why. Some of the answers are not Boolean (true/false) and are just to make you contemplate the problem. 1. Bob sets the Wi-Fi setting on his pacemaker to not broadcast the name of his Wi-Fi. What is this defensive strategy called? 2. Name an example of a value that could be hard coded and why. (What would be the motivation for the programmer to do that?) 3. Is a captcha usable security? Why or why not? 4. Give one example of a good implementation of usable security. 5. When using information from the URL parameters do you need to validate that data? Why or why not? 6. If an employee learns a trade secret at work and then sells it to a competitor, this breaks which part(s) of CIA? 7. If you buy a “smart” refrigerator and connect it to your home network, then have a malicious actor connect to it and change the settings so that it’s slightly warmer and your milk goes bad, which part(s) of CIA did they break? 8. If someone hacks your smart thermostat and turns off your heat, which part(s) of CIA did they break? 9. If a programmer adds an Easter egg (extra code that does undocumented functionality, as a “surprise” for users, which is unknown to management and the security team), does this qualify as an insider threat? If so, why? If not, why not? 10. When connecting to a public Wi-Fi, what are some of the precautions that you could take to ensure you are doing “defense in depth”? 11. If you live in an apartment with several roommates and you all have a key to the door, is one of the keys considered to be a “factor of authentication”? CHAPTER 2 Security Requirements When you create any application or embark on any project, you must have requirements for what you are going to build. This is true no matter what development methodology you use (Waterfall, Agile, DevOps), language or framework you write it in, or type of audience you serve; without a plan you cannot build something of substance. If you have studied computer science or computer engineering, the image shown in Figure 2-1 is likely burned into your brain. It is commonly known as the System Development Life Cycle (SDLC), and it consists of five phases: Requirements, Design, Code, Testing, and Release. As this book progresses, we will refer back to this image in order to explain when each activity we talk about can and/or should occur. This chapter will revolve around the Requirements phase. Figure 2-1: The System Development Life Cycle (SDLC) TIP SDLC is sometimes defined as the software development life cycle, focusing on software rather than system. The two definitions are used interchangeably. When you have your very first project meeting (often called a “project kickoff meeting”), there should be a person from the security team present, to take part in the project from its very inception. Even though this person will not be working full time on the project, they should be part of the team and make themselves available regularly to ensure that all security questions and concerns are addressed in a timely manner. Assigning a security person to a project team is sometimes called the partnership model or the person is referring to as being “matrixed into the team.” No matter what you call it, this person is there to ensure that security’s interests (CIA) are represented throughout the entire project. NOTE The exact origin of the Partnership model is unknown. I first learned of it from the Netflix AppSec Team. The expression of being “matrixed into a project” was first introduced to me at the Treasury Board Secretariat of Canada and is, again, of unknown origin. This chapter will assume that you have a basic understanding of how IT projects and software development processes work. TIP Create a Support Level Agreement (SLA) between the security team and the other teams in order to define what a reasonable amount of time is to wait for a response to requests to the security team. Often security teams become the bottleneck of projects, and when an SLA is in place, this is less likely to happen. For best results, set conservative goals at first, and aim to improve over time. Requirements Software project requirements should always include security questions. Following are the types of questions that security professionals should be asking when assisting with requirements gathering and analysis: Does the system contain, or come in contact with, confidential, sensitive, or Personally Identifiable Information (PII) data? Where and how is the data being stored? Will this application be available to the public (internet) or internally only (intranet)? Does this application perform sensitive or essential tasks (such as transferring money, unlocking doors, or delivering medicine)? Does this application perform any risky software activities (such as allowing users to upload files)? What level of availability do you need? Does it require 99.999% up time? (Note: almost no systems actually require that level of up time.) Ideally, when creating a list of requirements, the security representative should ask questions and then add the appropriate security requirements to the list of requirements for the project. For instance; “Will your application allow users to upload files? Yes? Okay then, let’s add the following security requirements to that project requirements document to ensure they design it securely from the start.” As someone creating software, it is part of your responsibility to protect the security, safety, and privacy of your users. These requirements will help ensure that you do. The following sections detail security requirement definitions and explanations. At the end of this chapter you’ll find a checklist of requirements that could be added to any web application project requirements document. Encryption Cryptography is a type of math that can be applied to information in order to make its value no longer understandable; it is used to hide secrets and ensure privacy of communications. Encryption is two-way, in that you can jumble up the information into an unreadable mess, and then “decrypt” it back into its original form. Hashing is one-way; the original value can never be recovered. Encryption is used quite often to protect secrets or to transmit data, because the system needs that data back later. The value is in the data. Hashing is most often used to prove identity, to authenticate to a system, to verify the integrity of your data, or some sort of challenge or verification; for example, no one cares what your password actually is, they just want to know if they should let you into the system or not. The data is not the value; proving your identity is (you know the original value, the password, which proves your identity). Hashing a value also means that if the value is ever leaked, it is unusable; leaked passwords (that are in an altered and hashed format) don’t do the thief any good, as you would enter them into the system and they would not be recognized as your password. NOTE Although there are currently concerns about quantum computing making our current forms of cryptography and encryption obsolete, this book will presume that has not happened yet. There is no known date as to when this will occur, and no one, including me, currently has a working quantum-resistant encryption algorithm or strategy that is proven as of this writing. As such, this topic is out of scope for this book. In order to ensure the confidentiality of your data, it should be encrypted in transit (on its way to and from the user, the database, an API, etc.) and at rest (while in the database). It should be noted that this ensures no one will learn your secrets; if someone were to gain unauthorized access to your data or intercept your traffic with a sniffing tool, they would not be able to understand what they have found. This does not, however, protect the availability of your data, or its integrity. Someone could still delete or change the data in your database (it would be obvious it was changed or removed, but also quite inconvenient if your backups and rollbacks are not perfectly seamless). A malicious actor could intercept your traffic and change or block your messages, which again would cause problems. That said, protecting your secrets (the “C” in CIA) is vital, and thus no matter what system you are creating, you will want to ensure the data is encrypted (not hashed) in transit and at rest. Some may argue that data should even be encrypted while in use (in memory), but unless you are dealing with extremely sensitive data, this is generally not expected as a project requirement. To protect highly sensitive data, it is recommended that you flush the memory when your program exits, logs out, or is otherwise ended. Never Trust System Input Any input to your system could potentially be tampered with or otherwise cause your application to malfunction or fail. Whether this input is intentionally malicious or not, if it causes your application to go into an unknown state (a state you have not planned for or programmed to handle), this is a very dangerous place to be. When your application falls into an unknown state, this is where malicious actors are able to force your application to do things you never dreamed of, including breaking one or more of the CIA factors. Your program must be able to handle every type of input gracefully, even bad input. Input to your application means literally anything and everything that is not a part of your application or that could have been manipulated outside of your application. NOTE One of the main risks to computer software is when data (values in variables, from an API or from a database) is executed as though it were part of the code of your application. Having code run that is not supposed to be a part of your application is generally characterized as an “injection” vulnerability and has been widely recognized by security professionals as the #1 threat to secure software5 since the start of our industry. This risk is the motivation for many of the project requirements included in this chapter, but most especially this one. Following are examples of input to your application: User input on the screen (for instance, entering search phrases into a field) Information from a database (even the database you designed for your app) Information from an API (even one you wrote) Information from another application that your application integrates with or otherwise accepts input from (this includes serverless apps and scripts) Values in the URL parameters, cookie values, configuration files Data or commands from cloud workflows Images that you’ve included from other sites (with or without permission) Values used from online storage This is not an exhaustive list. Please be aware that anything from outside your program could potentially be damaging. NOTE Cloud workflows are triggers that are usually used to call serverless apps, but may be used to trigger an action within your application. Serverless applications are applications or scripts that run in the cloud, without the need of a server running all the time. This means they are not using infrastructure resources unless they are running. When a serverless application is called, it launches a container, the app or script runs to completion on that container, and then it self-destructs, releasing the infrastructure resources. Examples of parts of your application that could be manipulated outside of your program include: URL parameters (a user could change them) Information in a cookie that does not have the “secure” and “HTTPS only” flags set Hidden fields (they are not safe from attackers) HTTPS request headers Values entered on the screen that can be manipulated after they have passed your JavaScript validation if using a web proxy (more on this later in the chapter) Front-end frameworks that are not included as part of your project but instead hosted elsewhere on the internet and called from your application in real time Third-party code that you include in your application when it is compiled (libraries, includes, frameworks, etc.) Images that you include in your application that are hosted elsewhere on the internet Configuration files that are not managed by you APIs or any other service that your application calls Scripts that you do not control Sometimes developers forget that even frameworks and online services that are well trusted, respected, and supported are still possible attack vectors. In order to effectively apply the concept “never trust system input” you must always validate all input (every single time) before you use the input. Input is considered untrustworthy until after validation. By “validate” I mean you perform tests to ensure the input is appropriate and what you are expecting, and if it’s not, reject it. There are special cases where you should sanitize input (remove anything that is potentially bad), and that will be discussed later. For this section we are discussing validating input to your application. Input validation examples include: You are expecting a date of birth, so you verify that the value you receive is indeed in date format and/or convert it into date format, and that it is within the previous 100 years (for example, age > current year – 100 && age < current year ). If it is not in proper date format (“aaaaaaaa” for example), you reject it. If it states the person is 5,000 years old, you reject it. If the person’s age is such that they have not been born yet, reject it. Issue a proper error message stating the format is incorrect, and what the format should be. Or, if it is not within your age range of 100 years, issue an error message stating that the age is not correct. The field is a person’s first name, and you have dedicated 80 characters for this field. Verify that the input you receive is 80 characters or less, and that the characters are appropriate for a name. For instance, if it contains %, [, {, X-XSS-Protection This header has been deprecated. Not only is it no longer supported by modern browsers, but it is recommended by several industry experts that it should no longer be used at all, due to the vulnerabilities it can create. Although it can help in some cases with some very old browsers, it causes more harm than good, and therefore should no longer be used as of this writing. Content-Security-Policy (CSP) The first thing a malicious actor does when they realize a website is vulnerable to XSS is call out to their own script, located somewhere else on the internet, which is usually significantly longer than the vulnerable app would allow or would be otherwise detected and blocked. Most applications only allow 20–100 characters for most fields, meaning an attacker wouldn’t be able to install a complete piece of malware or do as much damage as they would like to, so they call out to somewhere else on the internet where their harmful code is ready and waiting. Content Security Policy makes you list all of the sources for content (scripts, images, frames, fonts, etc.) that your site will use that are outside of your domain, which would stop a vulnerable web application from calling out and running the secondary part of the attack. It reduces the risk and potential damage of this type of attack drastically. That said, developers tend to find it time consuming to keep track of their sources and so this is not a popular security header among developers. It is my viewpoint that it would be used much more often if developers understood the risks and the protection it offers from these risks. If you are having trouble convincing a developer to apply this header, consider lending them this book; hopefully they’ll thank you later. NOTE Never turn CSP on without the developer’s consent and assistance. You shouldn’t enable security features without coordinating with affected teams in general, but especially for this security header. It would almost certainly break the website’s look and feel and some of its features, but more importantly, it will break trust with the developer team. Don’t rush into this one; test it thoroughly before you deploy for the first time. The absolute easiest setting is to just block everything, which is a good choice if your site is static and/or boring (doesn’t call any content from anywhere). The settings for this would be as follows: Content-Security-Policy: default-src 'self'; block-all-mixed-content; But let’s be honest; few modern websites are so simple. That’s okay, we got this. This is a list, courtesy of OWASP (the Open Web Application Security Project), of the various sources you can define in your policy: default-src: As you would expect, this is the default setting. It’s sort of a “catch all.” If anything is trying to load and it’s not clearly defined in the rest of the policy, this will apply. It is often set to “self,” to say that if we don’t explicitly allow it somewhere else in the policy the answer is “don’t load it.” Always set this to “self” if you are unsure. NOTE Frame Ancestors and Form Action are exceptions to this rule; they do not fallback to default-src. script-src: A list of domains (where scripts are located) or the exact URL of the script that are allowed to run as part of your site. Every other script from any other place on the internet, except those in your domain and what you list here, will not run. This protects against XSS attacks. WARNING The “unsafe-inline” keyword can be used as part of your configuration in order to undo all of the locking down we just did in our Content Security Policy; it allows any script from anywhere to run. The use of unsafe-inline in your applications should not be final; it should be temporary, for testing. Use unsafe-inline as you work your way up to a mature and complete CSP implementation, but it should never be final or used as a permanent solution. Also, make sure to audit for this setting in production during security assessments. Each of the following items follows this pattern: if you list it, then that type of resource can be used or loaded as part of your web application, plus those from within your own domain. Every other type of resource not listed here, when using the CSP header, will be blocked. object-src = plugins (valid sources for , , and elements) style-src = styles (Cascading Style Sheets, or CSS) img-src = images media-src = video and audio frame-src = frames font-src = fonts plugin-types = limits types of plugins that can be run script-nonce: This is a complicated one, but noteworthy. A nonce is string of characters created to be used one single time in order to prove that a specific script is the one you mean to call. This is an extra-added level of security within the CSP security header. Using this setting means you require the nonce in order to run the script. TIP Nonces are a complex topic, and the implementation of the nonce feature within CSP has changed over time. With this in mind I direct you to the OWASP Cheat Sheet on this topic for updates and a significantly longer explanation: cheatsheetseries.owasp.org/cheatsheets/Content_Security_Policy_Cheat_Sheet.html. report-uri: CSP will make a report for you about what it has blocked and other helpful information. This URI is to tell it where to send the report. As of this writing there are only four security headers that provide a reporting feature, and quite frankly, it’s really cool. Having metrics and information about the types of attacks your site is receiving is a gift. WARNING The URL of the reports are public, meaning an attacker could potentially view your reports and also perform a denial-of-service attack (DOS or DDOS) in order to hide their misdeeds. This is by far the most complex of all the security headers. For more information on this topic visit csp-evaluator.withgoogle.com (from Google) and scotthelme.co.uk. For example: Content-Security-Policy: default-src 'self'; img-src https://*.wehackpurple.com; media-src https://*.wehackpurple.com; allows the browser to load images, videos, and audio from *.wehackpurple.com. Content-Security-Policy: default-src 'self'; style-src https://*.jquery.com; script-src https://*.google.com; allows the browser to load styles from jquery.com and scripts from google.com. TIP Security headers that provide reports: CSP, Expect-CT, Public-Key- Pins, and XSS-Protection. Very handy! NOTE DOS or DDOS stands for denial of service, with the extra “D” meaning “distributed.” The purpose of a DOS attack is to overload the resources of the victim such that no one can use them. This could cause a website to crash, online stores to lose sales, and other forms of blocking access to online resources. When DOS attacks first started, they often came from one only source, meaning that IP address could be blocked and the attack would be subverted. Over time attackers learned that having many (hundreds or even thousands) of different IPs launching an attack was significantly more damaging and difficult to defend against. Most DOS attacks as of this writing are distributed (DDOS), often using compromised IoT devices as part of the attack. X-Frame-Options This header helps to protect against clickjacking attacks,15 which means a malicious website framing your legitimate website, and being able to steal information or “clicks.” Although you may have a specific situation where you want your site to be framed by one or more specific sites, allowing any site to frame yours is likely a bad idea as this can lead to situations where a user thinks they are on your site, when in fact they are clicking on an invisible site that is most likely malicious in nature (which is called “clickjacking”). This can lead to keylogging and stealing user’s credentials. In order for this type of vulnerability to be exploited the user has to participate by clicking a phishing link or a link on a malicious site, meaning it doesn’t happen in the wild as often as some other attacks, but the potential damage is high when it does. WARNING X-Frame-Options is deprecated and Content-Security-Policy (CSP) is used in its place for modern browsers. X-Frame-Options is used for backward compatibility of older browsers and will hopefully be phased out slowly from active use. To allow the site to use frames from within your own domain, set X-Frame-Options to “sameorigin”: X-Frame-Options: SAMEORIGIN To allow no frames, from anywhere, set to X-Frame-Options to “deny”: X-Frame-Options: DENY X-Content-Type-Options Part of the beauty of writing software is being creative and taking poetic license in the way you use a programming language; finding new and imaginative ways to use your language and framework. However, sometimes this leads to ambiguity, which can lead to security vulnerabilities if an application is unsure of its next instruction. This is called “falling into an unknown state,” and we never want our applications to fall into an unknown state. Unless you are a penetration tester or security researcher that is; in that case it is your favorite place in the world as you will most certainly find software vulnerabilities there. This security header instructs a browser not to “sniff” (infer/guess) the content type of media used in a web app, and instead to rely solely on the type specified by the application. Browsers like to think that they can anticipate the type of the content that they are serving in attempts to be helpful, but unfortunately this has turned into a known vulnerability that can be exploited, to the detriment of your website. This security header only has one possible setting: X-Content-Type-Options: nosniff Referrer-Policy When you surf from site to site on the internet, each site sends the next site a value called the “referrer,” which means a link to the page you came from. This is incredibly helpful for people analyzing their traffic, so they know where people are coming from. However, if you are visiting a site of a sensitive nature (a mortgage application, or a site detailing a specific medical condition, for example), you may not want the specifics sent to the next page you visit. In order to protect your user’s privacy, as the website creator you can set the referrer value to only pass the domain and not which page the user was on (wehackpurple.com versus wehackpurple.com/embarrassing-blog-post-title), or to not pass any value at all. You can also change the value of the referrer based on whether you are “downgrading” from HTTPS to HTTP. In order to only pass on the protocol and domain information, set the referrer to ”origin.” No other context can change this setting. Referrer-Policy: origin For example, a document at https://wehackpurple.com/page.html will send the referrer https://wehackpurple.com/. This setting will send only the protocol and domain if the user is leaving your domain. Within your domain it passes the entire path. This will serve most situations for non- sensitive websites: Referrer-Policy: strict-origin-when-cross-origin No value in the referrer field, context does not matter: Referrer-Policy: no-referrer This setting can get quite complex if you need it to, but generally only sending the origin domain when leaving your domain will fit most business situations and protect your user’s privacy adequately. If in doubt, you can always send nothing to ensure your user’s privacy is respected. For more information, Mozilla is a leader in this area and always offers stellar advice and technical guidance: developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Referrer-Policy Bonus Resource: Scott Helme, a security researcher who shares a lot of information and tools on security headers, is a great resource to learn more about this topic: scotthelme.co.uk Strict-Transport-Security (HSTS) This security header forces the connection to be HTTPS (encrypted), even if the user attempted to connect to the website via HTTP. This means the data going and back and forth will be encrypted by force. Users, including attackers, cannot downgrade to HTTP (unencrypted); the browser will force it to switch to HTTPS before loading any data. DEFINITIONS Platform as a Service (PaaS): A cloud computing service that hosts software in the cloud (usually web applications), which is maintained by the cloud provider. There is no need to patch or upgrade a PaaS; your cloud provider performs these tasks for you. Certificate Authority (often known as a CA): A trusted company or organization that verifies the identity of whoever is purchasing a certificate. Electronic Frontier Foundation (EFF): An international non-profit organization that works to protect the privacy and other rights on the internet. “Let’s Encrypt”: Run by the Electronic Frontier Foundation (EFF), this offers encryption certificates for free. As of this writing Let’s Encrypt is the only consumer Certificate Authority that offers certificates to the public for free. Wild Card Certificate: A “wild card” certificate covers all of your subdomains, not only your main domain. You will want to get one of these if you have subdomains. Subdomains are everything where the ‘*’ is in this formula: *.yourdomain.whatever. For example, *.wehackpurple.com would include newsletter.wehackpurple.com, store.wehackpurple.com, and www.wehackpurple.com. In order for this security header to work correctly you need to have a certificate that is issued by a certificate authority (CA), installed on your web server, PaaS, container, or wherever else you are hosting your app. This certificate will be used as part of the encryption process, and you cannot turn on HSTS without one. You will want to get one called a “wildcard” certificate if you have subdomains. You will also want a certificate that lasts as long as possible (one year as opposed to 3 months), so that you don’t have to waste time “rotating your certs” very often. Then you will need to figure out how long your certificate lasts, in seconds. No, I’m not kidding, they decided to time it in seconds. Hint: one year is 31,536,000 seconds: Strict-Transport-Security: max-age=31536000; includeSubDomains TIP You can submit your domain to hstspreload.org and add the suffix “preload” to your declaration. Google will preload your site to ensure that no one is ever able to connect to your site unencrypted. Although major browsers have announced their intentions to adopt this functionality, it is not part of the official HSTS specification.16 The adjusted syntax of the preceding example would become Strict- Transport-Security: max-age=31536000; includeSubDomains; preload. Feature-Policy As of this writing, this is the newest security header supported by modern browsers. With the advent of HTML 5 and many cool new features in more modern browsers, this security header allows or disallows these new types of features for your web application. The setting choices are: none: Not allowed at all self: Allowed but only your own domain can use/call this feature src (iframes only): The document loaded into the iframe must come from the same origin as the URL in the iframe’s src attribute16 *: Any domain can use/call this feature : This feature is allowed for specific URLs17 Here is an example that allows only your own speaker and full screen to run on your site: Feature-Policy: camera 'none'; microphone 'none'; speaker 'self'; vibrate 'none'; geolocation 'none'; accelerometer 'none'; ambient-light-sensor 'none'; autoplay 'none'; encrypted-media 'none'; gyroscope 'none'; magnetometer 'none'; midi 'none'; payment 'none'; picture-in-picture 'none'; usb 'none'; vr 'none'; fullscreen * These settings were used for the OWASP DevSlop Project website. We forbid almost all features; we only allowed the use of the speaker when called from our own site. We also allowed any domain to set the browser to full screen. When in doubt, be more restrictive, not less. Your users will thank you. X-Permitted-Cross-Domain-Policies This security header specifically applies only to Adobe products (Reader and Flash) being a part of your application. Adobe Flash is incredibly insecure and is no longer supported by Adobe, and thus should not be used in modern websites or applications. The purpose of this header is to ensure that files from your domain are allowed (or not) to be accessed by the Adobe products from other sites. If you intend to allow Adobe Reader hosted in places other than your domain to access the documents on your site, you would want to name the domains here. Otherwise, set it to “none” to ensure that no other domains are allowed to use Adobe products to access your documents/resources: X-Permitted-Cross-Domain-Policies: none Expect-CT “CT” refers to Certification Transparency, an open framework that provides oversite for Certificate Authorities (CAs). From time to time CAs accidentally issue certificates to less- than-ideal sites, and sometimes even outright malicious sites. The entire Certificate Authority system was designed in order to create trust and verify that whoever was the holder of a certificate would be a site that could be trusted by browsers and users. If CAs are providing certificates to malicious sites, whether it be through error, negligence, or by being complicit, it is not acceptable. The Certificate Transparency Framework logs details about various transactions to track when CAs have issued certificates inappropriately. If a CA has made several “bad calls” when issuing certificates, a browser or organization may choose to no longer “trust” the certificates that they issue. You may wonder what this has to do with you as a software developer, application support professional, or security professional. In order to help maintain the integrity of the entire Certificate Authority system, we must log our certificates into the online CT registry. If a site’s certificate is not in the registry, modern web browsers will issue warnings to users that your website is untrustworthy. No one in your business wants browsers telling users their website is unsafe. When you turn on this security header: 1. Your user’s browser will check the CT logs to see if your certificate is there, for better or worse. 2. If set to “enforce,” your user’s browser will enforce certificate transparency, which means if your certificate is not in the registry or is otherwise “CT unqualified,” it will terminate the connection between your site and the user. If not in enforce mode, a report is sent to the report URL. It is advised at first you deploy in “reporting only” mode, then once you are certain your certificates are acceptable and correctly registered, you upgrade to “enforce” mode. The “max-age” field (measured in seconds) is the length of time for the browser to apply this setting (it is cached in the browser for this length of time). WARNING Just like the CSP Reports, the Expect-CT Report URLs are not private. Any information in them could be accessed by anyone, including those with less-than-honest intentions. Below are two examples of implementation of the Expect-CT header: Reporting only: Expect-CT: max-age=86400, report-uri="https://wehackpurple.com/report" Reporting and Blocking: Expect-CT: max-age=86400, enforce, report-uri="https://wehackpurple.com/report" The following is another example from the OWASP DevSlop project, this time for.Net CORE applications. This example requires that you add an additional Nuget package (“Nwebsec.AspNetCore.Middleware”)18 to your project, as per the first line in the example: //Security headers.Net CORE, not the same as ASP.net app.UseHsts(hsts => hsts.MaxAge(365).IncludeSubdomains()); app.UseXContentTypeOptions(); app.UseReferrerPolicy(opts => opts.NoReferrer()); app.UseXXssProtection(options => options.EnabledWithBlockMode()); app.UseXfo(options => options.Deny()); app.UseCsp(opts => opts.BlockAllMixedContent().StyleSources(s => s.Self()).StyleSources(s => s.UnsafeInline()).FontSources(s => s.Self()).FormActions(s => s.Self()).FrameAncestors(s => s.Self()).ImageSources(s => s.Self()).ScriptSources(s => s.Self()) ); //End Security Headers Public Key Pinning Extension for HTTP (HPKP) Public Key Pinning is a system that was created to protect against fraudulent certificates. The idea of this system was that only one or two Certificate Authorities would be trusted for a particular URL/site. At first it was implemented using static pins (built into the browser directly and manually, specifically in Chrome and Firefox), but eventually other website owners wanted to “pin” certificates as well, and this was done dynamically by website owners. The certificate would be pinned for a period of time (usually a year) and to a specific cryptographic identity (keys). However, if you lose your keys, it means you’ve lost control of your website for up to a year, as it won’t work without them. In the meantime, you have essentially “bricked” your website URL (made it unusable), and thus your security feature has caused catastrophic business damage. This situation is called “HPKP Suicide.” Although the potential benefits of this security header are great, the risks make it unadvisable to use this security feature unless you require an extremely high level of assurance and have a team that is extremely knowledgeable on this topic and ready to accept this business risk. WARNING This security header is considered deprecated/no longer supported. Securing Your Cookies The HTTP protocol was never designed to handle user sessions (keeping track of someone being logged in, or having items in a shopping cart; keeping track of anything as a user moves from page to page within your site is referred to as “state”). Cookies are used to pass information about a user session back and forth from the browser to your web server and can be saved for future use to give the user a more personalized experience (for instance, it can remember their language preference). Although many websites use cookies to store information for marketing and tracking purposes, and even selling the information to other companies, the ethical aspects in regard to privacy of the data kept within cookies is not a topic we will delve into here. When you are setting cookies in your application, there are settings that you will need to use in order to ensure the data within your cookie remains safe, and they are discussed in the following sections. Please note: Sometimes developers decide to use local storage instead of cookies. There is an entirely different set of precautions to take in order to protect local storage, and that will not be covered in this section. Also, your session should always be passed in a session cookie (not a persistent cookie, and never saved into local storage). Always use a session cookie for your session. Also, just in case you forgot, always validate the input that you put into your cookies. If the data you are getting doesn’t make sense, reject it and try again. Your cookie data is very valuable, and you must take care to always protect your session (more on that in the following chapters). The Secure Flag The secure flag ensures that your cookie will only be sent over encrypted (HTTPS) channels. If an attacker attempts to downgrade your session to HTTP, your web application will refuse to send the cookie. You should always turn this setting on: Set-Cookie: Secure; (plus all of your other settings) The HttpOnly Flag The naming of this flag is not intuitive, as it has nothing to do with forcing the connection to be unencrypted (HTTP), and thus makes it confusing for programmers. When this flag is set on a cookie it means that the cookie cannot be accessed via JavaScript; it can only be changed on the server side. The reason for using this setting on your cookie is to protect against XSS attacks attempting to access the values in your cookie. You always want to set this value, in all cookies, as another layer of defense against XSS attacks against the confidentiality of the data in your cookie: Set-Cookie: HttpOnly; (plus all of your other settings) Persistence If you are collecting sensitive user data or managing your session, your cookie should not be persistent; it should self-destruct at the end of the session to protect that data. A cookie that self-destructs at the end of your session is a called a “session cookie.” If the cookie does not self-destruct at the end of the session, it is called a “persistent cookie” or a “tracking cookie.” Talk with the privacy team and business analysts to determine whether you want the cookie to be persistent or not. For persistent cookies, you can set the expiry via the “expires” attribute setting, and/or set a specific maximum age that the cookie can reach via “max-age” setting. Expires: Jan 1, 2021 Set-Cookie: Expires=Mon, 1st Jan 2021 00:00:00 GMT; (plus all of your other settings) Max-age of 1 hour Set-Cookie: Max-Age=3600; (plus all of your other settings) Domain If you want your cookie to be able to be accessed by domains outside of your own, you must explicitly list them as trusted domains using the “domain” attribute. Otherwise browsers will assume your cookies are “host-only,” meaning only your domain can access them and it will block all other access. This type of built-in protection is considered “secure by default”; if only all software settings worked this way! Set-Cookie: Domain=app.NOTwehackpurple.com; (plus all of your other settings) WARNING If you do not set the subdomain (NOTwehackpurple.com instead of app.NOTwehackpurple.com), every application and page hosted within that domain will be able to access your cookie. You probably don’t want that. Path Many URLs on the web are actually many separate applications all listed under different paths and subdomains. To the user they appear as one giant web page, but in reality, they could be thousands of different applications. If you are working in such a situation, it is likely that you should limit access to your cookie to only your specific area of the location where your application resides—its “path.” You set the path attribute to limit your cookie’s scope: Set-Cookie: path=/YourApplicationsPath; (plus all of your other settings) Same-Site The same-site attribute was created by Google to combat the cross-site request forgery (CSRF) vulnerability. CSRF, affectionately known as “sea surf,” is an attack against users who are logged in to a legitimate site, where an attacker attempts to take actions on the user’s behalf without their consent or knowledge. It’s usually done via a phishing email. Imagine Alice is going to be on a TV show to talk about the company she works for. Alice wants to look great on the show, and she has decided to buy a brand new outfit online. Alice logs in to her favorite clothing site and is looking through all the new items they have when she notices she has received an email. The email is a phishing email, with a link to the site that she is currently logged in to, and the link includes code instructions for the site to purchase a product and send it to the attacker, not Alice. If Alice clicks this link, and the site does not have proper defenses, the entire attack could happen without her even realizing it. The attack only works if Alice is currently logged in to the site in the phishing email. You might think this sounds far fetched, but think back to logging in to any of the websites that you usually use. Do you always press the Log Out button? Do you sometimes leave a site open for a few days? No one is perfect, and it is our job is to protect the users of our sites, even if they make a mistake such as clicking a phishing link. With this in mind, this cookie attribute enforces the rule that cookies can only come from within the same site. Cookies cannot come from cross-site (not your site) origins. The options are “None” (cookies can be sent from anywhere), “Strict” (only from your domain), and “Lax” (if you want cookies to be sent when coming from a link or another page, only send non-sensitive information in these cookies). If you don’t set this value in your cookies, the default on modern browsers is “lax:/” and on older browsers it’s “none.”19 Lax means users can remain logged in and surf to other sites and then return and your cookies will still work. Lax is usually used for navigation and other features your application does before a user logs in. It will block obvious CSRF attacks (POST requests). It’s not foolproof, but it’s a good compromise if you can’t talk them into using “strict.” Set-Cookie: same-site=Strict; (plus all of your other settings) Set-Cookie: same-site=Lax; //only blocks POST requests, allows links Cookie Prefixes Prefixes for cookies are new and not accepted by all browsers; however, they are a defense- in-depth measure (additional layer) for cases where cookies are accidentally mishandled in your application. For instance, if a subdomain has been compromised, and your cookie is set to your entire domain in the “Path” setting, the compromised domain could attempt access to your cookie. Prefixes are within the name of the cookie, and therefore the server will see it, encrypted or not. Prefixes can be used to ensure your cookie is only accessed within a specific subdomain using the “host” prefix. More information on this topic is available on the Mozilla and Chrome websites. Data Privacy Most websites now have a privacy policy stating that websites must state which data they are collecting and how they are using it. If your workplace has such a policy, ensure you follow it. If your workplace doesn’t, perhaps you need one. Food for thought. Data Classification All data collected, used, or created by your application must be classified and labeled to ensure that anyone doing work on your project knows how to handle the data. Depending upon which country you live in and where you work (private company or governmental agency), the classification system you use may be different. Ask your privacy or security team if you need guidance. Let’s look at some examples of why we need to classify data and how. Bob works for the federal government, and they have their own data classification system (Figure 2-2). In his current role he deals with data that is Classified, Secret, and Top Secret, and there is a strict system to identify these files and data types, which Bob always follows. Classified, Secret, and Top Secret, in Bob’s country of Canada, means that if exposed the data would cause harm to Canada as a nation, as opposed to a specific person. In his previous role Bob dealt with data that was much less sensitive, and those data classifications were public (anyone can see them), Protected A (could cause harm or embarrassment to an individual), Protected B (will cause harm to a person or persons), and Protected C (could result in death or otherwise irreparable damage to one or more persons). Figure 2-2: Data classifications Bob uses at work By always classifying (deciding its level of sensitivity) and labeling the data that he collects, Bob ensures that everyone on the team knows how to handle the data correctly. If it’s unlabeled, people could make a mistake, and unknowingly leak or leave unprotected very valuable and/or sensitive data. Many database systems allow you to add classification levels to data fields or tables (such as Microsoft SQL Server). If the one you are using doesn’t provide this functionality natively, you can add an extra field to your table to label it yourself (or one for each field, if the sensitivity level varies). Even if data is “unclassified” or “public,” it should be labeled. Always follow the rules and regulations according to your data’s classification. If you don’t know what the rules or regulations are, ask. If there are no rules where you work, you can either adopt and follow the standard put forth by your country’s government or NIST’s “Guide for Mapping Types of Information and Information Systems to Security Categories.” NOTE Alth

Alice & Bob Learn Application Security PDF

Document Details

Tags

Related

Summary

Full Transcript