Software Security PDF
Document Details
Uploaded by PraiseworthyDaffodil8466
2021
Mathias Payer
Tags
Summary
This document is a textbook on software security, covering topics such as principles, policies, and protection. It includes sections on authentication, access rights, confidentiality, integrity, availability, isolation, least privilege, compartmentalization, threat models, bug vs. vulnerability, and more. The document is dated July 2021.
Full Transcript
Software Security Principles, Policies, and Protection Mathias Payer July 2021, v0.37 Contents 1 Introduction 1 2 Software and System Security Principles 6 2.1 Authentication........
Software Security Principles, Policies, and Protection Mathias Payer July 2021, v0.37 Contents 1 Introduction 1 2 Software and System Security Principles 6 2.1 Authentication.................. 7 2.2 Access Rights................... 9 2.3 Confidentiality, Integrity, and Availability... 9 2.4 Isolation...................... 12 2.5 Least Privilege.................. 15 2.6 Compartmentalization.............. 16 2.7 Threat Model................... 17 2.8 Bug versus Vulnerability............. 20 2.9 Summary..................... 22 3 Secure Software Life Cycle 24 3.1 Software Design.................. 25 3.2 Software Implementation............. 27 3.3 Software Testing................. 28 3.4 Continuous Updates and Patches........ 29 3.5 Modern Software Engineering.......... 30 3.6 Summary..................... 31 4 Memory and Type Safety 32 4.1 Pointer Capabilities............... 34 ii Contents 4.2 Memory Safety.................. 36 4.2.1 Spatial Memory Safety.......... 37 4.2.2 Temporal Memory Safety........ 39 4.2.3 A Definition of Memory Safety..... 40 4.2.4 Practical Memory Safety......... 40 4.3 Type Safety................... 44 4.4 Summary..................... 48 5 Attack Vectors 50 5.1 Denial of Service (DoS)............. 50 5.2 Information Leakage............... 51 5.3 Confused Deputy................. 52 5.4 Privilege Escalation............... 54 5.4.1 Control-Flow Hijacking......... 56 5.4.2 Code Injection.............. 58 5.4.3 Code Reuse............... 60 5.5 Summary..................... 61 6 Defense Strategies 62 6.1 Software Verification............... 62 6.2 Language-based Security............ 63 6.3 Testing...................... 64 6.3.1 Manual Testing.............. 65 6.3.2 Sanitizers................. 69 6.3.3 Fuzzing.................. 72 6.3.4 Symbolic Execution........... 81 6.4 Mitigations.................... 85 6.4.1 Data Execution Prevention (DEP)/WˆX 86 6.4.2 Address Space Layout Randomization (ASLR).................. 87 6.4.3 Stack integrity.............. 91 iii Contents 6.4.4Safe Exception Handling (SEH)..... 97 6.4.5Fortify Source............... 100 6.4.6Control-Flow Integrity.......... 101 6.4.7Code Pointer Integrity.......... 106 6.4.8Sandboxing and Software-based Fault Isolation.................. 106 6.5 Summary..................... 108 7 Case Studies 109 7.1 Web security................... 109 7.1.1 Protecting long running services.... 110 7.1.2 Browser security............. 112 7.1.3 Command injection........... 114 7.1.4 SQL injection............... 116 7.1.5 Cross Site Scripting (XSS)........ 117 7.1.6 Cross Site Request Forgery (XSRF).. 118 7.2 Mobile security.................. 119 7.2.1 Android system security......... 119 7.2.2 Android market.............. 121 7.2.3 Permission model............. 122 8 Appendix 123 8.1 Shellcode..................... 123 8.2 ROP Chains................... 124 8.2.1 Going past ROP: Control-Flow Bending 124 8.2.2 Format String Vulnerabilities...... 125 9 Acknowledgements 126 References 127 iv 1 Introduction Browsing through any daily news feed, it is likely that the reader comes across several software security-related stories. Software security is a broad field and stories may range from malware infestations that abuse a gullible user to install ma- licious software to widespread worm outbreaks that leverage software vulnerabilities to automatically spread from one sys- tem to another. Software security (or the lack thereof) is at the basis of all these attacks. But why is software security so difficult? The answer is com- plicated. Both protection against and exploitation of security vulnerabilities cross-cut through all layers of abstraction and involve human factors, usability, performance, system abstrac- tions, and economic concerns. While adversaries may target the weakest link, defenders have to address all possible attack vectors. An attack vector is simply an entry for an attacker to carry out an attack, i.e., an opening in the defenses allowing the attacker to gain unintended access. A single flaw is enough for an attacker to compromise a system while the defender must prohibit any feasible attack according to a given threat model. As an example, an attacker requires a single exploitable bug to compromise a system while the defender must prohibit the exploitation of all flaws that are present in a system. If you 1 1 Introduction compare this to logic formulas, it is much simpler to satisfy an “it exists” condition than a “for all condition”. Security and especially system and software security concerns permeate all areas of our life. We interact with complex inter- connected software systems on a regular basis. Bugs or defects in these systems allow attackers unauthorized access to our data or enable them to escalate their privileges such as by installing malware. Security impacts everyone’s life and it is crucial for a user to make safe decisions. To manage an information system, people across several layers of abstraction have to work together: managers, administrators, developers, and security researchers. A manager will decide on how much money to invest into a security solution or what security product to buy. Administrators must carefully reason about who gets which privileges. Developers must design and build secure systems to protect the integrity, confidentiality, and availability of data given a specific access policy. Security researchers identify flaws and propose mitigations against weaknesses, vulnerabilities, or systematic attack vectors. Security is the application and enforcement of policies through defense mechanisms over data and resources. Security policies specify what we want to enforce. Defense mechanisms specify how we enforce the policy (i.e., an implementation/instance of a policy). For example, “Data Execution Prevention” is a mechanism that enforces a Code Integrity policy by guar- anteeing each page of physical memory in the address space of a process is either writable or executable but never both. Software Security is the area of security that focuses on (i) testing, (ii) evaluating, (iii) improving, (iv) enforcing, and (v) 2 1 Introduction proving security properties of software. To form a common basis of understanding and to set the scene for software security, this book first introduces and defines basic security principles. These principles cover confidentiality, integrity, and availability to define what needs to be protected, compartmentalization to understand approaches of defenses, threat models to reason about abstract attackers and behavior, and differences between bugs and vulnerabilities. During the discussion of the secure software life cycle, we will evaluate the design phase with a clear definition of requirement specification and functional design of software before going into best implementation practices, continuous integration, and software testing along with secure software updates. The focus is explicitly not software engineering but security aspects of these processes. Memory and type safety are the core security policies that enable semantic reasoning about software. Only ff memory and type safety are guaranteed then we can reason about the correctness of software according to its implementation. Any violation of these policies results in exceptional behavior that allows an attacker to compromise the internal state of the application and execute a so-called weird machine. Weird machines no longer follow the expected state transitions of an application as defined in source code as the application state is compromised. An execution trace of a weird machine always entails some violation of a core security policy to break out of the constraints of the application control-flow (or data-flow). The CPU only executes the underlying instructions. When the data is compromised, the high-level properties that the compiler 3 1 Introduction or runtime system assumed will no longer hold and therefore any checks that were removed due to these assumptions may no longer hold. The section on attack vectors discusses different types of at- tacks. Starting with a confused deputy that abuses a given API to trigger a compartment into leaking information or esca- lating privileges to control-flow hijacking attacks that leverage memory and type safety issues to redirect an application’s control-flow to attacker-chosen locations. Code injection and code reuse rewire the program to introduce attacker-controlled code into the address space of a program. The defense strategies section lists different approaches to pro- tect applications against software security violations. As new source code is written faster than it can be tested, some ex- ploitable bugs will always remain despite the best development and testing efforts. As a last line of defense, mitigations help a system to ensure its integrity by sacrificing availability. Mitiga- tions stop an unknown or unpatched flaw by detecting a policy violation through additional instrumentation in the program. During development, defense strategies focus on different ap- proaches to verify software for functional correctness and to test for specific software flaws with different testing strategies. Sanitizers can help expose software flaws during testing by terminating applications whenever a violation is detected. A set of case studies rounds off the book by discussing browser security, web security, and mobile security from a software and system security perspective. This book is intended for readers interested in understanding 4 1 Introduction the status quo of software security, for developers that want to design secure software, write safe code, and continuously guarantee the security of an underlying system. While we discuss several research topics and give references to deeper research questions, this book is not intended to be research- focused but a source of information to dive into the software security area. Disclaimer: this book is definitively neither perfect nor flawless. If you find spelling errors, language issues, textual mistakes, factual inconsistencies, or other flaws, please follow a respon- sible disclosure strategy and let us know by dropping us an email. We will happily update the text and make the book better for everyone. Enjoy the read and hack the planet! 5 2 Software and System Security Principles The goal of software security is to allow any intended use of soft- ware but prevent any unintended use. Any unintended use may cause harm as in unintended use of compute resources outside of a defined allowed use. In this chapter, we discuss several sys- tem principles that are essential when building secure software systems. Confidentiality, Integrity, and Availability enable rea- soning about different properties of a secure system. Isolation enforces the separation between components such that interac- tion is only possible along a well-defined interface that allows reasoning about access primitives. Least privilege ensures that each software component executes with the minimum amount of privileges. During compartmentalization, a complex piece of software is broken into smaller components. Isolation and compartmentalization play together, a large complex system is compartmentalized into small pieces that are then isolated from each other. The threat model specifies the environment of the software system, outlining the capabilities of an attacker. Distinguishing between software bugs and vulnerabilities helps us to decide about the risks of a given flaw. 6 2 Software and System Security Principles 2.1 Authentication Authentication is the process of verifying if someone is who they claim to be. Through authentication, a system learns who you are based on what you know, have, or are. During authen- tication, a user verifies that they have the correct credentials. For example, to login on a system a user has to authenticate with their username and password. The username may either have to be entered or selected from a drop down list and the password has to be typed. Authenticating through username and password is the most common form of verifying someones credential but alternate forms are possible too. Common classes of authentication forms are passwords (what you know), biometric (what you are), or demonstration of property (what you have). Passwords are the classic approach towards authentication. During authentication, the user has to type their password or PIN (personal identification number) to login. The password is generally kept secret. Most systems provide usernames and password as the default authentication method. Limitations are the lack of replay resistance: an attacker that steals the raw biometric data can replay that data to authenticate as the user. This risk can be mitigated by a reasonable password update policy where, after a break, the users may be urged to update their passwords. Another risk is that passwords can be brute-forced. During such a brute force attack, the attacker tries every single possible password combination. Over the years password policies have become highly restrictive and on some systems users have to 7 2 Software and System Security Principles create new passwords every few months, they are not allowed to repeat passwords, and passwords must contain a set of unique character types (e.g., upper case letters, lower case letters, numbers, and special characters) to ensure sufficient entropy. Current best practices are to allow users freedom in providing sufficiently long passwords. It is easier to achieve good entropy with longer passwords than having users forget their complex short passwords. Biometric logins may target fingerprints, Iris scans, or be- havioral patterns (e.g., how you swipe across your screen). Using biometric factors for authentication is convenient as users cannot (generally) neither lose nor forget them. Their key limitation is the lack of replay resistance. Different to pass- words, biometrics cannot be changed, so a loss of data means that this authentication form loses its utility. For example, if someone’s fingerprints are known by the attacker, they can no longer be used for authentication. Property can be anything the user owns that can be presented to the authentication system such as smartcards, smartphones, or USB keys. These devices have some internal key generation mechanism that can be verified. An advantage is that they are easily replaceable. The key disadvantage is that they should not be used by itself as, e.g., the smartphone may be stolen. Instead of just using a single username and password pair, many authentication systems nowadays rely on two or more factors. For example, a user may have to log in with username, password, and a code that is sent to their phone via text message. 8 2 Software and System Security Principles 2.2 Access Rights Access rights encode what entities a user has access to. For example, a user may be allowed to execute certain programs but not others. They may have access to their own files but not to files owned by another user. The Unix philosophy introduced a similar access right matrix consisting of user, group, and other rights. Each file has an associated user which may have read, write, or execute rights. In addition to the user who is the primary owner, there may be a group with corresponding read, write, or execute rights, and all others that are not part of the group with the same set of rights. A user may be member of an arbitrary number of groups. The system administrator organizes group membership and may create new users. Through privileged services, users may update their password and other sensitive data. More information about access rights, access control (both mandatory and discretionary) along with role based access control can be found in many books on Usenix system design or generally system security. 2.3 Confidentiality, Integrity, and Availability Information security can be summarized through the three key concepts: confidentiality, integrity, and availability. The three concepts are often called the CIA triad. These concepts are sometimes called security mechanisms, fundamental concepts, 9 2 Software and System Security Principles properties, or security attributes. While the CIA triad is somewhat dated and incomplete, it is an accepted basis when evaluating the security of a system or program. The CIA triad serves as a good basis for refinement and covers the core principles. Secrecy as a generic property ensures that data is kept hidden (secret) from an unintended receiver. Confidentiality of a service limits access of information to priv- ileged entities. In other words, confidentiality guarantees that an attacker cannot recover protected data. The confidentiality property requires authentication and access rights according to a policy. Entities must be both named and identified and an access policy determines the access rights for entities. Pri- vacy and confidentiality are not equivalent. Confidentiality is a component of privacy that prevents an entity from viewing privileged information. For example, a software flaw that al- lows unprivileged users access to privileged files is a violation of the confidentiality property. Alternatively, encryption, when implemented correctly, provides confidentiality. Note that confidentiality ensures that someone else’s data is being kept secret For example, the OS ensures confidentiality of a process’ address space by hiding it from other processes. Integrity of a service limits the modification of information to privileged entities. In other words, integrity guarantees that an attacker cannot modify protected data. Similar to confidential- ity, the integrity property requires authentication and access rights according to a policy. For example, a software flaw that allows unauthenticated users to modify a privileged file is a violation of the integrity policy. For example, a checksum that is protected against adversarial changes can detect tampering 10 2 Software and System Security Principles of data. Another aspect of integrity is replay protection. An adversary could record a benign interaction and replay the same interaction with the service. Integrity protection detects re- played transactions. In software security, the integrity property is often applied to data or code in a process. For example, the OS ensures integrity of a process’ address space by prohibiting other processes from writing to it. Availability of a service guarantees that the service remains accessible. In other words, availability prohibits an attacker from hindering computation. The availability property guar- antees that legitimate uses of the service remain possible. For example, allowing an attacker to shut down the file server is a violation of the availability policy. For example, the OS ensures availability by scheduling each process a “fair” amount of time, alternating between processes that are ready to run. The three concepts build on each other and heavily interact. For example, confidentiality and integrity can be guaranteed by sacrificing availability. A file server that is not running cannot be compromised or leak information to an attacker. For the CIA triad, all properties must be guaranteed to allow progress in the system. Several newer approaches extend these three basic concepts by introducing orthogonal ideas. The two most common extensions are accountability and non-repudiation, referring that a service must be accountable and cannot redact a granted access right or service. For example, a service that has given access to a file to an authorized user cannot claim after the fact that access 11 2 Software and System Security Principles was not granted. Non-repudiation is, at its core, a concept of law. Non-repudiation allows both a service to prove to an external party that it completed a request and the external party to prove that the service completed the request. Orthogonally, privacy ensures confidentiality properties for the data of a person. Anonymity protects the identity of an entity participating in a protocol. Each property covers one separate aspect of information se- curity. Policies provide concrete instantiations of any of the policies while mechanisms further refine a policy into an actual implementation. In practice, we will be working with policies that provide certain guarantees, following the core properties defined here. Policies themselves define the high level goals and the concrete mechanisms then enforce a given policy. 2.4 Isolation Isolation separates two components from each other and con- fines their interactions to a well-defined API. There are many different ways to enforce isolation between components, all of them require some form of abstraction and a security monitor. The security monitor runs at higher privileges than the isolated components and ensures that they adhere to the isolation. Any violation to the isolation is stopped by the security monitor and, e.g., results in the termination of the violating compo- nent. Examples of isolation mechanisms include the process abstraction, containers, or SFI [33,34]. 12 2 Software and System Security Principles The process abstraction is the most well known form of iso- lation: individual processes are separated by the operating system from each other. Each process has its own virtual mem- ory address space and can interact with other processes only through the operating system which has the role of a security monitor in this case. An efficient implementation of the process abstraction requires support from the underlying hardware for virtual memory and privileged execution. Virtual memory is an abstraction of physical memory that allows each process to use the full virtual address space. Virtual memory relies on a hardware-backed mechanism that translates virtual addresses to physical addresses and an operating system component that manages physical memory allocation. The process runs purely in the virtual address space and cannot interact with physical memory. The code in the process executes in non-privileged mode, often called user mode. This prohibits process code from interacting with the memory manager or side-stepping the operating system to interact with other processes. The CPU acts as a security monitor that enforces this separation and guarantees that privileged instructions trap into supervi- sor mode. Together privileged execution and virtual memory enable isolation. Note that similarly, a hypervisor isolates it- self from the operating system by executing at an even higher privileged mode and mapping guest physical memory to host physical memory, often backed through a hardware mechanism to provide reasonable performance. Containers are a lightweight isolation mechanism that builds on the process abstraction and introduces namespaces for ker- nel data structures to allow isolation of groups of processes. Normally, all processes on a system can interact with each 13 2 Software and System Security Principles other through the operating system. The container isolation mechanism separates groups of processes by virtualizing op- erating system mechanisms such as process identifiers (pids), networking, inter process communication, file system, and namespaces. Software-based Fault Isolation (SFI) [33,34] is a software tech- nique to isolate different components in the same address space. The security monitor relies on static verification of the exe- cuted code and ensures that two components do not interact with each other. Each memory read or write of a component is restricted to the memory area of the component. To en- force this property, each instruction that accesses memory is instrumented to constrain the pointer to the memory area. To prohibit the isolated code from modifying its own code, control- flow transfers are carefully vetted and all indirect control-flow transfers must target well-known locations. The standard way to enforce SFI is to mask pointers before they are dereferenced (e.g., anding them with a mask: and %reg, 0x00ffffff) and by aligning control-flow targets and enforcing alignment. Generally, lower levels of abstractions trust the isolation guar- antees of higher levels. For example, a process trusts the operating system that another process cannot suddenly read its memory. This trust may be broken through side channels which provide an indirect way to recover (partial) information through an unintended channel. Threat models and side chan- nels will be discussed in detail later. For now, it is safe to assume that if a given abstraction provides isolation that this isolation holds. For example, the process trusts the operating system (and the underlying hardware which provides privilege 14 2 Software and System Security Principles levels) that it is isolated from other processes. 2.5 Least Privilege The principle of least privilege guarantees that a component has the least amount of privileges needed to function. Different components need privileges (or permissions) to function. For example, an editor needs read permission to open a particular file and write permissions to modify it. Least privilege requires isolation to restrict access of the component to other parts of the system. If a component follows least privilege then any privilege that is further removed from the component removes some functionality. Any functionality that is available can be executed with the given privileges. This property constrains an attacker to the privileges of the component. In other words, each component should only be given the privilege it requires to perform its duty and no more. Note that privileges have a temporal component as well. For example, a web server needs access to its configuration file, the files that are served, and permission to open the corre- sponding TCP/IP port. The required privileges are therefore dependent on the configuration file which will specify, e.g., the port, network interface, and root directory for web files. If the web server is required to run on a privileged port (e.g., the default web ports 80 and 443) then the server must start with the necessary privileges to open a port below 1024. After opening the privileged port, the server can drop privileges and restrict itself to only accessing the root web directory and its subdirectories. 15 2 Software and System Security Principles 2.6 Compartmentalization The idea behind compartmentalization is to break a complex system into small components that follow a well-defined com- munication protocol to request services from each other. Under this model, faults can be constrained to a given compart- ment. After compromising a single compartment, an attacker is restricted to the protocol to request services from other compartments. To compromise a remote target compartment, the attacker must compromise all compartments on the path from the initially compromised compartment to the target compartment. Compartmentalization allows abstraction of a service into small components. Under compartmentalization, a system can check permissions and protocol conformity across com- partment boundaries. Note that this property builds on least privilege and isolation. Both properties are most effective in combination: many small components that are running and interacting with least privilege. A good example of compartmentalization is the Chromium web browser. Web browsers consist of multiple different components that interact with each other such as a network component, a cache, a rendering engine that parses documents, and a JavaScript compiler. Chromium first separates individual tabs into different processes to restrict interaction between them. Additionally, the rendering engine runs in a highly restricted sandbox to limit any bugs in the parsing process to an unprivi- leged process. 16 2 Software and System Security Principles 2.7 Threat Model A threat model is used to explicitly list all threats that jeopar- dize the security of a system. Threat modeling is the process of enumerating and prioritizing all potential threats to a system. The explicit motion of identifying all weaknesses of a system allows individual threats to be ranked according to their impact and probability. During the threat modeling process, the sys- tem is evaluated from an attacker’s view point. Each possible entry vector is evaluated, assessed, and ranked according to the threat modeling system. Threat modeling evaluates questions such as: What are the high value-assets in a system? Which components of a system are most vulnerable? What are the most relevant threats? As systems are generally large and complex, the first step usually consists of identifying individual components. The in- teraction between components is best visualized by making any data flow between components explicit, i.e., drawing the flow of information and the type of information between components. This first step results in a detailed model of all components and their interactions with the environment. Each component is then evaluated based on its exposure, ca- pabilities, threats, and attack surface. The analyst iterates through all components and identifies, on a per-component basis, all possible inputs, defining valid actions and possible threats. For each identified threat, the necessary preconditions are mapped along with the associated risk and impact. 17 2 Software and System Security Principles A threat model defines the environment of the system and the capabilities of an attacker. The threat model specifies the clear bounds of what an attacker can do to a system and is a precondition to reason about attacks or defenses. Each identified threat in the model can be handled through a defined mitigation or by accepting the risk if the cost of the mitigation outweighs the risk times impact. Let us assume we construct the threat model for the Unix “login” service, namely a password-based authentication ser- vice. Our application serves three use-cases: (i) the system can authenticate a user based on a username and password through a trusted communication channel, (ii) regular users can change their own password, and (iii) super users can create new users and change any password. We identify the following components: data storage, authentication service, password changing service, and user administration service according to the use-cases above. The service must be privileged as arbitrary users are allowed to use some aspects of the service depending on their privilege level. Our service therefore must distinguish between different types of users (administrators and regular users). To allow this distinction, the service must be isolated from unauthenticated access. User authentication services are therefore an integral part of the operating system and privileged, i.e., run with administrator capabilities. The data storage component is the central database where all user accounts and passwords are stored. The database must be protected from unprivileged modification, therefore only the administrator is allowed to change arbitrary entries 18 2 Software and System Security Principles while individual users are only allowed to change their own entry. The data storage component relies on the authentication component to identify who is allowed to make modifications. To protect against information leaks, passwords are encrypted using a salt and one-way hash function. Comparing the hashed input with the stored hash allows checking equivalence of a password without having to store the plaintext (or encrypted version) of the password. The authentication service takes as input a username and password pair and queries the storage component for the cor- responding entry. The input (login request) must come from the operating system that tries to authenticate a user. After carefully checking if the username and password match, the service returns the information to the operating system. To protect against brute-force attacks, the authentication service rate limits the number of allowed login attempts. The password changing service allows authenticated users to change their password, interfacing with the data storage compo- nent. This component requires a successful prior authorization and must ensure that users can only change their own password but not passwords of other users. The administrator is also allowed to add, modify, or delete arbitrary user accounts. Such an authentication system faces threats from several direc- tions, providing an exhaustive list would go beyond the scope of this book. Instead, we provide an incomplete list of possible threats: Implementation flaw in the authentication service allow- ing either a user (authenticated or unauthenticated) to 19 2 Software and System Security Principles authenticate as another user or privileged user without supplying the correct password. Implementation flaw in privileged user management which allows an unauthenticated or unprivileged user to modify arbitrary data entries in the data storage. Information leakage of the password from the data stor- age, allowing an offline password cracker to probe a large amount of passwords1 A brute force attack against the login service can probe different passwords in the bounds of the rate limit. The underlying data storage can be compromised through another privileged program overwriting the file, data corruption, or external privileged modification. 2.8 Bug versus Vulnerability A “bug” is a flaw in a computer program or system that results in an unexpected outcome. A program or system executes computation according to a specification. The term “bug” comes from a moth that deterred computation of a Harvard Mark II computer in 1947. Grace Hopper noted the system crash in the operation log as “first actual case of bug being found”, see 2.1,. The bug led to an unexpected termination 1 Originally, the /etc/passwd file stored all user names, ids, and hashed passwords. This world readable file was used during authentication and to check user ids. Attackers brute forced the hashed passwords to escalate privileges. As a mitigation, Unix systems moved to a split system where the hashed password is stored in /etc/shadow (along with an id) and all other information remains in the publicly readable /etc/passwd. 20 2 Software and System Security Principles of the current computation. Since then the term bug was used for any unexpected computation or failure that was outside of the specification of a system or program. Figure 2.1: “First actual case of bug being found”, note by Grace Hopper, 1947, public domain. As a side note, while the term bug was coined by Grace Hopper, the notion that computer programs can go wrong goes back to Ada Lovelace’s notes on Charles Babbage’s analytical machine where she noted that “an analysing process must equally have been performed in order to furnish the Analytical Engine with the necessary operative data; and that herein may also lie a possible source of error. Granted that the actual mechanism is unerring in its processes, the cards may give it wrong orders.” A software bug is therefore a flaw in a computer program that causes it to misbehave in an unintended way while a hardware bug is a flaw in a computer system. Software bugs are due to human mistake in the source code, compiler, or runtime system. Bugs result in crashes and unintended program state. Software bugs are triggered through specific input (e.g., console input, 21 2 Software and System Security Principles file input, network input, or environmental input). If the bug can be controlled by an adversary to escalate privi- leges, e.g., gaining code execution, changing the system state, or leaking system information then it is called a vulnerability. A vulnerability is a software weakness that allows an attacker to exploit a software bug. A vulnerability requires three key components (i) system is susceptible to flaw, (ii) adversary has access to the flaw (e.g., through information flow), and (iii) adversary has capability to exploit the flaw. Vulnerabilities can be classified according to the flaw in the source code (e.g., buffer overflow, use-after-free, time-of-check- to-time-of-use flaw, format string bug, type confusion, or miss- ing sanitization). Alternatively, bugs can be classified according to the computational primitives they enable (e.g., arbitrary read, arbitrary write, or code execution). 2.9 Summary Software security ensures that software is used for its intended purpose and prevents unintended use that may cause harm. Security is evaluated based on three core principles: confiden- tiality, integrity, and availability. These principles are evaluated based on a threat model that formally defines all threats against the system and the attacker’s capabilities. Isolation and least privilege allow fine-grained compartmentalization that breaks a large complex system into individual components where security policies can be enforced at the boundary between components 22 2 Software and System Security Principles based on a limited interface. Security relies on abstractions to reduce complexity and to protect systems. 23 3 Secure Software Life Cycle Secure software development is an ongoing process that starts with the initial design and implementation of the software. The secure software life cycle only finishes when software is retired and no longer used anywhere. Until this happens, software is continuously extended, updated, and adjusted to changing requirements from the environment. This setting results in the need for ongoing software testing and continuous software updates and patches whenever new vulnerabilities or bugs are discovered and fixed. The environment such as operating system platforms (which can be considered software as well, following the same life cycle) co-evolve with the software running on the platform. An example is the evolution of security features available on the Ubuntu Linux distribution. Initially few to no mitigations were present but with each new release of the distribution, new hardening features are released, further increasing the resilience of the environment against unknown or unpatched bugs in the software. Ubuntu focuses on safe default configuration, secure subsystems, mandatory access control, filesystem encryption, trusted platform modules, userspace hardening, and kernel hardening. Together, these settings and changes make it harder for attackers to compromise a system. 24 3 Secure Software Life Cycle Software engineering is different from secure software devel- opment. Software engineering is concerned with developing and maintaining software systems that behave reliably and efficiently, are affordable to develop and maintain, and satisfy all the requirements that customers have defined for them. It is important because of the impact of large, expensive software systems and the role of software in safety-critical applications. It integrates significant mathematics, computer science, and practices whose origins are in engineering. Why do we need a secure software development life cycle? Secure software development focuses not only on the functional requirements but additionally defines security requirements (e.g., access policies, privileges, or security guidelines) and a testing/update regime on how to react if new flaws are discovered. Note, this is not a book on software engineering. We will not focus on waterfall, incremental, extreme, spiral, agile, or continuous integration/continuous delivery. The discussion here follows the traditional software engineering approach, leaving it up to you to generalize to your favorite approach. We discuss aspects of some modern software engineering concepts in a short section towards the end of this chapter. 3.1 Software Design The design phase of a software project is split into two sub phases: coming up with a requirement specification and the concrete design following that specification. The requirement specification defines tangible functionality for the project, indi- vidual features, data formats, as well as interactions with the 25 3 Secure Software Life Cycle environment. From a security perspective, the software engi- neering requirement specification is extended with a security specification, an asset identification, an environmental assess- ment, and use/abuse cases. The security specification involves a threat model and risk assessment. The asset specification defines what kind of data the software system operates on and who the entities with access to the data are (e.g., including privilege levels, administrator access, and backup procedures). Aspects of the environment are included in this assessment as they influence the threats, e.g., a public terminal is at higher physical risk than a terminal operating in a secured facility. Transitioning from the requirement specification phase to the software design phase, security aspects must be included as integral parts of the design. This transition involves additional threat modeling based on the concrete design and architecture of the software system. The design of the software then extends the regular design documents with a concrete security design that ties into the concrete threat model. The actors and their abilities and permissions are clearly defined, both for benign users and attackers. During this phase, the design is reviewed from a functional but also from a security perspective to probe different aspects and to iteratively improve the security guar- antees. The final design document contains full specifications of requirements, security constraints, and a formal design in prose. 26 3 Secure Software Life Cycle 3.2 Software Implementation The implementation of the software project follows mostly reg- ular software engineering best practices in robust programming. Special care should be taken to ensure that source code is always checked into a source code repository using version control such as git or svn. Source version control systems such as GitHub allow the organization of source code in a git repository as well as corresponding documentation in a wiki. Individual flaws can be categorized in a bug tracker and then handled through branches and pull requests. Each project should follow a strict coding standard that defines what “flavor” of a programming language is used, e.g., how code is indented and what features are available. For C++, it is worthwhile to define how exceptions will be used, what modern features are available, or how memory management should be handled. Alongside the feature definition, the coding standard should define how comments in the code are handled and how the design document is updated whenever aspects change. The Google C++ style guide or Java style guide are great examples of such specification documents. They define the naming structure in projects, the file structure, code formatting, naming and class interfaces, program practices, and documentation in an accessible document. Whenever a programmer starts on a project they can read the style guide and documentation to get a quick overview before starting on their component. Similarly, newly added or modified source code should be re- viewed in a formal code review process. When committing code 27 3 Secure Software Life Cycle to a repository, before the new code is merged into the branch, it must be checked by another person on the project to test for code guidelines, security, and performance violations. The code review process must be integrated into the development process, working naturally alongside development. There are a myriad of tools that allow source review such as GitHub, Gerrit, and many others. For a new project it is important to evaluate the features of the different systems and to choose the one that best integrates into the development process. 3.3 Software Testing Software testing is an integral component of software develop- ment. Each new release, each new commit must be thoroughly tested for functionality and security. Testing in software en- gineering focuses primarily on functionality and regression. Continuous integration testing, such as Jenkins or Travis, al- low functional tests and performance tests based on individual components, unit tests, or for the overall program. These tests can run for each commit or at regular intervals to detect diver- sions quickly. While measuring functional completeness and detecting regression early is important, it somewhat neglects security aspects. Security testing is different from functional testing. Func- tional testing measures if software meets certain performance or functional criteria. Security as an abstract property is not inherently testable. Crashing test cases indicate some bugs but there is no guarantee that a bug will cause a crash. Automatic security testing based on fuzz testing, symbolic execution, or 28 3 Secure Software Life Cycle formal verification tests security aspects of the project, increas- ing the probability of a crash during testing. See Section 6.3 for more details on testing. Additionally, a red team evaluates the system from an adversary’s perspective and tries to find exploitable flaws in the design or implementation. 3.4 Continuous Updates and Patches Software needs a dedicated security response team to answer to any threats and discovered vulnerabilities. They are the primary contact for any flaw or vulnerability and will triage the available resources to prioritize how to respond to issues. Software evolves and, in response to changes in the environ- ment, will continuously expand with new features, potentially resulting in security issues. An update and patching strategy defines how to react to said flaws, how to develop patches, and how to distribute new ver- sions of a software to the users. Developing a secure update infrastructure is challenging. The update component must be designed to frequently check for new updates while considering the load on the update servers. Updates must be verified and checked for correctness before they are installed. Existing soft- ware market places such as the Microsoft Store, Google Android Play, or the Apple Store provide integrated solutions to update software components and allow developers to upload new soft- ware into the store which then handles updates automatically. Google Chrome leverages a partial hot update system that quickly pushes binary updates to all Google Chrome instances to protect them against attacks. Linux distributions such as 29 3 Secure Software Life Cycle Debian, RedHat, or Ubuntu also leverage a market-style system with an automatic software update mechanism that continu- ously polls the server for new updates and informs the user of new updates (e.g., through a pop up) or, if enabled, even automatically installs the security updates. 3.5 Modern Software Engineering Software engineering processes underwent several improvements and many different management schemes exist. Under agile software development, one of those modern extensions, both requirements and solutions co-evolve as part of a collaborative team. The teams self-organize and restructure themselves de- pending on the changing requirements as part of the interaction with the customer. Under an agile system, an early release is constantly evaluated and further improved. The focus of agile development is on functionality and evolutionary planning. This core focus on functionality and the lack of a written specification or documentation makes reasoning about security challenging. Individual team leads must be aware of security constraints and explicitly push those constraints despite them never being encoded. Explicitly assigning a member of the team a security role (i.e., a person that keeps track of security constraints) allows agile teams to keep track of security con- straints and to quickly react to security relevant design changes. Every release under agile software development must be vetted for security and this incremental vetting must consider security implications as well (e.g., a feature may increase the threat surface or enable new attack vectors). 30 3 Secure Software Life Cycle 3.6 Summary Software lives and evolves. The software development life cycle continues throughout the lifetime of software. Security must be a first class citizen during this whole process. Initially, pro- grammers must evaluate security aspects of the requirement specification and develop a security-aware design with explicit notion of threats and actors. During the implementation phase programmers must follow strict coding guidelines and review any modified code. Whenever the code or the requirements change, the system must be tested for functionality, perfor- mance, and security using automated testing and targeted security probing. Last but not least, secure software develop- ment is an ongoing process and involves continuous software patching and updates – including the secure distribution of said updates. 31