CSC 223 Study Guide PDF

Summary

This document is a study guide for CSC 223, covering fundamental concepts in version control, shell scripting (SH), and Git. It details version control systems, file systems, terminal commands, and SSH.

Full Transcript

CSC 223 STUDY GUIDE VERSION CONTROL o Version control is a system that records changes to a file or set of files over time so that you can recall specific versions later. o Version control systems (VCS) are tools used to track changes to source code (or other collections of files and folders)....

CSC 223 STUDY GUIDE VERSION CONTROL o Version control is a system that records changes to a file or set of files over time so that you can recall specific versions later. o Version control systems (VCS) are tools used to track changes to source code (or other collections of files and folders). These tools help maintain a history of changes. They facilitate collaboration. Vcss track changes to a folder and its contents in a series of snapshot. Vcss also maintain metadata like who created each snapshot, messages associated with each snapshot. o Git is a version control system. SHELL (SH) o Shell is a program that takes commands from the keyboard and gives them to the operating system to perform. o File System: File structure is a way of organizing files and directories in a computer. Itis a hierarchical tree-like structure of directories and files The root directory (outermost directory) is the top-level directory in the file system The root directory (outermost directory) is represented by / in Unix-based systems The root directory (outermost directory) is represented by C:\ in Windows systemsfile path is the location of a file or folder in a directory structure. § An example of a file is C:\Users\Documents\file.txt. § The file path is made up of the drive name - e.g. C\\, the folder - e.g. Users\, and the file name - e.g. Documents\file.txt. o File Paths: A filepath is a string that specifies the location of a file or directory in the file system A filepath generally consists of directories and subdirectories separated by a delimiter (e.g., / or \) A filepath can be read left to right to understand its location in the file system relative to the root directory o Terminal commands: Pwd command is used to print the current working directory. Ls command is used to list the files and directories in the current working directory. Ls -lt command is used to list the files and directories in the current working directory in long listing format and sorted by time modified. Nano opens a text editor to create or edit a file. Cd can be used to jump to locations. Cd. Some common shortcuts: § ~ : Home Directory §. : Current Directory §.. : Parent Directory § / : Root Directory Mkdir is used to create a new folder. SHELL (SH) o SSH is a protocol for encrypted remote login and other secure network services. o Using the SSH protocol, you can connect and authenticate to remote servers and services. o You can use SSH to access another computer over a network and execute commands on the other computer. o SSH keys are a way to identify trusted computers and communicate with them without involving passwords. o SSH public keys authenticate a user to a remote server without username and password. o SSH keys are two files that are generated together: a public key and a private key. The private key is kept on the computer you log in from. (Never share your private key). The public key is shared with all the computers you want to communicate with. The filename ending with.pub is your public key. Ls -al ~/.ssh is used to check if you have an SSH key. SSH keys are generated using a command line tool called ssh-keygen o Ssh-keygen -t ed25519 -C "[email protected]" § -t: the type of encryption to use § Ed25519: the encryption type § -C: comment to help you identify the key o To paste your public ssh key on the website you want to use it on: pbcopy < ~/.ssh/id_rsa.pub. GIT (INDIVIDUAL) o You can configure your Git username and email using the following commands, replacing John Doe’s info with your own. $ git config --global user.name "John Doe" $ git config --global user.email [email protected] o Git innit Mark the start of a git workflow. Git init is a one-time command you use during the initial setup of a new local repo. Executing this command will create a new Git repository in the current directory. This will create a new subdirectory named.git that contains all your necessary repository files — a Git repository skeleton. At this point, nothing is tracked yet. The local repository must be linked to a remote repository. The argument for git innit can be the path to a local folder or nothing. o Git remote add origin Used to manage the set of remotes associated with a repository. A remote in Git is a common repository that all team members use to exchange their changes, In most cases, such a remote repository is stored on a code hosting service like github or on an internal server. In contrast to a local repository, a remote typically does not provide a file tree of the project’s current state. Instead, it only consists of the.git versioning data o Git clone Mark the start of a git workflow. Git clone is a command for downloading existing source code from a remote repository (like Github, for example). Cloning a repository downloads a remote repository to a local machine. Github has at that point in time, including all versions of every file and folder for the project. You can push your changes to the remote repository on github, or pull other people’s changes from github. You don’t need to run git init or git remote add origin if you are cloning an existing repository. o Workflow( Individual): In git, there are four main states that your files can reside in. They are: untracked: means that Git sees the file as a new file that has not been committed to the database yet. staged: means that you have marked a modified file in its current version to go into your next commit snapshot. committed: means that the data is safely stored in your local database. pushed: means that your data has been transferred to a remote repository. o Git status Git status is a command that displays the state of the working directory and the staging area. It lets you see which changes have been staged, which haven’t, and which files aren’t being tracked by Git. Status output does not show you any information regarding the committed project history. o Git diff Used along with git status and git log to analyze the current state of a Git repo in comparison to the past. o Git add Git add is a command used to add files to the staging area. Starts tracking changes to a new local file. It tells Git that you want to include updates to a particular file in the next commit. Changes are not recorded until you run git commit. Path or name of a file or directory is the expected argument. o Git commit Git commit is a command used to save the changes to the local repository. Note that git commit alone doesn’t send changes to the remote repository. It only records changes to the local repository. String message describing the changes are the expected arguments for the git commit command. Git commit -m You should run git commit to save your changes to the local repository at the completion of every task. o Git push -u origin main Name of the local branch and the remote repository are the expected arguments for the git push command./ Git push -u. Moves changes from the local repository to the remote repository. o Git log Used to display the commit history of a repository. By default, git log displays the commit hash, the author and the commit message. Git log –oneline gives a more concise view of the commit history. o Git reset Used to undo local changes to the state of a Git repo Reset is the command we use when we want to go back to a previous everything that happened after it. Similar to revert, we first need to find the commit we want to go back to using git log. Next, we need to use the reset command. To reset to earlier commits, use git reset followed by the first seven characters of the of the commit you want to go back to. o Git restore Used to undo changes in the working directory. Used to undo changes in the staging area o Git revert revert The command we use when we want to take a previous commit, and add it as a new commit keeping the log intact. We need to find the point we want to return to. To do that, we need to go through the log. To avoid the very long log list, we are going to use the --oneline per commit showing i) the first seven characters of the commit hash and ii) the option, which gives just one line per commit and the commit message Use it to make a new commit: Git revert HEAD ~ x (x being a number. 1 going back one more, etc)i o The following use internet connectivity: git clone, git push and git pull. o The terminal is a text-based interface to interact with the compute. It can be used to navigate the file system It can be used to create, edit, and delete files and directories It can be used to run programs and scripts It can be used to do all the tasks that can be done using a graphical interface It is less user-friendly than a graphical interface Changes made using the terminal are permanent and cannot be undone Changes made using the terminal impact what is displayed in the graphical interface Commands in the terminal are case-insensitive but arguments are case-sensitive number. 1 going back one more, GIT (TEAM) o Git fetch The git fetch command is used to download changes from a remote repository. It will download any new branches or changes that have been made since the last time you fetched. These changes will be downloaded to your local repository, but they will not be merged into your working directory. o Git merge The git merge command is used to merge changes from one branch into another. It will take the changes from the specified branch and apply them to the current branch. If there are any conflicts between the two branches, Git will prompt you to resolve them before completing the merge. There are two ways to resolve conflicts: manually editing the conflicted files or using a merge tool. § Some popular merge tools include vimdiff, kdiff3, and meld. Conflicts can occur when the same lines of code have been changed in both branches. You will need to manually resolve these conflicts before the merge can be completed. The way to resolve conflicts is to open the conflicted file in your text editor and manually edit the file to resolve the conflicts. Once you have resolved all the conflicts, you can save the file and commit the changes. This manual edit process takes place in the working directory after a merge has been attempted and failed due to conflicts. o Pull requests are a way to propose changes to a repository and collaborate with others. When you create a pull request, you are asking the repository owner to review and merge your changes into the main branch. Moves changes from the remote repository to the local repository To create a pull request, you will need to push your changes to a remote branch and then create the pull request on the Git hosting service. You should run git pull every day before you start working on your project to get your team members’ changes. o Working with Branches Branches are an important concept in Git that allow you to work on different features or fixes in isolation. When you create a new branch, you are creating a new line of development that is independent of the main branch (usually called master or main). To create a new branch, you can use the git checkout -b command. This will create a new branch and switch to it in one step. Git checkout -b Once you have made changes on a branch, you can merge those changes back into the main branch using the git merge command. § Git checkout main § Git merge You can check which branch you are on using the git branch command. GITHUB (COMMUNITY) o Github is a code hosting platform for version control and social network for collaboration. o Your github profile is the place where you can share information about yourself, your projects, and your interests. o You can also use it to follow other developers, discover new code, and contribute to projects. o Repositories and projects: They’re easiest to imagine as a project’s folder. A repository contains all of the project files (including documentation), and stores each file’s revision history. Repositories can have multiple collaborators and can be either public or private. Repositories can contain folders and files, images, videos, spreadsheets, and data sets – anything your project needs. Often, repositories include a README file, a file with information about your project. README files are written in the plain text Markdown language. Github lets you add a README file at the same time you create your new repository. o Fork a repository A fork is a copy of a repository. Forks are used to create a copy of a repository under your account. Forking a repository allows you to freely experiment with changes without affecting the original project. Forks are used to either propose changes to someone else’s project or to use someone else’s project as a starting point for your own idea. After forking the repository, you can clone it to your local machine and make changes. Once you are done, you can create a pull request to propose your changes to the original repository. o Pull Requests Pull requests are used to merge changes from a branch to the main branch Pull requests show differences between the content from both branches. The changes, additions, and subtractions are shown in green and red. As soon as you make a commit, you can open a pull request and start a discussion, even before the code is finished. By using github’s @mention system in your pull request message, you can ask for feedback from specific people or teams, whether they’re down the hall or 10 time zones away. You can even open pull requests in your own repository and merge them yourself. PROGRAMMING IN JAVA o Java is a high-level, class-based, object-oriented programming language. o Java is used in a wide variety of computing platforms from embedded devices and mobile phones to enterprise servers and supercomputers. It is also used in developing games, mobile applications, and enterprise software. o Java Development Kit (JDK): JDK is a software development kit used to develop Java applications and applets. It includes the JRE, an interpreter/loader (Java), a compiler (javac), an archiver (jar), a documentation generator (Javadoc), and other tools needed in Java development. o Java Runtime Environment (JRE): JRE is a software package that provides Java class libraries, along with Java Virtual Machine (JVM), and other components to run applications written in Java programming. o Java Virtual Machine (JVM): JVM is a software that can be ported to different hardware platforms and execute Java bytecode. o Java is very strictly an object-oriented language, whereas Python is a multi-paradigm language. This means that Java is designed around the concept of objects and classes, whereas Python can be used in an object-oriented, procedural, or functional style. Every Java program must be inside a class, and every program must have a main method. In Python, you can write code outside of a class or function. o Java is a statically typed language. This means that you have to declare the type of a variable when you declare it. For example, in Java, you would declare a variable as int x = 5; whereas in Python, you would declare a variable as x = 5. o Java is a compiled language, where Python is an interpreted language. This means that running a Java program is a two-step process: First you compile the Java code into bytecode, and then you run the bytecode on the Java Virtual Machine (JVM). In Python, you run the Python code in one-step that is directly on the Python interpreter. public class HelloWorld { public static void main(String[] args) { System.out.println("Hello, World!"); } } public class HelloWorld { This line declares a class named HelloWorld. In Java, every program must be inside a class. The class name must match the name of the file. In this case, the file is named HelloWorld.java, so the class name must be HelloWorld. The public keyword is an access modifier that specifies that the class is accessible from other classes. Every class in Java must have an access modifier: public, private, protected, or package-private. The {} delimits the body of the class. In Java, {} is used to delimit blocks of code. public static void main(String[] args) { This line declares a method named main. In Java, the main method is the entry point of the program. The public keyword is an access modifier that specifies that the method is accessible from other classes. Similar to the class, every method in Java must have an access modifier. The static keyword means that the method belongs to the class, not an instance of the class. The void keyword means that the method does not return a value. The String[] args parameter is an array of strings that are passed to the main method. System.out.println("Hello, World!"); This line prints “Hello, World!” to the console. The System.out.println method is used to print a string to the console. This is a method of the System.out object, which is a static member of the System class that represents the standard output stream. This is imported by default in every Java program. Note that the println is equivalent to the built-in print() function in Python. Note that semi-colon ; is used to end the statement. This is mandatory in Java. Every statement in Java must end with a semi-colon. In contrast, Python uses whitespace to delimit statements. ABSTRACTIONS o Abstraction is the process of hiding the complex implementation details and showing only the necessary features of an object. In other words, it helps to reduce programming complexity and effort. o Why Abstraction? Reduce Complexity: Abstraction helps to reduce programming complexity and effort. Security: It helps to hide the internal details of an object and show only the necessary features of an object. Code Reusability: Abstraction helps to reuse the code and use it in multiple places. Flexibility: It helps to change the implementation of an object without affecting the rest of the code. o In programming, abstraction is achieved using three main concepts: Data and Variables: Variables are used to store data values. They are used to represent the state of an object. They allow abstracting operations and algorithms from the specific values they operate on. Functions and Methods: Functions and methods are used to define abstract behavior only in terms of inputs and outputs. They also allow abstracting the implementation details of an algorithm. Classes and Objects: Classes and objects are used to define abstract data types. They allow abstracting the data structure and behavior of an object. PRIMITIVE DATA TYPES o Values and Types The atomic indivisible unit of data in computer programming is called a value. Values are the most basic things that a computer program manipulates or calculates. Each value belongs to a type. The type of a value determines its interpretation by the computer and the operations that Can be performed on it. For example, the value 42 is of type int (short for integer) and the value "Hello World!" is of type str (short for string, so-called because it contains a string of letters). In Java, every value has a type, and every type is defined by the Java programming language. o Primitive Data Types byte: 8-bit signed integer. Range: -128 to 127. short: 16-bit signed integer. Range: -32,768 to 32,767. int: 32-bit signed integer. Range: -2,147,483,648 to 2,147,483,647. long: 64-bit signed integer. Range: -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807. float: 32-bit floating point number. double: 64-bit floating point number. o Other primitive data types include: boolean: Represents true or false values. char: Represents a single 16-bit Unicode character. o Python versus Java Python Java Primitive Python does not have primitive data Java has a clear distinction between Data Types types in the same way Java does. primitive data types (e.g., int, char, float, versus boolean) and objects In Python, everything objects. is an object, including what Java would consider primitives like Primitive data types in Java are not integers and floats. objects and are stored directly in memory. This means even simple data types in Python have methods and are more They are more efficient but have fewer versatile, though capabilities compared to objects. potentially less efficient in terms of memory. Wrappers In Python, since everything is an Java has wrapper classes for each and object, there is no need for separate primitive type (e.g., Integer for int, Float Autoboxing wrapper classes or for float). autoboxing. These wrappers allow primitive values to be treated as objects when necessary. Java also supports autoboxing, which automatically converts between primitives and their corresponding wrapper types. Fixed-size In contrast, Python’s objects Java uses fixed-size memory allocation Memory (including what would be primitives in for primitive types, which makes Allocation Java) are dynamically allocated, memory usage predictable and efficient. meaning their memory size can vary. For example, an int always uses 4 bytes, Python integers and floats use more a float uses 4 bytes, etc. memory than their Java counterparts because they are objects and include metadata. o In Java, there are two types of type conversion: Automatic Type Conversion: Java supports automatic type conversion when values of different types are combined in an expression. This is known as type promotion. Explicit Type Conversion (Type Casting): Java also supports explicit type conversion, known as type casting, which allows you to convert a value from one type to another. o Automatic Type Conversion (Type Promotion) When values of different types are combined in an expression, Java automatically promotes the values to a common type. o The rules for type promotion are as follows: If one of the values is a double, the other value is converted to adouble If one of the values is a float, the other value is converted to a float If one of the values is a long, the other value is converted to a long Otherwise, both values are converted to an int. o Explicit Type Conversion Java allows you to explicitly convert a value from one type to another using type casting. Type casting is done by placing the desired type in parentheses before the value to be converted. Type casting can result in data loss if the value being converted cannot be represented in the target type. FIXED-SIZE REFERNCE DATA TYPES o Reference types store references (memory addresses) to objects rather than the objects themselves. This allows Java to manage memory more efficiently and handle complex data structures. o Java also has non-primitive data types, which are called reference types. These include: Arrays: A collection of elements of the same type. Classes: User-defined data types that can have fields, methods, and constructors. Interfaces: Similar to classes but with only method signatures and no implementation. Enums: A special type that defines a set of constants. Strings: A sequence of characters. Arrays: An array is a collection of elements of the same type stored in contiguous memory locations. Arrays in Java are fixed in size, meaning the number of elements in an array is determined when the array is created and cannot be changed. Arrays are indexed starting from 0. The index represents the position of an element in the array. To access an element in an array, you use the index enclosed in square brackets after the array name (e.g., arr). To access subarrays or slices of an array, you can use the Arrays.copyOfRange() method or loop through the array and copy elements to a new array. o Declaring and initializing arrays: To declare an array in Java, you specify the type of elements the array will hold, followed by square brackets [] and the array name. You can initialize the array with values using an array initializer enclosed in curly braces {}. int[] numbers = {10, 20, 30, 40, 50}; o Accessing Array Elements You can access elements in an array using the index enclosed in square brackets [] after the array name. The index starts from 0 for the first element and increments by 1 for each subsequent element. int firstElement = numbers; Array Length int length = numbers.length Iterating over arrays int[] numbers = {10, 20, 30, 40, 50}; for (int i = 0; i < numbers.length; i++) { System.out.println(numbers[i]);} Strings String class that represent sequences of characters. Strings in Java are immutable, meaning their values cannot be changed once they are created. Java provides a rich set of methods for working with strings, such as concatenation, substring extraction, searching, and more. Strings are part of the Java standard library and do not need to be imported explicitly. You can create strings using string literals or the String class constructor. o Creating strings You can create strings in Java using string literals or the String class constructor. String str1 = "Hello, World!"; o Read string length You can get the length of a string using the length() method of the String class. The length() method returns the number of characters in the string. int length = str.length(); o Accessing Characters To read an individual character from a string, you can use the charAt() method. The charAt() method returns the character at a specified index in the string. char lastChar = str.charAt(str.length() - 1); o Substring Extraction You can extract a substring from a string using the substring() method. The substring() method takes the starting index and optionally the ending index of the substring to be extracted. String str = "Hello, World!"; String subStr = str.substring(7); // subStr = "World!" o String concatenation You can concatenate strings in Java using the + operator or the concat() method of the String class. o Comparing to Numeric Types You can compare strings in Java using the equals() method or the compareTo() method of the String class. The equals() method checks if two strings have the same content. The compareTo() method compares two strings lexicographically. o Converting to Numeric Types You can convert strings to numeric types such as integers and doubles using the parseInt() and parseDouble() methods of the Integer and Double classes, respectively. o String Formatting You can format strings in Java using the String.format() method, which allows you to create formatted strings with placeholders for variables. String name = "Alice"; int age = 30; String message = String.format("Hello, %s! You are %d years old.", name, age); // message = "Hello, Alice! You are 30 years old." Enums o An enum in Java is a special data type that defines a set of constants. o Enums are used to represent fixed values that are known at compile time. Enum constants are implicitly static and final, meaning they cannot be changed once they are defined. o Enums are defined using the enum keyword, followed by the enum name and a list of constants enclosed in curly braces {}. Each constant is separated by a comma. o Enums are useful for representing a fixed set of values and can improve code readability and maintainability by providing meaningful names for constants. Dynamic-Size Reference Data Types o Collections in Java are dynamic-size data structures that store elements of different types and provide additional functionality for managing and manipulating the data. o Java provides a rich set of collection classes in the java.util package that implement various data structures such as lists, sets, maps, queues, and more. o Array List: The ArrayList class in Java implements the List interface and provides a dynamic array that can grow or shrink in size. ArrayList allows duplicate elements and maintains the insertion order of elements. The List interface in Java represents an ordered collection of elements that allows duplicate elements. Lists maintain the insertion order of elements and provide methods to add, remove, and access elements by index. o Hash Set: The HashSet class in Java implements the Set interface and provides a collection of unique elements. HashSet does not allow duplicate elements and does not maintain the insertion order of elements. The Set interface in Java represents a collection of elements that does not allow duplicate elements. Sets do not maintain the insertion order of elements and provide methods to add, remove, and check for the presence of elements. o Hash Map The HashMap class in Java implements the Map interface and provides a collection of key-value pairs. HashMap allows null keys and values, and does not maintain the insertion order of elements. The Map interface in Java represents a collection of key-value pairs where each key is unique. Maps provide methods to add, remove, and access elements by key. Java provides several implementations of the Map interface, such as HashMap, TreeMap, and LinkedHashMap. Classes: User Defined Types o Classes in Java are the blueprints for objects. o They define the structure and behavior of objects. o A class is a template for objects, and an object is an instance of a class. o Declaring a class: A class in Java is declared using the class keyword followed by the class name. The class name should start with an uppercase letter. o Class Member: Attributes (aka Fields): These are variables declared inside a class. They represent the state of an object. Methods: Methods are functions defined inside a class. They represent the behavior of an object. Constructors: Constructors are special methods used to initialize objects. Nested Classes: A class can be defined inside another class. These are called nested classes. o Creating Objects: To create an object of a class, we use the new keyword followed by the class name and constructor arguments. o Attributes Attributes are variables declared inside a class. They represent the state of an object. Attributes are also known as fields. The types of attributes are: § Instance Variables: These are non-static variables declared inside a class. Each object of the class has its own copy of instance variables. § Static Variables: These are class-level variables declared with the static keyword. They are shared among all objects of the class. § Final Variables: These are constants declared with the final keyword. Their value cannot be changed once initialized. o Methods are functions defined inside a class. They represent the behavior of an object. Methods can be of two types: Instance Methods: § These methods operate on the instance variables of the class. § They are called on objects of the class. Static Methods: § These methods are declared with the static keyword. § They can access only static variables of the class. o Constructors: Constructors are special methods used to initialize objects. They have the same name as the class and do not have a return type. Constructors are called when an object is created using the new keyword. o Default Constructor: If a class does not have any constructor defined, Java provides a default constructor that initializes the object with default values. o Parametrized Constructor: A parameterized constructor is a constructor with parameters. It is used to initialize the object with the specified values. o This Keyword: The this keyword in Java is a reference to the current object. It is used to refer to the current instance of the class. The this keyword is used to differentiate between instance variables and parameters with the same name. The this keyword is used to access instance variables and methods of the current object. Operations o An expression is a combination of values, variables, and operators. A value all by itself is considered an expression, and so is a variable, so the following are all legal expressions: 42, x, x+1 o When you type an expression at the prompt, the interpreter evaluates it, which means that it finds the value of the expression. o A statement is a unit of code that has an effect, like creating a variable or displaying a value. o Other types of statements include: Import statements: Import a module. If statements: Execute code depending on the value of a condition. For statements: Execute code for each item in a sequence. While statements: Execute code while a condition is true. Def statements: Define a function. Return statements: Exit a function and return a value. Break statements: Exit a loop. Continue statements: Skip the rest of the loop body. o A program is a sequence of statements. If there is more than one statement, the results appear one at a time as the statements execute. Arithmetic Operators o Comparison with Python: Both Python and Java use the same basic arithmetic operators: Addition ( + ): Adds two operands. Subtraction ( - ): Subtracts the second operand from the first. Multiplication ( * ): Multiplies two operands. Division ( / ): Divides the first operand by the second. Modulus ( % ): Returns the remainder of a division operation. o Division The / operator performs division, but the result depends on the operand types: § If both operands are integers, integer division is performed, and the result is an integer. § If either operand is a floating-point number (float or double), floating-point division is performed, returning a float or double. o Exponentiation Java does not have a built-in exponentiation operator. Instead, you use the Math.pow() method to perform exponentiation. o Unary Operators Java also supports unary operators + and -, as well as the increment (++) and decrement (--) operators. o Modulus Operator The % operator also returns the remainder, but the result has the same sign as the dividend. o Comparison operators are used primarily with primitive types and do not support chaining like Python. o Logical Operators: o Membership Operators: o Identity Operators: o Bitwise Operators o Operator Overloading: Does not support operator overloading. Operators have fixed behavior based on the data types they are used with. o String Functions o Mathematical Functions o Input or Output Functions: o To read data from a file in Java, you can use the FileInputStream and BufferedReader classes. The FileInputStream class is used to read bytes from a file, and the BufferedReader class is used to read text from a character-input stream. o To write data to a file in Java, you can use the FileOutputStream and BufferedWriter classes. The FileOutputStream class is used to write bytes to a file, and the BufferedWriter class is used to write text to a character-output stream. Testing in Java o Unit testing is a software testing technique where individual units or components of a software application are tested in isolation. o The goal of unit testing is to validate that each unit of the software performs as expected. In Java, unit testing is commonly done using the JUnit framework. o JUnit is a popular testing framework for Java that provides annotations and assertions to write and run unit tests. o JUnit tests are written as methods in a test class and are executed by a test runner. o To run JUnit tests in Java, you can use an IDE that supports JUnit, such as IntelliJ IDEA or Eclipse. o You can also run tests from the command line using the java command with the JUnit platform launcher. Control flow o Programs (multiple lines of code) run each line of code sequentially top to bottom o Statements (single line of code) run right to left Expression on right hand side of assignment/boolean operator is executed first Then, left hand expression of assignment operator is executed o Expressions ( < 1 line of code) run as per rules of precedence; for operators of equal precedence, expressions run left to right. o Conditionals: The if statement is used to execute a block of code only if a specified condition is true. If the condition is false, the code block is skipped. The if-else statement is used to execute one block of code if a specified condition is true and another block of code if the condition is false. The if-else if-else statement is used to execute one block of code if a specified condition is true, another block of code if a different condition is true, and a default block of code if none of the conditions are true. Nested if statements are if statements inside another if statement. They are used to test multiple conditions in sequence. The inner if statement is executed only if the outer if statement’s condition is true. o Ternary Operators: The ternary operator ? : is a shorthand way of writing an if-else statement. It is used to assign a value to a variable based on a condition. (variable = (condition) ? value1 : value2;) o Switch statement: The switch statement is used to execute different blocks of code based on the value of an expression. It is an alternative to using multiple if-else if-else statements. o Loops The for loop is used to iterate over a sequence of elements a fixed number of times. It consists of three parts: initialization, condition, and increment/decrement. The while loop is used to execute a block of code repeatedly as long as a specified condition is true. The loop continues to execute until the condition becomes false. The do-while loop is similar to the while loop, but the condition is checked after the block of code is executed. This means that the block of code is executed at least once, even if the condition is false. The break statement is used to exit the loop prematurely. It is often used to terminate the loop when a certain condition is met. When the break statement is encountered, the loop is terminated, and the control is transferred to the statement following the loop. The continue statement is used to skip the current iteration of the loop and continue with the next iteration. It is often used to skip certain elements in the loop based on a condition. The return statement is used to exit a method prematurely. When the return statement is encountered inside a loop, the loop is terminated, and the control is transferred back to the caller of the method. Nested loops are loops inside another loop. They are used to perform repetitive tasks that require multiple levels of iteration. Nested loops can be of any type, such as for, while, or do-while. Exceptions o The Throwable class is the superclass of all exceptions and errors in Java. o The Exception class is a subclass of Throwable and is the superclass of all checked exceptions. o The RuntimeException class is a subclass of Exception and is the superclass of all unchecked exceptions. o Checked Exceptions: Checked exceptions are exceptions that are checked at compile time. These exceptions are subclasses of the Exception class but not subclasses of the RuntimeException class. Checked exceptions must be caught or declared in the method signature using the throws keyword. Examples of checked exceptions include IOException, SQLException, ClassNotFoundException, etc. o Unchecked exceptions: Unchecked exceptions are exceptions that are not checked at compile time. These exceptions are subclasses of the RuntimeException class. Unchecked exceptions do not need to be caught or declared in the method signature. Examples of unchecked exceptions include ArithmeticException, NullPointerException, ArrayIndexOutOfBoundsExcept ion, etc. o Exception Handling in Java try Block § The try block is used to enclose the code that might throw an exception. It is followed by one or more catch blocks and an optional finally block. catch Block § The catch block is used to handle the exception that is thrown in the try block. It contains the code that handles the exception. A try block can have multiple catch blocks to handle different types of exceptions. finally Block § The finally block is used to execute the code that should be run regardless of whether an exception is thrown or not. It is executed after the try block and any associated catch blocks. o In addition to the built-in exceptions provided by Java, you can create your own custom exceptions by extending the Exception class or one of its subclasses. o Custom exceptions are useful for handling application-specific errors or conditions. TREES o Trees are one of the most important data structures in computer science. They are a non- linear data structure that store data in a hierarchical manner. o Trees offer efficient insertion, deletion, and search operations, making them a versatile data structure for a wide range of applications. o Trees are ubiquitous in computer science and are used for a wide variety of applications, including file systems, parsing languages, and artificial intelligence to plan out strategies and make decisions. o Terminology Nodes: The individual elements in a tree that store data. These are depicted as circles in the diagram. They are very similar to linked list nodes. Edges: The connections between nodes. These are depicted as lines in the diagram. Root Node: The topmost node in a tree. It is the only node in the tree that has no parent. Parent Node: A node that has child nodes. Each node in the tree, except the root node, has exactly one parent node. Child Node: A node that is connected to a parent node. Each node in the tree can have zero or more children. Leaf: Nodes that have no child nodes. They are the nodes at the bottom of the tree. Path: A sequence of nodes connected by edges. Height of a tree: The length of the longest path from the root to a leaf. Length is measured in the number of edges. Height of a node: The length of the longest path from the node to a leaf. Length is measured in the number of edges. Depth: The length of the path from the root to a node. Length is measured in the number of edges. Level: The depth of a node in the tree. The root node is at level 0. Subtree: A tree that is part of a larger tree. It consists of a node and all its descendants. Internal Node: A node that has at least one child node. Essentially, any node that is not a leaf. External Node: A node that has no child nodes. Essentially, a leaf node. Sibling: Nodes that share the same parent. Ancestors of a node: A node that is on the upward path a given node to the root. Descendant of a node: A node that is on the path from a given node to a leaf. o Types of trees There are many different types of trees, each with its own unique properties and characteristics. Note that we are going the following mathematical notation for trees: § n to represent the number of nodes in the tree, § N to represent the maximum number of children each node can have § h to represent the height of the tree. The most common types of trees Binary trees that have at most two children per node. Binary trees are further classified into different types based on their properties. o Binary Trees: Full Tree § An N-ary tree in which every node has either zero or N children. Complete Tree § An N-ary tree in which every level, except possibly the last, is completely filled, and all nodes are as far left as possible. Degenerate Tree § A tree in which each parent node has only one child node. This is essentially a linked list. Perfect Tree § An N-ary tree in which all interior nodes have N children and all leaves have the same depth or same level. Balanced Tree § A N-ary tree in which the heights of all the subtrees of each node differ by at most one. o Properties of Tress Trees have the following properties: Tree Property: § A tree with n nodes has n−1 edges. § The number of edges in a tree is always one less than the number of nodes. § This is because each node, except the root node, has exactly one parent node. Full Tree Property: § An N-ary tree with height h has at most 𝑵𝒉 nodes. § A full tree is a tree in which each node has exactly N children. Height Property: § The height of an N-ary tree with n nodes is log " (𝑛 + 1) − 1. § The height of a tree is the length of the longest path from the root to a leaf. PROPERTIES OF A BINARY TREE o The maximum number of nodes at level l of a binary tree is 𝟐𝒍 o A tree with n nodes has n-1 edges o A tree with height h has at most 𝟐𝒉$𝟏 − 𝟏 nodes. o The height of a binary tree tree with n nodes is log & (𝑛 + 1) − 1 TRAVERSALS o Traversal is the process of visiting each node in the tree in a specific order. o There are two main types of tree traversal: Breadth-First Traversal (BFS): A traversal algorithm that explores all the nodes at the present depth before moving on to the nodes at the next depth. Breadth-first traversal is also known as level-order traversal. Depth-First Traversal (DFS): A traversal algorithm that explores as far as possible along each branch before backtracking. § Depth-first traversal includes three types of traversals: preorder, inorder, and postorder. o Level Order Traversal (Breath-First Traversal) Level-order traversal is a breadth-first search (BFS) algorithm that visits all the nodes at the present depth before moving on to the nodes at the next depth. Psuedocode for level order traversal: § Create a queue and enqueue the root node. § While the queue is not empty: 2.1. Dequeue a node from the queue. 2.2. Process the node (e.g., print its value). 2.3. Enqueue the left child of the node if it exists. 2.4. Enqueue the right child of the node if it exists. The time complexity of level order traversal is O(n), where n is the number of nodes in the binary tree. The space complexity of level order traversal is O(n), where n is the number of nodes in the binary tree. o Preorder Traversal (DFS: Root-Left-Right) Visit the root node first, then recursively do a preorder traversal of the left subtree, followed by a preorder traversal of the right subtree. The preorder traversal of a binary tree is a depth-first traversal that visits the root node first, followed by the left subtree and then the right subtree. The pseudocode for preorder traversal is as follows: Visit the root node. Recursively do a preorder traversal of the left subtree. Recursively do a preorder traversal of the right subtree. Preorder traversal has the following applications: § Expression Tree: Preorder traversal is used to convert an infix expression to a prefix expression (also known as Polish notation). e.g., A + B to + A B. Prefix expressions are easier to evaluate using a stack-based algorithm. § Clone a Tree: Preorder traversal is used to clone a binary tree. This is because the root node is visited first, making it easy to create a copy of the tree and connecting the left and right subtrees. o In-order Traversal (DFS: Left-Root-Right) In-order traversal is a depth-first traversal that visits the left subtree first, followed by the root node, and then the right subtree. The pseudocode for in-order traversal is as follows: Recursively do an in-order traversal of the left subtree. Visit the root node. Recursively do an in-order traversal of the right subtree. In-order traversal has the following applications: § Binary Search Tree (BST): In-order traversal of a BST results in a sorted list of elements. This property is used to find the k-th smallest element in a BST. § Postfix Expression and Evaluation: In-order traversal is used to convert an infix expression to a postfix expression (also known as Reverse Polish notation). e.g., A + B to A B +. Postfix expressions are easier to evaluate using a stack-based algorithm. o Post-order Traversal (DFS: Left-Right-Root) Recursively do a postorder traversal of the left subtree, followed by a postorder traversal of the right subtree, and then visit the root node. The pseudocode for postorder traversal is as follows: Recursively do a postorder traversal of the left subtree. Recursively do a postorder traversal of the right subtree. Visit the root node. Postorder traversal has the following applications: § Expression Tree: Postorder traversal is used to convert an infix expression to a postfix expression (also known as Reverse Polish notation). e.g., A + B to A B +. Postfix expressions are easier to evaluate using a stack-based algorithm. § Delete a Tree: Postorder traversal is used to delete a binary tree. This is because the root node is visited last, making it easy to delete the tree from the leaves up to the root. § Postorder traversal is commonly used in expression trees to evaluate expressions o Insertion Insertion is the process of adding a new node to the binary tree. The new node is inserted based on the value of the node and the properties of the binary tree. The pseudocode for inserting a node in a binary tree that is not a binary search tree is as follows: Create a new node with the given value. If the root is null, set the new node as the root. Else, perform a level order traversal to find the first node that has an empty left or right child. Insert the new node as the left child if the left child is empty, otherwise insert it as the right child. The time complexity of inserting a node in a binary tree is O(n) in the worst case, where n is the number of nodes in the binary tree. The space complexity of inserting a node in a binary tree is O(n) in the worst case, where n is the number of nodes in the binary tree. o Deletion Deletion is the process of removing a node from the binary tree. The node to be deleted is removed based on the value of the node and the properties of the binary tree. The pseudocode for deleting a node in a binary tree is as follows: Find the node to be deleted using a level order traversal or any other traversal method. If the node has no children, simply remove the node. If the node has one child, replace the node with its child. If the node has two children, find the inorder successor or predecessor of the node. Replace the node with the inorder successor or predecessor. The time complexity of deleting a node in a binary tree is O(n) in the worst case, where n is the number of nodes in the binary tree. The space complexity of deleting a node in a binary tree is O(n) in the worst case, where n is the number of nodes in the binary tree. o Searching Searching is the process of finding a specific node in the binary tree based on its value. The pseudocode for searching a node in a binary tree is as follows: Perform a level order traversal or any other traversal method to search for the node. If the node is found, return the node. If the node is not found, return null. The time complexity of searching for a node in a binary tree is O(n) in the worst case, where n is the number of nodes in the binary tree. The space complexity of searching for a node in a binary tree is O(n) in the worst case, where n is the number of nodes in the binary tree. BINARY SEARCH TREE o A binary search tree (BST) is a binary tree data structure that is used to store data in a sorted order. o The binary search tree must at all times satisfy the binary search tree property: The left subtree of a node contains only nodes with values less than or equal to the node’s value. The right subtree of a node contains only nodes with values greater than or equal to the node’s value. Both the left and right subtrees must themselves be binary search trees>. o As a result of this property, the inorder traversal of a binary search tree will always return the nodes in sorted order. o All operations on a binary search tree have a time complexity of O(h), where h is the height of the tree. h is equal to the number of levels in the tree. o For a balanced binary search tree, the height is O(log n), where n is the number of nodes in the tree. o For an unbalanced binary search tree, the height is O(n). o Search Searching for a node in a binary search tree is very similar to searching for a value in a sorted array. o Depth First Search (DFS) o A traversal algorithm that explores as far as possible along each branch before backtracking. o Depth First Search can be implemented using the following steps: If the tree is empty, return False. If the current node is the target node, return True. Recursively search the left subtree. Recursively search the right subtree. o Breadth First Search (BFS) o A traversal algorithm that explores all the nodes at the present depth before moving on to the nodes at the next depth. o Breadth First Search can be implemented using the following steps: If the tree is empty, return False. Create a queue and add the root node to the queue. While the queue is not empty, remove the first node from the queue. § If the current node is the target node, return True. § Add the left child of the current node to the queue if it exists. § Add the right child of the current node to the queue if it exists. o Insertion o Adding a new node to the tree follows the following steps: If the tree is empty, create a new node and set it as the root of the tree. If the tree is not empty § Compare the value of the new node with the value of the current node. § If the value of the new node is less than the value of the current node, move to the left subtree. § If the value of the new node is greater than the value of the current node, move to the right subtree. § Repeat steps 2.1 to 2.3 until you reach a leaf node. Insert the new node as the left child if the value of the new node is less than the value of the leaf node, or as the right child if the value of the new node is greater than the value of the leaf node. o Deletion o Deletion of a node in a binary search tree involves the following steps: If the tree is empty, return the tree. If the value of the node to be deleted is less than the value of the current node, move to the left subtree. If the value of the node to be deleted is greater than the value of the current node, move to the right subtree. If the value of the node to be deleted is equal to the value of the current node, delete the node. § If the node has no children, simply delete the node. § If the node has one child, replace the node with its child. § If the node has two children, find the inorder successor of the node, replace the node with the inorder successor, and delete the inorder successor. § The inorder successor is the smallest node in the right subtree of the node to be deleted. o Update Updating a node in a binary search tree involves deleting the node and inserting a new node with the updated value. The time complexity of inserting a node in a binary search tree is O(h), where h is the height of the tree. The time complexity of deleting a node in a binary search tree is also O(h), where h is the height of the tree. Therefore, the time complexity of updating a node in a binary search tree is O(h). HEAPS A heap is a binary tree with the following properties: o It is a complete binary tree: § All levels of the tree are fully filled except possibly for the last level, which is filled from left to right. o It satisfies one of the heap property, depending on the type of heap: § In a Max-heap, the value of each node is greater than or equal to (≥) the values of its children. § In a Min-heap, the value of each node is less than or equal to (≤) the values of its children. The heap property guarantees that the largest (in a max-heap) or smallest (in a min-heap) value is always at the root, and the tree maintains a specific order based on the type of heap. Array-based Implementation o Heaps can be implemented using arrays or linked lists. The following is an implementation of a max-heap using an array: There are several ways to implement a heap, including using arrays, linked lists, or binary trees. Heaps are commonly implemented using arrays, where the root element is stored at index 0, and the children of the node at index i are stored at indices 2i+1 and 2i+2. Node Position in Array Root 0 Left Child of node at index i 2i + 1 Right Child of node at index i 2i + 2 Parent of node at index i (i-1)/2 This representation allows for efficient access to the parent and children of a node. A heap supports the following operations: 1. Insert: Insert an element into the heap, maintaining the heap property and the complete binary tree property. 2. Extract / Delete: Remove and return the root element of the heap, maintaining the heap property and the complete binary tree property. 3. Heap Sort: Sort an array using a heap. Insert 1. Add the element to the end of the heap, maintaining the complete binary tree property i.e. fill from left to right at the last level. 2. Heapify: Restore the heap property by swapping the element with its parent until the heap property is satisfied. The heapify operation involves moving the element up the heap (in the case of a max-heap) or down the heap (in the case of a min-heap) until the heap property is satisfied. The time complexity of inserting an element into a heap is O(logn), where n is the number of elements in the heap. Step 1 takes O(1) time, and step 2 (heapify) takes O(logn) time to go from the leaf to the root. Read / Peek o Reading or peeking at the root element of a heap involves returning the value of the root element without removing it from the heap. Update 1. Update the element with the new value. 2. Heapify: Restore the heap property by moving the element up or down the heap until the heap property is satisfied. Extract / Delete o Extracting the maximum element from a max-heap or the minimum element from a min- heap involves the following steps: 1. Replace the root element with the last element in the heap (rigthmost element in the last level). 2. Remove the last element from the heap. 3. Heapify: Restore the heap property by moving the new root element down the heap until the heap property is satisfied. Heapify o Heapify is an operation that restores the heap property in a heap. It involves moving an element up or down the heap to satisfy the heap property. o Generally, heapify is used after an insert or extract operation to maintain the heap property. o Heapify generally starts from the root and moves the element down the heap until the heap property is satisfied. o In case of a max-heap, heapify involves moving a node down the heap until it is greater than or equal to its children. Applications o Heaps are used in various applications, including: Priority Queues o A priority queue is an abstract data type, often implemented using a heap, that allows elements to be inserted and extracted based on their priority. o The element with the highest priority is at the root of the heap (max-heap) or the element with the lowest priority is at the root of the heap (min-heap). o Priority queues are commonly used in algorithms and applications where elements need to be processed based on their priority, such as task scheduling, Dijkstra’s algorithm, Prim’s algorithm, Huffman coding, and more. o Note that the terms “max-priority queue” and “min-priority queue” are used interchangeably with “max-heap” and “min-heap,” respectively, when implementing priority queues using heaps. o Please note that priority queue are distinct from queues or stacks, which are first-in-first- out (FIFO) and last-in-first-out (LIFO) data structures, respectively. Heap Sort Heap sort is a sorting algorithm that uses a heap to sort an array. It involves converting the array into a heap using the heapify operation and then extracting the root element of the heap until the heap is empty. Heap sort involves the following steps: 1. Build Heap: Convert the array into a heap using the heapify operation. 2. Extract Max/Min: Repeatedly extract the maximum (in a max-heap) or minimum (in a min-heap) element from the heap until the heap is empty. o Heap sort has a time complexity of O(n log n) and is an in-place sorting algorithm with a space complexity of O(1). Graphs A graph is a data structure that consists of a finite set of vertices (nodes) and a collection of edges that connect pairs of vertices. Graphs are used to model relationships between objects, such as cities connected by roads, friends connected by social networks, or dependencies between tasks in a project. Graph Terminology o Vertex (Node): A fundamental unit of a graph. It represents an entity, such as a city, person, or object. o Edge: A connection between two vertices. It represents a relationship between the entities represented by the vertices. o Directed Graph (Digraph): A graph in which the edges have a direction associated with them. The edges are ordered pairs of vertices. o Undirected Graph: A graph in which the edges do not have a direction associated with them. The edges are unordered pairs of vertices. o Weighted Graph: A graph in which each edge has an associated weight or cost. The weight can represent distance, time, cost, or any other relevant metric. o Path: A sequence of vertices connected by edges. It represents a route from one vertex to another. o Cycle: A path that starts and ends at the same vertex. o Connected Graph: A graph in which there is a path between every pair of vertices. o Disconnected Graph: A graph in which there are vertices that are not connected by any path. o Degree of a Vertex: The number of edges incident to a vertex. In a directed graph, the degree is divided into the in-degree (number of incoming edges) and out-degree (number of outgoing edges). o Adjacency: Two vertices are adjacent if there is an edge connecting them. o Adjacency Matrix: A matrix representation of a graph in which the rows and columns correspond to vertices, and the entries indicate whether an edge exists between the vertices. o Adjacency List: A list representation of a graph in which each vertex has a list of adjacent vertices. Graph Representation There are two common ways to represent a graph: adjacency matrix and adjacency list. Adjacency Matrix o An adjacency matrix is a 2D array of size V×V, where V is the number of vertices in the graph. If the value of the matrix element at row i and column j is 1, it indicates an edge between vertices i and j. o For weighted graphs, the matrix can store the weight of the edge instead of 1. There are two types of adjacency matrices: o Unweighted Graphs: The matrix elements are 0 or 1, indicating the absence or presence of an edge. o Weighted Graphs: The matrix elements are the weights of the edges. o Similarly, for directed graphs, the adjacency matrix can be asymmetric, with different values for the in-degree and out-degree. o Adjacency matrices are useful for dense graphs, where the number of edges is close to the maximum possible number of edges. Adjacency List o An adjacency list is a collection of lists or arrays used to represent the graph. Each vertex has a list of adjacent vertices. This representation is more space-efficient than an adjacency matrix for sparse graphs. In an adjacency list: o For an undirected graph, each edge (u,v) is stored twice: once in the list of u and once in the list of v. o For a directed graph, the edge (u,v) is stored only in the list of u. Adjacency lists are useful for sparse graphs, where the number of edges is much less than the maximum possible number of edges. Graph Traversal o Graph traversal is the process of visiting all the vertices and edges of a graph. Two common graph traversal algorithms are Depth-First Search (DFS) and Breadth-First Search (BFS). Depth-First Search (DFS) o DFS is an algorithm that explores as far as possible along each branch before backtracking. It uses a stack to keep track of the vertices to visit next. DFS Algorithm o Start at a vertex v and mark it as visited. o Recursively visit all adjacent vertices of v that have not been visited. o Repeat step 2 for each unvisited adjacent vertex. Example of DFS o Consider the following graph: 0 -- 1 | | 3 -- 2 o Starting at vertex 0, a DFS traversal would visit the vertices in the order 0, 1, 2, 3. Breadth-First Search (BFS) o BFS is an algorithm that explores all the vertices at the present depth before moving on to the vertices at the next depth. It uses a queue to keep track of the vertices to visit next. BFS Algorithm o Start at a vertex v and mark it as visited. o Enqueue v into a queue. o While the queue is not empty, dequeue a vertex u from the queue. o Visit all unvisited adjacent vertices of u and mark them as visited. Enqueue them into the queue. Example of BFS o Consider the same graph as before: 0 -- 1 | | 3 -- 2 o Starting at vertex 0, a BFS traversal would visit the vertices in the order 0, 1, 3, 2. Graph Algorithms o Graphs are used to solve a variety of problems, and several algorithms have been developed to work with graphs efficiently. Some common graph algorithms include: o Dijkstra’s Algorithm: Finds the shortest path between two vertices in a weighted graph. o Bellman-Ford Algorithm: Finds the shortest path between two vertices in a weighted graph with negative edge weights. o Prim’s Algorithm: Finds the minimum spanning tree of a connected, undirected graph. o Kruskal’s Algorithm: Finds the minimum spanning tree of a connected, undirected graph. o Floyd-Warshall Algorithm: Finds the shortest paths between all pairs of vertices in a weighted graph. o Topological Sorting: Arranges the vertices of a directed acyclic graph in a linear order. o Strongly Connected Components: Identifies the strongly connected components in a directed graph.

Use Quizgecko on...
Browser
Browser