Data Structures In Depth Using C++ PDF
Document Details
Uploaded by DistinctiveConnemara146
Zagazig University
2023
Mahmmoud A. Mahdi
Tags
Summary
This textbook, "Data Structures in Depth Using C++", by Mahmmoud A. Mahdi, covers fundamental data structures and their implementations in C++. It includes an introductory section on algorithm analysis and efficiency, emphasizing practical examples and exercises for readers to gain a thorough understanding of the subject.
Full Transcript
Data Structures In Depth Using C++ Mahmmoud A. Mahdi Draft v6.10 Copyright © 2023 Dr. Mahmmoud Mahdi C OMPUTER S CIENCE D EPART, FACULTY OF C OMPUTERS , Z AGAZIG U NIVERSITY www.qursaan.com First release, October 2023 To my dear father, may God forgive him ...
Data Structures In Depth Using C++ Mahmmoud A. Mahdi Draft v6.10 Copyright © 2023 Dr. Mahmmoud Mahdi C OMPUTER S CIENCE D EPART, FACULTY OF C OMPUTERS , Z AGAZIG U NIVERSITY www.qursaan.com First release, October 2023 To my dear father, may God forgive him Contents 1 Introduction................................................. 11 1.1 Mathematics Review 12 1.1.1 Series.......................................................... 12 1.2 Algorithm Analysis Foundations 14 1.2.1 What is an algorithm?........................................... 14 1.2.2 Simple Programming Problems vs. Algorithm Problems............... 14 1.2.3 Algorithm Design Techniques..................................... 14 1.2.4 Important Problem Types......................................... 15 1.3 Analysis of Algorithm Efficiency 16 1.3.1 What is the Algorithm Analysis.................................... 16 1.3.2 Algorithm Evaluation............................................ 16 1.3.3 Theoretical analysis of Time Efficiency.............................. 16 1.3.4 Empirical Analysis of Time Efficiency................................ 17 1.3.5 Understanding Order of Growth................................... 18 1.3.6 Best-Case, Average-Case, Worst-Case Analysis...................... 18 1.3.7 Asymptotic order of growth...................................... 19 1.4 Example Analysis of Non-recursive Algorithms 20 1.4.1 Analysis of simple for loop........................................ 20 1.4.2 Analysis of a Nested for Loop..................................... 22 1.5 Introduction to Data Structures 24 1.5.1 Importance of Data Structures.................................... 24 1.5.2 Types of Data Structure.......................................... 24 1. Introduction Objective In this chapter, we will focus on the following key objectives: Understanding Data Structures: Gain a solid understanding of fundamental concepts and the significance of data structures in programming and algorithm analysis. Asymptotic Notation: Learn how to utilize asymptotic notation to describe the execution time and efficiency of algorithms. Time Estimation: Develop the skills to estimate the time required to execute a program. 12 Chapter 1. Introduction 1.1 Mathematics Review In this section, we lay the mathematical foundation necessary for algorithm analysis. You will be introduced to fundamental mathematical formulas that are essential for your under- standing. These formulas are not only theoretical, but serve as powerful tools in analyzing algorithms. 1.1.1 Series In the realm of mathematics, series play a crucial role in summing and evaluating sequences of numbers. We will explore some fundamental series that will be particularly useful in algorithm analysis. General Algebraic Manipulation Formula This is a fundamental formula that allows us to manipulate algebraic expressions within a series: N ∑ f (n) = N f (N) (1.1) i=1 Geometric Series Geometric series involve a common ratio between terms. They are often encountered in algorithm analysis and beyond. One key formula is the following. N ∑ 2i = 2N+1 − 1 (1.2) i=0 along with N kN+1 − 1 ∑ ki = k−1 (1.3) i=0 Arithmetic Series Arithmetic series are the sum of consecutive integers. They appear frequently in various mathematical and computational contexts. Here are a couple of important formulas: N N(N + 1) N 2 ∑i= ≈ (1.4) i=1 2 2 1.1 Mathematics Review 13 More explanation for Equation 1.4: end (end − start + 1)(end + start) ∑ i= i=start 2 N N(N + 1)(2N + 1) N 3 ∑ i2 = 6 ≈ 3 (1.5) i=1 N N k+1 ∑ ik ≈ |k + 1| , k ̸= −1 (1.6) i=1 Harmonic Series The harmonic series is a famous mathematical series that arises in various contexts, including algorithm analysis. One of its approximations is [Wei14]: N 1 ∑i ≈ loge N (1.7) i=1 Example 1.1 — Summation of Consecutive Integers to 2N. 2N (2N)(2N + 1) ∑i= 2 i=1 Example 1.2 — Sum of Consecutive Integers with an Offset. N+a (N + a + 1)(N + a) ∑ i= 2 i=0 In the following sections, we will dive deeper into the applications of these mathematical concepts in the analysis of algorithms. 14 Chapter 1. Introduction 1.2 Algorithm Analysis Foundations 1.2.1 What is an algorithm? An algorithm is a sequence of unambiguous instructions for solving a problem, i.e., for obtaining a required output for any legitimate input in a finite time [Lev02]. 1.2.2 Simple Programming Problems vs. Algorithm Problems In the realm of problem-solving and programming, it’s important to distinguish between simple programming problems and algorithm problems. Understanding the differences can help you approach problem-solving more effectively. Simple Programming Problems Simple programming problems typically exhibit the following characteristics: Straightforward Implementation: These problems have solutions that are rela- tively easy to implement. They often involve common programming constructs and techniques. Linear Scan: The problem can be solved by scanning data or elements in a linear manner. There is no need for complex data structures or advanced algorithms. Limited Room for Optimization: Simple programming problems often have solu- tions that are close to optimal. There’s not much room for improving the efficiency of the solution. Algorithm Problems On the other hand, algorithm problems present a different set of challenges: Ambiguity in Problem Solving: Algorithm problems may not have a clear and straightforward path to a solution. Figuring out what to do can be a significant part of the challenge. Complexity in Execution: Even if you know what to do, it may not be clear how to efficiently execute the solution. Simple ideas can be too slow, and you need to find more optimized approaches. Room for Optimization: Algorithm problems often leave room for optimizing your solutions. Finding more efficient algorithms is a key aspect of problem solving. Designing Algorithms: These problems require you to design algorithms from scratch or adapt existing ones creatively. Analyzing Algorithm Efficiency: Understanding and analyzing the efficiency of algorithms is a fundamental skill when dealing with algorithm problems. Recognizing the distinctions between these two types of problem will help you approach problem-solving with a more appropriate mindset and strategy. 1.2.3 Algorithm Design Techniques When solving problems with algorithms, various strategies can be employed. Here are some common algorithm design techniques: 1.2 Algorithm Analysis Foundations 15 Brute Force: This technique involves trying all possible solutions and selecting the best one. Divide and Conquer: It’s about breaking a problem into smaller sub-problems, solving them, and combining their solutions. Greedy Approach: This strategy makes locally optimal choices at each step to find a global solution. Dynamic Programming: It’s used for problems with overlapping subproblems, where solutions to subproblems are reused. Iterative Improvement: Problems are solved by iteratively refining a solution until an optimal solution is reached. Backtracking: It’s useful for solving problems incrementally, undoing steps when they lead to a dead end. 1.2.4 Important Problem Types In the realm of algorithmic problem solving, various problem types arise. It’s essential to recognize and categorize them to apply suitable algorithms: Sorting: Problems related to arranging elements in a specific order. Searching: Finding specific elements within a dataset. String Processing: Dealing with operations on strings, such as searching or pattern matching. Graph Problems: Problems that involve relationships and connections between data points. Combinatorial Problems: Challenges related to selecting and arranging items from a set. Geometric Problems: Issues involving shapes, distances, and positions in space. Numerical Problems: Problems that deal with numerical computations and opera- tions. 16 Chapter 1. Introduction 1.3 Analysis of Algorithm Efficiency 1.3.1 What is the Algorithm Analysis Definition: The process of evaluating and comparing algorithms in terms of their resource usage (time and space) [Lev02]. Objective: Choose the most efficient algorithm for a given problem. 1.3.2 Algorithm Evaluation When evaluating an algorithm, we typically consider two main aspects: 1. Time Efficiency: How quickly does the algorithm solve the problem? 2. Space Efficiency: How much memory does the algorithm require to operate? In addition to assessing an algorithm’s performance, we also ask whether there is room for improvement: 1. Lower Bounds: Are there theoretical limits on how efficient an algorithm for this problem can be? 2. Optimality: Can we design an algorithm that is better in terms of time and space efficiency? Evaluation Methods To evaluate the algorithms, there are two main approaches: Theoretical Analysis: We analyze algorithms based on mathematical and theoretical principles to understand their performance. Empirical Analysis: We assess algorithms through practical experiments and real- world testing to observe their behavior in various scenarios. 1.3.3 Theoretical analysis of Time Efficiency Time efficiency is analyzed by determining the number of repetitions of the basic operation as a function of input size [Lev02]. Basic operation: the operation that contributes most towards the running time of the algorithm Estimating Running Time with Basic Operations We can estimate the running time T (n) of a program that implements an algorithm on a particular computer using the formula (Equation 1.8, Figure 1.1): T (n) ≈ copC(n) (1.8) In this formula: cop : represents the execution time of the basic operation of the algorithm on a specific computer. It quantifies how long it takes to perform the most fundamental operation. 1.3 Analysis of Algorithm Efficiency 17 Figure 1.1: Running-Time Equation Explanation C(n): is the count of how many times this basic operation needs to be executed within the algorithm for a given input size n. By multiplying the execution time of the basic operation (cop ) by the number of times it is executed (C(n)), we get an approximation of the program’s running time T (n). This estimation is useful for understanding the efficiency of the algorithm and predicting how it will perform on the chosen computer. Input Size & Basic Operation To analyze the efficiency of algorithms, we often consider the relationship between the input size and the basic operations performed. This allows us to understand how the algorithm’s performance scales with different problem sizes. For example, refer to Table 1.1 for illustrative examples. Table 1.1: Examples of Input Size and Basic Operation Relationships Problem Input Size Measure Basic Operation 1 Searching for a key in a Number of list items, i.e., Key comparison list of n items n 2 Multiplication of two Matrix dimensions or total Multiplication of two matrices number of elements numbers 3 Checking primality of a n’s size = number of digits Division given integer n (in binary representation) 4 Typical Graph problem #vertices and/or edges Visiting a vertex or traversing an edge 1.3.4 Empirical Analysis of Time Efficiency This approach involves practical experiments with real-world data. Here is how it is done: Selecting a Specific Sample of Inputs: To understand how an algorithm performs in practice, we choose a representative set of inputs. These inputs should cover a range of scenarios and problem sizes. 18 Chapter 1. Introduction Using Physical Units of Time: We measure the algorithm’s execution time us- ing a physical unit, such as milliseconds. This measurement provides a tangible understanding of the time required for an algorithm to complete a task. Counting Actual Basic Operations: In empirical analysis, we do not rely solely on theoretical predictions. Instead, we count the actual number of executions of basic operations during the algorithm’s run. This allows us to quantify the algorithm’s efficiency more accurately. Analyzing the Empirical Data: The collected data is then analyzed to determine the real-world performance of the algorithm. This analysis provides valuable information about the performance of the algorithm under various conditions. Empirical analysis complements theoretical analysis and helps assess how algorithms perform when faced with actual data and real-time constraints. 1.3.5 Understanding Order of Growth The order of growth is a fundamental concept that allows us to describe how an algorithm’s efficiency scales as the input size (n) approaches infinity (n → ∞). Example Scenarios to Consider: How much faster will the algorithm run on a computer that is twice as fast? The order of growth helps us predict how the algorithm’s performance scales with improvements in hardware. If the order of growth is linear (O(n)), for instance, doubling the computer’s speed might roughly halve the execution time. How much longer does it take to solve a problem of double the input size? If we understand the order of growth, we can estimate the impact of increasing the problem size. For an algorithm with quadratic growth (O(n2 )), doubling the input size could make the algorithm take four times longer. In summary, the order of growth is a powerful tool for assessing an algorithm’s scalability and predicting its behavior as the input size grows infinitely. It helps us focus on the most significant aspects of algorithm efficiency, which is vital in real-world applications. 1.3.6 Best-Case, Average-Case, Worst-Case Analysis We recognize that the efficiency of some algorithms depends on the specific characteristics of the input. To gain a comprehensive understanding of an algorithm’s performance, we explore three essential scenarios: Worst Case (Cworst (n)): This scenario involves identifying the maximum resource utilization over all possible inputs of size n. Worst-case analysis allows us to determine how the algorithm performs when faced with the most challenging input. Best Case (Cbest (n)): In contrast, the best-case scenario seeks the minimum resource usage over all inputs of size n. It characterizes the algorithm’s efficiency under the most favorable conditions. Average Case (Cavg (n)): Average case analysis provides a measure of the algorithm performance on "average" inputs of size n. It involves considering the expected resource consumption over a distribution of real-world inputs. 1.3 Analysis of Algorithm Efficiency 19 It is important to note that the "average case" does NOT represent the average between the best and worst cases. Instead, it reflects the number of times a basic operation is expected to be executed on inputs that the algorithm is likely to encounter in practical situations. Each of these scenarios provides a distinct view of an algorithm’s efficiency. Worst-case analysis guarantees that the algorithm will not perform poorly on any input, while best-case analysis illustrates its maximum potential. On the contrary, average case analysis provides insight into its real-world performance expectations. 1.3.7 Asymptotic order of growth We often compare functions in a way that abstracts away constant factors and focuses on how they behave as input sizes become significantly large. This approach is essential for gaining a high-level understanding of algorithm efficiency. Three important notations are used to express these comparisons: O(g(n)): represents the class of functions f (n) that do not grow faster than g(n). In other words, it characterizes an upper bound on the growth rate of f (n), which helps us understand how the algorithm’s efficiency behaves in the worst-case scenario (Figure 1.2). Θ(g(n)): represents the class of functions f (n) that grow at the same rate as g(n). It provides a precise characterization of how an algorithm’s performance scales as input size increases (Figure 1.4). Ω(g(n)): represents the class of functions f (n) that grow at least as fast as g(n). It helps us identify a lower bound on the growth rate of f (n), providing insight into the best-case algorithm performance (Figure 1.3). The asymptotic order of growth enables us to focus on the fundamental behaviors of algorithms, disregarding constant factors and small input sizes. It is a powerful tool for understanding how an algorithm will perform as it faces increasingly large datasets. Basic Asymptotic Efficiency Classes The table in Table 1.2 illustrates the fundamental asymptotic efficiency classes used in algorithm analysis. These notations are essential for characterizing how algorithms perform as their input sizes increase significantly, impacting their performance and scalability in various scenarios. 20 Chapter 1. Introduction Figure 1.2: Big-oh: t(n) ∈ O(g(n)) [Lev02] Figure 1.3: Big-Omega: t(n) ∈ Ω(g(n)) [Lev02] 1.4 Example Analysis of Non-recursive Algorithms In this section, we will explore the analysis of non-recursive algorithms using practical examples. To begin, we will demonstrate the process of transforming loops into mathe- matical series, a fundamental step in understanding and evaluating the efficiency of these algorithms. 1.4.1 Analysis of simple for loop Example 1.3 Analysis of simple for loop 1 for (int i = 1; i