Lecture Notes 3.pdf
Document Details
Uploaded by BrainiestObsidian
Tags
Full Transcript
COMP1210 - Lecture Note 3 Abstract data type An Abstract Data Type (ADT) is a conceptual model or a mathematical model for data types that defines the behaviors or operations that can be performed on data of that type. It specifies what the data type can do, but not how it should be implemented. ADT...
COMP1210 - Lecture Note 3 Abstract data type An Abstract Data Type (ADT) is a conceptual model or a mathematical model for data types that defines the behaviors or operations that can be performed on data of that type. It specifies what the data type can do, but not how it should be implemented. ADTs provide a high-level abstraction for working with data structures, allowing programmers to focus on the logical operations rather than the underlying implementation details. The main components of an ADT are: 1. Data Representation: The way data is stored or organized in memory. The data representation is hidden from the user of the ADT and is implementation-specific. 2. Operations: The set of operations or functions that can be performed on the data. These operations define the behavior of the ADT and are the only way to access and manipulate the data. 3. Invariants: The conditions or constraints that must be satisfied by the data and the operations, ensuring the integrity and validity of the data. ADTs are designed based on the principle of abstraction, which is one of the fundamental concepts in computer science and object-oriented programming. Abstraction allows us to separate the logical view of a data type from its implementation details, providing a clean interface for working with data structures. Some common examples of ADTs include: - List (ordered collection of elements) - Queue (First-In-First-Out) - Stack (Last-In-First-Out) - Priority Queue (elements with priorities) - Set (unordered collection of unique elements) - Map or Dictionary (key-value pairs) - Tree (hierarchical data structure) - Graph (collection of nodes and edges) ADTs are widely used in software development because they promote code reusability, modularity, and maintainability. By defining a clear interface for data structures, ADTs allow different implementations to be swapped out without affecting the rest of the code, as long as the implementation adheres to the specified operations and invariants. When working with ADTs, programmers can focus on the logical operations and algorithms, leaving the implementation details to be optimized or changed as needed, without affecting the overall functionality of the program. 1 1. List - A list is a sequence where elements can be added or removed from any position dynamically, unlike arrays that have a fixed size once created. - Lists allow you to access elements directly by their position (index) and extract sub-sequences (slices) efficiently, which facilitates operations like iteration, search, and manipulation of sub-lists. - Lists are commonly used when the order of elements matters, and there is a need for dynamic resizing. They are used in a wide range of applications, including implementing other data structures like stacks and queues, managing collections of items in various algorithms, and storing sequential data in applications 2. Queue - A linear data structure that follows the First-In-First-Out (FIFO) principle. - Basic operations: enqueue (add an element to the rear) and dequeue (remove an element from the front). - Used in scenarios where the order of processing elements is important, such as job scheduling, printer spooling, and breadth-first search algorithms. 3. Stack - A linear data structure that follows the Last-In-First-Out (LIFO) principle. - Basic operations: push (add an element to the top) and pop (remove an element from the top). - Used in scenarios where the last added element needs to be processed first, such as expression evaluation, undo/redo operations, and depth-first search algorithms. 4. Priority Queue - Similar to a regular queue, but each element has an associated priority. - Elements are served based on their priority, with the highest priority element being served first. - Used in scenarios where elements need to be processed based on their importance or urgency, such as job scheduling, event handling, and graph algorithms like Dijkstra's shortest path algorithm. 5. Double-ended Queue (Deque) - A generalized version of a queue, where elements can be inserted or removed from both ends (front and rear). - Supports operations like push_front, push_back, pop_front, and pop_back. - Used in scenarios where elements need to be added or removed from both ends, such as sliding window algorithms, undo/redo operations, and breadth- first search algorithms. 2 6. Tree - A hierarchical data structure consisting of nodes connected by edges. - Has a root node, and each node can have zero or more child nodes. - Common operations include insertion, deletion, and traversal (pre-order, in- order, post-order). - Used in scenarios where data needs to be organized hierarchically, such as file systems, decision trees, and various algorithms like binary search trees and heaps. These abstract data types define the behavior and operations of the data structures, without specifying the implementation details. The implementation can vary depending on the programming language and the specific requirements of the application. Stack vs Heap Memory Allocation Memory in a C++ program can either be allocated on stack or heap. Stack Allocation : - The allocation happens on contiguous blocks of memory. - We call it stack memory allocation because the allocation happens in the function call stack. - The size of memory to be allocated is known to compiler and whenever a function is called, its variables get memory allocated on the stack. - Whenever the function call is over, the memory for the variables is de-allocated. - The programmer does not have to worry about memory allocation and de-allocation of stack variables. - Memory from the stack is used by all the members which are declared inside functions. - Note that main is also a function. int main() { // All these variables get memory allocated on stack int a; int b; int n = 20; int c[n]; } Heap Allocation : - The memory is allocated during execution of instructions written by programmers. - Note that the name heap has nothing to do with heap data structure. - It is called heap because it is a pile of memory space available to programmers to allocated and de-allocate. 3 - If a programmer does not handle this memory well, memory leaks can happen in the program. The new operator is used to allocate memory at runtime. The memory is allocated in bytes. Let's first see how to allocate a variable dynamically. int *ptr = new int; By writing new int, we allocated the space in memory required by an integer. Then we assigned the address of that memory to an integer pointer ptr. We assign value to that memory as follows: *ptr = 4; Thus, we allocated that much space in memory that would be required by an int and then assigned the address of that memory to a pointer ptr and assigned the memory a value 4. When we dynamically allocate some memory to a variable, we actually use the heap memory. Key Differences Between Stack and Heap Allocations 1. In a stack, the allocation and de-allocation is automatically done but whereas, in the heap, it needs to be done by the programmer manually. 2. Handling of Heap frame is costlier than handling of stack frame. 3. Memory shortage problem is more likely to happen in stack whereas the main issue in heap memory is fragmentation. 4. Stack frame access is easier than the heap frame as the stack have small regions of memory and is cache friendly, but in case of heap frames which are dispersed throughout the memory so it cause more cache misses. 5. Stack is not flexible, the memory size allotted cannot be changed whereas a heap is flexible, and the allotted memory can be altered. 6. Accessing time of heap takes more than a stack. When to use the Heap? When should you use the heap, and when should you use the stack? 1. If you need to allocate a large block of memory (e.g. a large array, or a big class), and you need to keep that variable around a long time (like a global), then you should allocate it on the heap. 2. If you are dealing with relatively small variables that only need to persist as long as the function using them is alive, then you should use the stack, it's easier and faster. 4 The stack and the heap work together In C++, when you use the new operator to allocate memory, both the stack and the heap are utilized, for example, say you issue the following command: int *ptr = new int; Here's how the memory allocation works: Heap Allocation: - `new int` dynamically allocates memory for a single integer on the heap. The memory allocated on the heap is not automatically managed and must be explicitly freed by the programmer using `delete`. - The integer allocated on the heap is initially uninitialized, meaning it could contain any arbitrary value until it is explicitly assigned a value. Stack Allocation: - `int *ptr` declares a pointer variable `ptr` on the stack. This pointer variable will hold the address of the integer allocated on the heap. - The actual pointer `ptr` itself is stored on the stack, while the memory it points to is on the heap. To summarize: - Heap:- Memory for the integer (the `new int` part) is allocated on the heap. - Stack:- The pointer variable `ptr` is allocated on the stack and contains the address of the integer that resides on the heap. Here is a breakdown of the memory states: Stack: - `ptr` (which holds the address of the heap-allocated integer). Heap:- An integer (`int`) that `ptr` points to. If you visualize it, it looks something like this: Stack: ptr ----> 0x1234 (address of the allocated integer on the heap) Heap: 0x1234 ----> (int value, uninitialized) To avoid memory leaks, remember to free the allocated memory using `delete` when you no longer need it: delete ptr; // Frees the memory allocated on the heap ptr = nullptr; // Good practice to avoid dangling pointers This ensures that the memory allocated on the heap is properly released and can be reused by the system. 5 De-allocation: Heap Memory Address: When you use delete ptr;, the memory at the address ptr points to is de-allocated (freed), meaning that the memory is returned to the heap and can be reused by the system for other allocations. The actual pointer ptr still holds the same address value, but the memory at that address is no longer valid for use. Pointer Variable: The pointer variable ptr remains in scope until it goes out of scope naturally, such as when the function it is declared in returns, or if it's explicitly reassigned.\ To avoid confusion or accidental use of the pointer after the memory is freed, it's common practice to set the pointer to nullptr after deleting it. Advantages of Heap Memory Dynamic Memory Management: - Flexibility: - The heap allows for dynamic memory allocation, meaning you can allocate memory at runtime based on the program's needs. This is particularly useful for situations where the amount of memory required cannot be determined at compile time, such as when handling user input, working with data structures like linked lists, or managing large data sets. - Variable Lifetime:- Objects and variables can have a lifetime beyond the scope of a function call. Memory allocated on the heap remains allocated until explicitly de-allocated using `delete` (or automatically in languages with garbage collection), allowing for complex data structures and objects to persist as long as needed. Large Memory Allocation: - Larger Size- The heap can handle larger allocations compared to the stack. The stack has a limited size (stack size is typically set when a program starts and can be quite small), whereas the heap can allocate a much larger amount of memory, suitable for large data structures like arrays, buffers, and trees. Disadvantages of Heap Memory Performance Overhead: - Slower Allocation and De-allocation:- Allocating and de-allocating memory on the heap is generally slower than on the stack. This is because the heap involves more complex management to handle memory fragmentation and allocation requests, which can introduce overhead. 6 - Fragmentation:- Over time, the heap can become fragmented as blocks of memory are allocated and de-allocated in varying sizes. This fragmentation can lead to inefficient memory usage and increased allocation times as the system searches for suitable blocks of memory to fulfill allocation requests. Manual Memory Management: - Risk of Memory Leaks:- Since heap memory is manually managed, forgetting to free allocated memory can lead to memory leaks, where memory is no longer used but not returned to the system. Memory leaks can eventually exhaust the available memory, leading to program crashes or degraded performance. - Dangling Pointers and Double-Free Errors:- Incorrectly managing heap memory can lead to dangling pointers (pointers that reference freed memory) and double-free errors (attempting to free memory that has already been freed). These issues can cause undefined behavior, crashes, or security vulnerabilities. Summary While the heap provides flexibility and can handle large memory allocations, it also introduces complexity in terms of performance overhead and manual memory management. Proper handling of heap memory is crucial to avoid common pitfalls such as memory leaks and fragmentation. Reference Variables and Pointer Variables in C++ Introduction to References and Pointers In C++, references and pointers are two different ways to manipulate the memory addresses of variables. Understanding how to use them effectively is crucial for tasks such as dynamic memory allocation, implementing data structures, and optimizing program performance. Reference Variables A reference variable is an alias for another variable. Once a reference is initialized to a variable, it cannot be changed to refer to another variable. int a = 5; int &b = a; // b is a reference to a Key Characteristics: - Initialization:-Must be initialized when declared. - Alias:- Acts as an alias to the initial variable, meaning any changes to the reference affect the original variable. - No Null References:- Unlike pointers, references cannot be null. 7 - Immutability:- After initialization, a reference variable cannot be made to reference another variable. Example Usage: #include void increment(int &b) { b++; } int main() { int x = 10; increment(x); // x is now 11 std::cout data) { Node *new_node = new Node(d); new_node->next = head; head = new_node; return; } If the element after which we want to insert a new node is located at the first node of the list. We simply set the next variable of the newly inserted node to point to the head and then set the value of head as the new_node. 3. Finally, if the list is not NULL and the element is not found at the first node, we create a new pointer variable temp and assign the head variable to it. Next, we traverse through the linked list using while loop. The while-loop executes until temp->next becomes NULL. During each iteration, we check if the value stored in the node pointed to by the current node is equal to the value passed by the x parameter. If the comparison returns true, we break the loop. Next, if the item is found, the temp->next variable will not be NULL. The next variable of the new_node is set to next variable of temp and the next variable of temp is set to new_node. Look at the following code: if (temp->next == NULL) cout next = temp->next; temp->next = new_node; } 20 Below shows an implementation of the insert_before_node() function in the LinkedList class. #include #include using namespace std; class Node { public: int data; Node *next; Node(int d=0){ data = d; } }; class LinkedList { private: Node * head; public: LinkedList() { head = NULL; } void printList() { Node * node = head; while (node != NULL) { cout data next; } } void insert_at_start(int data) { Node *new_node = new Node(data); new_node->next = head; head = new_node; } void insert_before_node(int x, int d) { if (head == NULL) { cout data) { Node *new_node = new Node(d); new_node->next = head; head = new_node; return; } Node *temp = head; while (temp->next != NULL) 21 { if (temp->next->data == x) break; temp = temp->next; } if (temp->next == NULL) cout next = temp->next; temp->next = new_node; } } }; int main() { LinkedList myLst; myLst.insert_at_start(12); myLst.insert_at_start(18); myLst.insert_at_start(29); myLst.insert_at_start(3); myLst.insert_before_node(18,40); myLst.insert_before_node(3,78); myLst.printList(); } Output: 78 3 29 40 18 12 Deleting an element They are three ways to delete an element as seen below: Deletion from the Start Deleting an node from the start of the linked list is straightforward. We have to set the second node to be the head of the list, which we can do by simply assigning the value of the next variable of the head (which is pointing to the second node) to the head as shown below: void delete_at_start() { if (head == NULL) { cout next; } 22 In the script above, we first check if the list is empty or not. If the list is empty we display the message that the list has no element to delete. Otherwise, we assign the value of the head- >next to the head. The head will now point towards the second node. Deletion at the End To delete an element from the end of the list, we simply have to iterate through the linked list till the second last element, and then we need to set the next variable of the second last element to NULL, which will convert the second last element to last element. void delete_at_end() { if (head == NULL) { coutnext->next != NULL) { temp = temp->next; } temp->next = NULL; } Deletion by Node data value To delete the element by value, we first have to find the node that contains the item with the specified value and then delete the node. Finding the item with the specified value iterating through the list for the item. Once the item to be deleted is found, the next variable of the node before the item is set to the node that exists after the item being deleted. Look at the following code: void delete_element_by_value(int x) { if (head == nullptr) { cout next; delete temp; return; } Node* n = head; while (n->next != nullptr) { if (n->next->data == x) { break; } 23 n = n->next; } if (n->next == nullptr) { cout next = n->next->next; delete temp; } } In the code above, we first check if the list is empty. Next, we check if the element to be deleted is located at the start of the linked list. If the element is found at the start, we delete it by setting the head to the next variable of the first node (which basically refers to the second node). Finally, if the element is not found at the first index, we iterate through the linked list and check if the value of the node being iterated is equal to the value to be deleted. If the comparison returns true, we set the next variable of the previous node to the node which exists after the node which is being deleted. LinkedList Destructor The destructor in the LinkedList class is essential for proper memory management. When you use dynamic memory allocation (via new) to create nodes, those nodes are not automatically deleted when the LinkedList object is destroyed. Without a destructor, the allocated memory for the nodes would not be freed, resulting in a memory leak. ~LinkedList(){ Node *current = head; Node *next = nullptr; while (current != nullptr){ next = current->next; delete current; current = next; } head = nullptr; } Given such detail about the PrintList, InsertNode and DeleteNode functions studying them you should make you able to implement the following functions in the LiskedList: GetCount() SearchList() 24