Python Basics PDF

Python Basics Python is an example of an interpreted language (unlike C/C++) Each line is interpreted one at a time Does have some flexibility, especially when simply running a program from top-to-bottom ○ But is also dangerous since it doesn’t check for type errors and may fail in the middle of execution Python interactive shell (IDLE) can execute lines of code by typing it in Stores state of variables that can be reused, BUT… ○ Once you exit the interactive shell, memory is cleared and your work is lost ○ So python programs are usually NOT written in the interactive shell and in separate.py files Python Built-in Atomic Types Python has some basic types of data that come straight out-of-the-box Examples are integers (int) and floats (decimals) ○ May affect the output type you’re getting, even if it’s numerically the same Example Unset >>> 2/2 1.0 # float >>> 2 + 2 4 # int >>> 2 + 2.0 4.0 # float And there are certain functionality that may work with certain types, but not others Unset >>> x = 10.0 >>> int(x) 10 >>> float(x) 10.0 >>> x = "10.0" # string type >>> type(x) >>> x = float(x) >>> type(x) >>> x = int(x) >>> type(x) >>> x = "10.0" >>> x = int(x) Traceback (most recent call last): File "", line 1, in x = int(x) ValueError: invalid literal for int() with base 10: '10.0' >>> len(x) 4 >>> len(float(x)) Traceback (most recent call last): File "", line 1, in len(float(x)) TypeError: object of type 'float' has no len() Relational and Logical Operators Output of these operators result in a Boolean value ○ True , False Boolean values are important for control structures (while loops, if statements) Basically, allows you to fine-tune your program and define what / when instructions should be executed. Example Unset >>> 5 == 5 True >>> 5 != 5 False >>> 5 < 5 False >>> 5 >> 5 > 4 True >>> 4 >= 5 False >>> True or False True >>> not False or False True >>> False and True False Python Lists A dynamic collection of data (heterogeneous types) that can be indexed (index starting at 0). Example Unset >>> x = [] >>> len(x) # returns number of elements in list >>> x.append(1) # adds to the end of the list >>> x >>> len(x) 1 >>> x = [1,2,2,3,3,3] >>> len(x) 6 >>> x # extracts an element at an index (index starts at 0) 3 >>> x = "3" # assigns a value at a certain index >>> x [1, 2, 2, '3', 3, 3] Data types can have methods (functions that can be called on an instance of a type (or object)) Unset >>> x = [1,1,2,2,2] >>> x.count(2) # counts the number of times 2 appears in the list 3 >>> x.count(3) 0 >>> x = [1,'2',3,'4'] >>> '1' in x # Returns True if '1' is in the list, False otherwise False >>> 1 in x True >>> x.insert(2, 2.5) # inserts 2.5 in index 2 of the list, shifts right elements over >>> x [1, '2', 2.5, 3, '4'] >>> x.pop() # removes and returns last element of list '4' >>> x [1, '2', 2.5, 3] >>> x.pop(1) # removes element at index 1 '2' >>> x [1, 2.5, 3] >>> del x # Notice that there isn’t any output, but still removes element >>> x [1, 3] Python methods and functions may or may not have a return value A special value (None) is used to represent a non-returned value It may be useful to store a return value to a variable, because once a value returns and is not stored, then it’s gone! List Slicing [i:j] syntax starts (includes) element at index i up to (not including) element at index j Example Unset >>> x[1:4] [2, 3, 4] >>> x[1:7] [2, 3, 4, 5] >>> x[1:] [2, 3, 4, 5] >>> x[:3] [1, 2, 3] Strings Strings represent a collection of characters ○ But in python, characters aren’t a data type. It’s basically a string with one value Strings are very similar to Lists (not exactly though…) Example Unset >>> x = "CS9" >>> type(x) >>> x '9' But… Unset >>> x = "1" Traceback (most recent call last): File "", line 1, in x = "1" TypeError: 'str' object does not support item assignment Lists in Python are MUTABLE (can change the existing list) Strings are IMMUTABLE (cannot change the existing string) Some useful string methods Unset >>> x = x.replace("9", "1") #returns a string,replaces the “9” with “1” in x >>> x 'CS1' >>> x.split("S") # splits the string at the first occurrence of “S” ['C', '1'] >>> x.find("1") # returns the index of the first occurrence of “1” 2 Functions Unset # Function definition def double(n): ''' Returns 2 times the parameter ''' # Good to comment functions! return 2 * n Note: A function doesn’t have to return anything If a function doesn’t have a specific return value, it returns a special None type Another note: The function’s body’s scope is defined by tabs Mutable vs. Immutable Why should we care about immutable vs mutable properties? ○ Depending on whether or not something is mutable or immutable, it affects how the data is treated when passing it in a function When immutable types are passed into a function, a COPY of that variable is made and used within the function ○ Once the function returns, the immutable value does not change! When mutable types are passed into a function, the parameter is used within the function (no copy is made) ○ Once the function returns, the mutable value does change! Example Unset def changeListParameter(x): x = "!" return x a = ["C","S","9"] print(changeListParameter(a)) # ['!', 'S', '9'] print(a) # ['!', 'S', '9'] def changeStringParameter(x): x = x.replace("C", "!") return x b = "CS9" print(changeStringParameter(b)) # !S9 print(b) # CS9 - b didn't change! Control Structures if statements Unset if BOOLEAN_EXPRESSION: STATEMENT(s) If BOOLEAN_EXPRESSION is True, execute statements. If False, skip statements Note that tabs define the scope of the statements as part of the if body. if/else statements Unset if BOOLEAN_EXPRESSION: STATEMENT(s) #1 else: STATEMENT(s) #2 evaluate BOOLEAN_EXPRESSION ○ if True, execute statements #1 and then continue execution after if / else blocks ○ if False, execute statements #2 and then continue execution after if / else blocks elif statements Same as else if statements Example Unset x = False if x: print("Shouldn't print") elif 4 < 5: print("4 < 5") else: print("Shouldn't print") While Loops Unset while BOOLEAN_EXPRESSION: STATEMENT(S) if BOOLEAN_EXPRESSION is True, perform statements in the body of the loop if BOOLEAN_EXPRESSION is False, skip the body of loop entirely and continue execution Infinite Loop Example: Unset while True: print("Weee!!!") # Will always execute since BOOLEAN_EXPRESSION == True For loops Unset for VARIABLE in COLLECTION: STATEMENT(s) Assigns an element in the collection to the variable (starts with the 1st item COLLECTION) ○ Executes the STATEMENT(s) Assigns the next item in the collection to the variable ○ Executes the STATEMENT(s) … and so on… Continues loop execution until there are no more items in the collection range() is a function used for looping if we know the number of iterations we want to make ○ range(x) returns a collection of numbers from 0 up to (but not including) x ○ range(x,y) returns a collection of numbers starting at x up to (but not including) y Example Unset for x in range(4): print(x, "Hello!" * x) print("---") Sets Like a mathematical set A collection of items with: ○ No duplicates ○ Order and position do not matter Common set operators Unset s2 = set([2,4,6]) print(s2) print(type(s2)) print(3 in s2) print("?" in s2) print(5 not in s2) # True print(len(s2)) # Combine values from two sets s3 = set([4,5,6]) combined_set = s2 | s3 print(combined_set) # Get the common elements from two sets intersecting_set = s2 & s3 print(intersecting_set) Dictionaries Otherwise known as TABLES or MAPS ○ Works where a unique KEY maps to a VALUE Gives us more precise indexing than lists Dictionaries Otherwise known as TABLES or MAPS ○ Works where a unique KEY maps to a VALUE Gives us more precise indexing than lists Unset DICT = {:, :,... , :} D = {} # Empty dictionary. Notice the curly braces instead of [] print(D) print(len(D)) D = {'Jane':18, 'Jen':19, 'Joe':20, 'John':21} print(D) print(len(D)) print("Jane's age is:", D['Jane']) # Simple way to get key / value for x in D: print(x) # Prints the key print(D[x]) # Prints the value Restrictions on using Dictionaries Keys must be an immutable type (int, str, namedtuples, tuples, … not lists for example). Values can be any type (immutable or mutable) and do not have to be unique For our purposes, KEYS are UNIQUE. Don’t define {'Joe':17, 'Joe':18} ○ Python is actually OK with duplicate keys in the definition, and it will keep the last key / value in the dictionary Example Unset value = D.pop("Joe") # returns the value associated with Joe, error if doesn’t exist print(value) print(D) value = D.get("jksldf") # returns none if doesn’t exist print(value) D["Jem"] = 22 # Adds element to dictionary print(D) Python Objects and Classes Objects are a way for programmers to define their own Python types and create abstractions of things with the programming language. Each object may have a certain state that gets modified throughout program execution. Object Oriented (OO) programming is the way programs use and manipulate objects to solve problems and model real-world properties OO Programming is not REQUIRED ○ It’s more of a way to organize, read, maintain, test, and debug software in a manageable way We’ve been using objects already, for example: Unset >>> x = [1,2,3] >>> type(x) >>> x.count(3) 1 >>> x.count(-1) 0 >>> x.append(0) >>> x [1, 2, 3, 0] count, append, etc are all examples of methods that can be called on an object. Methods are like functions but are associated with an object In this case, Python already defined its own class called list that we can use, but sometimes we want to create our own specific objects for the applications we’re trying to build! Student Class Example Unset class Student: ''' Student class type that contains attributes for all students ''' def setName(self, name): self.name = name def setPerm(self, perm): self.perm = perm def printAttributes(self): print("Student name: {}, perm: {}" \.format(self.name, self.perm)) s = Student() s.setName("Chris Gaucho") s.setPerm(1111111) s.printAttributes() When defining methods, the self parameter represents the current object we’re calling the method on In the example above, s is the variable storing the created Student object. When we define the methods, the first parameter is the instance of the object stored in s We can then use and manipulate the state of the object with these methods using the self parameter Default Constructor We can provide either default values or set values of the object when constructing it through the parameter list In the example above, we can set an empty object without any initial attributes, which may cause an error when trying to use them The constructor below will set self.name and self.perm to the value None, which doesn’t require the set methods to set these fields in the object. Unset def __init__(self): self.name = None self.perm = None s = Student() s.printAttributes() We can even go a step further and define the attributes by putting them in the parameters when we create the object Unset def __init__(self, name, perm): self.name = name self.perm = perm s = Student("Richert", 1234567) s.printAttributes() Initializing default values in the constructor Unset def __init__(self, name=None, perm=None): self.name = name self.perm = perm In this case, we can pass in parameters or not. If not, then None will automatically be assigned to the self.name and self.perm Or we can pick and choose what to set. For example: Unset s = Student(perm=1234567) s.printAttributes() # Student name: None, perm: 1234567 We used the default value of name (None) and then explicitly set the perm (1234567) Example of using objects in code Unset s1 = Student("Jane", 1234567) s2 = Student("Joe", 7654321) s3 = Student("Jill", 5555555) studentList = [s1, s2, s3] for s in studentList: s.printAttributes() # Using assert statements to test correct functionality s1 = Student() assert s1.name == None assert s1.perm == None s1 = Student("Gaucho", 7654321) assert s1.name == "Gaucho" assert s1.perm == 7654321 Container Classes Let’s continue to expand on our Student class Classes are also useful to organize / maintain state of a program Student objects are useful to organize the attributes of a single student But let’s imagine we are trying to write a database of various students ○ The database may be useful to search for students with certain attributes… ○ We will need to add / remove things from our database ○ We can define a class to represent a collection of courses containing students ○ Let’s set up a collection of courses such that a dictionary key is defined by the course number and the value is a list of actual student objects Unset # Courses.py # from [filename (without.py)] import [component] from Student import Student class Courses: ''' Class representing a collection of courses. Courses are organized by a dictionary where the key is the course number and the corresponding value is a list of Students of this course ''' def __init__(self): self.courses = {} def addStudent(self, student, courseNum): # If course doesn't exist... if self.courses.get(courseNum) == None: self.courses[courseNum] = [student] elif not student in self.courses.get(courseNum): self.courses[courseNum].append(student) def printCourses(self): for courseNum in self.courses: print("CourseNum: ", courseNum) for student in self.courses[courseNum]: student.printAttributes() print("---") # Getter method def getCourses(self): return self.courses # Using assert statements to test correct functionality s1 = Student("Gaucho", 1234567) UCSB = Courses() UCSB.addStudent(s1, "CS9") courses = UCSB.getCourses() assert courses == {"CS9": [s1]} Shallow vs. Deep Equality Python allows us to check for equality using the == operator for our objects But Python doesn’t have any knowledge of what makes a Student equal to another Student in this case So by default, Python uses the memory address (not the values) to determine if two objects are the same This is known as a shallow equality Example Unset s1 = Student("Jane", 1234567) s2 = Student("Jane", 1234567) print(s1 == s2) # False, doesn’t compare values! In order to provide our meaning of equality for two Student objects, we will have to define the __eq__ method in our Student class In this case, we can assume two Students are the same if they have the same perm number ○ Comparing values (instead of memory addresses) is called deep equality Unset # Add the __eq__ method in Student.py # s1 == s2, self is the left operand (s1), rhs is the right operand (s2) def __eq__(self, rhs): return self.perm == rhs.perm Unset s1 = Student("Jane", 1234567) s2 = Student("Jane", 1234567) print(s1 == s2) # True, compares the perm values! Python Errors We’ve probably seen Python complain before even running the program For example: Unset print("Start") PYTHON! print( Hello ) In this case, the program didn’t run at all. Before anything happened Python basically is telling us that PYTHON! Is a syntax error. Before any Python script runs, it gets parsed through and there’s a simple check to make sure all expressions are legal. If not, then it will state an error. Note that the program is not running at this time. If we remove the syntactically incorrect line: Unset print("Start") print( Hello ) We get another type of error that happens WHILE the code is executing. Errors that happen during program execution is called a runtime error: Unset Traceback (most recent call last): File "/Users/richert/Desktop/UCSB/CS9/lecture.py", line 5, in print( Hello ) NameError: name 'Hello' is not defined The above message is basically saying Hello is a variable that hasn’t been defined, but we’re trying to use it in a print statement. Syntactically, there is nothing wrong with the above lines of code (since Hello could be a valid variable name). In this case, when python tries to execute the statement print( Hello ), it throws an exception Exceptions Exceptions are errors that occur during program execution So far, when we’ve encountered these runtime errors, we’ve just noticed that the program crashes However, there are ways to recover from runtime errors since we can handle exceptions in our code In the above code segment, it’s complaining about a certain type of error called NameError There are many types of exceptions types that can occur during runtime. For example: Unset print("Start") print (1/0) Gives us the exception: Unset Traceback (most recent call last): File "/Users/richert/Desktop/UCSB/CS9/lecture.py", line 5, in print( 1/0 ) ZeroDivisionError: division by zero In this case, the type of exception that has been thrown is called a ZeroDivisionError because a number divided by 0 is undefined Unset print("Start") print (‘5’ + 5) Gives us the exception: Unset Traceback (most recent call last): File "/Users/richert/Desktop/UCSB/CS9/lecture.py", line 5, in print( '5' + 5 ) TypeError: can only concatenate str (not "int") to str In this case, a TypeError occurred since ‘+’ cannot add str types. Handling Exceptions The general rule of exception handling is: If an exception was raised in a program and nobody catches the exception error, then the program will terminate. But we can handle exception messages with a general structure referred to as try / except An example: Unset while True: try: x = int(input("Enter an int: ")) # input() prompts user for input break # breaks out of the current loop except Exception: print("Input was not a number type") print("Let's try again...") print(x) print("Resuming execution") The flow of execution is: Everything within a try block is executed normally. If an exception occurs in the try block, execution halts and an exception message is passed to a corresponding except block. The except block tries to catch a certain exception type (note that Exception catches all types of exceptions (NameError, TypeError, ZeroDivisionError, etc). If there is a matching type in the except statements, then the first-matching except block is executed. Once done, program execution resumes past the except block(s). If no exceptions were caught, then the program will terminate with an error message. Catching Multiple Exceptions Let’s slightly modify our code so another type of exception (ZeroDivisionError) may happen (in addition to entering a non-int type): Unset while True: try: x = int(input("Enter an int: ")) print(x/0) break except ZeroDivisionError: print("Can't divide by zero") except Exception: print("Input was not a number type") print("Let's try again...") print("Resuming execution") In this case, the program will either complain that a number type was not entered, or if it was entered correctly, we’ll get a ZeroDivisionError ○ The program in this case will never execute “correctly” But the important thing to observe in this scenario is we can catch multiple exception types - depends on what type of exception was thrown The rule is: ○ except statements are checked top-down ○ The first matching exception type block is then executed ○ Then the program jumps past ALL the except statements (only one except block is executed) and code execution resumes Example of functions raising exceptions that are caught by the caller Unset def divide(numerator, denominator): if denominator == 0: raise ZeroDivisionError() # Change to ValueError() and observe return numerator / denominator try: print(divide(1,1)) print(divide(1,0)) print(divide(2,2)) # Notice this doesn’t get executed except ZeroDivisionError: print("Error: Cannot divide by zero") print("Resuming Execution...") In this scenario, we have an exception raised in the divide function Since there isn’t an except statement in divide(), the exception message gets passed to the calling function ○ Since divide was called in a try block, then we check the except statements for the first match ○ If a match exists, then the first except block is executed, then all except blocks are skipped and execution resumes If an exception is raised and we NEVER handle it in an except block, then Program will eventually crash with an error message (like we’ve seen) Testing Complete Test Complete Test: Testing every possible path through the code in every possible situation ○ Generally infeasible… ○ Imagine a simple program that takes in 4 integers and prints out the max In Python3, the range of valid integers is a lot! Limited to memory (unlike other languages like C++ / Java where an int type is stored in 32 bits (4 bytes) The number of computations to test EVERY POSSIBLE combination of the 4 integers will take A LONG TIME to compute! Unit Testing: Testing individual pieces (units) of a program to ensure correct behavior Test Driven Development (TDD) 1. Write test cases that describe what the intended behavior of a unit of software should. Kinda like the requirements of your piece of software 2. Implement the details of the functionality with the intention of passing the tests 3. Repeat until the tests pass. Imagine large software products where dozens of engineers are trying to add new features / implement optimizations all at the same time Having a “suite” of tests before deploying software to the public is essential ○ Someone may modify changes that work for a current version, but breaks functionality in a future version ○ Rigorous tests enable confidence in the stability in software Pytest Pytest is a framework that allows developers to write tests to check the correctness of their code Gradescope actually uses pytest to check the “correct” answer with students’ submissions We can write functions that start with test_, and the body of the function can contain assert statements (as seen in lab00) ○ Pytest will run each of these functions are report which tests passed and which tests failed Try and install Pytest on your computer (will use this in our examples) Installation Guide: https://docs.pytest.org/en/stable/getting-started.html Windows Installation Guide (created by previous Learning Assistants): Python and Pytest Installation Guide for Windows Example Write a function biggestInt(a,b,c,d) that takes 4 int values and returns the largest Let’s write our our tests first (TDD!) Unset # testFile.py # imports the biggestInt function from lecture.py from lecture import biggestInt def test_biggestInt1(): assert biggestInt(1,2,3,4) == 4 assert biggestInt(1,2,4,3) == 4 assert biggestInt(1,4,2,3) == 4 def test_biggestInt2(): assert biggestInt(5,5,5,5) == 5 # etc. def test_biggestInt3(): assert biggestInt(-5,-10,-12,-100) == -5 assert biggestInt(-100, 1, 100, 0) == 100 # etc. Now let’s write the function: Unset # lecture.py def biggestInt(a,b,c,d): biggest = 0 if a >= b and a >= c and a >= d: return a if b >= a and b >= c and b >= d: return b if c >= a and c >= b and c >= d: return c else: return d Command to run pytest on testFile.py: Navigate to the folder containing lecture.py and testFile.py (On mac terminal): python3 -m pytest testFile.py ○ Note: replace python3 with python on Windows. Operator Overloading We can define our own behavior for common operators in our classes ○ What does it mean if two student objects are equal (we defined it to mean perm numbers are equal)? ○ Or what does it mean to add (+) two students together? ○ Python allows us to define the functionality for operators! Defining __str__ When printing our defined objects, we may get something unusual. For example: Unset from Student import Student s1 = Student("Gaucho", 1234567) s2 = Student("Jane", 5555555) print(s1) All objects can be printed, but Python wouldn’t know what to print for user-defined objects like Student So it just prints the memory address (the 0x...) of where the object exists in memory In order to provide our own meaning of what Python should display when printing an object like Student, we will need to define a special __str__ method in our Student class: Unset def __str__(self): ''' returns a string representation of a student ''' return "Student name: {}, perm: {}".format(self.name, self.perm) Python will now use the return value of the __str__ method when determining what to display in the print statement ○ Now the print(s1) statement outputs Student name: Gaucho, perm: 1234567 Overriding the ‘+’ operator What would it mean to add (+) two students together? Maybe we can return a list collection … could be useful … maybe? Unset def __add__(self, rhs): ''' Takes two students and returns a list containing these two students ''' return [self, rhs] Unset x = s1 + s2 # returns a list of s1 + s2 print(type(x)) # list type for i in x: print(i) # Output of for loop # Student name: Gaucho, perm: 1234567 # Student name: Jane, perm: 5555555 Overloading the ‘=’ operator Example: Unset # = rhs.name.upper() # > # def __gt__ # < # def __lt__ Unset print(s1 = s2) # False print(s1 == s2) # False print(s1 < s2) # ERROR, we didn’t define the __lt__ method Good article on this as well as a list of common operators we can overload: https://thepythonguru.com/python-operator-overloading/ Inheritance Let’s write an Animal class and see what inheritance looks like in action: Unset # Animal.py class Animal: ''' Animal class type that contains attributes for all animals ''' def __init__(self, species=None, name=None): self.species = species self.name = name def setName(self, name): self.name = name def setSpecies(self, species): self.species = species def getAttributes(self): return "Species: {}, Name: {}".format(self.species, self.name) def getSound(self): return "I'm an Animal!!!" and now let’s define a Cow class that inherits from Animal Unset # Cow.py from Animal import Animal class Cow(Animal): # Available method for the Cow Class def setSound(self, sound): self.sound = sound Unset c = Cow("Cow", "Betsy") print(c.getAttributes()) c.setSound("Moo") # Sets a Cow sound attribute to "Moo" print(c.getSound()) # I’m an Animal!!! (calls the Animal.getSound method) Note that the Cow’s constructor (__init__) was inherited from the class Animal as well as the getAttributes() method Also note that we didn’t need to define the getSound() method since it was inherited from Animal But in this case, this inherited method getSound() may not be what we want. So we can redefine its functionality in the Cow class! Unset # in Cow class def getSound(self): return "{}!".format(self.sound) We changed the getSound() method in the Cow class, so in this case our Cow class overrode the getSound() method of Animal So now, cow objects will use its own version of getSound(), not the version that was inherited from Animal, as seen below: Unset c = Cow("Cow", "Betsy") c.setSound("Moo") # Sets a Cow sound to "Moo" print(c.getSound()) # Moo! We can still create Animal objects, and Animal objects will still use its own version of getSound() Unset a = Animal("Unicorn", "Lala") print(a.getAttributes()) print(a.getSound()) # I’m an Animal!!! Note: The constructed object type will dictate which method in which class is called. It first looks at the constructed object type and checks if there is a method defined in that class. If so, it uses that If the constructed object doesn’t have a method definition in its class, then it checks the parent(s) it inherited from, and so on … If there is no matching method call, then an error happens Extending Superclass Methods Some terminology: Animal in the previous example can be referred to as the Base / Parent / Super class Cow can be referred to as the Sub / Child / Derived class In the example above, we overrode the makeSound method from Animal However, sometimes we only want to extend the functionality, not completely replace it. ○ It is possible to override methods and still use the inherited functionality by calling the super() class methods ○ So in this case, we override the method in the child class, but we extend the base class’ functionality by using it in the child class’ overwritten method ○ Example: Unset class Cow(Animal): def getSound(self): s = "Using Super class getSound method\n" s += Animal.getSound(self) + "\n" # Uses Animal.getSound method s += "Extending it with our own getSound functionality" + "\n" s += "{}!!!".format(self.sound, self.sound) return s # Output: # Using super class getSound method # I'm an Animal!!! # Extending it with our own getSound functionality # Moo!!! Extending Constructors in a Child Class A common pattern is to redefine a subclass’ constructor by taking in all data from its parent class AND data specific to the subclass. Example: Unset # In Cow.py def __init__(self, species=None, name=None, sound=None): super().__init__(species, name) #Animal.__init__(self, species, name) also works self.sound = sound Unset c = Cow("Cow", "Betsy", "Moo") # Passes in data for Animal AND Cow a = Animal("Unicorn", "Lala") zoo = [c, a] for i in zoo: print(i.getAttributes()) print(i.getSound()) print("---") Inheritance and Exceptions We can create a hierarchy of Exception types using inheritance ○ Exception is the base class for ALL Exception types (Python allows us to raise these types) ○ Remember the sub-class IS-A type of the base class ○ This may be useful for fine-tuning certain behavior ○ For example, a network error could be because of: Malformed URL Timeout … ○ Or file reading may error out due to: Incorrect file name Incorrect access permissions … Example of creating our own Exception types: Unset class A(Exception): pass class B(A): # B inherits from A (B IS-A A type) pass class C(Exception): pass try: x = int(input("Enter a positive number: ")) if x < 0: raise B() # Change this to A() and C() and observe... except C: print("Exception of type C caught") except A: print("Exception of type A caught") except B: print("Exception of type B caught") # Will never get called except Exception: print("Exception of type Exception caught") print("Resuming execution") In the example above, B’s except block is never called. ○ This is because B IS-A type of A (B is a child of A). So when we catch the matching type, A always matches first. Algorithm Analysis There are many ways we can try to estimate an algorithm For example, we can benchmark the algorithm by calculating the time it takes for something to run We can do this in Python using some code: Unset import time def f1(n): l = [] for i in range(n): l.insert(0,i) return def f2(n): l = [] for i in range(n): l.append(i) return print("starting f1") start = time.time() f1(200000) end = time.time() print("time elapsed: ", end - start, "seconds") print("starting f2") start = time.time() f2(200000) end = time.time() print("time elapsed: ", end - start, "seconds") We’ll get to why the time difference between adding to the list in the beginning and adding to the end differ in time soon enough :) This way of measuring algorithms has some problems like: ○ Underlying hardware (fast / slow CPU, amount of memory, disk size / speed, network speed, etc.) ○ How busy the computer is, how the OS is managing other programs ○ How big n is (if n is small, time is almost the same) Asymptotic Behavior We want to analyze approximately how fast an algorithm runs when the size of the input approaches infinite So instead of calculating the raw time of how fast the algorithm runs on our computers, we can approximate the number of instructions the algorithm will take with respect to the size of the input ○ Steps can include things like assigning values to variables, evaluating boolean expressions, arithmetic operations, etc. Let’s try this with a simple algorithm: Unset for i in range(10): print(i) Counting expressions in this case: ○ Assignment: i = 0, i = 1, i = 2, etc. (10 steps) ○ print() (10 steps) ○ Algorithm takes 20 steps The algorithm runs in constant time, since there isn’t a variable input and will always take the same number of instructions to run. Let’s change our problem taking in a variable input size: Unset def f(n): for i in range(n): print(i) Now the number of instructions in this algorithm is dependent on the value of n So let’s try to express the number of instructions as a polynomial with respect to n ○ We can denote the number of instructions with respect to n as T(n) T(n) = n assignment statements + n print statements T(n) = 2n Order of magnitude function (Big-O) Since we have no idea how large n will be, we want to assume the worst-case scenario when analyzing our algorithms ○ And in this case, the worst-case is when n approaches infinite Since n approaches infinite, we can ignore lower order terms and coefficients ○ Does the 3 really matter when we have 1,000,000,000,000,000,000 + 3 ? ○ See: http://science.slc.edu/~jmarshall/courses/2002/spring/cs50/BigO/index. html for an example of how lower-order terms converge when n approaches infinite So we can express the above algorithm above by dropping all lower order terms, constants, and coefficients to O(n) Note: constant time algorithms are denoted as O(1) Another (slightly) more complex example: Unset def f(n) x = 0 for i in range(n): for j in range(n): x = i + j Initialize x = 0 is 1 instruction i assignment statements (n instructions) j assignment statements (n2 instructions) i + j computations (n2 instructions) x reassignment statements (n2 instructions) ○ T(n) = 1 + 3n2 + n Drop all lower order terms, coefficients, and constants and we get O(n2) Another example: Unset def f(n): for i in range(n): return i Note that in this example, i gets assigned to 0. But then i is immediately returned before re-assigning i to 1. So this algorithm is O(1). Recursion Recursion is when a function contains a call to itself Recursive solutions are useful when the result is dependent on the result of sub-parts of the problem The three laws of recursion 1. A recursive algorithm must have a base case 2. A recursive algorithm must change its state and move toward the base case 3. A recursive algorithm must call itself, recursively Common Example: Factorial N! = N * (N-1) * (N-2) * … * 3 * 2 * 1 N! = N * (N-1)! We can solve N! by solving the (N-1)! sub-part ○ Note: 0! == 1 Unset def factorial(n): if n == 0: # base case return 1 return n * factorial(n-1) Function Calls and the Stack The Stack data structure has the First in, Last out (FILO) property We can only access the elements at the top of the stack ○ Analogy: Canister of tennis balls In order to remove the bottom ball, we have to remove all the balls on top We won’t go through an implementation yet, but we can conceptualize this on a high-level Function calls utilize a stack (“call stack”) and organize how functions are called and how expressions that call functions wait for the functions’ return values ○ When a function is called, you can think of that function state being added (or “pushed”) on the stack ○ When a function finishes execution, it gets removed (or “popped”) from the stack ○ The top of the stack is the function that is currently running Example Unset def double(n): return 2 * n def triple(n): return n + double(n) print(triple(5)) Going back to our recursive factorial example, let’s trace factorial(3) Common Example: Fibonnaci Sequence A fibonacci sequence is the nth number in the sequence is the sum of the previous two (i.e. f(n) = f(n-1) + f(n-2)). f(0) = 0, f(1) = 1, f(2) = 1, f(3) = 2, f(4) = 3, f(5) = 5, f(6) = 8, … Unset def fibonnaci(n): if n == 1: # base cases return 1 if n == 0: return 0 return fibonnaci(n-1) + fibonnaci(n-2) Evaluate fibonnaci(4) by hand Unset fib(3) + fib(2) fib(2) + fib(1) + fib(2) fib(1) + fib(0) + fib(1) + fib(2) 1 + fib(0) + fib(1) + fib(2) 1 + 0 + fib(1) + fib(2) 1 + 0 + 1 + fib(2) 1 + 0 + 1 + fib(1) + fib(0) 1 + 0 + 1 + 1 + fib(0) 1 + 0 + 1 + 1 + 0 3 Python Lists vs. Python Dictionaries Let’s observe the performance between lists and dictionaries with an example. The following program counts the number of words in a file using a list and dictionary ○ They do the same thing, but the performance is vastly different… ○ wordlist.txt : File containing a collection of words, one per line. https://ucsb-cs8.github.io/m19-wang/lab/lab07/wordlist.txt ○ PeterPan.txt : You can download classic novels from the Gutenberg Project as a.txt file! https://www.gutenberg.org/ebooks/16 Unset # Set up our data structures DICT = {} infile = open("wordlist.txt", 'r') for x in infile: # x goes through each line in the file DICT[x.strip()] = 0 # Creates an entry in DICT with key x.strip() and value 0 print(len(DICT)) infile.close() # close the file after we’re done with it. WORDLIST = [] for y in DICT: # put the DICT keys into WORDLIST WORDLIST.append(y) print(len(WORDLIST)) # Algorithm 1 - Lists from time import time start = time() infile = open("PeterPan.txt", 'r') largeText = infile.read() infile.close() counter = 0 words = largeText.split() for x in words: x = x.strip("\"\'()[]{},.?:;-").lower() if x in WORDLIST: counter += 1 end = time() print(counter) print("Time elapsed with WORDLIST (in seconds):", end - start) # Algorithm 2 - Dictionaries start = time() infile = open("PeterPan.txt", 'r') largeText = infile.read() infile.close() words = largeText.split() counter = 0 for x in words: x = x.strip("\"\'()[]{},.?:;-").lower() if x in DICT: # Searching through the DICT counter += 1 end = time() print(counter) print("Time elapsed with DICT (in seconds):", end - start) Python lists are efficient if we know the index of the item we’re looking for ○ The reason why adding to the front of the list is costly is because lists have to “make room” for the element to be at index 0 All existing elements of the list need to shift one index up in order for the inserted element to be placed at index 0 ○ Adding to the end of the list is not nearly as costly because no shifting of existing elements needs to occur For this example, since we are looking for the value in the list (without knowing the index), we are checking through the entire WORDLIST for every word in PeterPan.txt Python dictionaries are efficient when looking up a key value ○ Dictionary values are actually stored in an underlying list ○ Keys are converted into a numerical value, which is passed into a hash function The purpose of the hash function is to output the index for the underlying list based on the key value We do not have to scan the underlying list structure since a key will always be placed into a specific location in the the underlying list We won’t go into the implementation now, but we’ll revist this in more detail later There are MANY ways to solve a problem in programming, but understanding how certain tools work and making the best decisions is important! Recall Recall our benchmarking algorithm when inserting elements to the front of a Python list: Unset def f1(n): l = [] for i in range(n): l.insert(0,i) return Since we’re inserting elements at the front of the Python list, we need to move the existing elements over by one in order to make room for them. ○ So the first insertion is cheap and takes one step ○ The 2nd insertion needs to move the first element over by one in order to make room for the 2nd element ○ The 3rd insertion needs to move the first and second element over ○ and so on… We can say the number of steps this for loop has is: ○ 1+2+3+…+n ○ And this simplifies to n(n + 1) / 2 ○ Therefore, f1 is O(n2) Binary Search Algorithm Binary Search is a useful algorithm to search for an item in a collection IF THE ITEMS ARE IN SORTED ORDER ○ Since the collection is in sorted order, we can check the middle to see if the item we’re looking for is there If the middle element is the one we’re looking for, then great! If the item we’re looking for is < the middle element, then we don’t have to search the right side of the collection If the item we’re looking for is > the middle element, then we don’t have to search the left side of the collection ○ Since each comparison is eliminating half the search space, this algorithm has a logarithmic property And this algorithm performs in O(log n) time Let’s do some TDD and write tests before we write the function! Unset def test_binarySearchNormal(): assert binarySearch([1,2,3,4,5,6,7,8,9,10], 5) == True assert binarySearch([1,2,3,4,5,6,7,8,9,10], -1) == False assert binarySearch([1,2,3,4,5,6,7,8,9,10], 11) == False assert binarySearch([1,2,3,4,5,6,7,8,9,10], 1) == True assert binarySearch([1,2,3,4,5,6,7,8,9,10], 10) == True def test_binarSearchDuplicates(): assert binarySearch([-10,-5,0,1,1,4,4,7,8], 0) == True assert binarySearch([-10,-5,0,1,1,4,4,7,8], 2) == False assert binarySearch([-10,-5,0,1,1,4,4,7,8], 4) == True assert binarySearch([-10,-5,0,1,1,4,4,7,8], 2) == False def test_binarySearchEmptyList(): assert binarySearch([], 0) == False def test_binarySearchSameValues(): assert binarySearch([5,5,5,5,5,5,5,5,5,5,5], 5) == True assert binarySearch([5,5,5,5,5,5,5,5,5,5,5], 0) == False Binary Search Implementation (iterative) Unset def binarySearch(intList, item): first = 0 last = len(intList) - 1 found = False while first

Document Details

Tags

Related

Summary

Full Transcript