Python Programming Notes PDF
Document Details
Uploaded by Deleted User
Tags
Summary
This document provides an introduction to Python programming. It covers basic concepts like variables, expressions, and input/output using the Python interpreter. The document also discusses various types of errors, including syntax and runtime errors. Additionally, it touches on computer hardware and the concept of programming language history, providing insights into historical language design.
Full Transcript
Everything in Comp 163 1. Introduction to Python 1.1 Programming (general) A computer program consists of instructions executing one at a time. Basic instruction types are: Input: A program receives data from a file, keyboard, touchscreen, network, etc. Process: A program...
Everything in Comp 163 1. Introduction to Python 1.1 Programming (general) A computer program consists of instructions executing one at a time. Basic instruction types are: Input: A program receives data from a file, keyboard, touchscreen, network, etc. Process: A program performs computations on that data, such as adding two values like x + y. Output: A program puts that data somewhere, such as a file, screen, or network. Programs use variables to refer to data, like x, y, and z below. The name is due to a variable's value "varying" as a program assigns a variable like x with new values. A sequence of instructions that solves a problem is called an algorithm. 1.2 Programming using Python The Python interpreter is a computer program that executes code written in the Python programming language. An interactive interpreter is a program that allows the user to execute one line of code at a time. Code is a common word for the textual representation of a program (and hence programming is also called coding). A line is a row of text. A statement is a program instruction. A program mostly consists of a series of statements, and each statement usually appears on its own line. Expressions are code that return a value when evaluated; for example, the code wage * hours * weeks is an expression that computes a number. The symbol * is used for multiplication. The names wage, hours, weeks, and salary are variables, which are named references to values stored by the interpreter. A new variable is created by performing an assignment using the = symbol, such as salary = wage * hours * weeks, which creates a new variable called salary. The print() function displays variables or expression values. Characters such as "#" denote comments, which are optional but can be used to explain portions of code to a human reader. Everything in Comp 163 1.3 Basic input and output Text is output from a Python program using the built-in function print( ). Text enclosed in quotes is known as a string. Text in strings may have letters, numbers, spaces, or symbols like @ or #. Each use of print() outputs on a new line. However, sometimes a programmer may want to keep output on the same line. Adding end=' ' inside of print() keeps the output of the next print on the same line, separated by a single space character. Ex: print('Hello', end=' '). Output can be moved to the next line using the newline character "\n". An escape sequence is a string that has a special meaning, like the newline character "\n", that always starts with a backslash "\". Other escape sequences exist, such as "\t" to insert a tab, or "\\" to print an actual backslash character. Any space, tab, or newline is called whitespace. The input() function is used to read input from a user. Strings and integers are each an example of a type; a type determines how a value can behave. Reading from input always results in a string type. However, often a programmer wants to read in an integer, and then use that number in a calculation. If a string contains only numbers, like '123', then the int() function can be used to convert that string to the integer 123. 1.4 Errors Syntax error, violates a programming language's rules on how symbols can be combined to create a program. An example is putting multiple prints on the same line. A runtime error, is when a program's syntax is correct but the program attempts an impossible operation, such as dividing by zero or multiplying strings together (like 'Hello' * 'ABC'). Abrupt and unintended termination of a program is often called a crash of the program. Everything in Comp 163 SyntaxError The program contains invalid code that cannot be understood. Indentation The lines of the program are not properly indented. Error ValueError An invalid value is used, which can occur if giving letters to int(). NameError The program tries to use a variable that does not exist. An operation uses incorrect types, which can occur if adding an integer TypeError to a string. The program would load correctly but would not behave as intended. Such an error is known as a logic error, because the program is logically flawed. A logic error is often called a bug. 1.5 Development environment Code development is usually done with an integrated development environment, or IDE. There are various IDEs that can be found online; some of the most popular are listed below. 1.6 Computers and programs (general) The engineers treated a positive voltage as a "1" and a zero voltage as a "0". 0s and 1s are known as bits (binary digits). To support different calculations, circuits called processors were created to process (or execute) a list of desired calculations, each called an instruction. A memory is a circuit that can store 0s and 1s in each of a series of thousands of addressed locations. Everything in Comp 163 The programmer-created sequence of instructions is called a program, application, or just app. Instructions represented as 0s and 1s are known as machine instructions, and a sequence of machine instructions together form an executable program (or an executable). Programs called assemblers to automatically translate instructions for humans, such as "Mul 97, #9, 98", known as assembly language instructions. In the 1960s and 1970s, programmers created high-level languages to support programming using formulas or algorithms, so a programmer could write a formula such as F = (9 / 5) * C + 32. To support high-level languages, programmers created compilers, which are programs that automatically translate high-level language programs into executable programs. 1.7 Computer tour Input/output devices: A screen (or monitor) displays items to a user. A keyboard allows a user to provide input to the computer, typically accompanied by a mouse for graphical displays. Storage: A disk (aka hard drive) stores files and other data, such as program files, songs and movies, or office documents. Memory: RAM (random-access memory) temporarily holds data read from storage and is designed so any address can be accessed much faster than from a disk. Memory size is typically listed in bits, or in bytes where a byte is 8 bits. Processor: The processor runs the computer's programs, reading and executing instructions from memory, performing operations, and reading and writing data from and to memory. The operating system allows a user to run other programs and interfaces with the many other peripherals. Because speed is so important, a processor may contain a small amount of RAM on its own chip, called cache memory, accessible in one clock tick rather than several, for maintaining a copy of the most-used instructions/data. Clock: A processor's instructions execute at a rate governed by the processor's clock, which ticks at a specific frequency. Everything in Comp 163 After computers were invented and occupied entire rooms, engineers created smaller switches called transistors, which in 1958 were integrated onto a single chip called an integrated circuit or IC. Engineers continued to find ways to make smaller transistors, leading to what is known as Moore's law: The doubling of IC capacity roughly every 18 months, which continues today. 1.8 Language history Programmers created scripting languages to execute programs without the need for compilation. A script is a program whose instructions are executed by another program called an interpreter. Interpreted execution is slower because it requires multiple interpreter instructions to execute one script instruction. Guido van Rossum began creating a scripting language called Python and an accompanying interpreter. Python is an open-source language, meaning the user community participates in defining the language and creating new interpreters. 1.9 Why whitespace matters whitespace is any blank space or newline. Programming is all about precision. Programs must be created precisely to run correctly. Ex: = and == have different meanings. Using i where j was meant can yield a hard-to-find bug. Not considering that n could be 0 in sum/n can cause a program to fail entirely in rare but not insignificant cases. Counting from i being 0 to i < 10 vs. i 2 is True > a > b means a is greater than b. x > 3 is False x = 4 is False Python supports operator chaining. For example, a < b < c determines whether b is greater-than a but less-than c. Chaining performs comparisons left to right, evaluating a < b first. If the result is True, then b < c is evaluated next. If the result of the first comparison a < b is False, then there is no need to continue evaluating the rest of the expression. Note that a is not compared to c. 4.5 Detecting ranges using logical operators A logical operator treats operands as being True or False, and evaluates to True or False. Logical operators include AND, OR, and NOT. Programming languages typically use various symbols for those operators, but below the words AND, OR, and NOT are used for introductory purposes. A Boolean refers to a value that is either True or False. Note that True and False are keywords in Python and must be capitalized. A programmer can assign a Boolean value by specifying True or False, or by evaluating an expression that yields a Boolean. Keywords and, or, and not (lowercase) are used to represent the AND, OR, and NOT logical operators. Logical operators are commonly used in expressions of if-else statements. 4.6 Detecting ranges with gaps Programmers often use logical operators to explicitly detect ranges with an upper and lower bound, including ranges with gaps that may have intermediate bounds. 4.7 Detecting multiple features with branches Each if statement is independent and more than one branch can execute, in contrast to the multi-branch if-else arrangement. Everything in Comp 163 4.8 Comparing data types and common errors The relational and equality operators work for integer, string, and floating-point built-in types. The operators can also be used for the string type. Strings are equal if they have the same number of characters and corresponding characters are identical. 4.9 Membership and identity operators The in and not in operators, known as membership operators, yield True or False if the left operand matches the value of an element in the right operand, which is always a container. Membership operators can be used to check whether a string is a substring, or matching subset of characters, of a larger string. For example, 'abc' in '123abcd' returns True because the substring abc exists in the larger string. The programmer can use the identity operator, is, to check whether two operands are bound to a single object. The inverse identity operator, is not, gives the negated value of "is". Thus, if x is y is True, then x is not y is False. Identity operators do not compare object values; rather, identity operators compare object identities to determine equivalence. Object identity is usually the memory address of an object. Thus, identity operators return True only if the operands reference the same object. 4.10 Order of evaluation The order in which operators are evaluated in an expression is known as precedence rules. Opera tor/C Description Explanation onven tion Everything in Comp 163 Items within In (a * (b + c)) - d, the + is evaluated first, () parentheses are then *, then -. evaluated first Arithmetic operators */% (using their precedence z - 45 * y < 53 evaluates * first, then -, then >= and membership (x >= 10) because < and >= have precedence == ! operators over or. = not not (logical NOT) not x or y is evaluated as (not x) or y. x == 5 or y == 10 and z != 10 is and Logical AND evaluated as (x == 5) or ((y == 10) and (z != 10)) because and has precedence over or. x == 7 or x < 2 is evaluated as (x == 7) or Logical OR or (x < 2) because < and == have precedence over or. 4.11 Code blocks and indentation A code block is a series of statements grouped together. A code block in Python is defined by its indentation level. Ex: the number of blank columns from the left edge. The initial code block is not indented. A new code block can follow a statement that ends with a colon, such as an "if" or "else". In addition, a new code block must be more indented than the previous code block. The program below includes comments indicating where each new code block begins. Everything in Comp 163 The amount of indentation used to indicate a new code block can be arbitrary, as long as the programmer uses the same indentation consistently for each line in the block. Good practice is to use the standard recommended four columns per indentation level. 4.12 Conditional expressions A conditional expression has the following form: (expr_when_true if condition else expr_when_false) A conditional expression has three operands and thus is sometimes referred to as a ternary operation. 5. Loops 5.1 Loops (general) A loop is a program construct that repeatedly executes the loop's statements (known as the loop body) while the loop's expression is true; when the expression is false, execution proceeds past the loop. Each time through a loop's statements is called an iteration. 5.2 While loops A while loop is a construct that repeatedly executes an indented block of code (known as the loop body) as long as the loop's expression is True. At the end of the loop body, execution goes back to the while loop statement and the loop expression is evaluated again. If the loop expression is True, the loop body is executed again. But, if the expression evaluates to False, then execution instead proceeds to below the loop body. Each execution of the loop body is called an iteration, and looping is also called iterating. The following example uses the statement while user_value != 'q': to allow a user to end a face-drawing program by entering the character 'q'. The letter 'q' in this case is a sentinel value, a value that when evaluated by the loop expression causes the loop to terminate. An infinite loop is a loop that will always execute because the loop's expression is always True. Everything in Comp 163 5.3 More while examples docstring: a multi-line string literal delimited at the beginning and end by triple quotes. Use either single (') or double (") quotes. The randint() function provides a new random number each time the function is called. 5.4 Counting The programmer can use a variable to count the number of iterations, called a loop variable. 5.5 For loops A for loop statement loops over each element in a container one at a time, assigning a variable with the next element that can then be used in the loop body. 5.6 Counting using the range() function The range() function allows counting in for loops as well. range() generates a sequence of integers between a starting integer that is included in the range, an ending integer that is not included in the range, and an integer step value. 5.7 While vs. for loops Both while loops and for loops can be used to count a specific number of loop iterations. A for loop combined with range() is generally preferred over while loops, since for loops are less likely to become stuck in an infinite loop situation. 5.8 Nested loops A nested loop is a loop that appears as part of the body of another loop. The nested loops are commonly referred to as the outer loop and inner loop. 5.9 Developing programs incrementally A programmer should not write the entire program and then run the program hoping the program works. Everything in Comp 163 Practice incremental programming by starting with a simple version of the program, and growing the program little by little into a complete version A FIXME comment attracts attention to code that needs to be fixed in the future. 5.10 Break and continue A break statement in a loop causes the loop to exit immediately. A break statement can sometimes yield a loop that is easier to understand. A continue statement in a loop causes an immediate jump to the while or for loop header statement. A continue statement can improve the readability of a loop. 5.11 Loop else The loop else construct executes if the loop completes normally. In the following example, a special message "All names printed" is displayed if the entire list of names is completely iterated through. 5.12 Getting both index and value when looping: enumerate() The enumerate() function retrieves both the index and corresponding element value at the same time, providing a cleaner and more readable solution. Unpacking is a process that performs multiple assignments at once, binding comma- separated names on the left to the elements of a sequence on the right. Ex: num1, num2 = [350, 400] is equivalent to the statements num1 = 350 and num2 = 400. 6. Strings 6.1 String slicing An index is an integer matching a specific position in a string's sequence of characters. An individual character is read using an index surrounded by brackets. Ex: my_str reads the character at index 5 of the string my_str. Indices start at 0, so index 5 is a reference to the 6th character in the string. Slice notation has the form my_str[start:end], which creates a new string whose value contains the characters of my_str from indices start to end -1. If my_str is 'Boggle', then my_str[0:3] yields string 'Bog'. Everything in Comp 163 The stride determines how much to increment the index after reading each element. 6.2 Advanced string formatting A format specification may include a field width that defines the minimum number of characters that must be inserted into the string. If the replacement value is smaller in size than the given field width, then the string is padded with space characters. Field widths set on each column in the example above cause the output to be formatted. A field width is defined in a format specification by including an integer after the colon, as in {name:16} to specify a width of 16 characters. Numbers will be right-aligned within the width by default, whereas most other types like strings will be left-aligned. A format specification can include an alignment character that determines how a value should be aligned within the width of the field. Alignment is set in a format specification by adding a special character before the field width integer. The basic set of possible alignment options include left-aligned () and centered (^). Numbers will be right-aligned within the width by default, whereas most other types like strings will be left-aligned. The optional precision component of a format specification indicates how many digits should be included in the output of floating types. The precision follows the field width component in the format specification, if a width is specified at all, and starts with a period character. Ex: f'{1.725:.1f}' indicates a precision of 1, thus the resulting string would be '1.7'. 6.3 String methods replace(old, new) - Returns a copy of the string with all occurrences of the substring old replaced by the string new. The old and new arguments may be string variables or string literals. replace(old, new, count) - Same as above, except replace(old, new, count) only replaces the first count occurrences of old. find(x) -- Returns the index of the first occurrence of item x in the string, otherwise, find(x) returns -1. x may be a string variable or string literal. Recall that in a string, the index of the first character is 0, not 1. If my_str is 'Boo Hoo!': o my_str.find('!') # Returns 7 o my_str.find('Boo') # Returns 0 o my_str.find('oo') # Returns 1 (first occurrence only) find(x, start) - Same as find(x), but begins the search at index start: Everything in Comp 163 o my_str.find('oo', 2) # Returns 5 find(x, start, end) -- Same as find(x, start), but stops the search at index end - 1: o my_str.find('oo', 2, 4) # Returns -1 (not found) rfind(x) -- Same as find(x) but searches the string in reverse, returning the last occurrence in the string. count(x) -- Returns the number of times x occurs in the string. o my_str.count('oo') # Returns 2 Methods to check a string value that returns a True or False Boolean value: o isalnum() -- Returns True if all characters in the string are lowercase or uppercase letters, or the numbers 0-9. o isdigit() -- Returns True if all characters are the numbers 0-9. o islower() -- Returns True if all cased characters are lowercase letters. o isupper() -- Returns True if all cased characters are uppercase letters. o isspace() -- Returns True if all characters are whitespace. o startswith(x) -- Returns True if the string starts with x. o endswith(x) -- Returns True if the string ends with x. Methods to create new strings: o capitalize() -- Returns a copy of the string with the first character capitalized and the rest lowercased. o lower() -- Returns a copy of the string with all characters lowercased. o upper() -- Returns a copy of the string with all characters uppercased. o strip() -- Returns a copy of the string with leading and trailing whitespace removed. o title() -- Returns a copy of the string as a title, with first letters of words capitalized. 6.4 Splitting and joining strings The string method split() splits a string into a list of tokens. Each token is a substring that forms a part of a larger string. A separator is a character or sequence of characters that indicates where to split the string into tokens. The join() string method performs the inverse operation of split() by joining a list of strings together to create a single string. Everything in Comp 163 7. Functions 7.1 User-defined function basics Program redundancy can be reduced by creating a grouping of predefined statements for repeated operations, known as a function. Even without redundancy, functions can prevent a main program from becoming large and confusing. A function is a named series of statements. A function definition consists of the function's name and a block of statements. Ex: def calc_pizza_area(): is followed by an indented block of statements. A function call is an invocation of the function's name, causing the function's statements to execute. The def keyword is used to create new functions. A function may return one value using a return statement. A function with no return statement, or a return statement with no following expression, returns the value None. None is a special keyword that indicates no value. A programmer can influence a function's behavior via an input. A parameter is a function input specified in a function definition. Ex: A pizza area function might have diameter as an input. An argument is a value provided to a function's parameter during a function call. Ex: A pizza area function might be called as calc_pizza_area(12.0) or as calc_pizza_area(16.0). A function's statements may include function calls, known as hierarchical function calls or nested function calls. 7.2 Print functions A function that only prints typically does not return a value. A function with no return statement is called a void function, and such a function returns the value None. Everything in Comp 163 7.3 Dynamic typing The function's behavior of adding together different types is a concept called polymorphism. Polymorphism is an inherent part of the Python language. Python uses dynamic typing to determine the type of objects as a program executes. Many other languages like C, C++, and Java use static typing, which requires the programmer to define the type of every variable and every function parameter in a program's source code. Dynamic typing typically allows for more flexibility of the code that a programmer can write, but at the expense of potentially introducing more bugs, since there is no compilation process by which types can be checked. 7.4 Reasons for defining functions Modular development is the process of dividing a program into separate modules that can be developed and tested separately and integrated into a single program. A function can be defined once, then called from multiple places in a program, thus avoiding redundant code. Examples of such functions are math module functions like sqrt() that relieve a programmer from having to write several lines of code each time a square root needs to be computed. 7.5 Writing mathematical functions A function is defined as a mathematical calculation involving several numerical parameters and returning a numerical result. 7.6 Function stubs Programs are written using incremental development, meaning a small amount of code is written and tested, then a small amount more (an incremental amount) is written and tested, and so on. To assist with the incremental development process, programmers commonly introduce function stubs, which are function definitions whose statements haven't been written yet. The benefit of a function stub is that the high-level behavior of the program can be captured before diving into details of each function, akin to planning the route of Everything in Comp 163 a road trip before starting to drive. Capturing high-level behavior first may lead to better- organized code, reduced development time, and code with fewer bugs. In some cases, a programmer may want a program to stop executing if an unfinished function is called. In such cases, a NotImplementedError can be generated with the statement raise NotImplementedError. The NotImplementedError indicates that the function is not implemented and causes the program to stop execution. NotImplementedError and the "raise" keyword are explored elsewhere in material focusing on exceptions. 7.7 Functions with branches/loops A function's block of statements may include branches, loops, and other statements. 7.8 Functions are objects A part of the value of a function object is compiled bytecode that represents the statements to be executed by the function. A bytecode is a low-level operation, such as adding, subtracting, or loading from memory. 7.9 Functions: Common errors A common error is to copy and paste code among functions but not complete all necessary modifications to the pasted code. Another common error is to return the wrong variable, like if return temperature had been used in the temperature conversion program by accident. The function will work and sometimes even return the correct value. Another common error is to return the wrong variable, like if return temperature had been used in the temperature conversion program by accident. The function will work and sometimes even return the correct value. 7.10 Scope of variables and functions A variable or function object is only visible to part of a program, known as the object's scope. When a variable is created inside a function, the variable's scope is limited to inside that function. In fact, because a variable's name does not exist until bound to an object, the variable's scope is actually limited to after the first assignment of the variable until the end of the function. Everything in Comp 163 Such variables defined inside a function are called local variables. In contrast, a variable defined outside of a function is called a global variable. A global variable's scope extends from the assignment to the end of the file and can be accessed inside of functions. A global statement must be used to change the value of a global variable inside of a function. 7.11 Namespaces and scope resolution A namespace maps names to objects. The Python interpreter uses namespaces to track all of the objects in a program. A namespace is a normal Python dictionary whose keys are the names and whose values are the objects. Scope is the area of code where a name is visible. Namespaces are used to make scope work. Each scope, such as global scope or a local function scope, has its own namespace. If a namespace contains a name at a specific location in the code, then that name is visible and a programmer can use it in an expression. When a name is referenced in code, the local scope's namespace is the first checked, followed by the global scope, and finally the built-in scope. If the name cannot be found in any namespace, the interpreter generates a NameError. The process of searching for a name in the available namespaces is called scope resolution. 7.12 Function arguments Arguments to functions are passed by object reference, a concept known in Python as pass-by-assignment. When a function is called, new local variables are created in the function's local namespace by binding the names in the parameter list to the passed arguments. If the object is immutable, such as a string or integer, then the modification is limited to inside the function. Any modification to an immutable object results in the creation of a new object in the function's local scope, thus leaving the original argument object unchanged. If the object is mutable, then in-place modification of the object is seen outside the scope of the function. Any operation like adding elements to a container or sorting a list that is performed within a function will also affect any other variables in the program that reference the same object. Everything in Comp 163 7.13 Keyword arguments and default parameter values Python provides for keyword arguments that allow arguments to map to parameters by name, instead of implicitly by position in the argument list. When using keyword arguments, the argument list does not need to follow a specific order. A function can have a default parameter value for one or more parameters, meaning that a function call can optionally omit an argument, and the default parameter value will be substituted for the corresponding omitted argument. A parameter's default value is the value used in the absence of an argument in the function call. 7.14 Arbitrary argument lists A function definition can include an *args parameter that collects optional positional parameters into an arbitrary argument list tuple. Adding a final function parameter of **kwargs, short for keyword arguments, creates a dictionary containing "extra" arguments not defined in the function definition. 7.15 Multiple function outputs Function return statements are limited to returning only one value. A workaround is to package the multiple outputs into a single container, commonly a tuple, and return that container. Unpacking is an operation that allows a statement to perform multiple assignments at once to variables in a tuple or list. 7.16 Help! Using docstrings to document functions A docstring is a string literal placed in the first line of a function body. The help() function can aid a programmer by providing them with all the documentation associated with an object. Everything in Comp 163 7.17 Engineering examples No new information 8. Files 8.1 Reading files A common programming task is to retrieve input from a file using the built-in open() function instead of using keyboard entry. The file.close() method closes the file, after which no more reads or writes to the file are allowed. The file.read() method returns the file contents as a string. The file.readlines() method returns a list of strings, where the first element is the contents of the first line, the second element is the contents of the second line, and so on. Both methods can be given an optional argument that specifies the number of bytes to read from the file. Each method stops reading when the end-of-file (EOF) is detected, which indicates no more data is available. 8.2 Writing files The file.write() method writes a string argument to a file. A mode indicates how a file is opened, such as whether or not writing to the file is allowed, if existing contents of the file are overwritten or appended, etc. The flush() file method can be called to force the interpreter to flush the output buffer to disk. Additionally, the os.fsync() function may have to be called on some operating systems. Closing an open file also flushes the output buffer. 8.3 Interacting with file systems The computer's operating system, such as Windows or macOS, controls the file system, and a program must use functions supplied by the operating system to interact with files. The Python standard library's OS module provides an interface to operating system function calls and is thus a critical piece of a Python programmer's toolbox. Everything in Comp 163 Portability, the ability to access an item easily from multiple locations, must be considered when reading and writing files outside the executing program's directory since file path representations often differ between operating systems. The character between directories, "\\"or "/", is called the path separator, and using the incorrect path separator may result in that file not being found. os.path.sep stores the path separator for the current operating system. The os.walk() function "walks" a directory tree like the one above, visiting each subdirectory in the specified path. 8.4 Binary data Some files consist of data stored as a sequence of bytes, known as binary data, that is not encoded into readable text using an encoding like ASCII or UTF-8. A bytes object is used to represent a sequence of single byte values, such as binary data read from a file. Bytes objects are immutable, just like strings, meaning the value of a bytes object cannot change once created. A byte object can be created using the bytes() built-in function: bytes('A text string', 'ascii'): creates a sequence of bytes by encoding the string using ASCII bytes(100): creates a sequence of 100 bytes whose values are all 0 bytes([12, 15, 20]): creates a sequence of 3 bytes with values from the list Programs can also access files using a binary file mode by adding a "b" character to the end of the mode string in a call to open(), as in open('myfile.txt', 'rb'). The struct module is a commonly used Python standard library module for packing values into sequences of bytes and unpacking sequences of bytes into values (like integers and strings). The struct.pack() function packs values such as strings and integers into sequences of bytes. The "" places the most significant byte first (big-endian), and "