Python Data Types PDF

Contents   1 Programming is based on a few simple concepts 2 A brief history of Python 3 Python is a high-level programming language 4 The interactive interpreter lets you run Python code interactively 5 Know your interpreter 6 Expressions are values, statements are code Grundlagen der Programmierung in Python 7 First Python program: "Hello, world" Bio Data Science, 2024  8 Boolean (= logical) values and expressions Dmitrij Turaev 8.1 Every object can be evaluated in boolean context 8.2 Boolean operators 8.3 Comparison operators Hello, Python: Data types 9 String operations produce new strings 10 Objects can be given names 11 One object can have multiple names and name bindings 12 Mutable objects can change their values 1 Programming is based on a few simple concepts Programming is a set of instructions that tell the computer how to solve a problem, i.e. how to convert inputs into outputs. This is similar to creating a cooking recipe, that tells how to convert food products (the input) into a tasty dish (the output). First, you need to clearly define what you want to do (what the inputs and outputs are) Then you can think about how to do it (give precise instructions how to convert inputs into outputs). You can do this as pseudocode first (can be done on paper), and implement it as program code later The key is to break up a large problem into smaller problems, which are easy to solve All computer programs can be made from a few simple ideas: 1. Sequence – list of subsequent instructions 2. Decision – if condition is true, then outcome A, else outcome B 3. Repetition – repeat instructions a given number of times, or until something happens Main control flow constructs (T = True, F = False) Here is an example. Problem: Given two strings ransomNote and magazine , return true if ransomNote can be constructed using the letters from magazine , and false otherwise. Each letter in magazine can only be used once in ransomNote. Input: string ransomNote , string magazine Output: true or false Example 1: Input: ransomNote = "aa", magazine = "aab" Output: true Example 2: Input: ransomNote = "aa", magazine = "abb" Output: false Example 3: Input: ransomNote = "i love you", magazine = "the quick brown fox jumped over the lazy dog" Output: true Give step-by-step instructions how to solve this problem. Solution: Here is one possible solution approach: For each letter in ransomNote: For each letter in magazine: If it's the same letter, delete it from magazine, and sto p inner loop If letter wasn't found, retu rn "false" and stop Return "true" and stop There are two nested loops here, because for each letter in ransomNote you need to look through possibly all letters of magazine , until a match is found. If it's found, you stop the inner loop, go to the next letter of ransomNote (the outer loop), and then again search possibly all letters of magazine , etc. If you apply this algorithm in your mind to the examples above, you'll see that it works. That's great! But you probably noticed that we repeated the same operation (searching all letters of magazine , the inner loop) many times (as many as there are letters in ransomNote ). Let's see if we can do better. Here is another approach: For each letter in magazine: Count how often it occurred (save the count in a count tabl e) For each letter in ransomNote: If the letter occurs in the count table, reduce its count by 1 If it doesn't occur in the c ount table, or its count is 0, r eturn "false" and stop Return "true" and stop Again, there are two loops. But they are not nested. Instead, you go once through all letters of magazine , and then you go once through all letters of ransomNote. This is a much more eﬃcient solution (the diﬀerence is quite substantial). 2 A brief history of Python Python was created around 1990 by Guido van Rossum, a Dutch programmer. It was strongly influenced by several preceding languages. I know what you think, Guido loved snakes. That's not true, though: Guido loved the British TV show Monty Python's Flying Circus. The oﬃcial Python documentation claims that it helps if you also like Monty Python. Python 2.0 was released in 2000. Its final version was Python 2.7.18, released in 2020. Some people are still using it. However, this is the last version of Python 2, development has focused completely on Python 3. Python 3.0 was released in 2008. It introduced several changes that made it backwards incompatible (Python 2 code didn't work with Python 3). Some of the core functionality was changed to make the language more consistent and future-proof. Not all software projects survive such a breaking point; Python did. All important libraries have been ported to Python 3. Python has a regular release cycle (PEP602). New Python 3 versions introduce new features, but remain backwards compatible. Python development is based on PEPs, Python Enhancement Proposals. "A PEP is a design document providing information to the Python community, or describing a new feature for Python or its processes or environment" (PEP1). Currently, it's probably the most popular scripting language for data science and machine learning. There are several interesting competitors, but they didn't catch up yet. Wrong Python version? 3 Python is a high-level programming language From the program in its human- readable form of source code, a compiler or assembler can derive machine code — a form consisting of instructions that the computer can directly execute. Alternatively, a computer program may be executed with the aid of an interpreter. (Wikipedia: Computer program) In computer science, a high-level programming language is a programming language with strong abstraction from the details of the computer. In contrast to low-level programming languages, it may use natural language elements, be easier to use, or may automate (or even hide entirely) significant areas of computing systems (e.g. memory management), making the process of simpler and more understandable than when using a lower-level language. The amount of abstraction provided defines how "high-level" a programming language is. (Wikipedia: High-level programming language) The reality is slightly more complicated, and the distinguishment between compiled and interpreted languages is not always clear. For example, Python is considered an interpreted language. However, Python code is compiled to bytecode before execution. The bytecode is a low-level platform-independent representation of your source code. Python scripts have the.py file ending, and bytecode files have the.pyc file ending. The bytecode is then sent for execution to the Python Virtual Machine, which is specific to the target machine. The default implementation of the Python language engine (the program that runs Python programs) is CPython, which is written in the C programming language (Stackoverflow). CPython compiles the Python source code into the bytecode, and this bytecode is then executed by the CPython virtual machine. High level vs. low level languages Tradeoﬀ: Eﬃciency (execution speed) ⟷ Readability (high level of abstraction) Self test: a. Diﬀerence between high-level and low- level languages? b. Diﬀerence between interpreted and compiled languages? High level or low level? 4 The interactive interpreter lets you run Python code interactively Start the interactive Python interpreter by entering python in the terminal. Use quit() , exit() or Ctrl + D to exit the interpreter Which Python version are you using? Solution: $ which python $ python --version A REPL (Read, Evaluate, Print and Loop) is an interactive way to execute Python code. You just type your commands and hit return for the computer to evaluate them. The Python interpreter: Reads the user input (your Python commands) Evaluates your code (to work out what you mean) Prints any results (so you can see the outcome) Loops back to step 1 (to continue the conversation) Programmers use the REPL to execute pieces of code, often to test ideas and explore problems. Because of the instant feedback you get, the REPL makes it easy to explore all the functionality that Python has to oﬀer. The interpreter tells you it’s waiting for instructions by presenting you with three chevrons ( >>> ). The easiest way to use Python is as a calculator, to add, multiply, subtract and divide numbers. Type something and see what happens. >>> 4 * 5 # multiplication 20 >>> 2 / 3 # division; (also tr y this in Python 2) 0.6666666666666666 >>> 2 ** 3 # exponent 8 >>> 7 % 2 # modulo division 1 What about square root? Additional functionality is often outsourced into modules (the generic term is "library"). Modules have to be imported to access this functionality: >>> import math # import the math module >>> math.sqrt(16) # function "s qrt" from the math module calcul ates the square root 4.0 A module is a file with Python code. It contains definitions (functions, classes and variables) and statements (instructions) The module name is the file name without the file name extension.py Many modules are provided with Python as part of the standard library. Some of them are written in C ("built-in" modules), most are written in Python. Many more third-party modules can be additionally installed We just imported a Python module, called a function from this module and passed it an argument. That's pretty advanced! Let's try something else: >>> "abc" + 3 >>> "abc" + "def" >>> "abc" - "def" >>> "abc" * 3 We can see that numbers and strings behave in diﬀerent ways. How does Python know which is which? Every object in Python has a type. (Unlike Bash, which is one of the reasons why its capabilities are very limited.) In computer science and computer programming, a data type or simply type is an attribute of data which tells the compiler or interpreter how the programmer intends to use the data. Most programming languages support common data types of real, integer and boolean.... This data type defines the operations that can be done on the data, the meaning of the data, and the way values of that type can be stored. (Wikipedia: Data type) The Python function type returns the type (= class) of an object: >>> type("abc") # determine typ e of object "abc" >>> type(123) >>> type(True) >>> type(None) >>> type(Nothing) 1. How many arguments does the function type accept? 2. How many arguments does the function sqrt from the math module accept? Solution: 1. In the example above, type accepts one argument and returns the object type. Functions may accept arguments, and always return something. An argument is an object that you pass to a function. 2. math.sqrt accepts one argument. Here are some types we've seen so far: int – integer float – floating point; examples of scientific notation with base/exponent: 2.34e5 , 2e-3 str – string; " or ' enclose strings, triple quotes for multiline strings bool – boolean, only two possible values: True and False None – special "NoneType", only one possible value None ("Explicit is better than implicit", Zen of Python) None is used when you want to explicitely state that something exists but doesn't have a value. Many other programming languages call it NULL. This is sometimes useful, you'll see examples later. What is the result type of the following expressions? 2 + 1 2.0 + 1 "abc" * 2 True and False Solution: Find out using the type function, e.g. type(2 + 1) or type('abc' * 2). This works because first the expression (e.g. 2 + 1 ) is evaluated, and then the result ( 3 ) is passed to the type function, so that it receives one argument. You could also write type(3) or type("abcabc") instead. Another function tests if an object has a particular type: >>> isinstance("abc", str) # te st if an object is an instance o f a particular type/class Almost everything in Python is an object. How can you tell? Easy: If you pass the object as argument to the type function, and type doesn't throw an error, then it's an object Try it: type(math) — you can see that the imported module is represented by an object Self-test: a. What is a Python module? Example? b. What property does every Python object have? c. What is a data type? d. What Python types do you know so far? e. Which functions provide information about the object type? If toilet paper was a data type Diﬀerence between 0, None and not defined 5 Know your interpreter The interactive interpreter is very useful for running a few lines of code, e.g. performing small calculations and testing code snippets. Similar as the Bash shell, the Python interpreter knows several keyboard shortcuts. ↑ / ↓ → scroll in command history Ctrl + A → go to beginning of line Ctrl + E → go to end of line Ctrl + → → jump one word to the right Ctrl + ← → jump one word to the left Ctrl + K → delete rest of line after cursor Ctrl + _ → undo Ctrl + R → search command history and, of course: Tab for autocompletion IPython is an interactive Python shell with additional abilities. The project goal was to create a comprehensive environment for interactive and exploratory computing. A part of the project, most importantly the notebook and related tools, was outsourced into a separate project, Jupyter. You can run IPython from the command line by entering ipython in an environment where it's installed. The IPython kernel (the program that runs and introspects the user’s code) is also used by tools like Jupyter (jupyterlab.readthedocs.io) and Spyder. You can always use IPython instead of Python. It has several advantages: Syntax highlighting: e.g. functions are green, strings are orange, and module imports are blue Autocompletion: You remember the sqrt function from the math module? Wondering what other functions math has? Type: math.. When you go through the functions, Ipython also tells you how many arguments they accept Copy/paste multi-line code: This is not reliably possible with the regular Python interpreter Here is another of IPython's superabilities: Any command that works in Bash can be used in IPython by prefixing it with the ! character. !which python !python --version !which ipython !ipython --version IPython also has a number of "magic commands" (IPython magics), which are sometimes useful. E.g., %paste or %cpaste commands let you paste multi-line code snippets (if regular copy- paste doesn't work as intended) %lsmagic – lists all available magic functions %quickref – shows a quick reference sheet The regular Python interpreter has three chevrons ( >>> ) as prompt, while IPython uses a numbered prompt ( In : ). Many code examples in this and other chapters use the >>> prompt by convention. You can execute them in IPython. [Additional information] JupyterLab is a web-based user interface for Project Jupyter (jupyter.org). It enables you to work with Jupyter notebooks, text editors and terminals. Jupyter notebooks (.ipynb files) are documents that combine live runnable code with narrative text (Markdown) and output/visualizations. A notebook kernel is a "computational engine" that executes the code in the notebook file. (In this architecture, the web interface in the browser is called "frontend", and the kernel that runs in the background is called "backend"). The default is the IPython kernel. It is very similar to the IPython running in the terminal, but there are a few minor diﬀerences. For example, not all output is printed by default (this can be changed, see Stackoverflow). A notebook consists of a sequence of cells. A code cell allows you to edit and write new code, with syntax highlighting and tab completion. The programming language you use depends on the kernel, and the default kernel (IPython) runs Python code. When a code cell is executed, code that it contains is sent to the kernel associated with the notebook. The results that are returned from this computation are then displayed in the notebook as the cell’s output. The output can be text, figures, HTML tables and more. This is known as IPython’s rich display capability. (Jupyter docs) The Jupyter notebook interface In addition to running your code, Jupyter stores code and output, together with markdown notes, in an editable document called a notebook. When you save it, this is sent from your browser to the notebook server, which saves it on disk as a text document (JSON format) with a.ipynb extension. The notebook server, not the kernel, is responsible for saving and loading notebooks, so you can edit notebooks even if you don’t have the kernel for that language—you just won’t be able to run code. The kernel doesn’t know anything about the notebook document: it just gets sent cells of code to execute when the user runs them. Which Jupyter version are you using? Solution: !jupyter --version 6 Expressions are values, statements are code You will often hear the terms "expressions" and "statements". What do they mean? An expression is something that evaluates to a single value A statement is code that does something, like assigning a variable or displaying a value An expression is a combination of values, variables, operators, and function calls that are evaluated to one resulting value (Stackoverflow). The interpreter evaluates expressions interactively: >>> 2 + 3 5 >>> 1 + 2 + 3 * (8 ** 9) - math. sqrt(4.0) 402653185.0 >>> 23 # A value all by itself is also a (simple) expression 23 >>> 4.0 4.0 Operator, operand, literal An operator is a symbol that represents an action. It tells the interpreter to perform a specific mathematical or logical operation on the operands and produce the final result. A literal is the literal notation for representing a fixed (constant) value, e.g. 42 , 3.14 , 1.6e-10 (numeric literals), or "Hello, world" (string literals). If you need a number in your code, and you are not reading it from a file, or from the keyboard, or from a database, or calculating it, or importing it from a module, then you can use a numeric literal. More examples of expressions: >>> min(2, 22) # function calls are always evaluated to one valu e 2 >>> max(3, 94) 94 >>> round(81.5) 82 >>> math.pi * 2 6.283185307179586 >>> "foo" # every value is also an expression 'foo' >>> "foo" + "bar" 'foobar' >>> "abc" * 2 'abcabc' >>> None None >>> True # this is a value of t he type "bool" True If you ask Python to print an expression, the interpreter evaluates the expression and prints the result: >>> print(min(max(3, 10), 5)) # evaluation order: max(3, 10) → m in(10, 5) → print(5) Statements are everything that can make up a line (or several lines) of Python code. Statements usually do something. Examples of statements are loops and conditionals. In Python, expressions are also considered statements. Examples: >>> x = 17 # Variable assignmen t: does not return a value, but is simply executed >>> if x == 17: print("hello") hello >>> print(42) 42 >>> 3 + 7 10 Self-test: a. What is an expression? b. What are examples of statements? c. What is an operator? What is an operand? 7 First Python program: "Hello, world" The interpreter is great for testing short pieces of code. Anything longer than a few lines should go in a script, to make it reproducible. A Python script is a text file with Python instructions. It can be saved, modified and executed. The script is run by the Python interpreter as if the commands were entered line by line. There is, however, a small diﬀerence. If you type an expression in interactive mode, the interpreter evaluates it and displays the result: >>> 1 + 1 2 But in a script, an expression all by itself doesn’t do anything! You need to print the value explicitely using the print function. 1. Open a new file in a text editor (VS Code, Spyder, JupyterLab editor, vim,...) 2. Enter these lines: #!/usr/bin/env python print("Hello, World!") 1. Save the file under an expressive file name, for example hello_world.py The file extension.py is optional, but strongly recommended General rules for file names: Only alphanumeric characters and _ , - and. No "umlaute", no spaces, no special characters! English names (you never know who is going to read it later) Descriptive names: e.g., guess_what.py is not as good as calculate_protein_properties.py 2. Run the program in a terminal: python path/to/hello_world.py If the script is in your current directory: python hello_world.py You should see the output of your first program! Other than it's pretty short, it's a fully valid program. Congrats, you're oﬃcially a Python programmer now. TODO: 1. What is the diﬀerence between the shebang lines #!/usr/bin/env python and #!/usr/bin/python ? (Hint) 2. What is required to be able to run the program using the syntax $ path/to/hello_world.py ? 3. How many arguments did you pass to the print function? 4. Does the print function accept more than one argument? Try it in the interpreter, then in your script. (If multiple arguments are passed to a function, they are separated by commas; e.g. min(2, 22) → 2 arguments were passed to the min function.) 5. Add some more print functions that output the values of some expressions/functions of your choice. E.g., what will print(min(2, 22)) do? Solution: 1. First case: You want to use the currently active python executable, as given by $ which python. Second case: You want to use a defined Python executable. 2. Make it executable: $ chmod +x hello_world.py 3. One argument. 4. Yes, it accepts an unlimited number of arguments. (If you look at the help message of the print function – you can do this by typing print? in Ipython –, you'll see that is says print(*args, sep=' ', end='\n', file=None, flush=False). The notation *args signifies an unlimited number of arguments; the rest are optional arguments, similar to command options in shell commands.) 5. It will evaluate the expression and print the resulting value. Important note You should be very clear on the distinction between a Python script and a Jupyter notebook (also see this blog post for details). A Python script is a plain text file with Python code and the file extension py. It can be opened and edited in any text editor. It is run in the terminal from the Bash command line, and the standard output also goes to the terminal (unless redirected to a file). When you run the script, it is executed completely from top to bottom. This is the traditional way to write and execute Python code, and the most useful approach for many applications. A Jupyter notebook is a text file with a more complex format that distinguishes text and code cells, and includes code output and plots. It is edited in a specialized IDE like Jupyterlab or VS Code. You can run code cells separately and out of order. Conceptually, this is very similar to working in the interactive Ipython interpreter. Notebooks are well-suited for data science and exploratory data analysis, reproducible workflows, presentations and teaching. Note that if an exercise explicitely asks to write a Python script, you should write a Python script and demonstrate its execution in the terminal. Generally, you should be able to use both approaches, and pick the one that is more suited to your use case. Self-test: a. What are two ways of executing Python code? b. What is a Python script? c. What is the shebang line? d. What is a function argument? e. How do you print something in Python? 8 Boolean (= logical) values and expressions Boolean logic lets you evaluate conditions and decide what operations your programs will execute, depending on the truth value of those conditions. It is one of the constructs that determine the program flow. A Boolean expression (logical expression) evaluates to true or false. In Python, this is represented by a dedicated Boolean type, with two constants: True and False. 8.1 Every object can be evaluated in boolean context Some functions and operations directly return Boolean objects, e.g. isinstance(5, int). However, any type can be evaluated as Boolean: 1. explicitely using the bool function 2. implicitely in Boolean context, e.g. in an if statement Explicit evaluation: >>> bool("abc") # non-empty str ings are True True >>> bool("") # 0 and empty valu es of any type are False False >>> bool(0) # 0 is False False >>> bool(5) # other integers ar e True True Implicit evalution in conditional statements ( if ) and in while loops. This is also called "truthy" and "falsy" (Stackoverflow, freecodecamp.org). In such cases, bool(object) is called behind the scenes: if "abc": # implicit evaluation of a s tring print("hello abc") if 5: # implicit evaluation of an inte ger print("hello 5") if 0: # 0 is False, remember? print("hello 0") if None: # None is False print("hello None") i = -3 while i: print("loop variable:", i) i = i + 1 How can you tell if the function insinstance returns a boolean type? Solution: Test it in the interpreter, e.g. type(isinstance(5, int)). Just for fun, you can check the type of the return value of the isinstance function using the same function: isinstance(isinstance(5, int), bool). The inner expression is evaluated first, therefore this is is the same as writing isinstance(True, bool). 8.2 Boolean operators You saw numerical operators (like + , - , * , ** , / , % ) and string operators (like + and * ). The Boolean operators and , or and not modify and join together expressions evaluated in Boolean context to create more complex conditions: not — reverses Boolean value of the operand ( not is an unary operator, and requires only one operand) >>> not True # reverse "True" False >>> not 0 True >>> not isinstance(3, int) False The operations or and and return one of their operands, similar to Bash. This eﬀectively corresponds to the boolean logic. and — returns right side, if left side is True; this results in True, only if both operands are True or — returns right side, if left side is False; this results in True, if at least one operand is True >>> True and False # evaluates to True, only if both parts eval uate to True False >>> True or False # evaluates t o True, if at least one part eva luates to True True >>> 3 and 5 # evaluates to righ t side (corresponds to True, "tr uthy") 5 >>> 3 or 5 # evaluates to left side (corresponds to True, "trut hy") 3 1. Why do Boolean operators behave like that with non-boolean operands (e.g. 3 and 5 vs. 3 or 5 )? (Hint, hint) 2. What will be the output of this code (Hint: Operator precedence): not False and False not (False and False) Solution: 1. "This is sometimes useful, e.g., if s is a string that should be replaced by a default value if it is empty, the expression s or "foo" yields the desired value." (Python docs) 2. Think about it, make a hypothesis, then test it in the interpreter. 8.3 Comparison operators Comparison operators == , != , < , , and >= compare the values of two objects and return True or False. We can always test such expressions in the interactive interpreter: >>> 3 > 10 False >>> "TATA" == "TATA" True They are often used in conditional statements if... else... to determine which code is executed. (The order in which program statements are executed is called "control flow". The conditional statement is one of the constructs which determine the control flow, together with loops and function calls.) In case of if... else... , either one code block (set of statements) is executed, or the other block Python uses indentation level to define where the block starts and where it ends This is best illustrated by an example. It uses the modulo operator, which produces the remainder of an integer division. This remainder can be compared to 0 to determine if the number was even or odd: >>> 17 % 2 # modulo operator 1 >>> 18 % 2 0 >>> 17 % 2 == 0 False >>> 18 % 2 == 0 True if 17 % 2 == 0: # also try "18" instead of "17" # condition is true, "if" bl ock is executed print("value is even") print("hi there, it's still the first code block") else: # condition is false, "else" block is executed print("value is odd") print("second code block her e") print("the second block can go on as long as you want") print("I'm not part of the condi tional statement, I'll be printe d no matter what") More info: "Using Boolean in Python" by CS Dojo TODO: 1. What will be the output of the code below? if 99 % 2 == 0 : print("my name is Alice") else: print("my name is Bob") Solution: 1. Make a hypothesis and test it in the interpreter. Note that this code doesn't make a lot of sense, because you know that 99 is odd. Usually, you will be testing the value of a variable, which can have diﬀerent values. Self-test: a. What are possible values of the Boolean type? b. What does it mean, if an object is evaluated in Boolean context? c. What is a Boolean expression? d. What are Boolean operators? e. What is a conditional statement? 9 String operations produce new strings Do strings know only two operators, + and * ? Absolutely not. Let's say we have a string of DNA, and want to: check if it contains a particular substring, create new strings from it, or extract a substring. This is why we have string operators like in , + , * , indexing and slicing. Test the following expressions in the interactive interpreter: "aac" + "ttg" # string concatenation "ACTG" * 2 # string concatenation "ACC" in "AAACCCACTG" # check for subs tring "ACT" not in "AAACCCACTG" # check for absence of substring False If you are wondering if you can write not "ACT" in "AAACCCACTG" instead of not in : Indeed you can. For more information, see Python docs and Stackoverflow. Indexing — return one item (substring of length 1): # Sequence of amino acids (one-letter c ode): "MALTEKDAVK" "MALTEKDAVK" # indexing in Pyth on (and many other languages) starts at 0 "MALTEKDAVK" # get second item "MALTEKDAVK"[-1] # negative index: count from the end "MALTEKDAVK" # carefully read t he error messages, they are helpful Slicing — return multiple items (substring): "MALTEKDAVK"[2:10] # slicing syntax: [start_position:end_position:step] "MALTEKDAVK"[2:10:2] # get every second item from items 3-9 "MALTEKDAVK"[::2] # get every second item "MALTEKDAVK"[4::2] # get every second item starting with 5th item "MALTEKDAVK"[4::-2] # use negative ste p More examples of slicing are here. TODO: 1. Get the first five amino acids from "MALTEKDAVK". 2. Reverse the sequence of the amino acids using slicing. Solution: "MALTEKDAVK"[:5] "MALTEKDAVK"[::-1] [Additional information] There is a historical and a usability perspective on zero- based numbering. Guido van Rossum, the creator of Python, motivated his choice of zero-based indexing mostly by his preference for the half-open interval notation which includes only one of its endpoints. This leads to more simple and intuitive code: For example, suppose you split a string into three parts at indices i and j – the parts would be a[:i] , a[i:j] , and a[j:]. Here are some more examples why the half-open interval notation can be nicer for the daily use (by the example of the range function). How to scare a Python programmer What this cartoon calls arrays corresponds to the list type in Python. (Python also has something called "arrays", which are special lists optimized for arithmetic computations, and much less common. Instead, people usually use the NumPy library for numerical calculations.) Indexing and slicing work the same way with all these types (ordered collections). Self-test: a. How can you access a letter of the string? b. How can you access a substring (a "slice" of the string)? c. What is the diﬀerence between indexing and slicing? source 1. Hello 2. olleH 3. World! 4. !dlroW 5. None of the above Bonus question: >>> s = x[-1:5:-1] >>> s == s.strip() # True or Fa lse? 10 Objects can be given names OK, we can do operations on strings. But do we really have to type the whole string every time? There is a better option: using variables (Python calls them names). The general syntax of name binding in Python is name = object (unlike in Bash, empty spaces don't matter). Now we gave the object a name, so we can call it by name, and don't need to use literals anymore! We say that we bind names to objects. It's also OK to say that we bind names to values, because every object has a value. (The value is what gets printed when you evaluate the object in the interactive interpreter.) What Python calls "names" and "bindings", most programming languages call "variables" and "assignments", as explained in the next section. If people talk about assigning to variables in Python, they often come from another programming language, or maybe they didn't use a good textbook. However, the term "variable" is extremely common, so it's OK to use it. myseq = "ACGT" my_favorite_num = 42 Now we can access the objects as often as we want: print(myseq) # print object value print(myseq * 10) print("AC" in myseq) import math print(math.sqrt(my_favorite_num)) Hint: Use the variable explorer (sometimes called "variable inspector", etc.) in IDEs like VS Code (link), Spyder or Jupyter notebook (link) to keep track of your variables. Abbreviated syntax like below is called "syntactic sugar", because it doesn't change the basic functionality, but is nice to have: a = b = c = 1 # simultaneous binding o f multiple names print(a) print(b) print(c) #print(d) # will this work? a += 1 # a = a + 1 print("testing the += operator ->", a) a *= 2 # a = a * 2 print("testing the *= operator ->", a) # etc. Don't confuse assignments ( = ) and the logical operator ==. It tests if the values of two objects are equal: a = 7 # bind name "a" to the integer 7 b = 7 # bind name "b" to 7 print("test if a equals b:", a == b) Conventions for Python names (PEP 8): Variable names: only letters_underscores_digits , starting with a letter Only lowercase, words separated by underscores Function names follow the same convention as variable names Use English names (you never know who is going to read the code later) Use clear and descriptive names: a variable for sequence length should be called seq_length rather than x or hi_there TODO: Rewrite the code below using a variable. if 17 % 2 == 0 : print("value is even") else: print("value is odd") Solution: input_value = 17 if input_value % 2 == 0 : print("value is even") else: print("value is odd") Self-test: a. What are variables? b. What are name bindings? 11 One object can have multiple names Why the terminology diﬀerences between programming languages? In C and C++ we talk about variables/assignments, Python calls it names/bindings: In C++, you assign values to variables In Python, you bind names to objects This is because a variable is a diﬀerent thing in C and Python: In C, C++, etc., a variable is a named memory location (Speicherzelle) In Python, a variable is a name used to refer to an object; the memory location doesn't matter Let's illustrate by an example. n = 300 m = n n and m have the same value: n == m # compare object VALUE (the out put of `print`) But it's more than that, they are the same thing! n is m # compare object IDENTITY (corr esponds to memory location) print(id(n)) print(id(m)) # this means that m is an other name for the same object The id function determines an object's identity, you can think of it as its memory location is is a convenience function that compares the identities; it does the same as id(a) == id(b) One object can have many names Now let's try to "assign" another value to m : m = 400 print(id(m)) # m is now bound to a dif ferent object, its identity changed What happened? We bound m to a diﬀerent object: A name can be re-bound to any object n = "foo" print(id(n)) We bound n to another object (a string), so its identity changed. After re-binding both names, we can't access the original object anymore, because it doesn't have a name. The nameless object is useless, and Python will discard it from memory soon... Sad, huh? Objects live as long as something references them. Otherwise they are automatically collected by the Python garbage collector after a while, so they don't take up memory. Self-test: a. Why Python has name bindings, while other languages have variable assignments? b. What happens when a Python variable is re-bound? c. What happens to objects which can't be referenced? 12 Mutable objects can change their values Let's look at another data type, the list , which is a sequence (an ordered collection) of objects. ("Sequence" means that it has an order, i.e. there is a "first item", a "second item" and so on. There are also unordered item collections, like the set , which is comparable to a bag filled with items, and there is no first or second item.) The list syntax is [obj1, obj2,...]. E.g., [2, 4, 1] is a list of three integers: >>> [2, 4, 1] [2, 4, 1] >>> type([2, 4, 1]) # list >>> a = [2, 4, 1] >>> b = [2, 4, 1] >>> a == b # True: the two list s have the same value True Let's look at the identity of the two objects. (You remember that object identity corresponds to memory location.) # Print object identity print(id(a)) print(id(b)) Both objects have the same value, but they are two diﬀerent objects, and have diﬀerent identities. Now, we can append an item to one of the lists using its built-in method append : a.append(7) # This list method appends an integer to the list print("a is now", a) a == b # Do both objects still have th e same value? print(id(a)) # Did the object identity change? The object a was changed in-place: Its value changed, but it's still the same object as before (as determined by its identity/memory location). The value of a list can change. Such types are called mutable. Other types, like int and str , are immutable: Unlike lists, they can't be changed in-place, you can only create new int or str objects. Objects of basic types like integers and strings can't change their value. They are immutable. Objects of other types like lists can change their value. They are mutable. The list is the first mutable type we've seen. What if we want to make a copy of the list, to keep the original list but modify the new list? We know that newlist = oldlist is just another name for the same list: oldlist = [2, 4, 1] newlist = oldlist print("newlist:", newlist) print("oldlist:", oldlist) print("newlist has same value as oldlis t:", newlist == oldlist) # compare obj ect value print("newlist is same object as oldlis t:", newlist is oldlist) # compare obj ect identity Before executing the next cell, make an assumption about what the output will be: newlist.append(99) print("newlist:", newlist) print("oldlist:", oldlist) This is not what we wanted! To actually create a new list, we can use slicing (as with strings). We will take not just a small slice, but a full slice, from the first to the last item: newlist = oldlist[:] # taking a slice creates a new object newlist == oldlist newlist is oldlist newlist.append(0.5) print("newlist:", newlist) print("oldlist:", oldlist) Remember: We say: "Names refer to objects", or "names are bound to objects". Many names can be bound to one object. Python has mutable and immutable objects: This is determined by their type. The value of mutable objects can change. Changes in a mutable object are visible through all its names. The terms "assignment" and "variable" are very common (and it's OK to use them), but the correct terminology is "binding" and "name". 1. What will be the output of the code below? Think about it first, and execute the code to verify your assumptions. 2. Solve this quiz and explain the result. a = [5,9,0] b = a print(a) print(b) print(a is b) b.append(100) a.append("abc") print(a) print(b) a.pop() # the method `pop` remo ves the last item (Ipython: ente r `a.pop?` or `list.pop?` to get help) print(a) print(b) Solution: 1. You can see that both names are bound to the same object, which is later modified by adding and removing elements. 2. Test it in the interpreter, and check the identity of the objects to understand what's going on. The + operator concatenates two lists, and returns a new list object. The name numbers is re-bound to this new object, while the name integers remains bound to the original object. Self-test: a. How many names can be bound to one object? b. What is the diﬀerence between mutable and immutable types? c. What determines if an object is mutable or immutable? Does the text below make sense to you? Objects are Python’s abstraction for data. All data in a Python program is represented by objects or by relations between objects. Every object has an identity, a type and a value. An object’s identity never changes once it has been created; you may think of it as the object’s address in memory. The is operator compares the identity of two objects; the id() function returns an integer representing its identity. An object’s type determines the operations that the object supports (e.g., “does it have a length?”) and also defines the possible values for objects of that type. The type() function returns an object’s type. Like its identity, an object’s type doesn't change. The value of some objects can change. Objects whose value can change are said to be mutable; objects whose value is unchangeable once they are created are called immutable.... An object’s mutability is determined by its type; for instance, numbers, strings and tuples are immutable, while dictionaries and lists are mutable. (The value of an object is simply what gets printed when you use the print function on it.) Objects are never explicitly destroyed; however, when they become unreachable they may be garbage-collected. If yes, you can be proud: This is The oﬃcial Python Documentation! Self-test: a. Which are three basic properties of every Python object?

Python Data Types PDF

Document Details

Tags

Related

Summary

Full Transcript