DS242 - All slides.pdf
Document Details
Uploaded by GallantReal
Saudi Electronic University
2021
Tags
Full Transcript
ر الجامعة السعودية االلكتونية ر االلكتونية الجامعة السعودية 26/12/2021 College of Computing and Informatics Bachelor of Science in Computer Science DS242 - Advanced Data Science Programming DS242 Advanced Data Science Programming Week...
ر الجامعة السعودية االلكتونية ر االلكتونية الجامعة السعودية 26/12/2021 College of Computing and Informatics Bachelor of Science in Computer Science DS242 - Advanced Data Science Programming DS242 Advanced Data Science Programming Week 2 Understanding the Five Areas of Data Science Contents 1. Working with Big, Big Data Volume Variety Velocity Managing volume, variety, and velocity 2. The Five-Step Process of Data Science Weekly Learning Outcomes 1. Learning about data science 2. Understanding big data 3. Discovering and outlining the five steps of data science Required Reading (Book 5: Doing Data Science) Chapter 1. Understanding the Five Areas of Data Science (John C. Shovic and Alan Simpson, Python All-in-One For Dummies, 2nd edition, 2021). About Data Science Data science is the domain of study that deals with vast volumes of data using modern tools and techniques to find unseen patterns, derive meaningful information, and make business decisions. The data used for analysis can come from many different sources and are presented in various formats. About Data Science Data science affects our lives in more ways than you may think. When you use Google or Bing or DuckDuckGo, you’re using a sophisticated application of data science. The suggestions for other search terms that come up when you’re typing? They come from data science. About Data Science Medical diagnoses and interpretations of images and symptoms are examples of data science. Doctors rely on data science interpretations more and more these days. Data science looks intimidating to the uninitiated. Inferences, data graphs, and statistics. About Data Science In this chapter, we introduce you to the use of Python in data science and talk about just enough theory to get you started. If nothing else, we want to leave you with an understanding of the process of data science and give you a better idea of what’s behind some of the results of big data analysis that are touted in the news. About Data Science Python and its myriad tools and libraries can make data science much more accessible to non-computer scientists. One thing to remember is that most scientists (including data scientists) are not necessarily experts in computer science. They like to use tools that simplify coding and enable them to focus on getting answers and performing data analysis. Working with Big, Big Data Big data refers to complex datasets that are too large for conventional data-processing software (databases, spreadsheets, and traditional statistics packages such as SPSS Statistics) to handle. The industry talks about big data using three concepts, called the “three v’s”: volume, variety, and velocity. Volume Volume refers to the size of the dataset. The volume can be really, really big almost hard- to-believe big. For example, Facebook has more users than the population of China. There are over 250 billion images on Facebook and 2.5 trillion posts. That’s a lot of data. Volume what about the upcoming world of IoT? “Gartner, one of the world’s leading analysis companies, estimates 22 billion devices by 2022.” That’s 22 billion devices producing thousands of pieces of data. Imagine that you’re sampling the temperature in your kitchen once a minute for a year that’s over ½ million data points. “Data science is how we make use of all this data.” Variety Photos and images are different data types from temperature and humidity or location information. In a sense, the information is concentrated in, say, temperature and more smeared out in images. Photos are sophisticated data structures and are hard to interpret and harder still for machines to classify correctly. Variety Let’s talk about voice data for a minute. Amazon’s Alexa is very good at translating voice to text but not as good at assigning meaning to the text. One reason is the lack of context (environmental factors such as social cues, tone, and body language), but another reason is the many ways that people ask for things, make comments, and so on. Velocity Velocity refers to how fast the data is changing and how fast it is being added to the data piles. Facebook users upload about 1 billion pictures a day, so in the next couple of years Facebook will have over 1 trillion images. Facebook is a high-velocity dataset. Velocity A low-velocity dataset (not changing at all) might be the set of temperature and humidity readings from your house in the last five years. Needless to say, high-velocity datasets require different techniques than low-velocity datasets. Managing volume, variety, and velocity The management of volume, variety, and velocity is a complex topic. Data scientists have developed many methods for processing data. The three V’s describe the dataset and give you an idea of the data parameters. The process of gaining insights in data is called data analytics. After doing data science for a few years, you’ll be very good at managing this process. The Five-Step Process of Data Science Process of Data Science We generally can break down the process of doing science on data (especially big data) into five steps: Capture the data. Process the data. Analyze the data. Communicate the results. Maintain the data. Capturing the data To have something to do analysis on, you must capture data. In any real-world situation, you probably have a number of potential sources of data, such as company records, public databases, or your own gathered data. Inventory them and decide what to include in your project. But before you can know what to include, you have to carefully define your business questions and goals. Capturing the data With well-defined goals, it’s easier to know if you have achieved them. If you can, combine your data sources so it’s easy to get to the information you need to find insights and build all those nifty reports you just can’t wait to show off to management. Processing the data Processing the data is the part of data science that should be easy but almost never is. Some data scientists spend months massaging and manipulating their data so they can process and trust it. You need to identify anomalies and outliers, eliminate duplicates, remove missing entries, and determine what data is inconsistent. Processing the data You need to clean and process your data carefully so that you don’t remove data important to your upcoming analysis work or introduce bias that will destroy your ability down the line to make good inferences or get good answers. Processing the data One more data-processing issue to worry about: According to Marketing Week in 2015, 60 percent of consumers provide intentionally incorrect information when submitting data online. Even good data scientists have been accused of cherry-picking data while cleaning it to support a hypothesis. Analyzing the data By the time you’ve expended all that energy to get to the point of looking at the data to see what you can find, you would think that asking the questions would be relatively simple. It’s not. Analyzing big datasets for insights and inferences or asking complex questions requires the most human intuition in all of data science. Some questions, such as “What is the average money spent on cereal in 2020?” can be easily defined and calculated, even on huge amounts of data. Analyzing the data But useful questions, such as, “How can I get more people to buy Sugar Frosted Flakes?” is the $64,000 question. A question such as that has layers and layers of complexity. You want a baseline of how much Sugar Frosted Flakes your customers are currently buying. That answer should be easy to get. Then you have to define what you mean by more people. Analyzing the data Analyzing the data requires skill and experience in statistical techniques such as linear and logistic regressions and finding correlations between different data types by using a variety of probability algorithms and formulas such Naïve Bayes Communicating the results After you’ve crunched and mangled your data into the format you need and then analyzed the data to answer your questions, you need to present the results to management or the customer. Most people visualize and understand information better and faster when they see it in a graphical format rather than just in text. Data science people use two major Python packages to communicate results: the R language and MatPlotLib. We use MatPlotLib to display our big data graphics. Maintaining the data Maintaining the data is the step in data science that many people ignore. After they’ve asked their first round of questions and received their first round of answers, many professionals basically move on to the next project. However, there’s a reasonable chance that at some point, perhaps much later, they will have to ask more questions about the same data. Maintaining the data Be sure to archive and document the following information so you can restart a project quickly or, even more likely, reuse parts of the project for a new one: Data and sources Models you used to modify the data (including any exception data and “data throw-out criteria”) Queries and results DATA SCIENCE VERSUS DATA ANALYTICS Currently, data science refers to the process of working out insights from large datasets of unstructured data. Data science uses predicative analytics, statistics, and machine learning to wade through large amounts of data. Data analytics focuses on using and creating statistical analysis for existing sets of data to achieve insights on that data. DATA SCIENCE VERSUS DATA ANALYTICS With these vague descriptions and the fact that more techniques are being developed to do data analysis on big data (not surprisingly named big data analytics), you can see how the two areas are moving closer and closer. At the risk of ridicule from our fellow academics, we would definitely call steps 3–5 of data analytics — analyze the data, communicate the results, and maintain the data — a subset of data science data. Thank You ر الجامعة السعودية االلكتونية ر االلكتونية الجامعة السعودية 26/12/2021 College of Computing and Informatics Bachelor of Science in Computer Science DS242 - Advanced Data Science Programming DS242 Advanced Data Science Programming Week 3 Speeding Along with Lists and Tuples (Part 1) Contents 1. Defining and Using Lists Weekly Learning Outcomes 1. Define lists 2. Work with Lists Required Reading (Book 2: Understanding Python Building Blocks) Chapter 3: Speeding Along with Lists and Tuples (John C. Shovic and Alan Simpson, Python All-in-One For Dummies, 2nd edition, 2021). Speeding Along with Lists and Tuples Introduction Sometimes in code you work with one data item at a time, such as a person’s name, unit price, or username. Other times, you work with larger sets of data, such as a list of people’s names or a list of products and their prices. These sets of data are often referred to as lists or arrays in most programming languages. Python has lots of easy, fast, and efficient ways to deal with all kinds of data collections. Defining and Using Lists The simplest data collection in Python is a list. A list is any list of data items, separated by commas, inside square brackets. Typically, you assign a name to the list using an = character, just as you would with variables. If the list contains numbers, don’t use quotation marks around them. For example, here is a list of test scores: scores = [88, 92, 78, 90, 98, 84] Defining and Using Lists If the list contains strings, as always, those strings should be enclosed in single or double quotation marks, as in this example: students = ["Mark", "Amber", "Todd", "Anita", "Sandy"] To display the contents of a list on the screen, you can print it just as you would print any regular variable. For example, executing print(students) in your code after defining that list displays the following on the screen: ['Mark', 'Amber', 'Todd', 'Anita', 'Sandy'] Referencing list items by position Each item in a list has a position number, starting with 0, even though you don’t see any numbers. You can refer to any item in the list by its number using the name for the list followed by a number in square brackets. In other words, use this syntax: listname[x] Referencing list items by position Replace listname with the name of the list you’re accessing and replace x with the position number of the item you want. Remember, the first item is always 0, not 1. For example, in the following first line, we define a list named students, and then print item number 0 from that list. The result, when executing the code, is the name Mark displayed: students = ["Mark", "Amber", "Todd", "Anita", "Sandy"] print(students) Mark Looping through a list To access each item in a list, just use a for loop with this syntax: for x in list: Replace x with a variable name of your choosing. Replace list with the name of the list. An easy way to make the code readable is to always use a plural for the list name (such as students, scores). Then you can use the singular name (student, score) for the variable name. Looping through a list Remember to always indent the code that’s to be executed in the loop. Below code shows a more complete example where you can see the result of running the code in a Jupyter notebook. Seeing whether a list contains an item If you want your code to check the contents of a list to see whether it already contains some item, use in listname in an if statement or a variable assignment. Below code creates a list of names. Then, two variables store the results of searching the list for the names Anita and Bob. Printing the contents of each variable displays True for the one where the name Anita is in the list. The test to see whether Bob is in the list proves False. Getting the length of a list To determine how many items are in a list, use the len() function (short for length). Put the name of the list inside the parentheses. For example, type the following code in a Jupyter notebook or at the Python prompt or whatever: students = ["Mark", "Amber", "Todd", "Anita", "Sandy"] print(len(students)) Running that code produces this output: 5 Adding an item to the end of a list To add an item to the end of a list, use the.append() method with the value you want to add inside the parentheses. You can use either a variable name or a literal value inside the quotation marks. For instance, in below code, the line students.append("Goober") adds the name Goober to the list. The line students.append(new_student) adds whatever name is stored in the new_student variable to the list. The.append() method always adds to the end of the list. So when you print the list, those two new names are at the end. Adding an item to the end of a list You can use a test to see whether an item is in a list and then append it only when the item isn’t already there. For example, the following code won’t add the name Amanda to the list because that name is already in the list: student_name = "Amanda" #Add student_name but only if not already in the list. if student_name in students: print(student_name + " already in the list") else: students.append(student_name) print(student_name + " added to the list") Inserting an item into a list Whereas the append() method adds an item to the end of a list, the insert() method adds an item to the list in any position. The syntax for insert() is listname.insert(position, item) Replace listname with the name of the list, position with the position at which you want to insert the item. Replace item with the value, or the name of a variable that contains the value, that you want to put in the list. Inserting an item into a list For example, the following code makes Lupe the first item in the list: # Create a list of strings (names). students = ["Mark", "Amber", "Todd", "Anita", "Sandy"] student_name = "Lupe" # Add student name to front of the list. students.insert(0, student_name) # Show me the new list. print(students) Output for the code: ['Lupe', 'Mark', 'Amber', 'Todd', 'Anita', 'Sandy'] Changing an item in a list You can change an item in a list using the = assignment operator just like you do with variables. Make sure you include the index number in square brackets to indicate which item you want to change. The syntax is: listname[index] = newvalue Replace listname with the name of the list; replace index with the subscript (index number) of the item you want to change; and replace newvalue with whatever you want to put in the list item. Changing an item in a list For example, look at the following code: # Create a list of strings (names). students = ["Mark", "Amber", "Todd", "Anita", "Sandy"] students = "Hobart" print(students) Output for the code: ['Mark', 'Amber', 'Todd', 'Hobart', 'Sandy'] Combining lists If you have two lists that you want to combine into a single list, use the extend() function with the following syntax: original_list.extend(additional_items_list) In your code, replace original_list with the name of the list to which you’ll be adding new list items. Replace additional_items_list with the name of the list that contains the items you want to add to the first list. Combining lists For example, look at the following code using lists named list1 and list2. After executing list1.extend(list2), the first list contains the items from both lists, as you can see in the output of the print() statement at the # Create two lists of Names. end. list1 = ["Zara", "Lupe", "Hong", "Alberto", "Jake"] list2 = ["Huey", "Dewey", "Louie", "Nader", "Bubba"] # Add list2 names to list1. list1.extend(list2) # Print list 1. print(list1) ['Zara', 'Lupe', 'Hong', 'Alberto', 'Jake', 'Huey', 'Dewey', 'Louie', 'Nader’, 'Bubba'] Removing list items Python offers a remove() method so you can remove any value from the list. If the item is in the list multiple times, only the first occurrence is removed. The following code displays a list of # Create a list of strings. letters = ["A", "B", "C", "D", "C", "E", "C"] letters with the letter C repeated a few # Remove "C" from the list. times. letters.remove("C") Then the code uses letters.remove("C") # Show me the new list. print(letters) to remove the letter C from the list: ['A', 'B', 'D', 'C', 'E', 'C'] Removing list items If you need to remove all of an item, you can use a while loop to repeat the.remove as long as the item still remains in the list. For example, this code repeats the.remove as long as “C” is still in the list: while "C" in letters: letters.remove("C") If you want to remove an item based on its position in the list, use pop() with an index number rather than remove() with a value. If you want to remove the last item from the list, use pop() without an index number. Removing list items The following code creates a list, removes the first item (0), and then removes the last item (pop() with nothing in the parentheses). Printing the list proves that those two items have been removed: # Create a list of strings. letters = ["A", "B", "C", "D", "E", "F", "G"] # Remove the first item. letters.pop(0) # Remove the last item. letters.pop() # Show me the new list. print(letters) Running the code shows that popping the first and last items did, indeed, work: ['B', 'C', 'D', 'E', 'F'] Removing list items When you pop() an item off the list, you can store a copy of that value in some variable. Below code shows the same code as the preceding, but it stores copies of what’s been removed in variables named first_removed and last_removed. At the end it prints the list, and also shows which letters were removed. Removing list items Python also offers a del (short for delete) command that deletes any item from a list based on its index number (position). But again, you have to remember # Create a list of strings. letters = ["A", "B", "C", "D", "E", "F", "G"] that the first item is 0. So, let’s # Remove item sub 2. say you run the following code to del letters delete item number 2 from the print(letters) ['A', 'B', 'D', 'E', 'F', 'G'] list: Clearing out a list To delete the contents of a list but not the list itself, use.clear(). The list still exists, but it contains no items. In other words, it’s an empty list. # Create a list of strings. The following code shows how letters = ["A", "B", "C", "D", "E", "F", "G"] you could test this. Running the # Clear the list of all entries. letters.clear() code displays [] at the end, which # Show me the new list. lets you know the list is empty: print(letters) [] Counting how many times an item appears in a list use the Python count() method to count how many times an item appears in a list. As with other list methods, the syntax is simple: listname.count(x) Replace listname with the name of your list, and x with the value you’re looking for (or the name of a variable that contains that value). Counting how many times an item appears in a list The below code counts how many times the letter B appears in the list, using a literal B inside the parentheses of.count() like this: grades.count("B") Because B is in quotation marks, you know it’s a literal, not the name of some variable. Finding a list item’s index Python offers an.index() method that returns a number indicating the position of an item in a list, based on the index number. The syntax is: listname.index(x) In the below example where the program crashes at the line f_index = grades.index(look_for) because there is no F in the list. Finding a list item’s index An easy way to get around this problem is to use an if statement to see whether an item is in the list before you try to get its index number. # Create a list of strings. If the item isn’t in the list, display a grades = ["C", "B", "A", "D", "C", "B", "C"] message saying so. Otherwise, get the # Decide what to look for look_for = "F" index number and show it in a # See if the item is in the list. message. That code follows: if look_for in grades: # If it's in the list, get and show the index. print(str(look_for) + " is at index " + str(grades.index(look_for))) else: # If not in the list, don't even try for index number. print(str(look_for) + " isn't in the list.") Alphabetizing and sorting lists Python offers a sort() method for sorting lists. In its simplest form, it alphabetizes the items in the list (if they’re strings). If the list contains numbers, they’re sorted smallest to largest. For a simple sort like that, just use sort() with empty parentheses: listname.sort() Created a new list for each simply by assigning each sorted list to a new list name. Reversing a list You can also reverse the order of items in a list using the.reverse() method. This is not the same as sorting in reverse. When you sort in reverse, you still sort: Z–A for strings, largest to smallest for numbers, and latest to earliest for dates. When you reverse a list, you simply reverse the items in the list, no matter their order, without trying to sort them. Reversing a list In the following code, we reverse the order of the names in the list and then print the list. # Create a list of strings. names = ["Zara", "Lupe", "Hong", "Alberto", "Jake"] # Reverse the list. names.reverse() # Print the list. print(names) ['Jake', 'Alberto', 'Hong', 'Lupe', 'Zara'] Copying a list If you need to work with a copy of a list so as not to alter the original list, use the.copy() method. # Create a list of strings. The following code is similar to the names = ["Zara", "Lupe", "Hong", "Alberto", "Jake"] preceding code, except that instead of # Make a copy of the list. backward_names = names.copy() reversing the order of the original list, we # Reverse the copy. make a copy of the list and reverse that backward_names.reverse() one. # Print the list. print(names) Printing the contents of each list shows print(backward_names) how the first list is still in the original order ['Zara', 'Lupe', 'Hong', 'Alberto', 'Jake'] ['Jake', 'Alberto', 'Hong', 'Lupe', 'Zara'] whereas the second one is reversed: Methods for Working with Lists Thank You ر الجامعة السعودية االلكتونية ر االلكتونية الجامعة السعودية 26/12/2021 College of Computing and Informatics Bachelor of Science in Computer Science DS242 - Advanced Data Science Programming DS242 Advanced Data Science Programming Week 4 Speeding Along with Lists and Tuples (Part 2) Contents 1. What’s a Tuple and Who Cares? 2. Working with Sets Weekly Learning Outcomes 1. Understanding the tuples using python 2. Understanding the difference between a set and a list is that the items in a set have no specific order Required Reading (Book 2: Understanding Python Building Blocks) Chapter 3: Speeding Along with Lists and Tuples (John C. Shovic and Alan Simpson, Python All-in-One For Dummies, 2nd edition, 2021). What’s a Tuple and Who Cares? Introduction In addition to lists, Python supports a data structure known as a tuple. But it’s not spelled tupple or touple, so our best guess is that it’s pronounced “two-pull.” A tuple is a list, but you can’t change it after it’s defined. The syntax for creating a tuple is the same as the syntax for creating a list, except you don’t use square brackets. You have to use parentheses, like this: prices = (29.95, 9.98, 4.95, 79.98, 2.95) What’s a Tuple Most of the techniques and methods that you learned for using lists back in slide 39, don’t work with tuples because they are used to modify something in a list, and a tuple can’t be modified. you can get the length of a tuple using len, like this: print(len(prices)) You can use.count() to see how many times an item appears in a tuple. For example: print(prices.count(4.95)) What’s a Tuple You can use in to see whether a value exists in a tuple, as in the following sample code: print(4.95 in prices) This returns True if the tuple contains 4.95 or False if it doesn’t. If an item exists in the tuple, you can get its index number. You’ll get an error, though, if the item doesn’t exist in the list. look_for = 12345 if look_for in prices: position = prices.index(look_for) else: position = -1 print(position) What’s a Tuple You can loop through the items in a tuple and display them in any format you want by using format strings. For example, this code displays each item with a leading dollar sign and two digits for the pennies: # Loop through and display each item in the tuple. for price in prices: print(f"${price:.2f}") The output from running this $29.95 $9.98 code with the sample tuple $4.95 $79.98 follows: $2.95 What’s a Tuple You can’t change the value of an item in a tuple using this kind of syntax: prices = 234.56 You’ll get an error message that reads TypeError: 'tuple' object does not support item assignment. (This message is telling you that you can’t use the assignment operator, =, to change the value of an item in a tuple because a tuple is immutable, meaning its content cannot be changed.) A tuple makes sense if you want to show data to users without giving them any means to change any of the information. Working with Sets Introduction Python also offers sets as a means of organizing data. The difference between a set and a list is that the items in a set have no specific order. Even though you may define the set with the items in a certain order, none of the items get index numbers to identify their position. To define a set, use curly braces where you use square brackets for a list and parentheses for a tuple. For example, here’s a set with some numbers in it: sample_set = {1.98, 98.9, 74.95, 2.5, 1, 16.3} Working with Sets Sets are similar to lists and tuples in a few ways. You can use len() to determine how many items are in a set. Use in to determine whether an item is in a set. But you can’t get an item in a set based on its index number. Nor can you change an item already in the set. You can’t change the order of items in a set either. You can’t use.sort() to sort the set or.reverse() to reverse its order. Working with Sets You can add a single new item to a set using.add(), as in the following example: sample_set.add(11.23) Not that unlike a list, a set never contains more than one instance of a value. (So even if you add 11.23 to the set multiple times, the set will still contain only one copy of 11.23.) You can also add multiple items to a set using.update(). (But the items you’re adding should be defined as a list in square brackets), as in the following example: sample_set.update([88, 123.45, 2.98]) Working with Sets You can copy a set. However, because the set has no defined order, when you display the copy, its items may not be in the same order as the original set, as shown in this code and its output: # Define a set named sample_set. sample_set = {1.98, 98.9, 74.95, 2.5, 1, 16.3} # Show the whole set print(sample_set) # Make a copy and show the copy. ss2 = sample_set.copy() print(ss2) {1.98, 98.9, 2.5, 1, 74.95, 16.3} {16.3, 1.98, 98.9, 2.5, 1, 74.95} Working with Sets The below code creates a set named sample_set and then uses a variety of print() statements to output information. Each number is neatly formatted with two digits, because the code uses the f-string >6.2f, which right aligns each number with two digits after the decimal point. Summary Lists and tuples are two of the most commonly used Python data structures. Sets don’t seem to get as much play as the other two, but it’s good to know about them. Thank You ر الجامعة السعودية االلكتونية ر االلكتونية الجامعة السعودية 26/12/2021 College of Computing and Informatics Bachelor of Science in Computer Science DS242 - Advanced Data Science Programming DS242 Advanced Data Science Programming Week 5 Cruising Massive Data with Dictionaries (Part 2) Contents 1. Deleting Dictionary Items 2. Having Fun with Multi-Key Dictionaries Weekly Learning Outcomes 1. Deleting items in a dictionary 2. Using multi-key dictionaries Required Reading (Book 2: Understanding Python Building Blocks) Chapter 4: Cruising Massive Data with Dictionaries (John C. Shovic and Alan Simpson, Python All-in-One For Dummies, 2nd edition, 2021). Deleting Dictionary Items Deleting Dictionary Items You can remove data from data dictionaries in several ways. The del keyword (short for delete) can remove any item based on its key. The syntax is as follows: del dictionaryname[key] # Define a dictionary named people. For example, the following code people = { 'htanaka': 'Haru Tanaka’, 'zmin': 'Zhang Min’, 'afarooqi': 'Ayesha Farooqi’, creates a dictionary named } people. # Show original people dictionary. print(people) Then it uses del people["zmin"] to # Remove zmin from the dictionary. del people["zmin"] # Show what's in people now. remove the item that has zmin as print(people) its key: Deleting Dictionary Items If you forget to include a specific key with the del keyword and specify only the dictionary name, the entire dictionary is deleted, even its name. suppose you executed del people instead of using del people["zmin"] in the preceding code. The output of the second print(people) would be an error, as in the following, because after the people dictionary is deleted, it no longer exists, and its content can’t be displayed: {'htanaka': 'Haru Tanaka', 'zmin': 'Zhang Min', 'afarooqi': 'Ayesha Farooqi'} ---------------------------------------------------------- NameError Traceback (most recent call last) in () 13 14 # Show what's in people now. ---> 15 print(people) NameError: name 'people' is not defined Deleting Dictionary Items To remove all key-value pairs from a dictionary without deleting the entire dictionary, use the clear method with this syntax: dictionaryname.clear() # Define a dictionary named people. The following code creates a people = { 'htanaka': 'Haru Tanaka’, 'zmin': 'Zhang dictionary named people, puts Min’, 'afarooqi': 'Ayesha Farooqi’, } some key-value pairs in it, and then # Show original people dictionary. print(people) # Remove all data from the dictionary. prints the dictionary so you can see people.clear() #Show what's in people now. its content. Then, people.clear() print(people) empties all the data: Deleting Dictionary Items The pop() method offers another way to remove data from a data dictionary. The pop() method actually does two things: If you store the results of the pop() method in a variable, that variable gets the value of the popped key. Regardless of whether you store the result of the pop() method in a variable, the specified key is removed from the dictionary. Data dictionaries offer a variation on pop() that uses this syntax: dictionaryname = popitem() Deleting Dictionary Items In the below example where you first see the entire dictionary in the output. Then adios = people.pop("zmin") is executed, putting the value of the zmin key in a variable named adios. We then print the adios variable so we can see that it contains Zhang Min, the value of the zmin key. Printing the entire people dictionary again proves that zmin has been removed from the dictionary. Having Fun with Multi-Key Dictionaries Having Fun with Multi-Key Dictionaries So far, you’ve worked with a dictionary that has one value (a person’s name) for each key (an abbreviation of that person’s name). But it’s not unusual for a dictionary to have multiple key-value pairs for one item of data. Suppose that just knowing the person’s full name employee = { isn’t enough. 'name': 'Haru Tanaka’, You want to also know the year the person was 'year_hired': 2005, hired, his or her date of birth, and whether or not 'dob': '11/23/1987’, that employee has been issued a company laptop. 'has_laptop': False The dictionary for any one person might look like } this: Having Fun with Multi-Key Dictionaries Suppose you need a dictionary of products that you sell. For each product, you want to know its name, its unit price, whether or not it’s taxable, and how many you currently have in stock. The dictionary might look something like this (for one product): product = { 'name': 'Ray-Ban Wayfarer Sunglasses’, 'unit_price': 112.99, 'taxable': True, 'in_stock'=: 10 } Having Fun with Multi-Key Dictionaries The value for a property can be a list, tuple, or set; it doesn’t have to be a single value. For example, for the sunglass's product, maybe you offer two models, black and tortoise. You could add a colors or model key and list the items as a comma separated list in square brackets like this: product = { 'name': 'Ray-Ban Wayfarer Sunglasses’, 'unit_price': 112.99, 'taxable': True, 'in_stock': 10, 'models': ['Black', 'Tortoise’] } Having Fun with Multi-Key Dictionaries let’s look at how you might display the dictionary data. You can use the simple dictionaryname[key] syntax to print just the value of each key. For example, using that last product example, the output of this code: print(product['name']) Ray-Ban Wayfarer Sunglasses print(product['unit_price']) 112.99 print(product['taxable']) = True print(product['in_stock']) 10 print(product['models']) ['Black', 'Tortoise'] Having Fun with Multi-Key Dictionaries You could get fancier by adding descriptive text to each print statement, followed by a comma and the code. You could get fancier by adding descriptive text to each print statement, followed by a comma and the code. You could also loop through the list to print each model on a separate line. And you can use an f-string to format the data. For example, here is a variation on the previous print() statements: Having Fun with Multi-Key Dictionaries You could get fancier by adding descriptive text to each print statement, followed by a comma and the code. product = { Name: Ray-Ban Wayfarer Sunglasses 'name' : 'Ray-Ban Wayfarer Sunglasses’, 'unit_price' : 112.99, 'taxable' : True, Price: $112.99 'in_stock' : 10, 'models' : ['Black', 'Tortoise’] } Taxable: True print('Name: ', product['name']) print('Price: ', f"${product['unit_price']:.2f}") = In Stock: 10 print('Taxable: ', product['taxable']) print('In Stock:', product['in_stock']) Models: print('Models:') Black for model in product['models']: Tortoise print(" " * 10 + model) Using the mysterious fromkeys and setdefault methods Data dictionaries in Python offer two methods, named fromkeys() and setdefault(), which are the cause of much head-scratching among Python learners — and rightly so because it’s not easy to find practical applications for their use. But we’ll take a shot at it and at least show you what to expect if you ever use these methods in your code. The fromkeys() method uses this syntax: newdictionaryname = dict.fromkeys(iterable[,value]) Using the mysterious fromkeys and setdefault methods Replace newdictionary with whatever you want to name the new dictionary. It doesn’t have to be a generic name like product. Replace the iterable part with any iterable — meaning, something the code can loop through; a simple list will do. The value part is optional. If omitted, each key in the dictionary gets a value of None, which is simply Python’s way of saying no value has been assigned to this key in this dictionary yet. Nesting dictionaries Define a bunch of dictionaries with names, how could you loop through the whole kit-and-caboodle without specifically accessing each dictionary by name? containingdictionaryname = { The answer is, make each dictionary a key- key: {dictionary}, value pair in some containing dictionary, where key: {dictionary}, key: {dictionary}, the key is the unique identifier for each... dictionary (for example, a UPC or SKU for each } product). The value for each key would then be a dictionary of all the key-value pairs for that dictionary. Nesting dictionaries That’s just the syntax for the dictionary of dictionaries. You have to replace all the italicized placeholder names as follows: containingdictionaryname: This is the name assigned to the dictionary as a whole. It can be any name you like but should describe what the dictionary contains. key: Each key value must be unique, such as the UPC or SKU for a product, or the username for a person, or even just some sequential number, as long as it’s never repeated. {dictionary} Enclose all the key-value pairs for that one dictionary item in curly braces, and follow that with a comma if another dictionary follows. Nesting dictionaries Using a combination of f-strings and some loops, you could get Python to display that data from the data dictionaries in a neat, tabular format. Thank You ر الجامعة السعودية االلكتونية ر االلكتونية الجامعة السعودية 26/12/2021 College of Computing and Informatics Bachelor of Science in Computer Science DS242 - Advanced Data Science Programming DS242 Advanced Data Science Programming Week 6 Cruising Massive Data with Dictionaries (Part 1) Contents 1. Understanding Data Dictionaries 2. Creating a Data Dictionary 3. Looping through a Dictionary 4. Data Dictionary Methods and Copying a Dictionary Weekly Learning Outcomes 1. Producing a data dictionary and seeing how to loop through a dictionary 2. Copying dictionaries and Deleting items in a dictionary 3. Using multi-key dictionaries Required Reading (Book 2: Understanding Python Building Blocks) Chapter 4: Cruising Massive Data with Dictionaries (John C. Shovic and Alan Simpson, Python All-in-One For Dummies, 2nd edition, 2021). Cruising Massive Data with Dictionaries Introduction Data dictionaries, also called associative arrays in some languages, are kind of like lists, which we discuss in Chapter. But each item in the list is identified not by its position in the list but by a key. You can define the key, which can be a string or a number. All that matters is that it is unique to each item in the dictionary. Understanding Data Dictionaries A data dictionary is similar to a list, except that each item in the list has a unique key. The value you associate with a key can be a number, string, list, tuple — just about anything, really. So, you can think of a data dictionary as being similar to a table where the first column contains a single item of information unique to that item and the second column, the value, contains information relevant to, and perhaps unique to, that key. Understanding Data Dictionaries In the below example, the left column contains a key unique to each row. The second column is the value assigned to each key. The left column shows an abbreviation for a person’s name. Some businesses use names like these when assigning user accounts and email addresses to their employees. The value corresponding to each key doesn’t have to be a string or an integer. It can be a list, or tuple. Understanding Data Dictionaries In the below example, the value of each key includes a name, a year (perhaps the year of hire or birth year), a number (for example, the number of dependents the person claims for taxes), and a Boolean True or False value (which may indicate, for example, whether the person has a company cellphone). For now, it doesn’t matter what each item of data represents. What matters is that for each key, you have a list (enclosed in square brackets) that contains four pieces of information about that key. Understanding Data Dictionaries A dictionary may also consist of several different keys, each representing a piece of data. For example, rather than have a row for each item with a unique key, you might make each employee their own little dictionary. Then you can assign a key name to each unit of information. Understanding Data Dictionaries Each dictionary entry having multiple keys is common in Python, because the ppatel.full_name = 'Priya Patel' ppatel.year_hired = 2015 language makes it easy to isolate the ppatel,dependents = 1 ppatel.has_company_cell = True specific item of data you want using object.key syntax, like this: The key name is more descriptive than ppatel = 'Priya Patel' ppatel = 2015 using an index based on position, as ppatel = 1 ppatel=True you can see in the following example. Creating a Data Dictionary Creating a Data Dictionary The code for creating a data dictionary follows this basic syntax: name = {key:value, key:value, key:value, key:value,...} The name is a name you make up and generally describes to whom or what the key-value pairs refer. The key:value pairs are enclosed in curly braces. The key is usually a string enclosed in quotation marks, but you can use integers instead. Each colon (:) separates the key name from the value assigned to it. The value is whatever you want to store for that key name, and can be a number, string, list pretty much anything. The ellipsis (...) just means that you can have as many key-value pairs as you want. Just remember to separate key:value pairs with commas, as shown in the syntax example. Creating a Data Dictionary To make the code more readable, developers often place each key:value pair on a separate line. But the syntax is still the same. The only difference is that a line break follows each comma, as in the following: name = { key:value, key:value, key:value, key:value,... } Creating a Data Dictionary open a Jupyter notebook, a.py file, or a Python prompt, and type the following code. open a Jupyter notebook, a.py file, or a Python prompt, and people = { type the following code. 'htanaka': 'Haru Tanaka’, Note that we created a dictionary named people that contains 'ppatel': 'Priya Patel’, 'bagarcia': 'Benjamin Alb’, multiple key:value pairs, each separated by a comma. 'zmin': 'Zhang Min’, 'afarooqi': 'Ayesha Farq’, The keys and values are strings, so they’re enclosed in quotation 'hajackson': 'Hanna Jack’, marks, and each key is separated from its value with a colon. 'papatel': 'Praty Aarav’, 'hrjackson': 'Henry Jack’ It’s important to keep all that straight; otherwise, the code won’t } work yes, even one missing or misplaced or mistyped quotation mark, colon, comma, or curly brace can mess up the whole thing: Accessing dictionary data After you’ve added the data, you can work with it in a number of ways. Using print(people) that is, a print() function with the name of the dictionary in the parentheses you get a copy of the entire dictionary, as print(people) follows: {'htanaka': 'Haru Tanaka', 'ppatel': 'Priya Patel', 'bagarcia': 'Benjamin Alberto Garcia', 'zmin': 'Zhang Min', 'afarooqi': 'Ayesha Farooqi’, 'hajackson': 'Hanna Jackson', 'papatel': 'Pratyush Aarav Patel', 'hrjackson’: 'Henry Jackson'} Typically, this is not what you want. More often, you’re looking for one specific item in the dictionary. In that case, use this syntax: dictionaryname[key] Accessing dictionary data where dictionaryname is the name of the dictionary, and key is the key value for which you’re searching. For example, if you want to know the value of the zmin key, you will enter: print(people['zmin']) Think of this line as saying, “print people sub zmin,” where sub just means the specific key. When you do that, Python returns the value for that one person the full name for zmin. Accessing dictionary data Note that in the code, zmin is in quotation marks because it’s a string. You can use a variable name instead, as long as it contains a string. For example, consider the following two lines of code. The first one creates a variable named person and puts the string 'zmin' into that variable. The next line doesn’t require quotation marks because person is a variable name: person = 'zmin' print(people[person]) Getting the length of a dictionary The number of items in a dictionary is considered its length. As with lists, you can use the len() statement to determine a dictionary’s length. The syntax is: len(dictionaryname) people = { Replace dictionaryname with the name of 'htanaka': 'Haru Tanaka’, 'ppatel': 'Priya Patel’, 'bagarcia': 'Benjamin Alberto Garcia’, 'zmin’: the dictionary you’re checking. 'Zhang Min’, 'afarooqi': 'Ayesha Farooqi’, For example, the following code creates a 'hajackson': 'Hanna Jackson’, 'papatel’: 'Pratyush Aarav Patel’, 'hrjackson': 'Henry dictionary, and then stores its length in the Jackson’ } howmany variable: # Count the number of key:value pairs and put in a variable. The print statement shows 8 howmany = len(people) # Show how many. print(howmany) Seeing whether a key exists in a dictionary use the in keyword to see whether a key exists. If the key exists, in returns True. If the key doesn’t exist, in returns False. The below example with two print() statements. The first one checks to see whether hajackson exists in the dictionary. The second checks to see whether schmeedledorp exists in the dictionary. The first print() statement shows True because hajackson is in the dictionary. The second one returns False because schmeedledorp isn’t in the dictionary. Getting dictionary data with get() Having the program crash and burn when you look for something that isn’t in the dictionary is a little harsh. A more elegant way to handle that situation is to use the.get() method of a data dictionary. The syntax is: dictionaryname.get(key) Replace dictionaryname with the # Look for a person. person = 'bagarcia' name of the dictionary you’re print(people.get(person)) searching. Replace key with the thing you’re looking for. Note that get() uses parentheses, not square brackets. Getting dictionary data with get() What makes.get() different is what happens when you search for a non-existent name? You don’t get an error, and the program doesn’t crash and burn. Instead, get() gracefully returns the word None to let you know that no person named schmeedledorp is in the people dictionary, as you can see in below example: Changing the value of a key Dictionaries are mutable, which means you can change the contents of the dictionary from code (not that you can make the dictionary shut up). The syntax is simply: dictionaryname[key] = newvalue Replace dictionaryname with the name of the dictionary, key with the key that identifies the item, and newvalue with whatever you want the new value to be. Adding or changing dictionary data You can use the dictionary update() method to add a new item to a dictionary or to change the value of a current key. The syntax is: dictionaryname.update(key, value) Replace dictionaryname with the name of the dictionary. Replace key with the key of the item you want to add or change. If the key you specify doesn’t exist in the dictionary, it will be added as a new item with the value you specify. If the key you specify does exist, nothing will be added. The value of the key will be changed to whatever you specify as the value. Adding or changing dictionary data The following Python code that creates a data dictionary named people and put two peoples’ names into it: # Make a data dictionary named people. people = { 'papatel': 'Pratyush Aarav Patel’, 'hrjackson': 'Henry Jackson’ } # Change the value of the hrjackson key. people.update({'hrjackson' : 'Henrietta Jackson'}) print(people) # Update the dictionary with a new key:value pair. people.update({'wwiggins' : 'Wanda Wiggins'}) Adding or changing dictionary data The first update line changes the value for hrjackson from Henry Jackson to Henrietta Jackson because the hrjackson key already exists in the data dictionary: people.update({'hrjackson' : 'Henrietta Jackson'}) The second update() reads as follows: people.update({'wwiggins' : 'Wanda Wiggins'}) There is no wwiggins key in the dictionary, so update() can’t change the name for wwiggins. Adding or changing dictionary data If the key already exists in the dictionary, its value is updated because no two items in a dictionary are allowed to have the same key. If the key does not already exist, the key-value pair is added because nothing in the dictionary already has that key, so the only choice is to add it. After running the code, the dictionary # Show what's in the data dictionary now. contains three items, paptel, hrjackson for person in people.keys(): print(person + " = " + people[person]) (with the new name), and wwiggins. Adding the following lines to the end of that code displays everything in the dictionary: Looping through a Dictionary Looping through a Dictionary You can loop through each item in a dictionary in much the same way you can loop through lists and tuples, but you have some extra options. If you just specify the dictionary name in the for loop, you get all the keys, as follows: for person in people: print(person) htanaka ppatel bagarcia zmin afarooqi hajackson papatel hrjackson Looping through a Dictionary To see the value of each item, keep the for loop the same, but print dic tionaryname[key] where dictionaryname is the name of the dictionary (people in our example) and key is whatever name you use right after for person in people: the for in the loop. print(people[person]) Haru Tanaka Running this code against the Priya Patel Benjamin Alberto Garcia sample people dictionary lists all Zhang Min Ayesha Farooqi the names, as follows: Hanna Jackson Pratyush Aarav Patel Henry Jackson Looping through a Dictionary You can also get all the names by using a slightly different syntax in the for loop: Add.values() to the dictionary name, as in the following. Then you can just print the variable name (person) inside the loop. for person in people.values(): print(person) You can loop through the keys and values at the same time by using.items() after the dictionary name in the for loop. Thank You ر الجامعة السعودية االلكتونية ر االلكتونية الجامعة السعودية 26/12/2021 College of Computing and Informatics Bachelor of Science in Computer Science DS242 - Advanced Data Science Programming DS242 Advanced Data Science Programming Week 7 Wrangling Bigger Chunks of Code Contents 1. Creating a Function 2. Commenting a Function 3. Passing Information to a Function 4. Returning Values from Functions 5. Unmasking Anonymous Functions Weekly Learning Outcomes 1. Creating your own function and including a comment in a function 2. Seeing how to pass information to a function 3. Returning values from a function 4. Understanding anonymous functions Required Reading (Book 2: Understanding Python Building Blocks) Chapter 5: Wrangling Bigger Chunks of Code (John C. Shovic and Alan Simpson, Python All-in- One For Dummies, 2nd edition, 2021). Wrangling Bigger Chunks of Code Introduction Functions provide a way to compartmentalize your code into small tasks that can be called from multiple places in an app. For example, if something you need to access throughout the app requires a dozen lines of code, chances are you don’t want to repeat that code over and over every time you need it. If all that code were contained in a function, you would have to change or fix it in only one location. To access the task that the function performs, you call the function from your code, just like you call a built-in function such as print. Creating a Function Creating a function is easy. To create a function, start a new line with def (short for definition) followed by a space, and then a name of your own choosing followed by a pair of parentheses with no spaces before or inside. Then put a colon at the end of that line. For example, to create a simple function named hello(), type: def hello(): Creating a Function This is a function, but it doesn’t do anything. def hello(): To make the function do something, you have to write Python code on subsequent lines. To ensure that the new code is “inside” the function, indent each of those lines. To make this function do something, put an indented line of code under def. We’ll start def hello(): print('Hello') by just having the function print hello. So, type print('Hello') indented under the def line. Now your code looks like this: Creating a Function Nothing should happen because the code inside a function isn’t executed until the functioned is called. Call your own functions the same way you call built-in functions: by writing code that calls the function by name, including the parentheses at the end. When the code executes, you should see the output, which is just the word Hello Commenting a Function Comments are always optional in code. But it’s customary to make the first line under the def statement a docstring (text enclosed in triple quotation marks) that describes what the function does. It’s also common to put a comment, preceded by a # sign, to the right of the parentheses in the first line. Here’s an example using the simple hello() function: def hello(): # Practice function """ A docstring describing the function """ print('Hello') Commenting a Function Comments don’t have any effect on what the code does. Comments are just notes to yourself or to programming team members describing what the code is about. Running the code again displays the same results. Passing Information to a Function Passing Information to a Function You can pass information to a function for it to work on. To do so, Enter a parameter name in the def statement for each piece of information you’ll be passing to the function. You can use any name for the parameter, as long as it starts with a letter or underscore, followed by a letter, an underscore, or a number. The name should not contain spaces or punctuation. (Parameter names and variable names follow the same rules). Ideally, the parameter should describe what’s being passed in, for code readability, but you can use generic names like x and y, if you prefer. Any name you provide as a parameter is local only to that function. Passing Information to a Function The technical term for the way variables work inside functions is local scope, meaning the scope of the variables’ existence and influence stays inside the function and does not extend further. Variables created and modified inside a function literally cease to exist the moment the function stops running. Any variables defined outside the function are unaffected by the goings-on inside the function. Passing Information to a Function A function can return a value, and that returned value is visible outside the function. More on how this process works in a moment. For example, suppose you want to pass a def hello(x): # Practice function """ A docstring describing the function """ person’s name into the hello function and print('Hello ' + x) then use the name in the print() statement. You could use any generic name for both the parameter and the function, like this: Passing Information to a Function Inside the parentheses of hello(x), the x is a parameter, a placeholder for whatever is being passed in. Inside the function, that x refers only to the value passed into the function. Any variables named x outside the function are separate from the x used in the parameter name and inside the function. def hello(user_name): # Practice function """ A docstring describing the function """ print('Hello ' + user_name) Passing Information to a Function In the print() function, we added a space after the o in Hello so there’d be a space between Hello and the name in the output. When a function has a parameter, you have to pass it a value when you call it, or it won’t work. For example, if you added the parameter to the def statement and still tried to call the function without the parameter, as in the following code, running the code would produce an error: def hello(user_name): # Practice function """ A docstring describing the function """ print('Hello ' + user_name) hello() Passing Information to a Function The error would read something like the following: hello() missing 1 required positional argument: 'user_name' The value you pass can be a literal (the exact data you want to pass in) or the name of a variable that contains that information. For example, when you run this code: def hello(user_name): # Practice function """ A docstring describing the function """ print('Hello ' + user_name) hello('Alan') Passing Information to a Function You can use a variable to pass data too. Defining optional parameters with defaults When you call a function that expects parameters without passing those parameters, you get an error. You can write a function so that passing a parameter is optional, but you have to tell the function what to use if nothing gets passed. The syntax follows: def functioname(parametername=defaultvalue): Defining optional parameters with defaults When you call a function that expects parameters without passing those parameters, you get an error. You can write a function so that passing a parameter is optional, but you have to tell the function what to use if nothing gets passed. The syntax follows: def functioname(parametername=defaultvalue): def hello(user_name = 'nobody'): # Practice function """ A docstring describing the function """ print('Hello ' + user_name) Passing multiple values to a function So far in all our examples we’ve passed just one value to the function. But you can pass as many values as you want. Just provide a parameter name for each value and separate the names with commas. To pass the user’s first name, last name, and maybe a date to the function. You could define those three parameters like this: def hello(fname, lname, datestring): # Practice function """ A docstring describing the function """ print('Hello ' + fname + ' ' + lname) print('The date is ' + datestring) Accessing dictionary data If you want to use some (but not all) optional parameters with multiple parameters, make sure the optional ones are the last ones entered. For example, consider the following, which would not work: def hello(fname, lname='unknown', datestring): Logically, the code inside the function does the following: Create a variable named msg and put in Hello and the first and last name. If the datestring passed has a length greater than 0, add “you mentioned” and that datestring to the msg variable. Print whatever is in the msg variable at this point. Accessing dictionary data The first call passes three values, and the second call passes only two. Both work because the third parameter is optional. The output from the first call is the full output including the date, and the output from the second omits the part about the date. Using keyword arguments (kwargs) The official Python documentation at Python.org, you may have noticed that they throw around the term kwargs a lot. Short for keyword arguments and is yet another way to pass data to a function. The term argument is the technical term for “the value you are passing to a function’s parameters.” So far, we’ve used strictly positional arguments. def hello(fname, lname, datestring=''): Pass variables The values to be passed to the function are first placed in variables named attpt_date, last_name, and so forth. Then the last line calls the hello() function again as in previous examples. But the value assigned to each parameter name is the name of a variable, not a literal value being passed in. appt_date = '12/30/2019' last_name = 'Janda' first_name = 'Kylie' hello(datestring=apt_date, lname=last_name, fname=first_name) Pass variables Calling a function with keyword arguments (kwargs). Passing multiple values in a list You can also pass iterables to a function. Remember an iterable is anything that Python can loop through to get values. A list is a simple and perhaps the most commonly used iterable. Passing multiple values in a list Trick to working with lists: If you want to alter the list contents (for example, by sorting the contents), make a copy of the list in the function and then make changes to the copy. You have to work with a copy of the list that was passed because the function doesn’t receive the original list in a mutable (changeable) format; it receives only a pointer to the list, which indicates the list’s location. Then the function can get the list’s contents. The function can do anything it likes with its own copy of the list, but the original list remains unchanged. Passing multiple values in a list def alphabetize(original_list=[]): """ Pass any list in square brackets, displays a string Example, with items sorted """ Here is a new function named alphabetize() that # Inside the function make a working copy of the list passed in. takes one argument called names. The name of the sorted_list = original_list.copy() parameter being passed in is original_list. # Sort the working copy. sorted_list.sort() The entire parameter declaration is original_list=[]. The square brackets indicate an empty list as the # Make a new empty string for output final_list = ‘’ default, in case nothing is passed in as a parameter. # Loop through sorted list and append name and In other words, we’re using =[] to define the default comma and space. for name in sorted_list: input as an empty list. final_list += name + ', ‘ The function can alphabetize a list of any number of # Knock off last comma space if the string is not blank words or names: final_list = final_list[:-2] # Print the alphabetized list. print(final_list) Passing in an arbitrary number of arguments A list provides one way of passing a lot of values into a function. You can also design the function so that it accepts any number of arguments. (Note that this method is not particularly faster or better, so use whichever is easiest or makes the most sense.) To pass in any number of arguments, use *args as the parameter name, like this: def sorter(*args): Whatever you pass in becomes a tuple named args inside the function. Passing in an arbitrary number of arguments Remember, a tuple is an immutable list (a list you can’t change). So again, if you want to change things, you need to copy the tuple to a list and then work on that copy. Here is an example where the code uses the simple statement newlist = list(args). You can read that as the variable named newlist is a list of all the things that are in the args tuple. The next line, newlist.sort() sorts the list, and print displays the contents of the list: (code in next slide) Passing in an arbitrary number of arguments def sorter(*args): """ Pass in any number of arguments separated by commas Inside the function, they treated as a tuple named args. ""“ # Create a list from the passed-in tuple. newlist = list(args) # Sort and show the list. newlist.sort() print(newlist) Returning Values from Functions A function to return some value and put it in a variable specified in the calling code. The line that does the returning is typically the last line of the function followed by a space and the name of the variable (or some expression) that contains the value to be returned. Here is a variation of the alphabetize function. It contains no print statement. Instead, at the end, it simply returns the alphabetized list (final_list) that the function created: (code in next slide) Returning Values from Functions def alphabetize(original_list=[]): """ Pass any list in square brackets, displays a string with items sorted """ # Inside the function make a working copy of the list passed in. sorted_list = original_list.copy() # Sort the working copy. sorted_list.sort() # Make a new empty string for output final_list = ‘’ # Loop through sorted list and append name and comma and space. for name in sorted_list: final_list += name + ', ‘ # Knock off last comma space final_list = final_list[:-2] # Return the alphabetized list. return final_list Returning Values from Functions The most common way to use functions is to store whatever they return in some variable. For example, in the following code, the first line defines a variable called random_list, which is just a list containing names in no particular order, enclosed in square brackets (which tells Python it’s a list). The second line creates a new variable named alpha_list by passing random_list to the alphabetize() function and storing whatever that function returns. The final print statement displays whatever is in the alpha_list variable: Returning Values from Functions random_list = ['McMullen', 'Keaser', 'Maier', 'Wilson', 'Yudt', 'Gallagher’, 'Jacobs’] alpha_list = alphabetize(random_list) print(alpha_list) Unmasking Anonymous Functions Python supports the concept of anonymous functions, also called lambda functions. The anonymous part of the name is based on the fact that the function doesn’t need to have a name (but can have one if you want it to). The lambda part is based on the use of the keyword lambda to define anonymous functions in Python. The minimal syntax for defining a lambda expression (with no name) follows: lambda arguments : expression Unmasking Anonymous Functions Replace arguments with the data being passed into the expression. Replace expression with an expression (formula) that defines what you want the anonymous function to return. Write the following code to put the names in a list, sort the list, and then print it: names = ['Adams', 'Ma', 'diMeola', 'Zandusky'] names.sort() print(names) That output follows: ['Adams', 'Ma', 'Zandusky', 'diMeola'] Thank You ر الجامعة السعودية االلكتونية ر االلكتونية الجامعة السعودية 26/12/2021 College of Computing and Informatics Bachelor of Science in Computer Science DS242 - Advanced Data Science Programming DS242 Advanced Data Science Programming Week 8 Doing Python with Class Contents 1. Mastering Classes and Objects 2. Creating a Class and Creating an Instance from a Class 3. Giving an Object Its Attributes 4. Giving a Class Methods Weekly Learning Outcomes 1. Mastering Classes and Objects 2. Creating a class and an Instance from a Class 3. Defining attributes with default values 4. Understanding Passing parameters to methods and Calling a class method by class name Required Reading (Book 2: Understanding Python Building Blocks Chapter 6: Doing Python With Class (part1) (pp217-pp237) (John C. Shovic and Alan Simpson, Python All-in-One For Dummies, 2nd edition, 2021). Doing Python with Class Introduction classes, which allow you to compartmentalize code and data. You discover all the wonder, majesty, and beauty of classes and objects. classes have become a defining characteristic of modern object- oriented programming languages such as Python. Mastering Classes and Objects Python is an object-oriented programming language. The concept of object-oriented programming (OOP) has been a major buzzword in the computer world for at least a couple decades. The term object stems from the fact that the model resembles objects in the real word in that each object is a thing that has certain attributes and characteristics that make it unique. For example, a chair is an object. Lots of different chairs exist that differ in size, shape, color, and material. But they’re all still chairs. Mastering Classes and Objects The concept where all cars (although not identical) have certain attributes and methods in common. In this case, you can think of the class Car as being a factory that creates all cars. After each car is created, it is an independent object. Changing one car has no effect on the other cars or the Car class Mastering Classes and Objects Class: A piece of code from which you can generate a unique object, where each object is a single instance of the class. Think of a class as a blueprint or factory from which you can create individual objects. Instance: One unit of data plus code generated from a class as an instance of that class. Each instance of a class is also called an object just like all the different cars are objects, all created by some car factory (class). Mastering Classes and Objects Attribute: A characteristic of an object that contains information about the object. Also called a property of the object. An attribute name is preceded by a dot, as in member.username which may contain the username for one site member. Method: A Python function associated with the class. A method defines an action that an object can perform. You call a method by preceding the method name with a dot and following it with a pair of parentheses. Creating a Class Creating a Class You create your own classes like you create your own functions. You are free to name the class whatever you want, so long as it’s a legitimate name that starts with a letter or underscore and contains no spaces or punctuation. It’s customary to start a class name with an uppercase letter to help distinguish classes from variables. To get started, all you need is the word class followed by a space, a class name of your choosing, and a colon. Creating a Class To create a new class named Member, use class Member:. To make your code more descriptive, feel free to put a comment above the class definition. You can also put a docstring below the class line, which will show up whenever you type the class name in VS Code. # Define a new class name Member. class Member: """ Create a new member. """ EMPTY CLASSES If you start a class with class name: and then run your code before finishing the class, you’ll actually get an error. To get around that, you can tell Python that you’re just not quite ready to finish writing the class by putting the keyword pass below the definition, as in the following code: # Define a new class name Member. class Member: pass In essence, what you’re doing there is telling Python “Hey I know this class doesn’t really work yet, but just let it pass and don’t throw an error message telling me about it.” Creating an Instance from a Class To grant to your class the capability to create instances (objects) for you, you give the class an init method. The word init is short for initialize. As a method, it’s really just a function defined inside a class. But it must have the specific name __init__ (that’s two underscores followed by init followed by two more underscores). The syntax for creating an init method is: def __init__(self[, suppliedprop1, suppliedprop2,...]) Creating an Instance from a Class The def is short for define, and __init__ is the name of the built-in Python method that’s capable of creating objects from within a class. The self part is just a variable name and is used to refer to the object being created at the moment. You can use the name of your own choosing instead of self. But self would be considered by most a best practice because it’s explanatory and customary. Creating an Instance from a Class This business of classes is easier to learn and understand if you start simply. So, for a working example, you’ll create a class named Member, into which you’ll pass a username (uname) and full name (fname) whenever you want to create a member. As always, you can precede the code with a comment. Creating an Instance from a Class You can also put a docstring (in triple quotation marks) under the first line both as a comment but also as an IntelliSense reminder when typing code in VS Code: # Define a class named Member for making member objects. class Member: """ Create a member from uname and fname """ def __init__(self, uname, fname): When the def __init__ line executes, you have an empty object named self inside the class. Giving an Object Its Attributes you have a new, empty Member object, you can start giving it attributes and populate (store values in) those attributes. For example, let’s say you want each member to have a.username attribute that self.username = uname contains the user’s user name (perhaps for self.fullname = fname logging in). You have a second attribute named fullname, which is the member’s full name. To define and populate those attributes, use the following: Giving an Object Its Attributes The first line creates an attribute named username for the new instance (self) and puts into it whatever was passed into the uname attribute when the class was called. The second line creates an attribute named fullname for the new self object, and puts into it whatever was passed in as the fname variable. # Define a new class named Member. Add some comments and the class Member: """ Create a new member. """ entire class looks like this: def __init__(self, uname, fname): # Define attributes and give them values. self.username = uname self.fullname = fname Creating an instance from a class When you’ve created the class, you can create instances (objects) from it using this simple syntax: this_instance_name = Member('uname', 'fname') Replace this_instance_name with a name of your own choosing (in much the same way you may name a dog, who is an instance of the Dog class). Replace uname and fname with the username and full name you want to put into the object that will be created. Make sure you don’t indent that code; otherwise, Python will think that new code still belongs to the class’s code. It doesn’t. It’s new code to test the class. Creating an instance from a class Example, let’s say you want to create a member named new_guy with the username Rambo and the full name Rocco Moe. Here’s the code for that: new_guy = Member('Rambo', 'Rocco Moe') To see what’s really in the new_guy instance of Members, you can print it as a whole. You can also print type(new_guy) to ask Python what type new_guy is. This code does it all: print(new_guy) print(new_guy.username) print(new_guy.fullname) print(type(new_guy)) Creating an instance from a class Creating an instance from a class In the figure, you can see that the first line of output is This output tells you that new_guy is an object created from the Member class. The number at the end is its location in memory. The next three lines of output are Rambo Rocco Moe The first line is the username of new_guy (new_guy.username), and the second line is the full name of new_guy (new_guy.fullname). The last line is the type and tells you that new_guy is an instance of the Member class. Changing the value of an attribute When working with tuples, you can define key:value pairs, much like the attribute:value pairs you see here with instances of a class. There is one major difference, though: Tuples are immutable, meaning that after they’re defined, your code can’t change anything about them. After you create an object, you can change the value of any attribute at any time using the following simple syntax: objectname.attributename = value Changing the value of an attribute Replace objectname with the name of the object (which you’ve already created via the class). Replace attributename with the name of the attribute whose value you want to change. Replace value with the new value. new_guy.username = "Princess" Defining attributes with default values You don’t have to pass in the value of every attribute for a new object. If you’re always going to give an attribute some default value at the moment the object is created, you can just use self.attributename = value, the same as before, in which attributename is a name of your own choosing. And value can be some value you just set, such as True or False for a Boolean, or today’s date, or anything that can be calculated or determined by Python without you providing the value. Defining attributes with default values If you’re going to be doing anything with dates and times, you’ll want to import the datetime module, so put that at the top of your file, even before the class Member: line. Then you can add the following lines before or after the other lines that assign values to attributes within the class: self.date_joined = dt.date.today() self.is_active = True Defining attributes with default values Add the import and those two new attributes to the class: import datetime as dt # Define a new class name Member. class Member: """ Create a new member. """ def __init__(self, username, fullname): # Define attributes and give them values. self.username = username self.fullname = fullname # Default date_joined to today's date. self.date_joined = dt.date.today() # Set is active to True initially. self.is_active = True Defining attributes with default values Note that a default value is just that: It’s a value that is assigned automatically when you create the object. But you can change a default value in the same way you would change any other attribute’s value, using this syntax: objectname.attributename = value the is_active attribute to determine whether a user is active and can log into your site. If a member turns out to be an obnoxious troll and you don’t want him logging in anymore, you could just change the is_active attribute to False like this: newmember.is_active = False Giving a Class Methods Any object you define can have any number of attributes, each given any name you like, to store information about the object, such as a dog’s breed and color or a car’s make and model. You can also define you own methods for any object, which are more like behaviors than facts about the object. For example, a dog can eat, sleep, and bark. A car can go, stop, and turn. A method is really just a function. What makes it a method is the fact that it’s associated with a particular class and with each specific object you create from that class. Giving a Class Methods Method names are distinguished from attribute names for an object by the pair of parentheses that follow the name. To define what the methods will be in your class, use this syntax for each method: def methodname(self[, param1, param2,...]): Replace methodname with a name of your choosing (all lowercase, no spaces). Keep the word self in there as a reference to the object being defined by the class. Optionally, you can also pass in parameters after self using commas, as with any other function. Giving a Class Methods Create a method: a method named.show_date_joined() that returns the user’s name and the date the user joined in a formatted string. Here is how you could define this method: # A method to return a formatted string showing date joined. def show_datejoined(self): return f"{self.fullname} joined on {self.date_joined:%m/%d/%y}" To call the method from your code, use this syntax: objectname.methodname() Giving a Class Methods Passing parameters to methods You can pass data into methods in the same way you do functions: by using parameter names inside the parentheses. However, keep in mind that self is always the first name after the method name, and you never pass data to the self parameter. For example, let’s say you want to create a method called.activate() and set it to True if the user is allowed to log in or False when the user isn’t. Passing parameters to methods Whatever you pass in is assigned to the.is_active attribute. Here’s how to define that method in your code: # Method to activate (True) or deactivate (False) account. def activate(self, yesno): """ True for active, False to make inactive """ self.is_active = yesno Passing parameters to methods Example shows the entire class followed by some code to test it: Calling a class method by class name you can call a class’s method using the following syntax: specificobject.method() An alternative is to use the specific class name, which can help make the code easier for humans to understand: Classname.method(specificobject) Replace Classname with the name of the class (which we typically define starting with an uppercase letter), followed by the method name, and then put the specific object (which you’ve presumably already created) inside the parentheses. Using class variables So far you’ve seen examples of attributes, which are sometimes called instance variables, because they’re placeholders that contain information that varies from one instance of the class to another. Another type of variable you can use with classes is called a class variable, which is applied to all new instances of the class that haven’t been created yet. Class variablesinside a class don’t have any tie-in to self because the self keyword always refers to the specific object being created at the moment. Using class variables To define a class variable, place the mouse pointer above the def __init__ line and define the variable using the standard syntax: variablename = value Replace variablename with a name of your own choosing, and replace value with the specific value you want to assign to that variable. For example, let’s say your code includes a free_days variable that grants people three months (90 days) of free access on sign-up. # Define a class named Member for making member objects. class Member: """ Create a member object """ free_days = 90 def __init__(self, username, fullname): Using class variables Using class methods A class method is a method associated with the class as a whole, not specific instances of the class. In other words, class methods are similar in scope to class variables in that they apply to the whole class and not just individual instances of the class. As with class variables, you don’t need the self keyword with class methods because that keyword always refers to the specific object being created at the moment, not to all objects created by the class. Using class methods So for starters, if you want a method to do something to the class as a whole, don’t use def name(self) because the self immediately ties the method to one object. To define a class method, you first need to type this into your code: @classmethod The @ at the start of this defines classmethod as a decorator — yep, yet another term to add to your ever-growing list of nerd-o-rama buzzwords. A decorator is generally something that alters or extends the functionality of that to which it is applied. Using class methods to define a method that sets the number of free days just before you start creating objects, so that all objects get the same free_days amount. The following code accomplishes that by first defining a class variable named free_days that has a given default value of 0. (The default value can be anything.) # Class methods follow @classmethod decorator and refer to cls rather than # to self. @classmethod def setfreedays(cls,days): cls.free_days = days Using class methods The setfreedays() method is a class method in the Member class. Using static methods A static method and it starts with this decorator: @staticmethod. It is a generic function, and the only reason to define it as part of a class is if you want to use the same name elsewhere in another class in your code. Wherever you want a static method, you type the @staticmethod line. Below that line, you