DATA1002/1902 Informatics: Data and Computation 2B: Spreadsheets PDF

Summary

This is a lecture presentation about spreadsheet concepts. It provides information about the history, and how to use spreadsheets. It also details how spreadsheets are used in the cloud.

Full Transcript

DATA1002_1902 DATA1002/1902 Informatics: Data and Computation 2B: Spreadsheets...

DATA1002_1902 DATA1002/1902 Informatics: Data and Computation 2B: Spreadsheets Dr. Josiah Poon School of Computer Science Image source: 1 https://www.pngegg.com/en/png-pylpy/download Acknowledge: content from material by James Curran, Tara Murphy, Alan Fekete DATA1002_1902 COMMONWEALTH OF AUSTRALIA Copyright Regulations 1969 WARNING This material has been reproduced and communicated to you by or on behalf of the University of Sydney pursuant to Part VB of the Copyright Act 1968 (the Act). The material in this communication may be subject to copyright under the Act. Any further reproduction or communication of this material by you may be the subject of copyright protection under the Act. Do not remove this notice. 2 DATA1002_1902 Acknowledgement of Country We acknowledge the traditional custodianship and law of the Country on which the University of Sydney campuses stand. We pay respects to those who have cared and continue to care for the Country. Do not remove this notice. 3 DATA1002_1902 Discussion We expect most students have some experience with using spreadsheets But different students will have used different features, and to different extents This large slide set is a reference Make sure you think about the concepts Make sure you practice with whichever functions/mechanisms you haven’t mastered 4 DATA1002_1902 Spreadsheet overview Especially prevalent in business and organizational decision-making But are also widely used for scholarship in many fields Special strength: “what-if” investigations see the impact of changing some parameter or data value Microsoft Excel is widely used https://www.istockphoto.com/illustrations/online-spreadsheet most often on a desktop/laptop can be cloud-hosted Gentle learning curve simple things can be done very quickly and easily lots of powerful features for more complicated tasks See https://en.wikipedia.org/wiki/Spreadsheet 5 DATA1002_1902 Spreadsheets in the cloud Spreadsheet can be Software-as-a-service (“SaaS”) eg Microsoft Office 365, Google Sheets Instead of having the code and data on a local, owned machine, the code and data are on a machine in a datacenter owned/managed by a cloud provider (AWS, Google, Microsoft etc) Users access the spreadsheet as a page seen in a web browser Much better for collaborative work than running locally! Better for data backup, security updates, etc Code running in cloud is owned/managed either by cloud provider or by a third party Management exploits large scale sharing, for elastic resource consumption (use resources like CPU, memory, only as needed) Sometimes service is free for the casual user, or else charged based on usage (“pay as you go”) Cloud provider may charge a fee, owner of code may charge a fee 6 DATA1002_1902 Agenda in this Session Brief history (non-examinable) Spreadsheet concepts Introducing formulas Data import and arrangement Data exploration Autofill Lookup Conditional summaries Pivot table 7 DATA1002_1902 The beginnings (non-examinable) The first spreadsheets really were ‘sheets’ - all accounts information arranged on a single sheet of paper 1961 Mattessich used matrices to model budgets — Simulation of the Firm Through a Budget Computer Program 1969 Pardo and Landau co-invented LANPAR (LANguage for Programming Arrays at Random) 1979 Bricklin develops VisiCalc, the first visual spreadsheet, for Apple II it had 5 columns and 20 rows of data http://www.bricklin.com 8 DATA1002_1902 A killer app(non-examinable) “Many bought $2000 Apples to run the $100 software” https://en.wiki pedia.org/wiki/ VisiCalc Image By User:Gortu - apple2history.org, Public Domain, https://commons.wikimedia.org/w/index.php?curid=342925 9 DATA1002_1902 Improving the tools (non-examinable) 1983 Lotus 1-2-3 released. It added Charting spreadsheet macros context-sensitive help 1985 Bill Gates wrote Excel for the Apple Macintosh. It added a GUI with point and click capability pull down menus 1987 Excel 2.0 was the flagship product for Windows 2.0 2006 release of free-to-use cloud-hosted Google Sheets (then called “Google Spreadsheets”) 2011 Microsoft Office 365 launched with Excel as well as many other tools 2013 consumer plans added to previous corporate ones Today spreadsheets have remained similar to VisiCalc in appearance Excel is dominant but there are other alternatives including some Open Source: Apache OpenOffice Calc, LibreOffice 10 Most can work with one another file formats DATA1002_1902 Agenda in this Session Brief history (non-examinable) Spreadsheet concepts Introducing formulas Data import and arrangement Data exploration Autofill Lookup Conditional summaries Pivot table 11 DATA1002_1902 Which spreadsheet? To keep it concrete and specific, we will describe Microsoft Excel Others use the same concepts, and (mostly) the same formulas etc Though layout of screen may vary, even between versions of the same software 12 DATA1002_1902 13 DATA1002_1902 First key ideas Cell location contents value format appearance Formula Range Worksheet Lots of guidance available https://support.office.com/en-us/excel 14 DATA1002_1902 What is in a cell? Location where in the spreadsheet the cell is placed described by column (letter(s)) and row (number) eg C6, AB142 this can change when one inserts or deletes rows or columns Contents what has been placed there most recently current contents has a type (number, text, date, empty, etc) one can edit to change the contents, with same or different type Value often, just what has been placed there but sometimes, when the contents is a formula, the value is the result of a calculation which may make use of values from other cells 15 DATA1002_1902 Display a cell Format Associated to a cell, controls how the value is displayed on the screen The format of a cell can have a big impact on usability: changing the colour, font, shading etc to draw the audience attention Alignment also influences ease of comparisons etc Special formats for dates, currency, etc Appearance what is shown on the screen Based on the value, but adjusted to fit the format and cell size note: any formula that refers to this cell, uses the true value, not the appearance 16 DATA1002_1902 Number formats Formatting can affect the apparent precision of numbers But, the contents are not changed, and the exact value will still be used when calculating formulas for other cells Numbers are stored with 15 digit precision The number of digits displayed depends on format (and also cell size) Values are displayed rounded as appropriate Too many digits to be displayed in the available space for a cell results in #### You can increase the number of digits shown, beyond what is really stored — the number is padded with zeros 17 DATA1002_1902 Example A B 1 3.14159 Here, we typed in A1 giving the exact value 2 4.142 3.14159 3 5.1416 This is the contents of A1 4 6.141590000 Value of A1 is also 3.14159 A1 has format general number What is displayed is 3.14159 (provided cell is wide enough on screen; otherwise appearance is ####) In A2 we typed =A1+1 and we put format to be three decimal digits The contents of A2 is the formula =A1+1 The value of A2 is 4.14159 What is displayed (provided cell is wide enough on screen) is 4.142 A3 contains =A2+1 with format 4 decimal digits; A4 contains =A3+1 with format 10 decimal digits 18 DATA1002_1902 Formulas Each formula is text that starts with an = sign =(A1 + B1)*2 =1/(C1 - C3) Cell references are entered directly, or by clicking on cell – use location – or you can use a name which the cell has been given – Eg =(Population * RateHighSchoolGraduation) 19 DATA1002_1902 Name You can give a cell a name by selecting it (by clicking on it, or by typing its location into the name box in the top area) Then edit the name box (which starts with the usual location eg B5) to have text that will be an alternate name for this cell Name box Formula bar 20 DATA1002_1902 Editing You can select a cell then enter new contents by typing there Or type into the formula bar in the top area You can insert a whole row or a whole column You can delete a whole row or a whole column You can delete a cell (shift neighbours to prevent a gap) 21 DATA1002_1902 Changes? Discuss: how can you change the value of a cell without changing its contents? how can you change the location of a cell without changing its contents? how can you change the appearance of a cell without changing its contents? https://courses.uchicago.edu/2019/11/22/spark-effective-discussions-with-canvas-discussion-boards/ 22 DATA1002_1902 Range Refer to a rectangular collection of cells A B C D E Used as arguments in many functions 1 Expressed with top-left- location:bottom- 2 right- location 3 Eg B3:D5 Special case: a part of a column eg C2:C6, 4 or part of a row eg C5:E5 5 6 23 DATA1002_1902 Worksheets A spreadsheet document can comprise more than one sheet sheets are displayed as tabs in the window Each sheet has cells A1, A2, B1, etc etc Each sheet has a name use Format>Sheet>Rename to alter the name Formula can refer to cells on other sheets as well eg Sheetname!D6 24 DATA1002_1902 Agenda in this Session Brief history (non-examinable) Spreadsheet concepts Introducing formulas Data import and arrangement Data exploration Autofill Lookup Conditional summaries Pivot table 25 DATA1002_1902 Built-in functions Excel has many functions already available to use in formulas most are also in Google Sheets, OpenOffice etc They take arguments and return a result (which is determined by the values in the arguments) the result can be used in further calculations usual notation: FUNCTION(arguments, separated by commas) eg =SUM(3,5,7) If argument is a reference, use the value of that cell in calculation of the function eg suppose value in A2 is 6, then MAX(3, 4, A2) has result 6 Combine and/or nest functions eg MAX(3,5)+MIN(2,4) eg MAX(3, A5, SUM(B2, C2)) make sure parentheses match up 26 DATA1002_1902 Function arguments Functions can have a fixed or variable number of arguments: zero eg =PI(), one eg =SQRT(A2) variable eg =SUM(A1:A10,A50,B10) or =SUM(C2:C4) Most functions require the correct data type to be passed SUM and AVERAGE are exceptions Data types can be checked use functions ISTEXT, ISNUMBER Excel helps with syntax when you are typing a formula 27 DATA1002_1902 Simple summaries (1) =COUNT(arg1,arg2, …) Count each element among all the arguments that is a number, date, or text representation of a number argument can be a range, this counts from all the cells within that range error values or text that cannot be translated into numbers are ignored Eg Suppose B3:E3 contains; fred,4,[empty],-5 then =COUNT(B3:E3) has value 2 from C3 (4) and E3 (-5) 28 DATA1002_1902 Simple summaries (2) =MAX(arg1, arg2, … ) or =MIN(arg1, arg2, …) or =SUM(arg1, arg2, …) or =AVERAGE(arg1, arg2, …) use a range as an arg, this just includes all cells from the range ignore empty cells, text values for MAX, MIN, SUM, when the arguments contain no numbers at all, result is 0 DATA1002_1902 Errors Possible error values are: #DIV/0 Division by zero #NAME? Undefined variable or function name (or a space between the function name and the opening parenthesis) #N/A No value is available #NULL! A result has no value #NUM! Numerical error (eg. SQRT(-1)) #REF! Invalid cell reference #VALUE! Invalid argument type A circular reference error occurs when a formula contains a reference to its own location eg. =SUM(A1:A10) in the cell A7 Excel displays a dialog and prevents user entering a formula with this mistake Image source: https://www.elegantthemes.com/blog/wordpress/the-ultimate-guide-to-common-http-error-codes 30 DATA1002_1902 Choices, choices The IF function allows a calculation that differs in varying situations Conditional calculation is very common in business and government but also science Price discounts for volume purchased, customer loyalty, etc Tax rates change at thresholds Eligibility rules for benefits Boundary conditions for system behaviour Physical restrictions on calculated quantities Eg mass should be non-negative, so a calculated negative value should be treated specially Conditional calculation is crucial for data cleaning Eg check for missing values, out of range values Image source: https://www.lifescienceleader.com/doc/do-you-as-a-leader-create-great-choices-0001 31 DATA1002_1902 Making choices IF(condition,value_if_condition_is_true,value_if_ condition_is_false) condition can have the form exp1 operator exp2 the relational operator is one of: =, , =, “” is Excel syntax for “not equal to” exp1, exp2 can be cell-location, explicit constant, or complicated formulas themselves similar for value_if_condition_is_true, value_if_condition_is_false Image source: https://www.lifescienceleader.com/doc/do-you-as-a-leader-create-great-choices-0001 32 DATA1002_1902 Making choices - Example (Hypothetical) dataset from political science name of governing party in column B governmenttype in column C reported turnout in election in column F survey response to “do you intend to vote” in column H organization that ran the survey in column I IF(C2”dictator”,F2,IF(I2=B2,0.6*H2,H2)) this returns the value in cell F2, except if the government is a dictatorship, in which case return a value from the survey: if the survey was run by the government party itself, we reduce that value by 40%, otherwise use the survey result directly 33 DATA1002_1902 Complex conditions Conditions can be described by logic-valued expressions These are built up from comparisons etc, combined with logical operators: AND, OR and NOT In Excel conditions, these operators are written as functions (prefix, not infix notation) e.g. OR(AND(weight > 50, height

Use Quizgecko on...
Browser
Browser