CSE2DV Lecture 1 - Introduction to Visualization 2024 PDF
Document Details
La Trobe University
2024
null
Ali Abdul Karim
Tags
Summary
This document is a lecture on computer-aided visualization. It includes discussions of visualization, topics such as data visualization, data analysis, and visualization design.
Full Transcript
latrobe.edu.au CSE2DV Lecture 1 - Introduction to Visualization Ali Abdul Karim 29 July 2024 La Trobe University CRICOS Provider Code Number 00115M latrobe.edu.a...
latrobe.edu.au CSE2DV Lecture 1 - Introduction to Visualization Ali Abdul Karim 29 July 2024 La Trobe University CRICOS Provider Code Number 00115M latrobe.edu.au Your friendly lecturer – Ali Abdul Karim BA (Hons) in Information Technology, La Trobe University 6+ years of industry experience in analytics capacity A PhD candidate at La Trobe University. Interested in exploring the impact of using new data for different forecasting applications. This includes feature engineering, feature selection and data analysis. email: [email protected] Slide 2 latrobe.edu.au Subject Coordinator Chang Liu Dr Chang Liu is currently a Lecturer with the Department of Computer Science and Information Technology, La Trobe University. email: [email protected] Slide 3 latrobe.edu.au Why did you sign up for this subject? Slide 4 latrobe.edu.au What are your goals for this subject? Slide 5 latrobe.edu.au What questions do you have for me? (Anything goes!) Slide 6 latrobe.edu.au Introduction to visualization questions latrobe.edu.au Visualization – or vis for short Visualization (or visualization, or vis) is the graphical representation of information or data In the 21st century, this almost always involves the use of a computer – This computer-based visualization is what we’re going to focus on in this subject Slide 8 latrobe.edu.au Why have a human in the loop? When we talk about “the loop”, we mean the decision-making loop , or process – As in, why do we want to have a human involved in the decision making process? Often the questions you want to ask your data aren’t well-defined in advance – The problems are ill-specified These are problems that computers are bad at solving, but humans – especially if you involve the powerful pattern matching of the visual system – are good at Slide 9 latrobe.edu.au Why have a computer in the loop? So, if humans are so great, why do we need to involve computers? – Modern datasets may involve many millions or billions of observation These are completely infeasible to analyze or draw by hand The goal of vis is to let humans and computers do only the things they’re good at in the analysis of data – Humans are really good at pattern recognition and open-ended analysis – Computers are really good at solving well-posed problems really fast Slide 10 latrobe.edu.au Why use an external representation at all? External representations, like visualizations, enable humans to go beyond their natural limitations – We only have so much “working memory”, so are limited in the amount of stuff we can keep in our thoughts at once – Our brains only work but so fast, so we are limited in the number and speed of computations we can do With well-designed external representations, we can ”upgrade” our memory and processing Slide 11 latrobe.edu.au Why vision? Might it be more effective to wire data directly into our brains? – Well, maybe someday, but even that’s not so clear Our visual system is very highly evolved, having had billions of years to get really good at recognizing certain types of patterns in the world – Well-designed visualizations take advantage of these highly optimized pathways Vision is a very high-bandwidth channel Other sensory channels – hearing, touch, smell, taste – are less efficient and hardware for stimulating them effectively is difficult or impossible to implement Slide 12 latrobe.edu.au Why show data in full detail? Modern (i.e., computer-aided) visualizations give their users the ability to view a summary of the data, but also the ability to see the data in full detail It is sometimes necessary to see the “raw” data in order to draw correct conclusions – Summaries, even good ones, are lossy Slide 13 latrobe.edu.au Details matter Anscombe’s Quartet Identical statistics x mean 9 x variance 10 y mean 7.5 y variance 3.75 x/y correlation 0.816 Slide 14 latrobe.edu.au Why is interactivity worthwhile? So, both summaries and detailed views have their benefits – And, in fact, modern datasets with many millions of observations make it impossible to ”just show all the data” An interactively changing visualization can provide the benefits of both summary and detail views It can also give you a more complete picture of the data – A single visualization is like just one viewpoint on a complex scene Slide 15 latrobe.edu.au Why is the design space so huge? There are so many choices to make in the process of creating a visualization – What data to use? – How to process that data? – How to represent a single data point? – How to summarize multiple data points? – How to transition between views? –… Slide 16 latrobe.edu.au Why the focus on tasks in this subject? A tool that works very well for one task might fail completely for another. – Consider a hammer and a nail, versus a hammer and a bolt This is also true for visualizations – Sometimes a simple bar chart is the right tool for the job. Other times, it will be inadequate, or even misleading Slide 17 latrobe.edu.au Why do we care about effectiveness? In my view – and therefore the official view of this subject – a visualization is good exactly to the extent that it helps a user accomplish their analysis task – Goodness == effectiveness A visualization, unlike other visual and non-visual media, succeeds to the extent it is correct, accurate, and truthful – Vis is often beautiful, but in this case, truth is not beauty Slide 18 latrobe.edu.au Why are most graphic designs kind of bad? A few slides ago, we talked about just how many choices go into producing a single visualization – Each of those choices presents an opportunity to, well, get it wrong A naïve optimization process is almost certain to fail For vis to be successful, it needs to be a match between the characteristics of the data, the characteristics of the analysis task, the characteristics of the computing hardware and software, and the perceptual and cognitive abilities of the human user Slide 19 latrobe.edu.au Why is vis evaluation difficult? Just like the design space is huge, the evaluation space is also huge – How do you know if a visualization works? Recall that one of the main reasons we employ vis is that we don’t yet know the questions we need to answer What does good mean? What does better mean? If it’s based on effectiveness, how do you define effective? Slide 20 latrobe.edu.au Why are there resource limitations? We’ve alluded to this before, but there are at least three classes of limitations that always need to be considered when designing or analyzing vis – Compute capacity – Display capacity – Human capacity Perceptual (visual) Cognitive Slide 21 latrobe.edu.au Why analyze visualizations? I/we believe that a good way to learn to design new visualizations is to analyze existing ones, taking ideas that seem to work well and leaving behind those that don’t – We’re going to present our analysis framework over the coming weeks What (data) are we dealing with? What is its natural structure? Why are we analyzing that data? How should we present the data? Slide 22 latrobe.edu.au How are we feeling? Any questions so far? Slide 23 latrobe.edu.au A brief history of vis latrobe.edu.au The Graph GOAT, William Playfair Scottish engineer, political economist, and secret agent Born 1759, died 1823 Credited with the invention of the line chart, bar chart, area chart, and pie chart Also credited with bringing down the French government through an elaborate counterfeiting scheme – Life stories from the 1700s are wild Slide 25 latrobe.edu.au Example of Playfair’s bar chart (1781) Slide 26 latrobe.edu.au Example of Playfair’s area chart (1785) Slide 27 latrobe.edu.au Example of Playfair’s time-series chart (1786) Slide 28 latrobe.edu.au Example of Playfair’s pie chart (1805) Slide 29 latrobe.edu.au History’s greatest vis? Slide 30 Charles Joseph Minard, 1869 latrobe.edu.au Slide 31 latrobe.edu.au The graph depicts a variety of data related to Napoleon’s 1812 invasion of Russia. Slide 32 latrobe.edu.au An abstracted (but to scale) map depicting the journey from the Neman river in Lithuania to Moscow. Slide 33 latrobe.edu.au 2 bars (tan for the march TO Moscow, black for the retreat FROM Moscow) depicting the size of Napoleon’s army over time (width = army size) Slide 34 latrobe.edu.au A line chart depicting the temperature at various times during the retreat from Moscow. Slide 35 latrobe.edu.au John Snow’s 1854 Cholera outbreak map Slide 36 latrobe.edu.au An early dot map visualization, this map illustrates the number of cholera deaths at different locations with small bar graphs. Slide 37 latrobe.edu.au The visualization clearly indicates that the outbreak is centered on Broad Street. Snow used this evidence to convince the parish authorities to close a contaminated well. Slide 38 Florence Nightengale’s Crimean War “Nightengale latrobe.edu.au rose” (polar area or circular histogram) vis Slide 39 Florence Nightengale’s Crimean War “Nightengale latrobe.edu.au rose” (polar area or circular histogram) vis Non-preventable deaths Preventable deaths Slide 40 latrobe.edu.au Then…not much happened. For, like, a hundred years. The dominant view was that images were too imprecise for statistics. Slide 41 latrobe.edu.au What changes in, oh, about 1950? Slide 42 latrobe.edu.au Slide 43 latrobe.edu.au How are we feeling? Any questions so far? Slide 44 latrobe.edu.au Subject mechanics: ILOs and Assessments latrobe.edu.au Intended Learning Outcomes Write code to clean and format data in preparation for data visualization. Slide 46 latrobe.edu.au Intended Learning Outcomes Design appropriate and effective visualizations that help users gain deep insight into complex data sets. Slide 47 latrobe.edu.au Intended Learning Outcomes Create interactive data visualizations for all users to effectively explore the data. Slide 48 latrobe.edu.au Intended Learning Outcomes Generate informative reports using data visualizations. Slide 49 latrobe.edu.au Assessment 1 25%: Data visualization using Power Bi – This will be an individual assignment; more details on what your applications will have to look like will be forthcoming Slide 50 latrobe.edu.au Assessment 2 25%: Data visualization using Python – This will be an individual assignment; more details on what your applications will have to look like will be forthcoming Slide 51 latrobe.edu.au Assessment 3 50%: 2-hour examination – Exam period; date and time TBD Slide 52 latrobe.edu.au Questions regarding ILOs and assessments? Can I clear anything up? Slide 53 latrobe.edu.au NEXT LECTURE: The what of vis: Data abstraction Thank you. Be well. latrobe.edu.au La Trobe University CRICOS Provider Code Number 00115M © Copyright La Trobe University 2021