Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...

Transcript

CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio CSE5DEV DATA EXPLORATION AND ANALYSIS Week 10 Reporting and Data Communication CSE5DEV Syllabus Week-Overview Reporting & Data Communication Overview 1 CSE5DEV Syllabus 2 Week-Overview 3 Reporting & Da...

CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio CSE5DEV DATA EXPLORATION AND ANALYSIS Week 10 Reporting and Data Communication CSE5DEV Syllabus Week-Overview Reporting & Data Communication Overview 1 CSE5DEV Syllabus 2 Week-Overview 3 Reporting & Data Communication 4 Examples of data communication Examples of data communicatio CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Subject Syllabus — Lecture 1 — Introduction — Lecture 2 — Data Collection & R Programming — Lecture 3 — Data Wrangling & R Programming — Lecture 4 — Data Cleaning & Normalisation — Lecture 5 — Data Visualisation — Lecture 6 — — Lecture 7 — — Lecture 8 — Data Exploration 1 Data Exploration 2 Data Exploration 3 Analysis Analysis Analysis — Lecture 10 — Reporting & Data Communication Lecture 11 Case Study Lecture 12 Revision — Lecture 9 — Correlation & Pattern Discovery Analysis CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Data Science Project Almost all data science and analysis projects require the same set of stages to be performed. These are: Stage -1 Identify the problem (question) Stage - 2 Collect & Prepare the data Stage - 3 Explore the data Stage - 4 Communicate the results What is the goal? What do you want to estimate? How to track houses prices across different areas? Data resources Descriptive statistics What are the findings? Data representation Visualisation What we learn? Clean and normalise the data Report the findings Does the result make sense? CSE5DEV Syllabus Week-Overview Reporting & Data Communication Overview 1 CSE5DEV Syllabus 2 Week-Overview 3 Reporting & Data Communication 4 Examples of data communication Examples of data communicatio CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Week 10 Overview Reporting and Data Communication This week we will be covering the basics of Reporting and Data Communication Learning outcomes: • Develop a high-level understanding of the data communication. • Understand the guidelines for result communication. • Understand and implement graphics that effectively communicate information. CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio What we have learned so far? Data can be in different formats, but computer program expects your data to be organised in a well-defined structure. What we have learned so far? —— Theory —— 1 Collecting and Wrangling: working with data • Read & correct data 2 Cleaning and Normalising: convert dirty data into correct data. • Cleaning & Handling Missing Values. • Normalising or Standardising Data. 3 Data Visualisation • Scatter plot, Boxplots, and Line plots 4 Data Exploration • • • • Univariate Analysis Bivariate (multivariate) Analysis Time Series Data Analysis Correlation & Pattern Discovery CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio What we have learned so far? What we have learned so far? —— R Programming —— 1 2 3 4 5 6 7 8 9 Install R and Rstudio, create Rmarkdown file, write and run basic codes, ..etc Data Type and data structure (vector, factor, matrix and data frame) View, Access, Change.... etc. Correct or change the format of the data to make it tidy Clean the data Normalise the data Data visualisation using ggplot2 Data Exploration: Tabular and Graphical Explorations Correlation & Pattern Discovery Note The above steps (Reading, Viewing, Accessing, Changing, ..., etc) are very crucial for Lecture 3 to lecture 11. If you DON’T know how to perform them in R, please let us know as soon as possible. Base R Cheat Sheet Getting Help Accessing the help files ?mean Get help of a particular function. help.search(‘weighted mean’) Search the help files for a word or phrase. help(package = ‘dplyr’) Find help for a package. More about an object str(iris) Get a summary of an object’s structure. class(iris) Find the class an object belongs to. Download and install a package from CRAN. library(dplyr) Load the package into the session, making all its functions available to use. dplyr::select Use a particular function from a package. data(iris) Load a built-in dataset into the environment. Working Directory setwd(‘C://file/path’) 2:6 2 3 4 5 6 seq(2, 3, by=0.5) 2.0 2.5 3.0 A complex sequence rep(1:2, times=3) 1 2 1 2 1 2 Repeat a vector rep(1:2, each=3) 1 1 1 2 2 2 Repeat elements of a vector An integer sequence While Loop for (variable in sequence){ while (condition){ Do something Do something } } Example Example for (i in 1:4){ while (i < 5){ j <- i + 10 print(i) print(j) i <- i + 1 } } Vector Functions sort(x) Return x sorted. table(x) See counts of values. rev(x) Return x reversed. unique(x) See unique values. Selecting Vector Elements Functions If Statements function_name <- function(var){ if (condition){ Do something } else { Do something different } Do something } return(new_variable) Example Example By Position x[4] The fourth element. x[-4] All but the fourth. x[2:4] Elements two to four. x[-(2:4)] All elements except two to four. x[c(1, 5)] Elements one and five. square <- function(x){ if (i > 3){ print(‘Yes’) squared <- x*x } else { print(‘No’) return(squared) } } Reading and Writing Data Input Ouput Description df <- read.table(‘file.txt’) write.table(df, ‘file.txt’) Read and write a delimited text file. df <- read.csv(‘file.csv’) write.csv(df, ‘file.csv’) Read and write a comma separated value file. This is a special case of read.table/ write.table. load(‘file.RData’) save(df, file = ’file.Rdata’) Read and write an R data file, a file type special for R. By Value x[x == 10] Elements which are equal to 10. x[x < 0] All elements less than zero. x[x %in% c(1, 2, 5)] Elements in the set 1, 2, 5. Change the current working directory. Use projects in RStudio to set the working directory to the folder you are working in. Join elements into a vector 2 4 6 getwd() Find the current working directory (where inputs are found and outputs are sent). For Loop c(2, 4, 6) Using Libraries install.packages(‘dplyr’) Programming Vectors Creating Vectors Named Vectors x[‘apple’] RStudio® is a trademark of RStudio, Inc. • CC BY Mhairi McNeill • [email protected] Element with name ‘apple’. Conditions a == b Are equal a > b Greater than a >= b Greater than or equal to is.na(a) Is missing a != b Not equal a < b Less than a <= b Less than or equal to is.null(a) Is null Learn more at web page or vignette • package version • Updated: 3/15 Types Matrixes TRUE, FALSE, TRUE Boolean values (TRUE or FALSE). Integers or floating point numbers. as.numeric 1, 0, 1 as.character '1', '0', '1' Character strings. Generally preferred to factors. as.factor '1', '0', '1', levels: '1', '0' Character strings with preset levels. Needed for some statistical models. log(x) Natural log. sum(x) Sum. exp(x) Exponential. mean(x) Mean. max(x) Largest element. median(x) Median. min(x) Smallest element. quantile(x) Percentage quantiles. Round to n decimal places. rank(x) signif(x, n) Round to n significant figures. var(x) The variance. cor(x, y) Correlation. sd(x) The standard deviation. round(x, n) w w w ww w w w w ww w w w w ww w t(m) Transpose m %*% n Matrix Multiplication solve(m, n) Find x in: m * x = n ] - Select a row m[ , 1] - Select a column m[2, 3] - Select an element Rank of elements. l[[2]] l[1] l$x l['y'] Second element of l. New list with only the first element. Element named x. New list with only element named y. Also see the dplyr library. Data Frames df <- data.frame(x = 1:3, y = c('a', 'b', 'c')) A special case of a list where all elements are the same length. Variable Assignment x The Environment ls() List all variables in the environment. rm(x) Remove x from the environment. rm(list = ls()) Remove all variables from the environment. You can use the environment panel in RStudio to browse variables in your environment. y 1 a 2 b 3 c Matrix subsetting df[ , 2] df[2, ] df[2, 2] RStudio® is a trademark of RStudio, Inc. • CC BY Mhairi McNeill • [email protected] • 844-448-1212 • rstudio.com grep(pattern, x) Join elements of a vector together. Find regular expression matches in x. toupper(x) Convert to uppercase. tolower(x) Convert to lowercase. nchar(x) Number of characters in a string. Factors List subsetting > a <- 'apple' > a [1] 'apple' Join multiple vectors together. gsub(pattern, replace, x) Replace matches in x with a string. Lists l <- list(x = 1:5, y = c('a', 'b')) A list is collection of elements which can be of different types. Maths Functions Also see the stringr library. paste(x, y, sep = ' ') paste(x, collapse = ' ') m[2, as.logical Strings m <- matrix(x, nrow = 3, ncol = 3) Create a matrix from x. Converting between common data types in R. Can always go from a higher value in the table to a lower value. df[[2]] df$x factor(x) Turn a vector into a factor. Can set the levels of the factor and the order. cut(x, breaks = 4) Turn a numeric vector into a factor but ‘cutting’ into sections. Statistics lm(x ~ y, data=df) Linear model. glm(x ~ y, data=df) Generalised linear model. summary Get more detailed information out a model. t.test(x, y) Preform a t-test for difference between means. pairwise.t.test Preform a t-test for paired data. prop.test Test for a difference between proportions. aov Analysis of variance. Distributions Understanding a data frame View(df) See the full data frame. head(df) See the first 6 rows. nrow(df) Number of rows. ncol(df) Number of columns. dim(df) Number of columns and rows. cbind - Bind columns. Random Variates Normal rnorm Cumulative Distribution dnorm pnorm Quantile qnorm Poison rpois dpois ppois qpois Binomial rbinom dbinom pbinom qbinom Uniform runif dunif punif qunif Plotting rbind - Bind rows. Density Function plot(x) Values of x in order. Dates Also see the ggplot2 library. plot(x, y) Values of x against y. hist(x) Histogram of x. See the lubridate library. Learn more at web page or vignette • package version • Updated: 3/15 CSE5DEV Syllabus Week-Overview Reporting & Data Communication Overview 1 CSE5DEV Syllabus 2 Week-Overview 3 Reporting & Data Communication 4 Examples of data communication Examples of data communicatio CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Reporting & Data Communication — Reporting — Once you have explored the given data and interpreted key findings, the next step is to report and communicate your findings (results). Report A report is a written document that presents information in an organised format for a specific audience and purpose. A report can fulfil many functions or purposes: • To ensure proper departmental functioning. • To provide information. • To provide the results of an analysis. • To persuade others to act. • To create an organisational memory. CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Reporting & Data Communication — Reporting — Keys to successful reporting projects: • Clarity • Brevity • Completeness • Correctness • Report types (in terms of content and format) 1 Informal: a single letter or a memo. 2 Formal: 10 to 100 pages. Includes cover, summary and text. 3 Short report periodic, informative, investigative. CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Reporting & Data Communication — Data Communication — Once you have explored the given data and interpreted key findings, the next step is to report and communicate your findings (results). Data Communication Data Communication is the process of presenting data in a clear, accurate, and compelling way that will ultimately support decision making. This is also know as storytelling with data. Indeed, data exploration is just a collection of numbers and figures until you turn it into a story. A key success in data exploration and analysis process is being able to clearly communicate results so different stakeholders can understand: Explain, Justify, Convey, Persuade and Express. Why is it important? Telling a great story can be very useful for making a better decision to improve business intelligence and organisations. CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Reporting & Data Communication — Data Communication — Data communication is the process of turning numbers and figures into a story. • How to Tell a Story with Data? • How Can We Do It? According to Storytelling with Data: A Data Visualization Guide for Business Professionals book, the above questions can be answered by implementing and understanding the following key points: by C. N. Knaflic ▶ ▶ ▶ ▶ ▶ ▶ Understand the context Choose an appropriate visual display Eliminate clutter Focus attention where you want it Think like a designer Tell a story The examples in next slides and adopted from Storytelling with Data: A Data Visualization Guide for Business Professionals by C. N. Knaflic. The figures are taken from : https://github.com/adamribaudo/storytelling-with-data-ggplot CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Reporting & Data Communication — Data Communication: Understand the context — Before you start preparing and selecting the graphics for the data visualisation, you need to know: Who is your audience? A good understanding of who your audience is and how they perceive you can help to identify common ground to ensure they hear your message. What do you need them to know or do? How to make what you communicate relevant for audience and form a clear understanding of why they should care about what you say. How will you communicate to your audience? The method you will use to communicate to your audience: live presentation, a written document or email. CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Reporting & Data Communication — Data Communication: Understand the context — Example: Who, what, and how. A science teacher • conducted a summer learning program on science that was aimed at giving kids exposure to the unpopular subject • implemented a survey at end of the program to understand how their perceptions toward science changed. The teacher believe the data shows a great success story and you would like to offer this program the in next summer. • Who: the parents of students (current and future), the future student, teachers, budget committee. • What: The summer program was very successful. Please approve budget of $X to offer it next summer. • How: Illustrate success using the collected data before and after program. CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Reporting & Data Communication — Data Communication: Choose an appropriate visual display — Think of a time when you attended a presentation that included poorly designed and graphs containing extraneous information. Was it memorable? What do you think about above graph? Is it easy to understand and interpret? No, right? And this is because it is not communicating the results effectively. CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Reporting & Data Communication — Data Communication: Choose an appropriate visual display — For example, do you ever use a weather app to determine how to dress for that day? If you open the app and see a cloud with lightning at the top of the app, you have a good idea that it is going to be a stormy, rainy day without having to read any data about temperature, barometric pressure, and humidity. This example shows you how a simple visual helps you gain quick insight and make a quick decision (in this case, to wear a raincoat and carry an umbrella). Believe it or not, you just consumed a good data visualisation. CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Reporting & Data Communication — Data Communication: Choose an appropriate visual display — Good and effective data visualisation methods should be. • Useful: People use it on a regular basis and can make relevant decisions by viewing all the information they need in one place. • Desirable: It’s not only easy to use but also pleasurable to use. • Usable: People who use it can accomplish their goals quickly and easily. CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Reporting & Data Communication — Data Communication: Choose an appropriate visual display — CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Reporting & Data Communication — Data Communication: Choose an appropriate visual display — CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Reporting & Data Communication — Data Communication: Choose an appropriate visual display — CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Reporting & Data Communication — Data Communication: Choose an appropriate visual display — CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Reporting & Data Communication — Data Communication: Choose an appropriate visual display — CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Reporting & Data Communication — Data Communication: Eliminate clutter — Clutter Visual clutter creates excessive cognitive load that can hinder the transmission of our message. Often there are so many visual elements in a single graph which occupy spaces but do not increase understanding of our audience. Indeed, every single element you add takes up cognitive load on the part of your audience – takes brain power to process. We should • identify anything that is not adding informative value and remove those things. CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Reporting & Data Communication — Data Communication: Eliminate clutter — Example: The following figure shows the plot of the monthly trend of incoming tickets and those processed over the past calendar year What things can we remove or change? Several changes can be made to reduce clutter. CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Reporting & Data Communication — Data Communication: Eliminate clutter — Example: Remove chart border CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Reporting & Data Communication — Data Communication: Eliminate clutter — Example: Remove gridlines CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Reporting & Data Communication — Data Communication: Eliminate clutter — Example: Remove data markers CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Reporting & Data Communication — Data Communication: Eliminate clutter — Example: Clean up axis labels CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Reporting & Data Communication — Data Communication: Eliminate clutter — Example: Label data directly CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Reporting & Data Communication — Data Communication: Eliminate clutter — Example: Leverage consistent colour CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Reporting & Data Communication — Data Communication: Focus your audience’s attention — Once we have selected an effective graphic and removed all unnecessary clutter, now we need to figure out how to draw audience’s attention to a particular findings using Preattentive Attributes. Examples of preattentive attributes are: • Colour, Length, Width, Orientation, Shape, Enclosure, Position, Grouping. Preattentive attributes are very helpful in: • directing audience’s attention to where you want them to focus it • creating a visual hierarchy of elements to lead your audience through the information you want to communicate in the way you want them to process it. CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Reporting & Data Communication — Data Communication: Focus your audience’s attention — Example of Preattentive Attributes. CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Reporting & Data Communication — Data Communication: Focus your audience’s attention — Example of Preattentive Attributes. CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Reporting & Data Communication — Data Communication: Focus your audience’s attention — Example of Preattentive Attributes in text CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Reporting & Data Communication — Data Communication: Focus your audience’s attention — Example of Preattentive Attributes. CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Reporting & Data Communication — Data Communication: Focus your audience’s attention — Example of Preattentive Attributes. CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Reporting & Data Communication — Data Communication: Focus your audience’s attention — Example of Preattentive Attributes. CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Reporting & Data Communication — Data Communication: Think Like a Designer — Think Like a Designer is the process of designing an effective product by crafting your own graphics. Designers often focus on how data visualisations communicate with audience and how the audience interacts with presented graphics. There are three traditional design concepts than can applied to data communicating: • Affordances • Accessibility • Aesthetics CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Reporting & Data Communication — Data Communication: Think Like a Designer — Affordances • Highlight the important stuff • Bold, italics, and underlining: Use for titles, labels, captions, and short word sequences to differentiate elements. • CASE and typeface: Uppercase text in short word sequences is easily scanned. • Inversing elements: is effective at attracting attention. • Size is another way to attract attention and signal importance • Eliminate distractions • Create a clear visual hierarchy of information CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Reporting & Data Communication — Data Communication: Think Like a Designer — Affordances CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Reporting & Data Communication — Data Communication: Think Like a Designer — Affordances: Eliminate distractions CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Reporting & Data Communication — Data Communication: Think Like a Designer — Affordances: Create a clear visual hierarchy of information CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Reporting & Data Communication — Data Communication: Think Like a Designer — Accessibility CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Reporting & Data Communication — Data Communication: Think Like a Designer — Aesthetics CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Reporting & Data Communication — Data Communication: Think Like a Designer — Example: CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Reporting & Data Communication — Data Communication: Think Like a Designer — Example: CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Reporting & Data Communication — Data Communication: Think Like a Designer — Example: CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Reporting & Data Communication — Data Communication: Tell a Story — Use stories to engage our audience emotionally in a way that goes beyond what facts can do. Keep it simple be authentic. CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Reporting & Data Communication — Data Communication: Tell a Story — Example: 3-minute story and Big Idea using summer learning program on science example CSE5DEV Syllabus Week-Overview Reporting & Data Communication Overview 1 CSE5DEV Syllabus 2 Week-Overview 3 Reporting & Data Communication 4 Examples of data communication Examples of data communicatio CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio Examples of data communication The following examples are taken from: R Programming for Research book. You can get the book from https://geanders.github.io/RProgrammingForResearch/. CSE5DEV Syllabus Week-Overview Reporting & Data Communication Examples of data communicatio End of Week 10 See you Next Lecture (Week 11) Case Study Table: CSE5DEV Timetable Check LMS

Tags

data analysis data communication computer science
Use Quizgecko on...
Browser
Browser