Full Transcript

CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation CSE5DEV DATA EXPLORATION AND ANALYSIS Week 5 Data Visualisation CSE5DEV Syllabus Week-Overview Data Visualisation Overview 1 CSE5DEV Syllabus 2 Week-Overview 3 Data Visualisation 4 Examples of Data Visualisation...

CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation CSE5DEV DATA EXPLORATION AND ANALYSIS Week 5 Data Visualisation CSE5DEV Syllabus Week-Overview Data Visualisation Overview 1 CSE5DEV Syllabus 2 Week-Overview 3 Data Visualisation 4 Examples of Data Visualisation Examples of Data Visualisation CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Subject Syllabus — Lecture 1 — Introduction — Lecture 2 — Data Collection & R Programming — Lecture 3 — Data Wrangling & R Programming — Lecture 4 — Data Cleaning & Normalisation — Lecture 5 — Data Visualisation CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Subject Syllabus — Lecture 1 — Introduction — Lecture 2 — Data Collection & R Programming — Lecture 3 — Data Wrangling & R Programming — Lecture 4 — Data Cleaning & Normalisation — Lecture 5 — Data Visualisation Lecture 6 Lecture 7 Lecture 8 Data Exploration 1 Data Exploration 2 Data Exploration 3 Analysis Analysis Analysis Lecture 10 Case Study 1 Lecture 11 Case Study 2 Lecture 12 Revision Lecture 9 Correlation & Pattern Discovery Analysis CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Science Project Almost all data science and analysis projects require the same set of stages to be performed. These are: CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Science Project Almost all data science and analysis projects require the same set of stages to be performed. These are: Stage -1 Identify the problem (question) Stage - 2 Collect & Prepare the data Stage - 3 Explore the data Stage - 4 Communicate the results What is the goal? What do you want to estimate? How to track houses prices across different areas? Data resources Descriptive statistics What are the findings? Data representation Visualisation What we learn? Clean and normalise the data Report the findings Does the result make sense? CSE5DEV Syllabus Week-Overview Data Visualisation Overview 1 CSE5DEV Syllabus 2 Week-Overview 3 Data Visualisation 4 Examples of Data Visualisation Examples of Data Visualisation CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Week 5 Overview Data Visualisation This week will be covering the basics of Data Visualisation. Learning outcomes: • Learn about the benefit of visualisation. • Learn about data visualisation methods. • Learn how to use charts and graphs. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation What we have learned so far? Data can be in different formats, but computer program expects your data to be organised in a well-defined structure. What we have learned so far? —— Theory —— CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation What we have learned so far? Data can be in different formats, but computer program expects your data to be organised in a well-defined structure. What we have learned so far? —— Theory —— 1 Data source and format: CSV data, Txt data, ..., etc. 2 Variable names, data types and data structure. 3 Data representation: Tabular representation (observations-by-features). 4 Data Cleaning & Normalising. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation What we have learned so far? What we have learned so far? —— R Programming —— CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation What we have learned so far? What we have learned so far? —— R Programming —— 1 Install R and Rstudio, create Rmarkdown file, write and run basic codes, ..etc CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation What we have learned so far? What we have learned so far? —— R Programming —— 1 2 Install R and Rstudio, create Rmarkdown file, write and run basic codes, ..etc Data Type and data structure (vector, factor, matrix and data frame) CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation What we have learned so far? What we have learned so far? —— R Programming —— 1 2 3 Install R and Rstudio, create Rmarkdown file, write and run basic codes, ..etc Data Type and data structure (vector, factor, matrix and data frame) View, Access, Change.... etc. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation What we have learned so far? What we have learned so far? —— R Programming —— 1 2 3 4 Install R and Rstudio, create Rmarkdown file, write and run basic codes, ..etc Data Type and data structure (vector, factor, matrix and data frame) View, Access, Change.... etc. Import data into R Environment (text file and csv files) CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation What we have learned so far? What we have learned so far? —— R Programming —— 1 2 3 4 5 Install R and Rstudio, create Rmarkdown file, write and run basic codes, ..etc Data Type and data structure (vector, factor, matrix and data frame) View, Access, Change.... etc. Import data into R Environment (text file and csv files) Correct or change the format of the data to make it tidy CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation What we have learned so far? What we have learned so far? —— R Programming —— 1 2 3 4 5 6 Install R and Rstudio, create Rmarkdown file, write and run basic codes, ..etc Data Type and data structure (vector, factor, matrix and data frame) View, Access, Change.... etc. Import data into R Environment (text file and csv files) Correct or change the format of the data to make it tidy Clean the data CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation What we have learned so far? What we have learned so far? —— R Programming —— 1 2 3 4 5 6 7 Install R and Rstudio, create Rmarkdown file, write and run basic codes, ..etc Data Type and data structure (vector, factor, matrix and data frame) View, Access, Change.... etc. Import data into R Environment (text file and csv files) Correct or change the format of the data to make it tidy Clean the data Handle missing values CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation What we have learned so far? What we have learned so far? —— R Programming —— 1 2 3 4 5 6 7 8 Install R and Rstudio, create Rmarkdown file, write and run basic codes, ..etc Data Type and data structure (vector, factor, matrix and data frame) View, Access, Change.... etc. Import data into R Environment (text file and csv files) Correct or change the format of the data to make it tidy Clean the data Handle missing values Normalise the data CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation What we have learned so far? What we have learned so far? —— R Programming —— 1 2 3 4 5 6 7 8 Install R and Rstudio, create Rmarkdown file, write and run basic codes, ..etc Data Type and data structure (vector, factor, matrix and data frame) View, Access, Change.... etc. Import data into R Environment (text file and csv files) Correct or change the format of the data to make it tidy Clean the data Handle missing values Normalise the data Note The above steps (Reading, Viewing, Accessing, Changing, ..., etc) are very crucial for Lecture 3 to lecture 11. If you DON’T know how to perform them in R, please let us know as soon as possible. Base R Cheat Sheet Getting Help Accessing the help files ?mean Get help of a particular function. help.search(‘weighted mean’) Search the help files for a word or phrase. help(package = ‘dplyr’) Find help for a package. More about an object str(iris) Get a summary of an object’s structure. class(iris) Find the class an object belongs to. Using Libraries install.packages(‘dplyr’) Download and install a package from CRAN. library(dplyr) Load the package into the session, making all its functions available to use. dplyr::select Use a particular function from a package. data(iris) Load a built-in dataset into the environment. Working Directory setwd(‘C://file/path’) Join elements into a vector 2 4 6 2:6 2 3 4 5 6 seq(2, 3, by=0.5) 2.0 2.5 3.0 A complex sequence rep(1:2, times=3) 1 2 1 2 1 2 Repeat a vector rep(1:2, each=3) 1 1 1 2 2 2 Repeat elements of a vector An integer sequence While Loop for (variable in sequence){ while (condition){ Do something Do something } } Example Example for (i in 1:4){ while (i < 5){ j <- i + 10 print(i) print(j) i <- i + 1 } } Vector Functions sort(x) Return x sorted. table(x) See counts of values. rev(x) Return x reversed. unique(x) See unique values. Selecting Vector Elements Functions If Statements function_name <- function(var){ if (condition){ Do something } else { Do something different } Do something } return(new_variable) Example Example By Position x[4] The fourth element. x[-4] All but the fourth. x[2:4] Elements two to four. x[-(2:4)] All elements except two to four. x[c(1, 5)] Elements one and five. square <- function(x){ if (i > 3){ print(‘Yes’) squared <- x*x } else { print(‘No’) return(squared) } } Reading and Writing Data Input Ouput Description df <- read.table(‘file.txt’) write.table(df, ‘file.txt’) Read and write a delimited text file. df <- read.csv(‘file.csv’) write.csv(df, ‘file.csv’) Read and write a comma separated value file. This is a special case of read.table/ write.table. load(‘file.RData’) save(df, file = ’file.Rdata’) Read and write an R data file, a file type special for R. By Value x[x == 10] Elements which are equal to 10. x[x < 0] All elements less than zero. x[x %in% c(1, 2, 5)] Elements in the set 1, 2, 5. Change the current working directory. Use projects in RStudio to set the working directory to the folder you are working in. For Loop c(2, 4, 6) getwd() Find the current working directory (where inputs are found and outputs are sent). Programming Vectors Creating Vectors Named Vectors x[‘apple’] RStudio® is a trademark of RStudio, Inc. • CC BY Mhairi McNeill • [email protected] Element with name ‘apple’. Conditions a == b Are equal a > b Greater than a >= b Greater than or equal to is.na(a) Is missing a != b Not equal a < b Less than a <= b Less than or equal to is.null(a) Is null Learn more at web page or vignette • package version • Updated: 3/15 Types Matrixes m[2, as.logical TRUE, FALSE, TRUE Boolean values (TRUE or FALSE). Integers or floating point numbers. as.numeric 1, 0, 1 as.character '1', '0', '1' Character strings. Generally preferred to factors. as.factor '1', '0', '1', levels: '1', '0' Character strings with preset levels. Needed for some statistical models. w w w ww w w w w ww w w w w ww w t(m) Transpose m %*% n Matrix Multiplication solve(m, n) Find x in: m * x = n ] - Select a row m[ , 1] - Select a column m[2, 3] - Select an element log(x) Natural log. sum(x) Sum. exp(x) Exponential. mean(x) Mean. max(x) Largest element. median(x) Median. min(x) Smallest element. quantile(x) Percentage quantiles. Round to n decimal places. rank(x) signif(x, n) Round to n significant figures. var(x) The variance. cor(x, y) Correlation. sd(x) The standard deviation. Rank of elements. Variable Assignment > a <- 'apple' > a [1] 'apple' l[1] l$x l['y'] Second element of l. New list with only the first element. Element named x. New list with only element named y. Also see the dplyr library. df <- data.frame(x = 1:3, y = c('a', 'b', 'c')) A special case of a list where all elements are the same length. List subsetting ls() List all variables in the environment. rm(x) Remove x from the environment. rm(list = ls()) Remove all variables from the environment. You can use the environment panel in RStudio to browse variables in your environment. y a 2 b 3 The Environment Data Frames 1 c Matrix subsetting df[ , 2] df[2, ] df[2, 2] RStudio® is a trademark of RStudio, Inc. • CC BY Mhairi McNeill • [email protected] • 844-448-1212 • rstudio.com Join multiple vectors together. grep(pattern, x) Join elements of a vector together. Find regular expression matches in x. gsub(pattern, replace, x) Replace matches in x with a string. toupper(x) Convert to uppercase. tolower(x) Convert to lowercase. nchar(x) Number of characters in a string. Factors l[[2]] x Also see the stringr library. paste(x, y, sep = ' ') paste(x, collapse = ' ') Lists l <- list(x = 1:5, y = c('a', 'b')) A list is collection of elements which can be of different types. Maths Functions round(x, n) Strings m <- matrix(x, nrow = 3, ncol = 3) Create a matrix from x. Converting between common data types in R. Can always go from a higher value in the table to a lower value. df[[2]] df$x factor(x) Turn a vector into a factor. Can set the levels of the factor and the order. cut(x, breaks = 4) Turn a numeric vector into a factor but ‘cutting’ into sections. Statistics lm(x ~ y, data=df) Linear model. glm(x ~ y, data=df) Generalised linear model. summary Get more detailed information out a model. t.test(x, y) Preform a t-test for difference between means. pairwise.t.test Preform a t-test for paired data. prop.test Test for a difference between proportions. aov Analysis of variance. Distributions Understanding a data frame View(df) See the full data frame. head(df) See the first 6 rows. nrow(df) Number of rows. ncol(df) Number of columns. dim(df) Number of columns and rows. cbind - Bind columns. Random Variates Normal rnorm Cumulative Distribution dnorm pnorm Quantile qnorm Poison rpois dpois ppois qpois Binomial rbinom dbinom pbinom qbinom Uniform runif dunif punif qunif Plotting rbind - Bind rows. Density Function plot(x) Values of x in order. Dates Also see the ggplot2 library. plot(x, y) Values of x against y. hist(x) Histogram of x. See the lubridate library. Learn more at web page or vignette • package version • Updated: 3/15 CSE5DEV Syllabus Week-Overview Data Visualisation Overview 1 CSE5DEV Syllabus 2 Week-Overview 3 Data Visualisation 4 Examples of Data Visualisation Examples of Data Visualisation CSE5DEV Syllabus Week-Overview Data Visualisation Data Visualisation Examples of Data Visualisation CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation Given an example of dataset. What is the best way to explore data variables? CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation Given an example of dataset. What is the best way to explore data variables? ▶ If your data involves a small number of samples, you might just print them out on the screen or paper and investigate them quickly before doing any analysis. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation Given an example of dataset. What is the best way to explore data variables? ▶ If your data involves a small number of samples, you might just print them out on the screen or paper and investigate them quickly before doing any analysis. ▶ If you have a huge dataset, then you need some assistance to explore the data. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation Given an example of dataset. What is the best way to explore data variables? ▶ If your data involves a small number of samples, you might just print them out on the screen or paper and investigate them quickly before doing any analysis. ▶ If you have a huge dataset, then you need some assistance to explore the data. ▶ We can use R programming to display the huge dataset as tables, but we can only explore the pattern of a specific parameter. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation Given an example of dataset. What is the best way to explore data variables? ▶ If your data involves a small number of samples, you might just print them out on the screen or paper and investigate them quickly before doing any analysis. ▶ If you have a huge dataset, then you need some assistance to explore the data. ▶ We can use R programming to display the huge dataset as tables, but we can only explore the pattern of a specific parameter. ▶ How about if want to explore the patterns or relationships of one or several parameters? CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation Given an example of dataset. What is the best way to explore data variables? ▶ If your data involves a small number of samples, you might just print them out on the screen or paper and investigate them quickly before doing any analysis. ▶ If you have a huge dataset, then you need some assistance to explore the data. ▶ We can use R programming to display the huge dataset as tables, but we can only explore the pattern of a specific parameter. ▶ How about if want to explore the patterns or relationships of one or several parameters? Data visualisation can help you to explore and analyse a huge dataset in an effective way. CSE5DEV Syllabus Week-Overview Data Visualisation Data Visualisation Examples of Data Visualisation CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation Data visualisation Data visualisation is the process of displaying data or information in graphical charts, figures and bars. It helps to make our data more understandable. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation Data visualisation Data visualisation is the process of displaying data or information in graphical charts, figures and bars. It helps to make our data more understandable. Visualising data via graphics is an important stage of data analysis CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation Data visualisation Data visualisation is the process of displaying data or information in graphical charts, figures and bars. It helps to make our data more understandable. Visualising data via graphics is an important stage of data analysis 1 to gain valuable insights that we can not find by just scanning at the raw data in a paper or spreadsheet form. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation Data visualisation Data visualisation is the process of displaying data or information in graphical charts, figures and bars. It helps to make our data more understandable. Visualising data via graphics is an important stage of data analysis 1 to gain valuable insights that we can not find by just scanning at the raw data in a paper or spreadsheet form. 2 to understand basic properties of the data. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation Data visualisation Data visualisation is the process of displaying data or information in graphical charts, figures and bars. It helps to make our data more understandable. Visualising data via graphics is an important stage of data analysis 1 to gain valuable insights that we can not find by just scanning at the raw data in a paper or spreadsheet form. 2 to understand basic properties of the data. 3 to suggest possible modelling strategies. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation Data visualisation Data visualisation is the process of displaying data or information in graphical charts, figures and bars. It helps to make our data more understandable. Visualising data via graphics is an important stage of data analysis 1 to gain valuable insights that we can not find by just scanning at the raw data in a paper or spreadsheet form. 2 to understand basic properties of the data. 3 to suggest possible modelling strategies. 4 to see and understand trends, outliers, ..etc. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation Data visualisation Data visualisation is the process of displaying data or information in graphical charts, figures and bars. It helps to make our data more understandable. Visualising data via graphics is an important stage of data analysis 1 to gain valuable insights that we can not find by just scanning at the raw data in a paper or spreadsheet form. 2 to understand basic properties of the data. 3 to suggest possible modelling strategies. 4 to see and understand trends, outliers, ..etc. 5 to ”debug” an analysis, if an unexpected result occurs, or to communicate your findings to others. CSE5DEV Syllabus Week-Overview Data Visualisation Data Visualisation Examples of Data Visualisation CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation Big Data and Data Visualisation CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation Big Data and Data Visualisation • In Big Data era, data visualisation methods and technologies are essential to analyse massive amounts of information and make data-driven decisions. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation Big Data and Data Visualisation • In Big Data era, data visualisation methods and technologies are essential to analyse massive amounts of information and make data-driven decisions. • Tables can be used where users need to see the pattern of a specific parameter, while charts can be used to show patterns or relationships in the data for one or more parameters. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation Big Data and Data Visualisation • In Big Data era, data visualisation methods and technologies are essential to analyse massive amounts of information and make data-driven decisions. • Tables can be used where users need to see the pattern of a specific parameter, while charts can be used to show patterns or relationships in the data for one or more parameters. Data visualisation can help you to explore and analyse a huge data in an effective way. CSE5DEV Syllabus Week-Overview Data Visualisation Data Visualisation Examples of Data Visualisation CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Advantages of Data Visualisation — CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Advantages of Data Visualisation — ▶ graphics and figures can be easily communicated and understood by readers. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Advantages of Data Visualisation — ▶ graphics and figures can be easily communicated and understood by readers. ▶ it can be accessed quickly by a wider audience. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Advantages of Data Visualisation — ▶ graphics and figures can be easily communicated and understood by readers. ▶ it can be accessed quickly by a wider audience. ▶ it conveys a lot of information in a small space. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Advantages of Data Visualisation — ▶ graphics and figures can be easily communicated and understood by readers. ▶ it can be accessed quickly by a wider audience. ▶ it conveys a lot of information in a small space. ▶ it makes your report more visually appealing. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Advantages of Data Visualisation — ▶ graphics and figures can be easily communicated and understood by readers. ▶ it can be accessed quickly by a wider audience. ▶ it conveys a lot of information in a small space. ▶ it makes your report more visually appealing. ▶ it can convert raw data into insights. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Advantages of Data Visualisation — ▶ graphics and figures can be easily communicated and understood by readers. ▶ it can be accessed quickly by a wider audience. ▶ it conveys a lot of information in a small space. ▶ it makes your report more visually appealing. ▶ it can convert raw data into insights. ▶ it can help to find simple and complex patterns in data CSE5DEV Syllabus Week-Overview Data Visualisation Data Visualisation Examples of Data Visualisation CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation Data visualisation charts can be used for four basic presentation types: CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation Data visualisation charts can be used for four basic presentation types: 1 Comparison CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation Data visualisation charts can be used for four basic presentation types: 1 Comparison 2 Distribution CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation Data visualisation charts can be used for four basic presentation types: 1 Comparison 2 Distribution 3 Relationship CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation Data visualisation charts can be used for four basic presentation types: 1 Comparison 2 Distribution 3 Relationship 4 Composition CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation Data visualisation charts can be used for four basic presentation types: 1 Comparison 2 Distribution 3 Relationship 4 Composition Following are the most common graph charts in data visualisation: CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation Data visualisation charts can be used for four basic presentation types: 1 Comparison 2 Distribution 3 Relationship 4 Composition Following are the most common graph charts in data visualisation: Bar Chart Pie Chart Heat Map Indicators Funnel Chart Line Chart Histogram Gauge Tables Area Chart Scatterplot Box Plot Maps Tree Map Radar or Spider CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — Line plot (chart) Line plots are generally used to present observations collected at regular intervals. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — Line plot (chart) Line plots are generally used to present observations collected at regular intervals. The x-axis represents the regular interval and y-axis shows the observations, ordered by the x-axis and connected by a line. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — Example: Line plot (chart) A line plot shows the relationship between TWO Numerical variables when the variable on the x-axis, also called the explanatory variable, is of a sequential nature. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — Example: Line plot (chart) A line plot shows the relationship between TWO Numerical variables when the variable on the x-axis, also called the explanatory variable, is of a sequential nature. Figure: Hourly temp for Jan 1-15 CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — Bar chart Bar charts are used to present relative quantities for multiple categories. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — Bar chart Bar charts are used to present relative quantities for multiple categories. • x-axis represents the categories and are spaced evenly. • y-axis represents the quantity for each category and is drawn as a bar from the baseline to the appropriate level on the y-axis. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — Example: Bar chart Bar chart is used to visualise the distribution of numerical or categorical variables. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — Example: Bar chart Bar chart is used to visualise the distribution of numerical or categorical variables. 20000 count 15000 10000 5000 0 Fair Good Very Good cut Premium Ideal CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — Pie chart Pie charts are similar to Bar charts which can be used to visualise the distribution of numerical or categorical variables. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — Pie chart Pie charts are similar to Bar charts which can be used to visualise the distribution of numerical or categorical variables. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — Histogram plot A histogram plot is used to summarise the distribution of a data sample. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — Histogram plot A histogram plot is used to summarise the distribution of a data sample. • x-axis represents discrete bins or intervals for the observations. • y-axis represents the frequency or count of the number of observations in the dataset that belong to each bin. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — Example: Histogram plot A histogram is used to visualises the distribution of a numerical value. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — Example: Histogram plot A histogram is used to visualises the distribution of a numerical value. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — Box and Whisker Plot A box and whisker plot is used to summarise the distribution of a data sample. It displays the five-number summary of data: the minimum min, first quartile (Q1), median, third quartile (Q3) and maximum max. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — Box and Whisker Plot A box and whisker plot is used to summarise the distribution of a data sample. It displays the five-number summary of data: the minimum min, first quartile (Q1), median, third quartile (Q3) and maximum max. • x-axis can be used to represent the data sample, where multiple box plots can be drawn side by side on the x-axis if desired. • y-axis represents the observation values. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — Example: Box and Whisker Plot Box plot is a standardised way of showing the distribution of data based on a five number summary (minimum, first quartile (Q1), median, third quartile (Q3), and maximum). CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — Example: Box and Whisker Plot Box plot is a standardised way of showing the distribution of data based on a five number summary (minimum, first quartile (Q1), median, third quartile (Q3), and maximum). Figure: Month by temp boxplot CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — Scatter Plot A scatter plot is used to summarise the relationship between two paired data samples. Each point on the plot represents a single observation. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — Scatter Plot A scatter plot is used to summarise the relationship between two paired data samples. Each point on the plot represents a single observation. • x-axis represents observation values for the first sample. • y-axis represents the observation values for the second sample. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — Example: Scatter Plot A scatter plot is also called a bivariate plot that allow you to visualise the relationship between two numerical variables. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — Example: Scatter Plot A scatter plot is also called a bivariate plot that allow you to visualise the relationship between two numerical variables. Figure: Arrival Delays vs Departure Delays CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — Heat Map Heat map can be used to show the relationship between two or three or many variables in a two-dimensional image. It uses the intensity of colours to explore two dimensions of the axis and the third dimension by an intensity of colour. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — Heat Map Heat map can be used to show the relationship between two or three or many variables in a two-dimensional image. It uses the intensity of colours to explore two dimensions of the axis and the third dimension by an intensity of colour. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — Example: Heat Map Heat map is used to is used to analyse the data as colours in two dimensions. It displays a correlation between all numerical variables in the datasets. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — Example: Heat Map Heat map is used to is used to analyse the data as colours in two dimensions. It displays a correlation between all numerical variables in the datasets. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — Correlograms Correlograms can be used to visualise the data in correlation matrices. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — Correlograms Correlograms can be used to visualise the data in correlation matrices. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — Example: Correlograms Correlograms is used to highlight the most correlated variables in a correlation matrix. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Data Visualisation — Common Graph Types — Example: Correlograms Correlograms is used to highlight the most correlated variables in a correlation matrix. CSE5DEV Syllabus Week-Overview Data Visualisation Overview 1 CSE5DEV Syllabus 2 Week-Overview 3 Data Visualisation 4 Examples of Data Visualisation Examples of Data Visualisation CSE5DEV Syllabus Week-Overview Data Visualisation Examples of data visualisation Examples of Data Visualisation CSE5DEV Syllabus Week-Overview Data Visualisation Examples of data visualisation In this lecture, we will learn Examples of Data Visualisation CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation In this lecture, we will learn • How to use R to visualise single or several variables. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation In this lecture, we will learn • How to use R to visualise single or several variables. • How to use R ggplot2 plotting package to create simple and complex plots from data in a data frame structure. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation In this lecture, we will learn • How to use R to visualise single or several variables. • How to use R ggplot2 plotting package to create simple and complex plots from data in a data frame structure. To this end, its assumed that you KNOW how to CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation In this lecture, we will learn • How to use R to visualise single or several variables. • How to use R ggplot2 plotting package to create simple and complex plots from data in a data frame structure. To this end, its assumed that you KNOW how to 1 import data CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation In this lecture, we will learn • How to use R to visualise single or several variables. • How to use R ggplot2 plotting package to create simple and complex plots from data in a data frame structure. To this end, its assumed that you KNOW how to 1 import data 2 organise data CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation In this lecture, we will learn • How to use R to visualise single or several variables. • How to use R ggplot2 plotting package to create simple and complex plots from data in a data frame structure. To this end, its assumed that you KNOW how to 1 import data 2 organise data 3 clean data CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation In this lecture, we will learn • How to use R to visualise single or several variables. • How to use R ggplot2 plotting package to create simple and complex plots from data in a data frame structure. To this end, its assumed that you KNOW how to 1 import data 2 organise data 3 clean data 4 normalise data CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation Recall .... Data variable values can be: ▶ Numeric: 1 2 Discrete - integer values. Continuous - any value in a pre-defined range (float, double). ▶ Categorical: values are selected from a predefined number of categories. 1 2 3 Ordinal - categories could be meaningfully ordered. Nominal - don’t have any order. Binary - the special case of nominal, with only 2 possible categories. ▶ Date: datetime, timestamp. ▶ Text: Multidimensional data ▶ Time series: Data points indexed in the time order CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation Recall... R - Factors ▶ Factors are the data objects which are used to categorise the data and store it as levels. ▶ They can store both strings and integers. ▶ They are useful in the columns which have a limited number of unique values. Like Male, Female and True, False etc. Factors are created using the factor () function by taking a vector as input. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation R - Factors # Create a vector as input. data <-c("East","West","East","North","North","East","West","West", "West","East","North") print(data) ## ## [1] "East" [9] "West" "West" "East" "East" "North" "North" "East" "North" "West" "West" print(is.factor(data)) ## [1] FALSE # Apply the factor function. factor_data <- factor(data) print(factor_data) ## [1] East West East North North East ## Levels: East North West print(is.factor(factor_data)) ## [1] TRUE West West West East North CSE5DEV Syllabus Week-Overview Data Visualisation Examples of data visualisation Examples of Data Visualisation CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation To visualise the variable of the given data, we need to CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation To visualise the variable of the given data, we need to 1 Step 1: Import data into Rstudio. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation To visualise the variable of the given data, we need to 1 Step 1: Import data into Rstudio. 2 Step 2: Install and Load ggplot2 packages. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation To visualise the variable of the given data, we need to 1 Step 1: Import data into Rstudio. 2 Step 2: Install and Load ggplot2 packages. 3 Step 3: Use ggplot2 to plot data variables. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation Step 1: Import data into Rstudio CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation Step 1: Import data into Rstudio In this lecture, we will use the following Data-Set to create graphs and figures. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation Step 1: Import data into Rstudio In this lecture, we will use the following Data-Set to create graphs and figures. 1 Salary: salary attributes: Age, Education, Experience, and other attributes. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation Step 1: Import data into Rstudio In this lecture, we will use the following Data-Set to create graphs and figures. 1 2 Salary: salary attributes: Age, Education, Experience, and other attributes. data w: weather dataset. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation Step 1: Import data into Rstudio In this lecture, we will use the following Data-Set to create graphs and figures. 1 2 3 Salary: salary attributes: Age, Education, Experience, and other attributes. data w: weather dataset. Iris: flower data set: Sepal.Length, Sepal.Width, Petal.Length, Petal.Width, Species. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation Step 1: Import data into Rstudio In this lecture, we will use the following Data-Set to create graphs and figures. 1 2 3 4 Salary: salary attributes: Age, Education, Experience, and other attributes. data w: weather dataset. Iris: flower data set: Sepal.Length, Sepal.Width, Petal.Length, Petal.Width, Species. diamonds: diamonds attributes: Cut, Colour, Clarity, Price, and other attributes. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation Step 1: Import data into Rstudio In this lecture, we will use the following Data-Set to create graphs and figures. 1 2 3 4 5 Salary: salary attributes: Age, Education, Experience, and other attributes. data w: weather dataset. Iris: flower data set: Sepal.Length, Sepal.Width, Petal.Length, Petal.Width, Species. diamonds: diamonds attributes: Cut, Colour, Clarity, Price, and other attributes. cars: auto-mobile attributes: Fuel Consumption, Design Performance , and other attributes. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation Step 1: Import data into Rstudio In this lecture, we will use the following Data-Set to create graphs and figures. 1 2 3 4 5 Salary: salary attributes: Age, Education, Experience, and other attributes. data w: weather dataset. Iris: flower data set: Sepal.Length, Sepal.Width, Petal.Length, Petal.Width, Species. diamonds: diamonds attributes: Cut, Colour, Clarity, Price, and other attributes. cars: auto-mobile attributes: Fuel Consumption, Design Performance , and other attributes. Data sources Kaggle, UCI Machine Learning Repository, Climate Data Online. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation Step 1: Import data into Rstudio Reading Data from CSV Files • Read the data from data1.csv, which includes a header row. • save the data into dat. By default dat will be data frame dat <- read.csv("data_name.csv", header=TRUE) • The read.csv() function creates a data frame from the data in the .csvfile. • If we pass header=TRUE, then the function uses the very first row to name the variables in the resulting data frame. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation Step 1: Import data into Rstudio Verify the results • use name () function the print the name of columns names(dat) ## [1] "MinTemp" ## [5] "Sunshine" ## [9] "WindDir3pm" ## [13] "Humidity3pm" ## [17] "Cloud3pm" ## [21] "RISK_MM" "MaxTemp" "WindGustDir" "WindSpeed9am" "Pressure9am" "Temp9am" "RainTomorrow" "Rainfall" "WindGustSpeed" "WindSpeed3pm" "Pressure3pm" "Temp3pm" "Evaporation" "WindDir9am" "Humidity9am" "Cloud9am" "RainToday" CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation Step 2: Install and Load ggplot2 packages CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation Step 2: Install and Load ggplot2 packages • In this subject, we will use ggplot2 package to visualise our data . CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation Step 2: Install and Load ggplot2 packages • In this subject, we will use ggplot2 package to visualise our data . • ggplot2 is a system for declaratively creating graphics, based on The Grammar of Graphics. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation Step 2: Install and Load ggplot2 packages • In this subject, we will use ggplot2 package to visualise our data . • ggplot2 is a system for declaratively creating graphics, based on The Grammar of Graphics. • You provide the data, tell ggplot2 how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation Step 2: Install and Load ggplot2 packages • In this subject, we will use ggplot2 package to visualise our data . • ggplot2 is a system for declaratively creating graphics, based on The Grammar of Graphics. • You provide the data, tell ggplot2 how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details We can Install and Load ggplot2, as follows: CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation Step 2: Install and Load ggplot2 packages • In this subject, we will use ggplot2 package to visualise our data . • ggplot2 is a system for declaratively creating graphics, based on The Grammar of Graphics. • You provide the data, tell ggplot2 how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details We can Install and Load ggplot2, as follows: 1 Install ggplot2 package only one time. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation Step 2: Install and Load ggplot2 packages • In this subject, we will use ggplot2 package to visualise our data . • ggplot2 is a system for declaratively creating graphics, based on The Grammar of Graphics. • You provide the data, tell ggplot2 how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details We can Install and Load ggplot2, as follows: 1 Install ggplot2 package only one time. install.packages("ggplot2") CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation Step 2: Install and Load ggplot2 packages • In this subject, we will use ggplot2 package to visualise our data . • ggplot2 is a system for declaratively creating graphics, based on The Grammar of Graphics. • You provide the data, tell ggplot2 how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details We can Install and Load ggplot2, as follows: 1 Install ggplot2 package only one time. install.packages("ggplot2") 2 Load ggplot2: CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation Step 2: Install and Load ggplot2 packages • In this subject, we will use ggplot2 package to visualise our data . • ggplot2 is a system for declaratively creating graphics, based on The Grammar of Graphics. • You provide the data, tell ggplot2 how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details We can Install and Load ggplot2, as follows: 1 Install ggplot2 package only one time. install.packages("ggplot2") 2 Load ggplot2: library("ggplot2") CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation Step 2: Install and Load ggplot2 packages Note: we can install all important packages in one command as follows pkgs <- c("ggplot2", "dplyr", "tidyr", "mosaicData", "carData", "VIM", "scales", "treemapify", "gapminder", "ggmap", "choroplethr", "choroplethrMaps", "CGPfunctions", "ggcorrplot", "visreg", "gcookbook", "forcats") install.packages("pkgs") CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation Step 3: Use ggplot2 to plot data variables CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation Step 3: Use ggplot2 to plot data variables ggplot2 uses various grammars to create graphics. The grammars specify plot building blocks, their types and other features. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation Step 3: Use ggplot2 to plot data variables ggplot2 uses various grammars to create graphics. The grammars specify plot building blocks, their types and other features. R Code: ggplot2 grammar library("ggplot2") ggplot(data = <DATA>, mapping = aes(x = <x>, y = <y>) + Geometric_object (color= "", alpha=, size=) CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation Step 3: Use ggplot2 to plot data variables ggplot2 uses various grammars to create graphics. The grammars specify plot building blocks, their types and other features. R Code: ggplot2 grammar library("ggplot2") ggplot(data = <DATA>, mapping = aes(x = <x>, y = <y>) + Geometric_object (color= "", alpha=, size=) • • • • • • Data: the input data. It should be a data frame. Aesthetic mapping (aes): the mapping of the variables to visual graph. Geometric object: points, lines, bars, etc. Color: controls the point colour. size: controls the point size. alpha: controls the point transparency. Transparency ranges from 0 (completely transparent) to 1 (completely opaque). Adding a degree of transparency can help visualise overlapping points. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation Step 3: Use ggplot2 to plot data variables ggplot2 - other functions • + : Add layers, scales, coords and facets • ggsave (filename,plot = last plot(),device = NULL, path = NULL,scale = 1,width = NA,height = NA, units = c(”in”, ”cm”, ”mm”), dpi = 300, limitsize = TRUE,...): save a plot to disk. CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation Step 3: Use ggplot2 to plot data variables CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation Step 3: Use ggplot2 to plot data variables Examples of Geometric objects are CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation Step 3: Use ggplot2 to plot data variables Examples of Geometric objects are • geom bar(): Bar chart • geom point(): Scatterplot • geom line(): Line diagram, connecting observations in order by x-value • geom boxplot: Box-and-whisker plot • geom path: Line diagram, connecting observations in original order • geom smooth: Add a smoothed conditioned mean • geom histogram: Histogram CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation ggplot2: step by step example. Step 1 # load package library(ggplot2) # read salary data and saved it in dat dat <- read.csv("salary_data.csv", header=TRUE) # call ggplot, specify dataset, and mapping ggplot(data = dat, mapping = aes(x = exper, y = wage)) CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation ggplot2: step by step example. Step 1: Plot output 40 wage 30 20 10 0 0 20 40 exper CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation ggplot2: step by step example. Step 2 : add geoms. Geoms are the geometric objects (points, lines, bars, etc.) that can be placed on a graph. We can add geoms using geom objects . In this example, we will add points using the geom point function to create a scatterplot. # add points ggplot(data = dat, mapping = aes(x = exper, y = wage)) + geom_point() CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation ggplot2: step by step example. Step 2: plot output 40 wage 30 20 10 0 0 20 40 exper CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation ggplot2: step by step example. Step 3: change point colour and shape . In this example we will change points colour into blue, make them larger, and semitransparent # make points blue, larger, and semi-transparent ggplot(data = dat, mapping = aes(x = exper, y = wage)) + geom_point(color = "cornflowerblue", alpha = .7, size = 3) CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation ggplot2: step by step example. Step 3: plot output 40 wage 30 20 10 0 0 20 40 exper CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation ggplot2: step by step example. Step 4: add a line of best fit. We can add best fit line using geom smooth function. The line can be linear, quadratic, nonparametric. We can also control the thickness of the line, line’s colour and show the confidence interval. In this example, we use a linear regression line as follows: (method = lm) (where lm stands for linear model). # add a line of best fit. ggplot(data = dat, mapping = aes(x = exper, y = wage)) + geom_point(color = "cornflowerblue", alpha = .7, size = 3) + geom_smooth(method = "lm") CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation ggplot2: step by step example. Step 4: plot output 40 wage 30 20 10 0 0 20 40 exper CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation ggplot2: step by step example. Step 5: grouping. In grouping, we map variables into colour, shape, size, transparency, and other visual characteristics of geometric objects. In this example, we will add gender to the plot and represent it by colour. # indicate sex using color ggplot(data = dat, mapping = aes(x = exper, y = wage, color = sex)) + geom_point(alpha = .7, size = 3) + geom_smooth(method = "lm", se = FALSE, size = 1.5) CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation ggplot2: step by step example. Step 5: plot output 40 30 wage sex F M 20 10 0 0 20 40 exper CSE5DEV Syllabus Week-Overview Data Visualisation Examples of Data Visualisation Examples of data visualisation ggplot2: step by step example. Step 6: scales. Scale function (scale) is used to control variable ranges. In this example, we will change the x and y axis scaling, and the colours. # modify the x and y axes and specify the colors to be used ggplot(data = dat,mapping = aes(x = exper,y = wage, color

Use Quizgecko on...
Browser
Browser