Podcast
Questions and Answers
What is one of the 3Es of displaying data that emphasizes clarity in conveying a message?
What is one of the 3Es of displaying data that emphasizes clarity in conveying a message?
- Efficient
- Ethical
- Engaging
- Effective (correct)
Which of the following tips aims to improve the viewer's understanding of data through placement?
Which of the following tips aims to improve the viewer's understanding of data through placement?
- Ensure visuals are placed near the corresponding text (correct)
- Avoid any graphical elements
- Limit the amount of data presented
- Use multiple colors to attract attention
What should be avoided to maintain an ethical display of data?
What should be avoided to maintain an ethical display of data?
- Inflating results to create a stronger impact (correct)
- Citing sources for data created by others
- Representing quantities accurately
- Including all relevant data
When using color in graphics, what is a recommended practice?
When using color in graphics, what is a recommended practice?
What is a common mistake to avoid when using tables in data display?
What is a common mistake to avoid when using tables in data display?
Which movie had the highest weekend gross ticket sales during Christmas weekend of 2017?
Which movie had the highest weekend gross ticket sales during Christmas weekend of 2017?
What is one important limitation of bar plots mentioned in the content?
What is one important limitation of bar plots mentioned in the content?
In a grouped bar plot, what determines the positions of the bars along the x-axis?
In a grouped bar plot, what determines the positions of the bars along the x-axis?
What is indicated about the bar order when representing unordered categories?
What is indicated about the bar order when representing unordered categories?
What might be a suitable alternative to represent data where bar length requires strict proportionality?
What might be a suitable alternative to represent data where bar length requires strict proportionality?
What does the content imply regarding the visualization of median income levels categorized by age and race?
What does the content imply regarding the visualization of median income levels categorized by age and race?
How should one interpret the age distribution of Titanic passengers?
How should one interpret the age distribution of Titanic passengers?
Why might a heatmap be considered effective for visualizing data values?
Why might a heatmap be considered effective for visualizing data values?
What is the primary objective during the data exploration phase?
What is the primary objective during the data exploration phase?
Which of the following is a characteristic of the data presentation phase?
Which of the following is a characteristic of the data presentation phase?
In the data exploration phase, which of the following aspects is considered less important?
In the data exploration phase, which of the following aspects is considered less important?
What feature is crucial for effective data exploration tools?
What feature is crucial for effective data exploration tools?
Which type of plotting is typically emphasized during the data exploration stage?
Which type of plotting is typically emphasized during the data exploration stage?
Why can some visualization tools hinder efficient data exploration?
Why can some visualization tools hinder efficient data exploration?
What should you prioritize when trying to understand a new dataset during exploration?
What should you prioritize when trying to understand a new dataset during exploration?
What happens during the transition from data exploration to data presentation?
What happens during the transition from data exploration to data presentation?
What is a common application for using 3D visualizations?
What is a common application for using 3D visualizations?
What is a choropleth map primarily used for?
What is a choropleth map primarily used for?
Which is a significant challenge when designing maps?
Which is a significant challenge when designing maps?
What distinguishes a cartogram from a traditional map?
What distinguishes a cartogram from a traditional map?
How is the earth's structure described?
How is the earth's structure described?
What does the equator represent on the earth?
What does the equator represent on the earth?
In what scenarios are 3D visualizations particularly effective?
In what scenarios are 3D visualizations particularly effective?
What aspect of geospatial datasets is essential in ecological studies?
What aspect of geospatial datasets is essential in ecological studies?
What was the trend observed in submissions to bioRxiv soon after its launch?
What was the trend observed in submissions to bioRxiv soon after its launch?
What does the author suggest about depicting figures in presentations?
What does the author suggest about depicting figures in presentations?
What is meant by 'making a figure for the generals'?
What is meant by 'making a figure for the generals'?
Why might a single figure be more suitable for certain mediums, like social media?
Why might a single figure be more suitable for certain mediums, like social media?
What is a recommended approach before presenting complex figures?
What is a recommended approach before presenting complex figures?
What was the observed relationship between q-bio and bioRxiv submissions?
What was the observed relationship between q-bio and bioRxiv submissions?
What is a potential issue when audiences view complex visualizations?
What is a potential issue when audiences view complex visualizations?
Why is it important to help readers understand visualizations?
Why is it important to help readers understand visualizations?
Which of the following statements about longitude is true?
Which of the following statements about longitude is true?
What do lines of equal latitude represent?
What do lines of equal latitude represent?
What is the significance of the World Geodetic System (WGS) 84?
What is the significance of the World Geodetic System (WGS) 84?
Which of the following describes altitude in the context of geospatial data?
Which of the following describes altitude in the context of geospatial data?
What happens during the process of map projection?
What happens during the process of map projection?
What is the latitude of the South Pole?
What is the latitude of the South Pole?
Which projection property can be preserved when mapping the Earth's surface?
Which projection property can be preserved when mapping the Earth's surface?
How is the meridian opposite the prime meridian designated?
How is the meridian opposite the prime meridian designated?
What do meridians terminate at?
What do meridians terminate at?
Flashcards
Ethical Data Display
Ethical Data Display
A visual representation that accurately reflects the data and avoids manipulating or exaggerating results.
Visuals as Pauses
Visuals as Pauses
Using visuals to guide the reader through the information, allowing them to pause and reflect on the presented concepts.
Effective Visuals
Effective Visuals
Employing visual elements in a way that is engaging without overwhelming the reader.
Efficient Visual Display
Efficient Visual Display
Signup and view all the flashcards
Effective Visual
Effective Visual
Signup and view all the flashcards
Latitude
Latitude
Signup and view all the flashcards
Longitude
Longitude
Signup and view all the flashcards
Datum
Datum
Signup and view all the flashcards
Prime Meridian
Prime Meridian
Signup and view all the flashcards
Meridians
Meridians
Signup and view all the flashcards
Parallels
Parallels
Signup and view all the flashcards
Projection
Projection
Signup and view all the flashcards
Conformal Projection
Conformal Projection
Signup and view all the flashcards
Figure for the Generals
Figure for the Generals
Signup and view all the flashcards
Building up to Complex Figures
Building up to Complex Figures
Signup and view all the flashcards
Simplifying Figures
Simplifying Figures
Signup and view all the flashcards
Rapid Processing of Complex Visualisations
Rapid Processing of Complex Visualisations
Signup and view all the flashcards
Figures the Audience Can Understand
Figures the Audience Can Understand
Signup and view all the flashcards
Visualisation
Visualisation
Signup and view all the flashcards
Meaningful Visualisation
Meaningful Visualisation
Signup and view all the flashcards
Breaking Down Complex Information
Breaking Down Complex Information
Signup and view all the flashcards
Choropleth Map
Choropleth Map
Signup and view all the flashcards
Cartogram
Cartogram
Signup and view all the flashcards
Map Projection
Map Projection
Signup and view all the flashcards
Equator
Equator
Signup and view all the flashcards
Poles
Poles
Signup and view all the flashcards
Spheroid
Spheroid
Signup and view all the flashcards
Geospatial Data Visualization
Geospatial Data Visualization
Signup and view all the flashcards
3D Visualization
3D Visualization
Signup and view all the flashcards
Data Exploration
Data Exploration
Signup and view all the flashcards
Data Presentation
Data Presentation
Signup and view all the flashcards
Visualization Flexibility
Visualization Flexibility
Signup and view all the flashcards
Well-Designed Data Exploration Tool
Well-Designed Data Exploration Tool
Signup and view all the flashcards
Plot-Type-Organized Visualization Tools
Plot-Type-Organized Visualization Tools
Signup and view all the flashcards
Challenges of Plot-Type-Organized Tools
Challenges of Plot-Type-Organized Tools
Signup and view all the flashcards
Speed and Efficiency in Data Exploration
Speed and Efficiency in Data Exploration
Signup and view all the flashcards
Discovering Hidden Patterns
Discovering Hidden Patterns
Signup and view all the flashcards
Bar Plot
Bar Plot
Signup and view all the flashcards
Order in Bar Plots
Order in Bar Plots
Signup and view all the flashcards
Grouped Bar Plot
Grouped Bar Plot
Signup and view all the flashcards
Stacked Bar Plot
Stacked Bar Plot
Signup and view all the flashcards
Dot Plot
Dot Plot
Signup and view all the flashcards
Heatmap
Heatmap
Signup and view all the flashcards
Density Plot
Density Plot
Signup and view all the flashcards
Study Notes
STAT 288: Data Visualization
- Course taught by Dr. Abdulla Eid
- Focuses on data visualization concepts and techniques
- Covers course overview and key concepts, syllabus, data science cycle, data collection examples, open data sources, principles of effective and ethical data visualization, tips for effective and efficient display, and conclusion/next steps.
Welcome & Course Syllabus
- Introduction to Data Visualization
- Overview of course content and structure
- Learning objectives and expectations
Data Science Cycle
- Data Collection (Open-Source data, large sample representation)
- Data Cleaning (Simple Data Mining)
- Data Visualization (Focus of this course, data extraction, story telling)
Data Collection Examples
- Students results
- Stock markets & financial institutions
- Demographics
- Sport Data (Olympics, France 2024, American Football)
- Election Data (2024 US Presidential Election)
Open Data Sources
- Data.gov: Science, research, manufacturing, climate
- Census.gov: US demographic
- OpenDataNetwork.com: Robust data search engine
- NCES.ed.gov: Education data
- HealthData.gov: Health-related databases
- DBpedia.org: Databases similar to Wikipedia
- Kaggle.com: Community-published datasets
What is Data Visualization?
- Data visualization is part art and science.
- The challenge is to get the art right without losing the science.
- Accurately conveying the data is paramount.
- It must not mislead or distort the data.
- Visualizations should be aesthetically pleasing and enhance the message.
- Visual elements, colors, and design should not be distracting.
Ugly, Bad, and Wrong...
- Examples of poor data visualizations.
- Visualizations that do not accurately reflect the data
- Bad visual design (e.g., inappropriate colors, unbalanced elements)
3Es of Displaying Data
- Effective: Clearly convey the message
- Efficient: Minimal use of resources
- Ethical: Represent data truthfully
Tips for Effective Display
- Visuals should be placed close to the relevant text.
- Visuals should allow readers to pause and consider the ideas.
- Graphics visually reinforce arguments.
- A simple data story is essential for good delivery.
Effective Display:
- Human-induced change: Certain naturally occurring gases (CO2 & H2O) trap atmospheric heat, leading to a greenhouse effect. Burning fossil fuels contribute CO₂ to the atmosphere, increasing to levels not seen in 650,000 years.
- The greenhouse effect: Solar radiation passes through the atmosphere while greenhouse gases absorb and re-emit infrared radiation, warming the Earth's surface, and lower atmosphere.
Tips for Efficient Display
- Use color carefully and sparingly.
- Avoid color if black and white is sufficient.
- Use color to establish visual patterns but do not over use it. Limit color interpretation to 2-3 colors at a time.
Which of these looks better?
- Emphasis on strong contrast, best for visual annotations or charts on screen
- Using black and white is better than various colors
- Avoid misrepresenting data with the use of color
Tips for Ethical Display
- Be honest in data presentation; do not exaggerate.
- Cite sources for graphics or data not created by the presenter.
- Obtain the proper permissions for use of any graphics or data.
- Include all relevant data even if unexplained.
- Accurately represent quantities/magnitudes.
- Avoid using tables to hide obvious data points
- Do not use color to mislead or misrepresent the data.
Conclusion & Next Steps
- Review key points
- Upcoming assignments and topics
- Resources for further study
Chapter 2: Data Visualization: Mapping Data Onto Aesthetics
- Data values converted to visual elements
Aesthetics and types of data
- position, shape, size, color, line width and line type
- data type classification (continuous or discrete)
Data Types
- Quantitative/numerical (continuous, discrete)
- Qualitative/categorical (unordered, ordered)
- Date or Time
- Text
Example of data type
- Table of data with month, day, location, station ID, and temperature values for different locations.
Scales map data values onto aesthetics
- Mapping values to positions, shapes or colors.
Example: Map Data to Aesthetics
- Graph showing temperatures of different US locations over time.
3.2 Non-Linear Scale
- original data, linear scale
- log-transformed data, linear scale
3.3 Coordinate systems with curved axes
- Cartesian Coordinates
Example of Curves Coordinate System
- Graph with circular data for temperature, location, and time
Different System (Subject specific)
- Cartesian longitude and latitude
- Interrupted Goode homolosine
- Robinson, Winkler triple
Chapter 4: Color Scales
- Fundamental use cases: distinguish data groups, represent data values and highlight.
Qualitative Color Scale
- Color choice for visually distinct but equivalent categories or groups
Chapter 5: Directory of Visualizations
- Charts, plots, and graphs.
5.1 Amounts
- Grouped Bars
- Stacked Bars
- Heatmap
5.2 Distribution
- Histogram
- Density Plot
- Cumulative Density
- Quantile-Quantile Plot
Distribution (Cont):
- Boxplots
- Violins
- Strip Charts
- Sina Plots
- Stacked Histograms
- Overlapping Densities
- Ridgeline Plots
5.3 Proportions
- Pie Chart
- Bars
- Stacked Bars
- Multiple Pie Charts
- Grouped Bars
- Stacked Densities
- Mosaic Plot
- Treemap
- Parallel Sets
5.4 X-Y Relationships
- Scatterplot
- Bubble Chart
- Paired Scatterplot
- Slopegraph
- Density Contours
- 2D Bins
- Hex Bins
- Correlogram
5.5 Geospatial Data
- Map
- Choropleth
- Cartograms
- Cartogram Heatmap
5.6 Uncertainty
- Error bars indicate the range of likely values for an estimate or measurement. Also, reference points can visualized as dots or bars.
Uncertainty (Cont):
- Confidence strips provide a visual sense of uncertainty but are difficult to read accurately. Eyes, half-eyes combine error bars for visualizing distributions.
Uncertainty (Cont):
- Shows different ways to visualize uncertainty with confidence bands, graded confidence bands, and fitted draws.
Chapter 6 Visualizing amounts
- In many scenarios, we are interested in the magnitude of a set of numbers, such as the total sales of products, the number of people in a city, or the age of athletes from around the world.
- Bar plots are the standard approach in this case, though there are other types like Dot plots, and Heatmaps
Bar Plot (Chart)
- A common visualization for amounts and counts.
Ugly (Why?)
- Examples of poorly designed bar plots.
Better Solution
- A better alternative to poorly designed bar plots
Bad (Why?)
- Showing examples of bad design choices, such as incorrect or misleading plotting choices
Order Matter (Of Some Sense)
- The order of data in a bar plot is important when the categories are not inherently ordered
6.2 Grouped and stacked bars:
- Visualizing amounts broken down by categories grouped (grouped bar plots), stacked (stacked bar plots).
Chapter 7: Visualizing Distributions:
- Histograms, and Density plots
- How many passengers of what ages were on the Titanic broken down by age groups
- We call the counts for each age group is called Age Distribution
- Visualizing single distribution, like how often certain ages appear
- Using histograms and density plots
Histogram
- Visually representing frequencies by using rectangles of varying heights proportional to different age bins
- Width of the bins are all the same for a histogram
Density Plots
- Drawing an underlying probability distribution for data.
- Kernel estimation.
Bandwidth (Appearance)
- Illustrating how the bandwidth/width affects the visualization.
What is the area under the curve in density plot?
- Understanding the relationship to probability.
Warning!
- Realizing that histograms and density plots are estimates.
- There's a large difference in choice when visualizing data distribution using histograms or density plots and other options.
7.2 Visualizing multiple distributions at the same time
- Visualizing overlapping distributions with stacked histograms.
- Comparing age distribution between male and female passengers.
Why?
- Illustrating why a particular visualization is bad.
Better Solution
- Showing examples of visualizations that are better than the previous examples.
- Illustrating the comparison between stacked bar charts and better data visualizations, and how one visualization is more suitable than the other. Data visualization should be clear and informative for the intended audience.
- Visualizations need to be easily interpreted, and show the data in a clear and understandable manner rather than a complex 3-D appearance.
- The presentation of data in simple terms will effectively capture the target population's attention and understanding
Chapter 8 Visualizing distributions: Empirical Cumulative Distribution Functions and Q-Q plots
- How many students received equal to or less points on a certain exam
- Shows the quantile-quantile of data in ascending or descending order
8.1 Empirical Cumulative Distribution Functions (Ecdfs)
- Ranking and plotting the data points in order to visualize the cumulative proportion at a given point in the data.
8.3 Quantile-Quantile Plots (QQ)
- Visually determine how well a dataset follows a particular distribution, such as a normal distribution.
Methodologies
- Using ranked data points to visually see the distribution following a normal distribution.
Example
- Showing how data points would look on a distribution based on a theoretical normal distribution.
Regression Line
- Illustrating how a regression line is different from the line that represents a theoretical normal distribution
Chapter 9 Visualizing many distributions at once
- Visualizing many distributions (e.g., monthly temperatures) at once with various plots such as boxplots, violin plots, and ridgeline plots.
9.1 Visualizing distributions along the vertical axis
- Showing mean or median values with indication of variation using error bars.
But why Bad?
- Discussing what is incorrect or misleading about a visualization.
Box Plot
- Displaying summary statistics like the median, quartiles, minimum value.
- Showing outliers in a dataset.
Temperature Example
- Plots showing box plots comparing temperatures across the months.
Violin Plots
- Showing similar data as box plots, but visualizing the distribution shape of the data.
Strip Chart
- Showing scatterplot-like visualizations.
- Displaying the frequency of different points for particular months.
Better (More points) Jettering
- Showing the improvement to the strip chart
Sina Plots
- Similar visualizations to strip charts, but the points are spread out to visualize variability across months
9.2 Visualizing Distributions along the horizontal axis
- Displaying distributions along the horizontal axis(e.g. temperatures) with a specific dataset from multiple categories
Chapter 10: Visualizing Proportions
- How to visually represent how different categories/groups compare to each other
Pie Charts
- Visualizing proportions as slices of a circle
Rectangular Pie Charts (Stacked Bars)
- Visualizing proportions using stacked bars
Side-by-Side Bars
- Visualizing proportions using side-by-side bars
Table 10.1: Pros and cons of common approaches to visualizing proportions.
- Summary of different approaches to visualize proportions
A case for side-by-side bars
- Showing an example of an improved/more appropriate visual representation of proportions.
A Better way
- Providing a better visualization
10.3 A case for stacked bars and stacked densities
- Illustrating the use of stacked plots
Heat Maps
- Representing proportions in grid form with gradients or colors
Chapter 11: Nested Proportions
- Visualizing how one variable is broken down by other variables using nested visualizations.
11.1 Nested proportions gone wrong
- Providing examples of incorrect visualizations about the proportions of different variables
Why Wrong?
- Problems with the incorrect visualizations of proportions
11.2 Mosaic plots and treemaps
- Mosaic plots and treemaps are used for displaying proportions within multiple categories.
Treemaps
- Representing hierarchical data with nested boxes to show the proportion sizes
11.3 Nested pies
- Visualizing proportions with nested pie charts.
Chapter 12: Visualizing Associations among Two or More Quantitative Variables
- Scatter plots and other options.
12.1 Scatter plots
- Visualizing the relationship between two quantitative variables using scatter plots.
12.2 Correlograms
- Visualizing correlation between different variables.
Correlograms
- Showing correlation of different variables against each other.
12.4 Paired Data
- Plotting paired quantitative data through time on scatter plots.
Chapter 14: Visualizing Trends
- Overall trends in data over time, and how to visualizes these trends
14.1 Smoothing
- Using moving averages to smooth out data and see overall trends.
Moving Average (After or at)
- Showing example plots comparing different moving averages and the use or limitations of the different types.
Observations from the previous figure
- Observation of what kind of smoothing is a better choice against time, given a particular dataset.
Another Moving Average (LOESS)
- Showing an alternative way to smooth data by using loess smoother.
14.2 Showing trends with a defined functional form
- Fitting trends to data using linear or exponential functions
Log scale and Exp
- Showing log scales for comparison
Chapter 26: Don't go 3D
- 3-D visualizations are often inappropriate
26.1 Avoid gratuitous 3D
- Avoid turning standard 2-D visualizations into 3-D objects, as the extra dimensions do not typically convey any meaningful information.
Focus on the size of 25% in these four graphs
- Demonstrating how 3-D projections distort the impression of the data
A case for side-by-side bars
- Giving examples as to why side-by-side bar charts are better than other visualizations, such as pie charts
A Better way
- Shows a different/better visualization choice for depicting proportions
10.3 A case for stacked bars and stacked densities
- Illustrating stacked bars and stacked densities
Chapter 15: Geospatial Data
- Plotting data points that have locations to create geographic visualizations
15.2 Layers
- Using multiple layers to show the relationship between variables in the data, such as terrain, road networks, or city locations
15.3 Choropleth mapping
- Plotting to depict quantities that vary on spatial areas.
15.4 Cartograms
- Presenting a geographic map to depict size proportionality
Chapter 28: Choosing visualization software
- Choosing the best/most appropriate software program for data exploration and visualization, based on the user's familiarity, data amount, and time pressure
28.1: Reproducibility and Repeatability
- Ensuring that graphs are well-documented and reproducible/repeatable for future viewers
Interactive Software
- Explaining how interactive software is not useful in data reproducibility
Advice
- Recommendations for better visualizations
28.2 Data exploration versus data presentation
- Discussing the need for efficient and rapid visualizations versus high-quality presentation figures
Data Presentation
- Providing final/polished/publication-level visualizations
Exploration Stage and Software
- Discussing the characteristics and utility of software that allows easy figure manipulation
Themes
- Discussing using themes/patterns for easier data comparison
Benefit of Separation
- Advantages of separating content from design.
Summary of the Chapter
- Summary and discussion points of visualization programs
Chapter 29: Story Telling
- Explaining the basics of communicating data and making a point.
- How to tell a story with data using compelling stories with specific formats.
- Presenting well-developed and structured stories for audience understanding
- Presenting simple versus complex visualizations to show storytelling in graphics.
- Providing clear explanations and use of data.
29.1 What is a story?
- Important characteristics of good storytelling, such as strong and focused story delivery and consistent message
Story!!
- Importance of narrative in data visualization
Benefit of Story
- How effective narrative/story structure benefits data visualization
Compelling Story!
- Using stories to convey meaning and insights
29.2 Make a figure for the generals
- Making a figure simple and clear to the intended audience
29.3 Build up towards complex figures
- Presenting complex figures by starting with simple ones
29.4 Make your figures memorable
- Showing how to present data in various ways, and what are the key aspects for making your data more memorable
29.5 Be consistent but don't be repetitive
- Using similar/consistent visual language for a set of related figures.
Week 10 - Chapter Supplementary Information
- Provide supplementary data/information for further learning/study
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.