Data Visualization Best Practices
46 Questions
4 Views

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

What is one of the 3Es of displaying data that emphasizes clarity in conveying a message?

  • Efficient
  • Ethical
  • Engaging
  • Effective (correct)
  • Which of the following tips aims to improve the viewer's understanding of data through placement?

  • Ensure visuals are placed near the corresponding text (correct)
  • Avoid any graphical elements
  • Limit the amount of data presented
  • Use multiple colors to attract attention
  • What should be avoided to maintain an ethical display of data?

  • Inflating results to create a stronger impact (correct)
  • Citing sources for data created by others
  • Representing quantities accurately
  • Including all relevant data
  • When using color in graphics, what is a recommended practice?

    <p>Limit usage to two or three colors for clarity</p> Signup and view all the answers

    What is a common mistake to avoid when using tables in data display?

    <p>Hiding obvious data points within tables</p> Signup and view all the answers

    Which movie had the highest weekend gross ticket sales during Christmas weekend of 2017?

    <p>Star Wars: The Last Jedi</p> Signup and view all the answers

    What is one important limitation of bar plots mentioned in the content?

    <p>They cannot start at zero.</p> Signup and view all the answers

    In a grouped bar plot, what determines the positions of the bars along the x-axis?

    <p>One categorical variable</p> Signup and view all the answers

    What is indicated about the bar order when representing unordered categories?

    <p>It should be arranged by data value.</p> Signup and view all the answers

    What might be a suitable alternative to represent data where bar length requires strict proportionality?

    <p>Dot plots</p> Signup and view all the answers

    What does the content imply regarding the visualization of median income levels categorized by age and race?

    <p>It is best represented using grouped bar plots.</p> Signup and view all the answers

    How should one interpret the age distribution of Titanic passengers?

    <p>By assessing the relative proportions of different ages.</p> Signup and view all the answers

    Why might a heatmap be considered effective for visualizing data values?

    <p>It presents data values through color representation.</p> Signup and view all the answers

    What is the primary objective during the data exploration phase?

    <p>Understanding key features of the dataset</p> Signup and view all the answers

    Which of the following is a characteristic of the data presentation phase?

    <p>Preparing high-quality, publication-ready figures</p> Signup and view all the answers

    In the data exploration phase, which of the following aspects is considered less important?

    <p>The aesthetic appeal of the figures</p> Signup and view all the answers

    What feature is crucial for effective data exploration tools?

    <p>Ability to adapt input data as required for different plots</p> Signup and view all the answers

    Which type of plotting is typically emphasized during the data exploration stage?

    <p>Quick iterations over various plots</p> Signup and view all the answers

    Why can some visualization tools hinder efficient data exploration?

    <p>They require unique input data for each plot type</p> Signup and view all the answers

    What should you prioritize when trying to understand a new dataset during exploration?

    <p>The efficiency of testing various visualizations</p> Signup and view all the answers

    What happens during the transition from data exploration to data presentation?

    <p>Focus shifts from understanding to visual appeal</p> Signup and view all the answers

    What is a common application for using 3D visualizations?

    <p>To show topographic relief of geographical features</p> Signup and view all the answers

    What is a choropleth map primarily used for?

    <p>Representing data values as differently colored spatial areas</p> Signup and view all the answers

    Which is a significant challenge when designing maps?

    <p>Determining how to accurately represent angles and areas</p> Signup and view all the answers

    What distinguishes a cartogram from a traditional map?

    <p>It distorts areas or represents them in a stylized way</p> Signup and view all the answers

    How is the earth's structure described?

    <p>As a slightly flattened spheroid along its axis</p> Signup and view all the answers

    What does the equator represent on the earth?

    <p>The line that separates the northern and southern hemispheres</p> Signup and view all the answers

    In what scenarios are 3D visualizations particularly effective?

    <p>When visualizing actual 3D objects and data mapped onto them</p> Signup and view all the answers

    What aspect of geospatial datasets is essential in ecological studies?

    <p>Locations of specific plants or animals</p> Signup and view all the answers

    What was the trend observed in submissions to bioRxiv soon after its launch?

    <p>Rapid, exponential growth in submissions</p> Signup and view all the answers

    What does the author suggest about depicting figures in presentations?

    <p>Simplified figures help audiences understand key points</p> Signup and view all the answers

    What is meant by 'making a figure for the generals'?

    <p>Simplifying figures to highlight only the important points</p> Signup and view all the answers

    Why might a single figure be more suitable for certain mediums, like social media?

    <p>It accommodates audiences with shorter attention spans</p> Signup and view all the answers

    What is a recommended approach before presenting complex figures?

    <p>Show a simplified version first to ease understanding</p> Signup and view all the answers

    What was the observed relationship between q-bio and bioRxiv submissions?

    <p>q-bio submissions decreased as bioRxiv submissions increased</p> Signup and view all the answers

    What is a potential issue when audiences view complex visualizations?

    <p>They may struggle to grasp the overall trends</p> Signup and view all the answers

    Why is it important to help readers understand visualizations?

    <p>To assist them in inferring the intended messages</p> Signup and view all the answers

    Which of the following statements about longitude is true?

    <p>The prime meridian is located at 0° longitude.</p> Signup and view all the answers

    What do lines of equal latitude represent?

    <p>Distance from the equator.</p> Signup and view all the answers

    What is the significance of the World Geodetic System (WGS) 84?

    <p>It serves as a datum for GPS technology.</p> Signup and view all the answers

    Which of the following describes altitude in the context of geospatial data?

    <p>It indicates how far the location is from the Earth's center.</p> Signup and view all the answers

    What happens during the process of map projection?

    <p>Distortions occur because it is impossible to flatten a sphere without losing some properties.</p> Signup and view all the answers

    What is the latitude of the South Pole?

    <p>-90°</p> Signup and view all the answers

    Which projection property can be preserved when mapping the Earth's surface?

    <p>Either angles or areas, but not both.</p> Signup and view all the answers

    How is the meridian opposite the prime meridian designated?

    <p>180°W</p> Signup and view all the answers

    What do meridians terminate at?

    <p>The poles.</p> Signup and view all the answers

    Study Notes

    STAT 288: Data Visualization

    • Course taught by Dr. Abdulla Eid
    • Focuses on data visualization concepts and techniques
    • Covers course overview and key concepts, syllabus, data science cycle, data collection examples, open data sources, principles of effective and ethical data visualization, tips for effective and efficient display, and conclusion/next steps.

    Welcome & Course Syllabus

    • Introduction to Data Visualization
    • Overview of course content and structure
    • Learning objectives and expectations

    Data Science Cycle

    • Data Collection (Open-Source data, large sample representation)
    • Data Cleaning (Simple Data Mining)
    • Data Visualization (Focus of this course, data extraction, story telling)

    Data Collection Examples

    • Students results
    • Stock markets & financial institutions
    • Demographics
    • Sport Data (Olympics, France 2024, American Football)
    • Election Data (2024 US Presidential Election)

    Open Data Sources

    • Data.gov: Science, research, manufacturing, climate
    • Census.gov: US demographic
    • OpenDataNetwork.com: Robust data search engine
    • NCES.ed.gov: Education data
    • HealthData.gov: Health-related databases
    • DBpedia.org: Databases similar to Wikipedia
    • Kaggle.com: Community-published datasets

    What is Data Visualization?

    • Data visualization is part art and science.
    • The challenge is to get the art right without losing the science.
    • Accurately conveying the data is paramount.
    • It must not mislead or distort the data.
    • Visualizations should be aesthetically pleasing and enhance the message.
    • Visual elements, colors, and design should not be distracting.

    Ugly, Bad, and Wrong...

    • Examples of poor data visualizations.
    • Visualizations that do not accurately reflect the data
    • Bad visual design (e.g., inappropriate colors, unbalanced elements)

    3Es of Displaying Data

    • Effective: Clearly convey the message
    • Efficient: Minimal use of resources
    • Ethical: Represent data truthfully

    Tips for Effective Display

    • Visuals should be placed close to the relevant text.
    • Visuals should allow readers to pause and consider the ideas.
    • Graphics visually reinforce arguments.
    • A simple data story is essential for good delivery.

    Effective Display:

    • Human-induced change: Certain naturally occurring gases (CO2 & H2O) trap atmospheric heat, leading to a greenhouse effect. Burning fossil fuels contribute CO₂ to the atmosphere, increasing to levels not seen in 650,000 years.
    • The greenhouse effect: Solar radiation passes through the atmosphere while greenhouse gases absorb and re-emit infrared radiation, warming the Earth's surface, and lower atmosphere.

    Tips for Efficient Display

    • Use color carefully and sparingly.  
    • Avoid color if black and white is sufficient. 
    • Use color to establish visual patterns but do not over use it. Limit color interpretation to 2-3 colors at a time.

    Which of these looks better?

    • Emphasis on strong contrast, best for visual annotations or charts on screen
    • Using black and white is better than various colors
    • Avoid misrepresenting data with the use of color

    Tips for Ethical Display

    • Be honest in data presentation; do not exaggerate.
    • Cite sources for graphics or data not created by the presenter.
    • Obtain the proper permissions for use of any graphics or data. 
    • Include all relevant data even if unexplained.
    • Accurately represent quantities/magnitudes. 
    • Avoid using tables to hide obvious data points
    • Do not use color to mislead or misrepresent the data.

    Conclusion & Next Steps

    • Review key points
    • Upcoming assignments and topics
    • Resources for further study

    Chapter 2: Data Visualization: Mapping Data Onto Aesthetics

    • Data values converted to visual elements

    Aesthetics and types of data

    • position, shape, size, color, line width and line type
    • data type classification (continuous or discrete)

    Data Types

    • Quantitative/numerical (continuous, discrete)
    • Qualitative/categorical (unordered, ordered)
    • Date or Time
    • Text

    Example of data type

    • Table of data with month, day, location, station ID, and temperature values for different locations.

    Scales map data values onto aesthetics

    • Mapping values to positions, shapes or colors.

    Example: Map Data to Aesthetics

    • Graph showing temperatures of different US locations over time.

    3.2 Non-Linear Scale

    • original data, linear scale
    • log-transformed data, linear scale

    3.3 Coordinate systems with curved axes

    • Cartesian Coordinates

    Example of Curves Coordinate System

    • Graph with circular data for temperature, location, and time

    Different System (Subject specific)

    • Cartesian longitude and latitude
    • Interrupted Goode homolosine
    • Robinson, Winkler triple

    Chapter 4: Color Scales

    • Fundamental use cases: distinguish data groups, represent data values and highlight.

    Qualitative Color Scale

    • Color choice for visually distinct but equivalent categories or groups

    Chapter 5: Directory of Visualizations

    • Charts, plots, and graphs.

    5.1 Amounts

    • Grouped Bars
    • Stacked Bars
    • Heatmap

    5.2 Distribution

    • Histogram
    • Density Plot
    • Cumulative Density
    • Quantile-Quantile Plot

    Distribution (Cont):

    • Boxplots
    • Violins
    • Strip Charts
    • Sina Plots
    • Stacked Histograms
    • Overlapping Densities
    • Ridgeline Plots

    5.3 Proportions

    • Pie Chart
    • Bars
    • Stacked Bars
    • Multiple Pie Charts
    • Grouped Bars
    • Stacked Densities
    • Mosaic Plot
    • Treemap
    • Parallel Sets

    5.4 X-Y Relationships

    • Scatterplot
    • Bubble Chart
    • Paired Scatterplot
    • Slopegraph
    • Density Contours
    • 2D Bins
    • Hex Bins
    • Correlogram

    5.5 Geospatial Data

    • Map
    • Choropleth
    • Cartograms
    • Cartogram Heatmap

    5.6 Uncertainty

    • Error bars indicate the range of likely values for an estimate or measurement. Also, reference points can visualized as dots or bars.

    Uncertainty (Cont):

    • Confidence strips provide a visual sense of uncertainty but are difficult to read accurately. Eyes, half-eyes combine error bars for visualizing distributions.

    Uncertainty (Cont):

    • Shows different ways to visualize uncertainty with confidence bands, graded confidence bands, and fitted draws.

    Chapter 6 Visualizing amounts

    • In many scenarios, we are interested in the magnitude of a set of numbers, such as the total sales of products, the number of people in a city, or the age of athletes from around the world.
    • Bar plots are the standard approach in this case, though there are other types like Dot plots, and Heatmaps

    Bar Plot (Chart)

    • A common visualization for amounts and counts.

    Ugly (Why?)

    • Examples of poorly designed bar plots.

    Better Solution

    • A better alternative to poorly designed bar plots

    Bad (Why?)

    • Showing examples of bad design choices, such as incorrect or misleading plotting choices

    Order Matter (Of Some Sense)

    • The order of data in a bar plot is important when the categories are not inherently ordered

    6.2 Grouped and stacked bars:

    • Visualizing amounts broken down by categories grouped (grouped bar plots), stacked (stacked bar plots).

    Chapter 7: Visualizing Distributions:

    • Histograms, and Density plots
    • How many passengers of what ages were on the Titanic broken down by age groups
    • We call the counts for each age group is called Age Distribution
    • Visualizing single distribution, like how often certain ages appear
    • Using histograms and density plots

    Histogram

    • Visually representing frequencies by using rectangles of varying heights proportional to different age bins
    • Width of the bins are all the same for a histogram

    Density Plots

    • Drawing an underlying probability distribution for data.
    • Kernel estimation.

    Bandwidth (Appearance)

    • Illustrating how the bandwidth/width affects the visualization.

    What is the area under the curve in density plot?

    • Understanding the relationship to probability.

    Warning!

    • Realizing that histograms and density plots are estimates.
    • There's a large difference in choice when visualizing data distribution using histograms or density plots and other options.

    7.2 Visualizing multiple distributions at the same time

    • Visualizing overlapping distributions with stacked histograms.
    • Comparing age distribution between male and female passengers.

    Why?

    • Illustrating why a particular visualization is bad.

    Better Solution

    • Showing examples of visualizations that are better than the previous examples.
    • Illustrating the comparison between stacked bar charts and better data visualizations, and how one visualization is more suitable than the other. Data visualization should be clear and informative for the intended audience.
    • Visualizations need to be easily interpreted, and show the data in a clear and understandable manner rather than a complex 3-D appearance.
    • The presentation of data in simple terms will effectively capture the target population's attention and understanding

    Chapter 8 Visualizing distributions: Empirical Cumulative Distribution Functions and Q-Q plots

    • How many students received equal to or less points on a certain exam
    • Shows the quantile-quantile of data in ascending or descending order

    8.1 Empirical Cumulative Distribution Functions (Ecdfs)

    • Ranking and plotting the data points in order to visualize the cumulative proportion at a given point in the data.

    8.3 Quantile-Quantile Plots (QQ)

    • Visually determine how well a dataset follows a particular distribution, such as a normal distribution.

    Methodologies

    • Using ranked data points to visually see the distribution following a normal distribution.

    Example

    • Showing how data points would look on a distribution based on a theoretical normal distribution.

    Regression Line

    • Illustrating how a regression line is different from the line that represents a theoretical normal distribution

    Chapter 9 Visualizing many distributions at once

    • Visualizing many distributions (e.g., monthly temperatures) at once with various plots such as boxplots, violin plots, and ridgeline plots.

    9.1 Visualizing distributions along the vertical axis

    • Showing mean or median values with indication of variation using error bars.

    But why Bad?

    • Discussing what is incorrect or misleading about a visualization.

    Box Plot

    • Displaying summary statistics like the median, quartiles, minimum value.
    • Showing outliers in a dataset.

    Temperature Example

    • Plots showing box plots comparing temperatures across the months.

    Violin Plots

    • Showing similar data as box plots, but visualizing the distribution shape of the data.

    Strip Chart

    • Showing scatterplot-like visualizations.
    • Displaying the frequency of different points for particular months.

    Better (More points) Jettering

    • Showing the improvement to the strip chart

    Sina Plots

    • Similar visualizations to strip charts, but the points are spread out to visualize variability across months

    9.2 Visualizing Distributions along the horizontal axis

    • Displaying distributions along the horizontal axis(e.g. temperatures) with a specific dataset from multiple categories

    Chapter 10: Visualizing Proportions

    • How to visually represent how different categories/groups compare to each other

    Pie Charts

    • Visualizing proportions as slices of a circle

    Rectangular Pie Charts (Stacked Bars)

    • Visualizing proportions using stacked bars

    Side-by-Side Bars

    • Visualizing proportions using side-by-side bars

    Table 10.1: Pros and cons of common approaches to visualizing proportions.

    • Summary of different approaches to visualize proportions

    A case for side-by-side bars

    • Showing an example of an improved/more appropriate visual representation of proportions.

    A Better way

    • Providing a better visualization

    10.3 A case for stacked bars and stacked densities

    • Illustrating the use of stacked plots

    Heat Maps

    • Representing proportions in grid form with gradients or colors

    Chapter 11: Nested Proportions

    • Visualizing how one variable is broken down by other variables using nested visualizations.

    11.1 Nested proportions gone wrong

    • Providing examples of incorrect visualizations about the proportions of different variables

    Why Wrong?

    • Problems with the incorrect visualizations of proportions

    11.2 Mosaic plots and treemaps

    • Mosaic plots and treemaps are used for displaying proportions within multiple categories.

    Treemaps

    • Representing hierarchical data with nested boxes to show the proportion sizes

    11.3 Nested pies

    • Visualizing proportions with nested pie charts.

    Chapter 12: Visualizing Associations among Two or More Quantitative Variables

    • Scatter plots and other options.

    12.1 Scatter plots

    • Visualizing the relationship between two quantitative variables using scatter plots.

    12.2 Correlograms

    • Visualizing correlation between different variables.

    Correlograms

    • Showing correlation of different variables against each other.

    12.4 Paired Data

    • Plotting paired quantitative data through time on scatter plots.
    • Overall trends in data over time, and how to visualizes these trends

    14.1 Smoothing

    • Using moving averages to smooth out data and see overall trends.

    Moving Average (After or at)

    • Showing example plots comparing different moving averages and the use or limitations of the different types.

    Observations from the previous figure

    • Observation of what kind of smoothing is a better choice against time, given a particular dataset.

    Another Moving Average (LOESS)

    • Showing an alternative way to smooth data by using loess smoother.
    • Fitting trends to data using linear or exponential functions

    Log scale and Exp

    • Showing log scales for comparison

    Chapter 26: Don't go 3D

    • 3-D visualizations are often inappropriate

    26.1 Avoid gratuitous 3D

    • Avoid turning standard 2-D visualizations into 3-D objects, as the extra dimensions do not typically convey any meaningful information.

    Focus on the size of 25% in these four graphs

    • Demonstrating how 3-D projections distort the impression of the data

    A case for side-by-side bars

    • Giving examples as to why side-by-side bar charts are better than other visualizations, such as pie charts

    A Better way

    • Shows a different/better visualization choice for depicting proportions

    10.3 A case for stacked bars and stacked densities

    • Illustrating stacked bars and stacked densities

    Chapter 15: Geospatial Data

    • Plotting data points that have locations to create geographic visualizations

    15.2 Layers

    • Using multiple layers to show the relationship between variables in the data, such as terrain, road networks, or city locations

    15.3 Choropleth mapping

    • Plotting to depict quantities that vary on spatial areas.

    15.4 Cartograms

    • Presenting a geographic map to depict size proportionality

    Chapter 28: Choosing visualization software

    • Choosing the best/most appropriate software program for data exploration and visualization, based on the user's familiarity, data amount, and time pressure

    28.1: Reproducibility and Repeatability

    • Ensuring that graphs are well-documented and reproducible/repeatable for future viewers

    Interactive Software

    • Explaining how interactive software is not useful in data reproducibility

    Advice

    • Recommendations for better visualizations

    28.2 Data exploration versus data presentation

    • Discussing the need for efficient and rapid visualizations versus high-quality presentation figures

    Data Presentation

    • Providing final/polished/publication-level visualizations

    Exploration Stage and Software

    • Discussing the characteristics and utility of software that allows easy figure manipulation

    Themes

    • Discussing using themes/patterns for easier data comparison

    Benefit of Separation

    • Advantages of separating content from design.

    Summary of the Chapter

    • Summary and discussion points of visualization programs

    Chapter 29: Story Telling

    • Explaining the basics of communicating data and making a point.
    • How to tell a story with data using compelling stories with specific formats.
    • Presenting well-developed and structured stories for audience understanding
    • Presenting simple versus complex visualizations to show storytelling in graphics.
    • Providing clear explanations and use of data.

    29.1 What is a story?

    • Important characteristics of good storytelling, such as strong and focused story delivery and consistent message

    Story!!

    • Importance of narrative in data visualization

    Benefit of Story

    • How effective narrative/story structure benefits data visualization

    Compelling Story!

    • Using stories to convey meaning and insights

    29.2 Make a figure for the generals

    • Making a figure simple and clear to the intended audience

    29.3 Build up towards complex figures

    • Presenting complex figures by starting with simple ones

    29.4 Make your figures memorable

    • Showing how to present data in various ways, and what are the key aspects for making your data more memorable

    29.5 Be consistent but don't be repetitive

    • Using similar/consistent visual language for a set of related figures.

    Week 10 - Chapter Supplementary Information

    • Provide supplementary data/information for further learning/study

    Studying That Suits You

    Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

    Quiz Team

    Related Documents

    Description

    This quiz focuses on essential principles and practices for effective data visualization. Test your understanding of the 3Es of displaying data, ethical considerations, and common mistakes to avoid. Enhance your skills in presenting clear and impactful data.

    More Like This

    Data Presentation and Statistics
    17 questions
    Presentation of Data
    28 questions

    Presentation of Data

    HighSpiritedMiami avatar
    HighSpiritedMiami
    Data Presentation Techniques
    45 questions
    Use Quizgecko on...
    Browser
    Browser