CAP483_Introduction 1 (1).pdf
Document Details
Uploaded by Deleted User
Tags
Full Transcript
Data Visualization UNIT 1 What is data VISUALIZATION Data visualization is a graphical representation of data. It presents data as an image or graphic to make it easier to identify patterns and understand difficult concepts. Technology allows users to interact with the...
Data Visualization UNIT 1 What is data VISUALIZATION Data visualization is a graphical representation of data. It presents data as an image or graphic to make it easier to identify patterns and understand difficult concepts. Technology allows users to interact with the data by changing the parameters to see more detail and create new insights. What is Data Visualization? Data visualization is the graphical representation of information and data. By using visual elements like charts, graphs, and maps, data visualization tools provide an accessible way to see and understand trends, outliers, and patterns in data. It provides an excellent way for employees or business owners to present data to non- technical audiences without confusion. In the world of Big Data, data visualization tools and technologies are essential to analyze massive amounts of information and make data-driven decisions. Data visualization tools provide accessible ways to understand outliers, patterns, and trends in the data. Especially when working with very large data sets, developing a cohesive format is vital to creating visualizations that are both useful and visually appealing. Why Use Data Visualization? According to IBM, 2.5 quintillion bytes of data are created every day. The Research Scientist Andrew McAfee and Professor Erik Brynjolfsson of MIT point out that “more data cross the internet every second than were stored in the entire internet just 20 years ago.” As the world becomes more and more connected with an increasing number of electronic devices, the volume of data will continue to grow exponentially. IDC predicts there will be 163 zettabytes (163 trillion gigabytes) of data by 2025. All of this data is hard for the human brain to comprehend—in fact, it’s difficult for the human brain to comprehend numbers larger than five without drawing some kind of analogy or abstraction. Data visualization designers can play a vital role in creating those abstractions. Let’s take an example. Suppose you compile visualization data of the company’s profits from 2013 to 2023 and create a line chart. It would be very easy to see the line going constantly up with a drop in just 2018. So you can observe in a second that the company has had continuous profits in all the years except a loss in 2018. It would not be that easy to get this information so fast from a data table. This is just one demonstration of the usefulness of data visualization. Let’s see some more reasons why visualization of data is so important. The human mind is very receptive to visual information. That's why data visualization is a powerful tool for communication. How many 3s can you count? 24872184012387409216590147609856093247209 12562906509852659048275829856809609863095 84390564095878950374509284750989475092984 The human mind is very receptive to visual information. That's why data visualization is a powerful tool for communication. How many 6s can you count? 04235109021714853253129349421260142530 48105210237103201223104253909042310402 32415010284053296405721603753053446210 35109003253001353015312431040283521006 48053729310432853142109006251002583004 03256053110420163005290105101210140258 84253003164071124053261410284010210352 21352010046853210125384001270320104285 53210041253109042580210421853010642530 48526092305210068531201428061003652810 24872184012387409216590147609856093247209 12562906509852659048275829856809609863095 84390564095878950374509284750989475092984 04235109021714853253129349421260142530 48105210237103201223104253909042310402 32415010284053296405721603753053446210 35109003253001353015312431040283521006 48053729310432853142109006251002583004 03256053110420163005290105101210140258 84253003164071124053261410284010210352 21352010046853210125384001270320104285 53210041253109042580210421853010642530 48526092305210068531201428061003652810 Why Use Data Visualization? To make easier in understand and remember. To discover unknown facts, outliers, and trends. To visualize relationships and patterns quickly. To ask a better question and make better decisions. To competitive analyze. To improve insights. Why is Data Visualization Important? Understanding Complex Data: It helps to simplify complex data sets and make patterns, trends, and correlations more apparent. This visual representation allows analysts and decision-makers to grasp difficult concepts or identify new insights that might not be apparent in raw data. Communication: Visualizations are powerful tools for communication. They can convey information quickly and efficiently to stakeholders, regardless of their technical expertise. Visuals such as charts, graphs, and maps make data more accessible and understandable, enabling better-informed decisions. Identifying Trends and Outliers: By presenting data visually, trends, outliers, and patterns can be easily spotted. This helps in understanding historical performance, predicting future trends, and identifying areas that require attention or improvement. Facilitates Decision-Making: By providing clear insights, data visualization enables faster and more accurate decision-making. Reveals Hidden Insights: It can uncover insights and anomalies that might not be apparent in textual or numerical data. Interactive Exploration: Many data visualization tools offer interactivity, allowing users to explore the data in different ways, ask questions, and get immediate visual feedback. Supports Data-Driven Culture: Promoting the use of data visualization in an organization fosters a data-driven culture, encouraging evidence-based decisions and strategies. When to Use It Since large numbers are so difficult to comprehend in any meaningful way, and many of the most useful data sets contain huge amounts of valuable data, data visualization has become a vital resource for decision-makers. To take advantage of all this data, many businesses see the value of data visualizations in the clear and efficient comprehension of important information, enabling decision-makers to understand difficult concepts, identify new patterns, and get data-driven insights in order to make better decisions. Benefits of data visualization The benefits of data visualization include the following: Actionable insights. A broad spectrum of an organization's personnel can understand visuals presented in business intelligence dashboards. This lets users absorb information quickly, get better insights and figure out the next steps faster. Exploration of complex relationships. Visualization platforms with advanced capabilities can display complex relationships among data points and metrics, allowing an organization to make faster data- based decisions. Compelling storytelling. Data dashboards that are visually compelling will maintain the audience's interest with information they can understand. Accessibility. Visualization tools make data more accessible and understandable, so that laypersons or semi-technical users who aren't data scientists can interpret and analyze it. Interactivity. Interactive dashboards have the functionality to allow users to click on various aspects of data displays to get more information. This is especially useful for those with less expertise on the subject area covered by the data. Static displays don't allow this. Advantages and Disadvantages of Data Visualization Advantages of Data Visualization: Enhanced Comparison: Visualizing performances of two elements or scenarios streamlines analysis, saving time compared to traditional data examination. Improved Methodology: Representing data graphically offers a superior understanding of situations, exemplified by tools like Google Trends illustrating industry trends in graphical forms. Efficient Data Sharing: Visual data presentation facilitates effective communication, making information more digestible and engaging compared to sharing raw data. Sales Analysis: Data visualization aids sales professionals in comprehending product sales trends, identifying influencing factors through tools like heat maps, and understanding customer types, geography impacts, and repeat customer behaviors. Identifying Event Relations: Discovering correlations between events helps businesses understand external factors affecting their performance, such as online sales surges during festive seasons. Exploring Opportunities and Trends: Data visualization empowers business leaders to uncover patterns and opportunities within vast datasets, enabling a deeper understanding of customer behaviors and insights into emerging business trends. Disadvantages of Data Visualization: Can be time-consuming: Creating visualizations can be a time-consuming process, especially when dealing with large and complex datasets. Can be misleading: While data visualization can help identify patterns and relationships in data, it can also be misleading if not done correctly. Visualizations can create the impression of patterns or trends that may not exist, leading to incorrect conclusions and poor decision-making. Can be difficult to interpret: Some types of visualizations, such as those that involve 3D or interactive elements, can be difficult to interpret and understand. May not be suitable for all types of data: Certain types of data, such as text or audio data, may not lend themselves well to visualization. In these cases, alternative methods of analysis may be more appropriate. May not be accessible to all users: Some users may have visual impairments or other disabilities that make it difficult or impossible for them to interpret visualizations. In these cases, alternative methods of presenting data may be necessary to ensure accessibility. Data visualization Tools Data visualization Tools There are numerous tools available for data visualization, each with its own set of features and capabilities. Here are some popular data visualization tools: Tableau: A widely-used tool for business intelligence and data visualization, Tableau offers a range of interactive and shareable dashboards. Microsoft Power BI: A powerful tool for data visualization and business analytics, Power BI allows users to create interactive reports and dashboards. Google Data Studio: A free tool that integrates with various Google services and other data sources, providing interactive and customizable reports. D3.js: A JavaScript library for producing dynamic, interactive data visualizations in web browsers. It leverages web standards such as SVG, HTML, and CSS. QlikView/Qlik Sense: QlikView offers guided analytics applications and dashboards, while Qlik Sense provides more self-service visualization capabilities. Plotly: A graphing library that makes interactive, publication-quality graphs online. It is available in several programming languages, including Python, R, and JavaScript. Chart.js: A simple yet flexible JavaScript charting library for designers and developers, offering animated, responsive charts. Looker: A business intelligence software and big data analytics platform that helps you explore, analyze, and share real-time business analytics. SAP Lumira: A self-service solution that allows users to create and share data visualizations and stories. Highcharts: A charting library written in pure JavaScript, offering a wide range of chart types and extensive documentation. Matplotlib: A plotting library for the Python programming language and its numerical mathematics extension NumPy. It provides an object-oriented API for embedding plots into applications. Data visualization methodology Data visualization methodology Data visualization methodology involves the process and techniques used to create visual representations of data to make it more understandable and actionable. Here are the key steps and considerations involved in data visualization methodology: Define the Purpose and Audience Purpose: Understand why the data needs to be visualized. Is it for analysis, reporting, storytelling, or monitoring? Audience: Identify who will be viewing the visualization. Different audiences may have varying levels of expertise and interest. Define the Purpose and Audience Before choosing your datavis design, it’s essential that you know what you want to achieve from your visualizations, and who will be viewing them. This is essential because if you design based on what you want to communicate to your end- end viewer, it’s more likely that they will easily be able to grasp that information. Your job is to make it easy for your viewer to make the business decisions they need to based on the data you are displaying for them. So, you will need to ask yourself what question they are trying to answer with this data and work from there. You will also need to assess how familiar they are with the information you are presenting. And, you should keep in mind their abilities to read different kinds of graphs and charts. From there you can decide how simple or complex your visualization can be, and whether you need to add any explanatory notes. Data Collection and Preparation Collecting the data: relevant data is gathered from various sources like data warehouses/lakes, operational systems, marketing platforms, etc. The data comes in all different formats and compositions. It is important to understand what data is relevant to what you are trying to achieve and whether it is accurate. This helps to ensure the other steps run smoother. Gather data from reliable and relevant sources.. Cleaning the data: This stage entails removing mistakes and discrepancies in order to refine its accuracy and quality. This could entail replacing null or missing numbers, fixing errors and typos, getting rid of duplicates, or addressing outliers. The strategy for cleaning data is determined by the kind of data and the particular needs of the study Data cleaning involves: Removing unnecessary variables Deleting duplicate rows/observations Addressing outliers or invalid data Dealing with missing values Standardizing or categorizing values Correcting typographical errors Data Integration Data integration: ◦ combines data from multiple sources into a coherent store. ◦ Careful integration can help reduce and avoid redundancies and inconsistencies in resulting data set. ◦ This can help improve the accuracy and speed of the subsequent data mining process. Data Transformation: Convert data into a suitable format for analysis and visualization. Strategies for data transformation are: Smoothing: remove noise from data Attribute Construction: new attributes are constructed Aggregation: summarization, data cube construction Normalization: scaled to fall within a small, specified range like 0.0 to 1.0 Discretization: raw values of a numeric attribute (e.g., age) are replaced by interval labels (e.g., 0–10, 11–20, etc.) or conceptual labels (e.g., youth, adult, senior ). Choose the Right Visualization Type Charts and Graphs: Bar charts, line graphs, pie charts, scatter plots, etc. Infographics: Combination of graphics and text to convey complex information. Maps: Geographical data representations. Dashboards: Interactive panels with multiple visualizations. Design Principles Clarity: Make sure the visualization is easy to understand. Accuracy: Ensure the data is represented correctly without distortion. Simplicity: Avoid unnecessary elements that could clutter the visualization. Consistency: Use consistent colors, fonts, and styles throughout the visualizations Use of Tools and Software Spreadsheet Software: Excel, Google Sheets. Visualization Tools: Tableau, Power BI, QlikView. Programming Languages: Python (matplotlib, seaborn, Plotly), R (ggplot2). Develop and Refine Prototyping: Create initial drafts of the visualizations. Feedback: Gather feedback from stakeholders and make necessary adjustments. Iteration: Refine the visualizations based on feedback and further analysis. Implement Interactivity Filters: Allow users to filter data to see different views. Tooltips: Provide additional information when users hover over elements. Drill-downs: Enable users to click on elements to see more detailed data. Test and Validate Accuracy Check: Verify the data and calculations. Usability Testing: Ensure the visualization is user-friendly and intuitive. Performance Testing: Make sure the visualization performs well with large datasets. Presentation and Communication Storytelling: Use the visualization to tell a compelling story. Context: Provide context and explanations to help the audience understand the data. Engagement: Encourage audience interaction and engagement with the visualization. Maintain and Update Regular Updates: Keep the data and visualizations updated with the latest information. Monitoring: Continuously monitor the performance and effectiveness of the visualizations. Iteration: Regularly review and improve the visualizations based on feedback and new requirements. Data Visualization Design: Data Visualization Design: Data visualization can be a powerful tool. Only, however, when done correctly. As we’ve mentioned, a poorly designed visualization can end up doing more harm than good. So, it’s important to make sure that your data visualizations are effective. When designing your dashboards and visualizations, there are certain principles or tips you should keep in mind to achieve this efficacy. These will enhance the value and effectiveness of your visualizations. They are: 1. Know your audience and your objective. 2. Choose the right types of visualizations. 3. Make them organized, consistent and intuitive. 4. Give context. 5. Less is more. 6. Use colors wisely. Know Your Audience and Your Objective: Before choosing your datavis design, it’s essential that you know what you want to achieve from your visualizations, and who will be viewing them. This is essential because if you design based on what you want to communicate to your end- end viewer, it’s more likely that they will easily be able to grasp that information. Your job is to make it easy for your viewer to make the business decisions they need to based on the data you are displaying for them. So, you will need to ask yourself what question they are trying to answer with this data and work from there. You will also need to assess how familiar they are with the information you are presenting. And, you should keep in mind their abilities to read different kinds of graphs and charts. From there you can decide how simple or complex your visualization can be, and whether you need to add any explanatory notes. Choose the Right Types of Visualizations: In order to choose the right kinds of visualizations for you and your stakeholders, you have to know a little about the different kinds and what purpose they serve. Make Your Visualizations Organized, Consistent and Intuitive: The whole point of data visualization is that the viewer will understand the data better than if it were in its raw form. It makes sense then, that the visualizations need to be intuitive and well organized. 6 key components of effective data visualization Clear and precise Your data visualization should be like a shining beacon of clarity, illuminating insights with precision and ease, be it trends, comparisons, or correlations – make sure your audience can effortlessly comprehend the insights hidden in the data. Simplicity: Example: A bar chart showing the sales of different products should not include 3D effects, excessive colors, or unnecessary grid lines. A simple 2D bar chart with clear labels is more effective. Legibility: Example: In a line chart displaying monthly temperatures, use a readable font size for the axis labels and data points. Avoid using overly stylized fonts. Consistency: Example: If blue is used to represent data for one category in a pie chart, use the same shade of blue to represent the same category in a corresponding bar chart. Audience-specific Your visualization will be a guide to your audience so before creating visuals from the data you must know your audience and their specific requirements. To achieve this, put yourself in your audience’s shoes and ask questions like: Do I know the context of my visualization? Am I serving my audience’s purpose? Is my visualization clear and concise? Or, am I providing too much information? Once you can answer these questions, you will have unlocked the key to creating good data visualizations. Key elements Representation The representation of data is the way you decide to depict data through a choice of physical forms. Whether it is via a line, a bar, a circle, or any other visual variable, you are taking data as the raw material and creating a representation to best portray its attributes. Presentation The presentation of data goes beyond the representation of data and concerns how you integrate your data representation into the overall communicated work, including the choice of colors, annotations, and interactive features. visual perception abilities Exploiting our visual perception abilities relates to the scientific understanding of how our eyes and brains process information most effectively, as we've just discussed. This is about harnessing our abilities with spatial reasoning, pattern recognition, and big-picture thinking. Amplify cognition Amplify cognition is about maximizing how efficiently and effectively we are able to process the information into thoughts, insights, and knowledge. Ultimately, the objective of data visualization should be to make a reader or users feel like they have become better informed about a subject. The data visualization methodology It presents a sequence of important analytical and design tasks and decisions that need to be handled effectively. As any fellow student of Operational Research (the "Science of Better") will testify, through planning and preparation, and the development and deployment of strategy, complex problems can be overcome with greater efficiency, effectiveness, and elegance. Data visualization is no different. The design challenges involved in data visualization are predominantly technology related; the creation and execution of a visualization design will typically require the assistance of a variety of applications and programs. However, the focus of this methodology is intended to be technology-neutral, placing an emphasis on the concepting, reasoning, and decision-making. Visualization design objectives Strive for form and function Justifying the selection of everything we do Creating accessibility through intuitive design Never deceive the receiver Key Factors Clarifying the purpose of your project Establishing intent – the visualization's function Establishing intent – the visualization's tone Clarifying the purpose of your project The reason for existing A project will typically form in one of following two ways: you've either been asked to do it or you've decided to do something yourself. You might think that's obvious, but these are very different scenarios for working creatively. This scenario is a completely self-defined, self-determined, and more flexible context than that of a commissioned project. It doesn't involve a client, or a brief, or a set of instructions, or restrictions on scope, timescales, or audience—you've got a blank canvas to follow the scent of what it is that motivated you in the first place. The intended effect It's important to capture these thoughts if they do form. Make sure you keep notes, in your sketchbook, on your tablet, or on a cigarette packet—it doesn't matter where, just do it before you forget. While we don't want to be closed off and commit ourselves to the pursuit of the first thing we think of, these instinctive thoughts could prove valuable later on. For example, a visualization to assist with the monitoring of signals or facilitating a visual lookup of data will be very different from a design that is intended to grab attention or change behavior. Similarly, presenting arguments and telling a story is a very different setting to conducting analysis or 'playing' with data. Establishing intent – the visualization's function The intended function of a data visualization concerns the functional experience you create between your design, the data, and the reader/user. Convey an explanatory portrayal of data to a reader Provide an interface to data in order to facilitate visual exploration Use data as an exhibition of self-expression When the function is to explain Explanatory data visualization is about conveying information to a reader in a way that is based around a specific and focused narrative. It requires a designer-driven, editorial approach to synthesize the requirements of your target audience with the key insights and most important analytical dimensions you are wishing to convey. There are many ways in which you can "explain" data. information dashboard in a corporate setting. A graphic in a newspaper an animated design to display patterns of population migration over time. a physical or ambient visualization designed to draw attention to the sugar content of certain drinks. Your objective as the designer is to create a graphical display, made accessible through intuitive, visual design that clearly portrays the narrative you are seeking to impart. Here is an example of an explanatory visualization, based on a chart type called a Sankey diagram, which portrays analysis of the top ten freshwater-consuming countries and the breakdown of its usage: When the function is to explore Exploratory data visualization design is a slightly different matter compared to creating an explanatory piece. Exploratory solutions aim to create a tool, providing the user with an interface to visually explore the data. the key feature that differentiates an exploratory piece from an explanatory piece is the amount of work you have to do as a reader to discover insights. This is a perfect example of an exploratory visualization design. a scatterplot matrix visualization a method used to reveal correlations across a multivariate dataset, enabling the eye to efficiently scan the entire matrix to quickly identify variable pairings with strong or weak relationships Exploratory visualizations are not limited to being interactive. Visual analysis can be facilitated through static portrayals of data. The previous example is actually interactive but a static version would still offer a discovery of the relationships and patterns of the dataset. When the function is to exhibit data Data art is characterized by a lack of structured narrative and absence of any visual analysis capability. Instead, the motivation is much more about creating an artifact, an aesthetic representation or perhaps a technical/technique demonstration. At the extreme end, a design may be more guided by the idea of fun or playfulness or maybe the creation of ornamentation An example of data art Establishing intent – the visualization's tone Let's look at the language of two potential motives behind creating a data visualization: "We need a chart to help monitor…" "We need to present this in a way that Convince people…" The reaction of a user reading, for example, a dashboard full of bar charts and line charts to help monitor monthly performance will be quite analytical and pragmatic in style. Pragmatic and analytical "A visualization is more effective than another visualization if the information conveyed by one visualization is more readily perceived than the information in the other." Creating a visualization with a pragmatic tone is about recognizing a need for a design that delivers fast, efficient and precise portrayals of data. Typically, you will have a captive audience, a readership who want to or need to interact and learn from the data. This could be a corporate environment, where people need to simply learn about recent performance of operational activity or undertake visual analysis to discover potentially revealing patterns. Pattern - Visualization methods that can reveal forms or patterns in the data to give it meaning. In general, there are two basic types of data visualisation: Exploration, which helps find a story the data is telling you. Explanation, which tells a story to an audience. Both types of data visualisation must take into account the audience’s expectations. Data visualization The primary goal of data visualization is to communicate information clearly and effectively through graphical means. It does not mean that data visualization needs to look boring to be functional or extremely sophisticated to look beautiful. To convey ideas effectively, both aesthetic form and functionality need to go hand in hand, providing insights into a rather sparse and complex data set by communicating its key-aspects more intuitively. Why data visualization is such a powerful tool? Intuitive Fast Flexible Insightful Importance of Data Exploration “Data exploration tasks are those of examining data without having an a priori understanding of what patterns, information, or knowledge it might contain.” While the majority of research has been on the output of and creation of visualizations, but the important aspect of these tasks is to first understand the data that is being presented. the main aspect of exploration is the understanding of perception in comparing two measures in a data source. data exploration is the process of finding the best way to pull out an outcome that a specific audience can perceive. By displaying data graphically – for example, through scatter plots, density plots or bar charts – users can see if two or more variables correlate and determine if they are good candidates for further analysis, which may include. Univariate analysis: The analysis of one variable. Bivariate analysis: The analysis of two variables to determine their relationship. Multivariate analysis: The analysis of multiple outcome variables. Principal components analysis: The analysis and conversion of possibly correlated variables into a smaller number of uncorrelated variables. Importance of Data Visualization Emotive and abstract Abstract visualization, in terms of its tone, is more about creating an aesthetic that portrays a general story or sense of pattern. You might not be able to pick out every data point or category, but there is enough visual information to give you a feel for the physicality of the data. a project to visualize the global airline transportation network consisting of all commercial flights worldwide. The routes highlighted are those flights in and out of Toronto Pearson airport. The project was designed to assess the threat of infectious diseases. The design does not intend to offer an analytical summary of air travel statistics. Instead it creates a more immersive experience in to the data, offering a visual interface to establish a greater sense of how interconnected the world is through air travel. It causes us to imagine just how easy it could be for diseases to spread across the globe in a short period of time. Data visualization is a means to an end, not an end in itself. It's merely a bridge connecting the messenger to the receiver and its limitations are framed by our own inherent irrationalities, prejudices, assumptions, and irrational tastes. All these factors can undermine the consistency and reliability of any predicted reaction to a given visualization, but that is something we can't realistically influence. The Seven Stages of Visualizing Data Why Data Display Requires Planning? Each set of data has unique presentation requirements, and the purpose for which you're using the data set influences those requirements just as much as the data itself. There are dozens of quick tools for developing graphics in office programs, on the Web and software. but complex data sets used for specialized applications require unique treatment. Too Much Information Data Collection Thinking About Data Data Never Stays the Same Too Much Information When you hear the term “information overload,” you probably know exactly what it means because it’s something you deal with daily. Performing sophisticated data analysis no longer requires a research laboratory, just a cheap machine and some code. Complex data sets can be accessed, explored, and analyzed by the public in a way that simply was not possible in the past. Data Collection We’re getting better and better at collecting data, but we lag in what we can do with it. Lots of data is out there, but it’s not being used to its greatest potential because it’s not being visualized as well as it could be. This is the greatest challenge of our information rich era: how can these questions be answered quickly, if not instantaneously? Thinking About Data We also do very little sophisticated thinking about information itself. Data Never Stays the Same thinking about data as fixed values to be analyzed but data is a moving target. How do we build representations of data that adjust to new values every second, hour, or week? This is a necessity because most data comes from the real world, where there are no absolutes. The temperature changes, the train runs late, or a product launch causes the traffic pattern on a web site to change drastically. Seven Stages Acquire Parse Filter Mine Represent Refine Interact Acquire Obtain the data, whether from a file on a disk or a source over a network. Parse Provide some structure for the data meaning, and order it into categories. Filter Remove all but the data of interest. Mine Apply methods from statistics or data mining as a way to discern patterns or place the data in mathematical context. Represent Choose a basic visual model, such as a bar graph, list, or tree. Refine Improve the basic representation to make it clearer and more visually engaging. Interact Add methods for manipulating the data or controlling what features are visible. Acquire Zip codes in the format provided by the U.S. Census Bureau Parse (Structure of acquired data) After you acquire the data, it needs to be parsed—changed into a format that tags each part of the data with its intended use. Each line of the file must be broken along its individual parts; in this case, it must be delimited at each tab character. Then, each piece of data needs to be converted to a useful format. Parse (Structure of acquired data) Mining the data: just compare values to find the minimum and maximum Represent This step determines the basic form that a set of data will take. Some data sets are shown as lists, others are structured like trees, and so on. The Represent stage is a linchpin that informs the single most important decision in a visualization project and can make you rethink earlier stages. How you choose to represent the data can influence the very first step (what data you acquire) and the third step (what particular pieces you extract). Refine (clarify the representation) In this step, graphic design methods are used to further clarify the representation by calling more attention to particular data (establishing hierarchy) or by changing attributes (such as color) that contribute to readability. Refine (clarify the representation) The next stage of the process adds interaction, letting the user control or explore the data. Interaction might cover things like selecting a subset of the data or changing the viewpoint Refine (clarify the representation) Interactions between the seven stages Widgets These widgets provide a means to display data in picture or graph form. Line Chart; a two-dimensional graph that shows a left to right series of data points, with each two consecutive data points connected by a straight line. Pie Chart: a two-dimensional circle, divided into different colored sections, one for each data value. Column Chart: a chart with horizontal rectangular bars representing the amount of a particular discrete value. Heat Map: A Google map with results' geographic frequency represented as colored blobs located at distinct points on the map. Bar Chart: a chart with vertical rectangular bars representing the amount of a particular discrete value. Point Map: A Google map with data points representing search results, each associated with a geospatial value. Data visualization tools