Document Details

EnoughTranscendental

Uploaded by EnoughTranscendental

SDU University

2024

Meraryslan Meraliyev

Tags

python data visualization data visualization libraries python programming data analysis

Summary

This is a presentation about Python data visualization techniques, covering Matplotlib, Seaborn, and Plotly. It includes examples, use cases, and tasks. It was created on October 1, 2024, and is intended for an undergraduate audience.

Full Transcript

Python Data Visualization Using Matplotlib, Seaborn, and Plotly Meraryslan Meraliyev SDU University October 1, 2024 Meraryslan Meraliyev (SD...

Python Data Visualization Using Matplotlib, Seaborn, and Plotly Meraryslan Meraliyev SDU University October 1, 2024 Meraryslan Meraliyev (SDU University) Python Data Visualization October 1, 2024 1 / 41 Outline 1 Introduction to Data Visualization 2 Matplotlib Basics 3 Seaborn for Advanced Visualization 4 Interactive Visualizations with Plotly 5 Case Study: Real-World Data Visualization 6 Conclusion Meraryslan Meraliyev (SDU University) Python Data Visualization October 1, 2024 2 / 41 What is Data Visualization? Visualization is the graphical representation of data. Helps in understanding trends, patterns, and outliers in data. Meraryslan Meraliyev (SDU University) Python Data Visualization October 1, 2024 3 / 41 Why Visualize Data? Simplifies data interpretation. Communicates complex data insights. Enables better decision-making. Meraryslan Meraliyev (SDU University) Python Data Visualization October 1, 2024 4 / 41 Common Python Libraries for Visualization Matplotlib: Basic and flexible static plots. Seaborn: Statistical visualizations with high-level abstraction. Plotly: Interactive, dynamic, and 3D visualizations. Meraryslan Meraliyev (SDU University) Python Data Visualization October 1, 2024 5 / 41 Applications of Visualization in Industry Business: Sales trends, market analysis. Science: Data exploration and research communication. Data Science: Machine learning model performance, feature relationships. Meraryslan Meraliyev (SDU University) Python Data Visualization October 1, 2024 6 / 41 Types of Plots and Their Uses Line Plot: Trends over time (e.g., stock prices). Bar Chart: Comparing categorical data (e.g., sales per region). Histogram: Data distribution (e.g., age distribution). Scatter Plot: Relationship between two continuous variables (e.g., height vs. weight). Heatmap: Correlation or frequency (e.g., correlation matrix). Meraryslan Meraliyev (SDU University) Python Data Visualization October 1, 2024 7 / 41 Importance of Data Visualization Bridges the gap between raw data and understanding. Identifies trends, relationships, and outliers. Helps decision-makers interpret data more effectively. Meraryslan Meraliyev (SDU University) Python Data Visualization October 1, 2024 8 / 41 Why Use Matplotlib? Highly customizable for static plots. Essential for publication-ready visuals. Supports a wide range of plot types. Meraryslan Meraliyev (SDU University) Python Data Visualization October 1, 2024 9 / 41 Matplotlib Structure Figure: The whole figure or window containing the plot. Axes: The area where data is plotted (could be multiple axes in one figure). Labels: Axis labels, titles, legends, etc. Meraryslan Meraliyev (SDU University) Python Data Visualization October 1, 2024 10 / 41 Use Case: Stock Price Analysis with Line Plots Visualizing the trend of stock prices over time. Can identify upward or downward trends. Useful for financial analysis and forecasting. Meraryslan Meraliyev (SDU University) Python Data Visualization October 1, 2024 11 / 41 Line Plot Example: Stock Prices 1 import matplotlib. pyplot as plt 2 import pandas as pd 3 4 # Generate random stock data 5 dates = pd. date_range ( start = ’2022 -01 -01 ’ , periods =100) 6 prices = pd. Series (100 + ( pd. Series ( range (100) ). apply ( lambda x : x + np. random. randn () ) ) ) 7 8 plt. plot ( dates , prices ) 9 plt. title (" Stock Price Over Time ") 10 plt. xlabel (" Date ") 11 plt. ylabel (" Price ") 12 plt. xticks ( rotation =45) 13 plt. show () 14 Meraryslan Meraliyev (SDU University) Python Data Visualization October 1, 2024 12 / 41 Task: Customize the Line Plot Change the line color to red. Add grid lines to the plot. Use dashed lines for the trend. Meraryslan Meraliyev (SDU University) Python Data Visualization October 1, 2024 13 / 41 Use Case: Product Sales Comparison with Bar Charts Comparing sales across different regions or categories. Useful for visualizing categorical data. Common in business analytics. Meraryslan Meraliyev (SDU University) Python Data Visualization October 1, 2024 14 / 41 Bar Chart Example: Product Sales 1 import matplotlib. pyplot as plt 2 3 categories = [ ’ North ’ , ’ South ’ , ’ East ’ , ’ West ’] 4 values = [350 , 480 , 230 , 510] 5 6 plt. bar ( categories , values ) 7 plt. title (" Product Sales by Region ") 8 plt. xlabel (" Region ") 9 plt. ylabel (" Sales ( in units ) ") 10 plt. show () 11 Meraryslan Meraliyev (SDU University) Python Data Visualization October 1, 2024 15 / 41 Task: Create a Grouped Bar Chart Create a bar chart showing product sales for two categories (e.g., Q1 and Q2). Customize the bar colors and add a legend. Meraryslan Meraliyev (SDU University) Python Data Visualization October 1, 2024 16 / 41 Use Case: Distribution of Customer Ages with Histograms Histograms are used to visualize data distribution. Useful for seeing how data is spread across intervals (e.g., ages). Common in demographic studies or data analysis. Meraryslan Meraliyev (SDU University) Python Data Visualization October 1, 2024 17 / 41 Histogram Example: Customer Age Distribution 1 import numpy as np 2 data = np. random. randn (1000) 3 plt. hist ( data , bins =30) 4 plt. title (" Customer Age Distribution ") 5 plt. show () 6 Meraryslan Meraliyev (SDU University) Python Data Visualization October 1, 2024 18 / 41 Task: Customize the Histogram Adjust the number of bins for better clarity. Change the color of the histogram bars. Meraryslan Meraliyev (SDU University) Python Data Visualization October 1, 2024 19 / 41 Multiple Subplots in Matplotlib Creating multiple plots within the same figure. Useful when comparing different datasets or variables. Meraryslan Meraliyev (SDU University) Python Data Visualization October 1, 2024 20 / 41 Multiple Subplots Example 1 fig , ( ax1 , ax2 ) = plt. subplots (1 , 2) 2 ax1. plot (x , y , color = ’r ’) 3 ax2. bar ( categories , values ) 4 plt. show () 5 Meraryslan Meraliyev (SDU University) Python Data Visualization October 1, 2024 21 / 41 Why Use Seaborn? Built on top of Matplotlib for better aesthetics. Simplifies the creation of complex statistical plots. Ideal for exploratory data analysis. Meraryslan Meraliyev (SDU University) Python Data Visualization October 1, 2024 22 / 41 Seaborn vs. Matplotlib Seaborn automates common visualizations that require more manual steps in Matplotlib. Seaborn provides cleaner default styles. Meraryslan Meraliyev (SDU University) Python Data Visualization October 1, 2024 23 / 41 Use Case: Exploring Relationships with Pair Plots Pair plots show the relationships between multiple features. Useful for exploring feature relationships in machine learning datasets. Helps identify potential correlations. Meraryslan Meraliyev (SDU University) Python Data Visualization October 1, 2024 24 / 41 Pair Plot Example: Iris Dataset 1 import seaborn as sns 2 import matplotlib. pyplot as plt 3 4 # Load the built - in Iris dataset 5 df = sns. load_dataset ( ’ iris ’) 6 sns. pairplot ( df , hue = ’ species ’) 7 plt. show () 8 Meraryslan Meraliyev (SDU University) Python Data Visualization October 1, 2024 25 / 41 Task: Create a Pair Plot for Another Dataset Use the ”tips” dataset in Seaborn. Create a pair plot to explore relationships between total bill, tip, and size. Meraryslan Meraliyev (SDU University) Python Data Visualization October 1, 2024 26 / 41 Use Case: Correlation Between Variables with Heatmaps Heatmaps are useful for visualizing correlations between variables. Used in financial data, stock correlations, or feature selection. Color intensity helps highlight the strength of correlations. Meraryslan Meraliyev (SDU University) Python Data Visualization October 1, 2024 27 / 41 Heatmap Example: Correlation Matrix 1 corr = df. corr () 2 sns. heatmap ( corr , annot = True , cmap =" coolwarm ") 3 plt. title (" Correlation Heatmap ") 4 plt. show () 5 Meraryslan Meraliyev (SDU University) Python Data Visualization October 1, 2024 28 / 41 Task: Customize the Heatmap Change the color palette to a ”cool” theme. Annotate the heatmap to display correlation values. Meraryslan Meraliyev (SDU University) Python Data Visualization October 1, 2024 29 / 41 Why Use Plotly? Creates interactive plots for web applications. Supports 3D and real-time visualizations. Ideal for dashboards and exploratory analysis. Meraryslan Meraliyev (SDU University) Python Data Visualization October 1, 2024 30 / 41 Use Case: Real Estate Data with Interactive Scatter Plots Interactive scatter plots allow deeper exploration of relationships. Useful in real estate data (e.g., price vs. square footage). Can use hover tools to explore more information dynamically. Meraryslan Meraliyev (SDU University) Python Data Visualization October 1, 2024 31 / 41 Interactive Scatter Plot Example: Gapminder Dataset 1 import plotly. express as px 2 3 # Load dataset 4 df = px. data. gapminder () 5 fig = px. scatter ( df , x = ’ gdpPercap ’ , y = ’ lifeExp ’ , color = ’ continent ’ , 6 hover_name = ’ country ’ , size = ’ pop ’ , log_x = True ) 7 fig. show () 8 Meraryslan Meraliyev (SDU University) Python Data Visualization October 1, 2024 32 / 41 Task: Customize the Scatter Plot Add hover text to display additional features (e.g., continent). Change the color scheme based on another feature. Meraryslan Meraliyev (SDU University) Python Data Visualization October 1, 2024 33 / 41 Use Case: Visualizing Stock Prices with Interactive Line Plots Interactive line plots allow exploration of stock price trends over time. Useful for financial analysis and stock market dashboards. Users can zoom and pan to explore data at different time scales. Meraryslan Meraliyev (SDU University) Python Data Visualization October 1, 2024 34 / 41 Interactive Line Plot Example 1 import plotly. graph_objs as go 2 x = [ ’2020 -01 -01 ’ , ’2020 -01 -02 ’ , ’2020 -01 -03 ’] 3 y = [100 , 105 , 102] 4 fig = go. Figure ([ go. Scatter ( x =x , y =y , mode = ’ lines + markers ’) ]) 5 fig. show () 6 Meraryslan Meraliyev (SDU University) Python Data Visualization October 1, 2024 35 / 41 Task: Create an Interactive Line Plot for Another Dataset Use the ”gapminder” dataset in Plotly Express. Visualize life expectancy over time for different continents. Meraryslan Meraliyev (SDU University) Python Data Visualization October 1, 2024 36 / 41 Visualizing Sales Trends Analyze historical sales data for a company. Visualize trends and seasonality using line and bar plots. Explore relationships between sales and marketing spend using scatter plots. Meraryslan Meraliyev (SDU University) Python Data Visualization October 1, 2024 37 / 41 Code: Sales Trends Line Plot 1 import pandas as pd 2 import matplotlib. pyplot as plt 3 4 # Simulated sales data 5 data = { ’ Date ’: pd. date_range ( start = ’2022 -01 -01 ’ , periods =12 , freq = ’M ’) , 6 ’ Sales ’: [230 , 250 , 220 , 300 , 340 , 360 , 400 , 380 , 420 , 500 , 460 , 480]} 7 8 df = pd. DataFrame ( data ) 9 plt. plot ( df [ ’ Date ’] , df [ ’ Sales ’] , marker = ’o ’) 10 plt. title (" Sales Trends Over Time ") 11 plt. xlabel (" Date ") 12 plt. ylabel (" Sales ") 13 plt. xticks ( rotation =45) 14 plt. show () 15 Meraryslan Meraliyev (SDU University) Python Data Visualization October 1, 2024 38 / 41 Task: Analyze Real-World Data Choose a dataset (e.g., financial, weather, or sales). Create visualizations using Matplotlib, Seaborn, and Plotly. Meraryslan Meraliyev (SDU University) Python Data Visualization October 1, 2024 39 / 41 Summary of Key Visualization Tools Matplotlib: Best for static and publication-ready plots. Seaborn: High-level interface for statistical graphics. Plotly: Ideal for interactive plots and dashboards. Meraryslan Meraliyev (SDU University) Python Data Visualization October 1, 2024 40 / 41 Further Reading and Resources Matplotlib Documentation Seaborn Documentation Plotly Documentation Meraryslan Meraliyev (SDU University) Python Data Visualization October 1, 2024 41 / 41

Use Quizgecko on...
Browser
Browser