Pandas Data Visualization (BMAN73701) PDF

Document Details

ExuberantKunzite6723

Uploaded by ExuberantKunzite6723

The University of Manchester, Alliance Manchester Business School

Manuel López-Ibáñez

Tags

python programming data analysis pandas data visualization

Summary

This is a lecture on programming in python for business analytics covering tabular data using Pandas and data visualization. The lecture notes cover topics such as introductory Pandas, and data visualization with Matplotlib from the University of Manchester's Alliance Manchester Business School.

Full Transcript

I MAN,CHESTER_ I 1s24 The University of Man c'hester Alliance Manchester Business Schoo BMAN73701 Programming in ~ python™for Business Analytics Python Week 4: Lecture 1 Tabular Data (Pandas) and Data Visualisation...

I MAN,CHESTER_ I 1s24 The University of Man c'hester Alliance Manchester Business Schoo BMAN73701 Programming in ~ python™for Business Analytics Python Week 4: Lecture 1 Tabular Data (Pandas) and Data Visualisation Prof. Manuel López-Ibáñez [email protected] Office hours: Mon 4pm-5pm, Fri 9am-10am https://calendly.com/manuel-lopez-ibanez I MANCH ESTER 1~2---l Agenda The Un iversity of Manchester Alliance Manchester Business Schoo,! Intro to Pandas Intro to Data Visualisation Matplotlib + Pandas Matplotlib in detail Programming visualisations BMAN73701 Week 4 3 I MAN,CHESTER_ I 1s24 The University of Man c'hester Alliance Manchester Business Schoo BMAN73701 Programming in ~ python™for Business Analytics Python Week 4: Lecture 1 Tabular Data (Pandas) and Data Visualisation Part 1: Tabular Data using Pandas Part 2: Data Visualisation MAN,CHEsTER_ 1824 Intro to Pandas The University of Man c'hester Alliance Manchester Business Schoo Dt11t1 \f1ra 11glhrg u·itb Pandas. N11111Py, a nd f Py tbo 11 pandas Yit = /3' Xit + /J,i + Eit Python library for tabular data manipulation and analysis Strong support for time-series and visualisation Website: http://pandas.pydata.org/ O'REILLY" Wes McKinney Documentation: http://pandas.pydata.org/pandas-docs/stable/ import pandas as pd # Just a shortcut. BMAN73701 Week 4 5 MAN,CHEsTER_ 1824 pandas Basics The University of Man c'hester Alliance Manchester Business Schoo Basic structure: DataFrame – A table with columns and rows – Columns have “names” (labels), they are obligatory – Rows have “names” (index), they are generated if not given Date Fuel_Price IsHoliday 0 2010-02-05 0.1 True 1 2010-02-12 0.2 False 2 2010-02-19 0.3 True 3 2010-02-26 0.4 True 4 2010-03-05 0.5 True BMAN73701 Week 4 6 MANCHESTER 1824 From Lists, Dicts, … to DataFrame The University of Manchester Alliance Manchester Business School From a list In [2 ] : data = [ [ a 1 ] [ b , 2 ] ] In [3 ] : pd. Data Fr a me ( data , colu mns =: [ model , p r ice ] ) Out [ :3]: - model p r.1ce 0 a 1 1 b 2 From a dictionary In [ 4 ] : data = { model 1 : [ a , b ] p r ice : [ 1 2 ] } In [.5,] : pd. Data1F r a me ( data ) Out [ 5] : model p r·ice 0 a ·1 1 b 2 BMAN73701 Week 4 7 MAN,CHEsTER_ 1824 pandas Read/Write The University of Man c'hester Alliance Manchester Business Schoo Date Fuel_Price IsHoliday Unemployment 2010-02-05 0.1 True 8.625 2010-02-12 0.2 False 8.625 2010-02-19 0.3 True 8.335 2010-02-26 0.4 True 2010-03-05 0.5 True 8.335 Read from / Write to Excel in Comma-Separated-Values (CSV) format import pandas as pd df = pd.read_csv('filename.csv') df.to_csv('filename.csv') BMAN73701 Week 4 8 MANCHESTER 1824 From Excel to DataFrame The University of Manchester Alliance Manchester Business School In [l]: i mpo r t pandas as pd In : df = pd. r ead_cs v( C: /wo r k/ sto r el f eatu r es.csv.zip 1 1 ) In : df. head () Out [:3] '' Dat e St o r·e Tempe r·atu r·e Fu el P r.1ce Ma rk Downl Ma rk DO'!,.f n2 \ 0 2010-02-05 1 42.31 2.572 Na N Na N 1 2010-02- 12 1 38.51 2.5 48 Na N Na N 2 281181-812- 19 1 39.93 2.5 14 Na N Na N 3 281181-02-26 1 46.63 2._'ii::;;- "J t:i 1 Na N Na N 4 2010-03-05 1 46.50 2.625 Na N Na N Ma r kDown3 Ma r kDown4 Ma r kDown5 CPI Unempl oyme nt Is Hol iday 81 Na N Na N Na N 211.096358 8. 106 Fal se 1 Na N Na N Na N 211.2 42170 8. 106 True 2 Na N Na N Na N 211.289143 8. 106 Fal se 3 Na N Na N Na N 211.3196 43 8. 106 Fal se 4 Na N Na N Na N 211.350143 8. 106 Fal se In : t ype (df) Out [ 4] : pa ndas. co r e. f r·ame. Dat alF r·ame BMAN73701 Week 4 9 MANCHESTER 1824 From Excel to DataFrame The University of Manchester Alliance Manchester Business School I n [.5 ] : df. in-fo{) · I 1 Rangeindex: 182 ent r·ies, 0 to 181 Data columns {total 11.2 columns): Date 11.82 non-null object Sto r e ].82 non-null int64 Tempe r atu r e 182 non-null f loat6 4 Fuel IP r·ice 182 non- null f loat6 4 Ma r kDownl 90 non-null f loat6 4 Ma r kDown2 73 non-null f loat6 4 Ma r kDown3 89 non-null f loat6 4 Ma r kDown4 90 non-null f loat6 4 Ma r kDown5 90 non-null f loat6 4 CPI ].69 non-null float64 Unemployment 11.69 non-null f loat6 4 IsHoliday ].82 non-null bool dtypes: bool(1 ) , f loat6 4{9), int64{1L), object{:n.) 1memo r·y us age: 15. 9+· KB BMAN73701 Week 4 10 MANCHESTER 1824 The Index The University of Manchester Alliance Manchester Business School In [-6 ] : df = pd. r ead_cs v( 1 C: /wo r k/ s t o r el f ea tu r es.csv. z ip 1 ). df [ 1 Da t e 1 ] = pd. to_da t e t i me(d f [ 1 Dat e 1 ] ). df. i nde x = df [ 1 Da t e 1 ]. df. head (). Out ' ' Dat e St o r e Tempe r a tu r·e Fu el Pr ice Ma r kDownl Ma r kDown2 \ Da t e 2010-02-05 2010-02-05 1 42.31 2.572 Na N Na N 2010-02- 12 2010-02- 12 1 38.51 2.5 48 Na N Na N 2010-02- 19 2010-02- 19 1 ~9 g ·::i, ,.J... ,.J 2.514 Na N Na N 2010-02-26 2010-02-2 6 1 46.63 2.561 Na N Na N 2010-03-05 2010-03-05 1 46.50 2.625 NaN Na N BMAN73701 Week 4 11 MANCHESTER 1824 The Index The University of Manchester Alliance Manchester Business School In [6 ] : df = pd. r ead_csv( C: /wor k/s t or el fea tu res.csv. zip 1 1 )... : df [ Dat e 1 = pd. to_datetime(d f [ Dat e 1 ] 1 1 ] )... : df. index = df [ Dat e ]... : df. head () In [7 ] : df. i nf o,() 1 1 ID atetime i ndex: 182 ent ri es , 2010-02-05 to 2013-07-26 Data columns (total 12 columns): Date 182 non-null datetime6 4 [nis] Sto r e 182 non-null int64 Tempe r atu r e 182 non-null f 1oat64 Fuel Pr ice 182 non-null f loat6 4 IMa r·kD01-.rnl 90 non-null f 1oat64 IMa r·lkD01aJ"n2 73 non-nul 1 f loat6 4 IMa r·kD01--1n3 89 non-nul 1 f 1oat64 IMa r·1 BMAN73701 Week 4 83 :!::: r iii' -I ::I-:::,- ~ g;i 8 rti in ~~ ~ ~- g. ~ MASERA m~ iQ. Hierarchical clustering Tree diagrams -------------------, g;'~ !!!'. :::J ~ g. v,. ro FORDP "' Ul- V, n.., :::r rti DUSTER g CAM.ARC 0 0 0..,. -------------- , I di a.. C di ::s DODGE a.../J,' 'f" JA Q· fQ 'ONDA ~. -3, BMAN73701 Week 4 TOYOTA. FIAT1 w (Cl· FIATX FERRAR a C. LOTUS ER.C2 VOLVO - ~ ti! DATSU TOYOTA PORSCH MERC2 Pt.i\A.ZDA DA MERC2 MFR,r.;? Dendograms 84 MAN,CHEsTER_ 1824 Treemaps Total: $589M Poultry Meat Coconuts, Brazil Nuts, and Cashews 12.12% Rice Other Vegetable 1.51% Residues ,=====,I I ,_________.I II~ Treemap of Benin's exports by product category, BMAN73701 2009. Week 4 H~ ED 85 MANCHESTER 1824 Relationships: Chord diagram The Un iversity of Manchester Alliance Manchester Business Schoo,! 25000 25000 I L ' '> 'b0 0 BMAN73701 Week 4 86 MAN,CHEsTER_ 1824 Networks/Graphs The University of Man c'hester Alliance Manchester Business Schoo https://networkx.github.io/documentation/stable/ BMAN73701 Week 4 87 MAN,CHEsTER_ 1824 Geographical Data: Contour maps The University of Man c'hester Alliance Manchester Business Schoo '".. _,._ 95 + Endin R d1 0 s6j&v nn level 600 400 200 Matplotlib Basemap Toolkit: all https://matplotlib.org/basemap/stable/users/examples.html BMAN73701 Week 4 88 u... t MANCH ESTER_ Geographical Data: Choropleth Maps I I 1~2➔ The University of Man c'h ester Alliance Manchester Business Schoo 20 11 US Agri culture Exports by State (Hover fo r breakdown) Millions USD 16k 14k 12k 10k ~ Bk 6k 4k 2k https://plotly.com/python/choropleth-maps/ BMAN73701 Week 4 89

Use Quizgecko on...
Browser
Browser