Numpy Programming Lecture Notes - PDF
Document Details
Uploaded by AdventurousJasper8246
The University of Manchester, Alliance Manchester Business School
Manuel López-Ibáñez
Tags
Related
Summary
These lecture notes cover numerical analysis using Numpy library in Python. The document provides information on vectors, matrices, broadcasting, and other Numpy related concepts.
Full Transcript
I MAN,CHESTER_ I 1s24 The University of Man c'hester Alliance Manchester Business Schoo BMAN73701 Programming in ~ python™for Business Analytics Python Week 3: Lecture 2 Numerical Analysis Prof. Manuel López-Ib...
I MAN,CHESTER_ I 1s24 The University of Man c'hester Alliance Manchester Business Schoo BMAN73701 Programming in ~ python™for Business Analytics Python Week 3: Lecture 2 Numerical Analysis Prof. Manuel López-Ibáñez [email protected] Office hours: Mon 4pm-5pm, Fri 9am-10am https://calendly.com/manuel-lopez-ibanez I MAN,CHESTER_ 1s24 I The University of Man c'hester Alliance Manchester Business Schoo Contact Manuel López-Ibáñez (or Lopez-Ibanez) Discussion Board in Blackboard Office hours (AMBS 3.050): Mon 4pm-5pm, Fri 9am-10am https://calendly.com/manuel-lopez-ibanez Email: [email protected] – 22 711 emails received last year ! – Allow me 2-3 days to reply – If no reply, please remind me – Include BMAN24771 in the subject BMAN73701 3 I MANCH ESTER 1~2---l What is new? The Un iversity of Manchester Alliance Manchester Business Schoo,! From basics to advanced analytics Complete Python code available in BB Labs: Less exercise-like, more case-study-like BMAN73701 9 I MANCH ESTER 1~2---l Plan The Un iversity of Manchester Alliance Manchester Business Schoo,! Week 3: Numerical Analysis (NumPy) Week 4, Lecture 1: Data Exploration and Visualization (Pandas + Matplotlib) Week 4, Lecture 2: Data preprocessing and preparation (Pandas + Scikit-learn) Week 5: Machine learning (Scikit-learn) BMAN73701 10 I MAN,CHESTER_ I 1s24 The University of Man c'hester Alliance Manchester Business Schoo BMAN73701 Programming in ~ python™for Business Analytics Python Week 3: Lecture 2 Numerical Analysis Part 1: Vectors and matrices Part 2: Broadcasting Part 3: Aggregations and other useful operations I MANCH ESTER 1~2---l Summing up 10M numbers The Un iversity of Manchester Alliance Manchester Business Schoo,! a = list(range(10**7)) For-loop sum() over a list np.sum() s = 0 s = sum(a) x = np.array(a) for i in a: s = np.sum(x) s += i Time: 0.1 seconds Time: 0.05 seconds Time: 0.0009 seconds BMAN73701 12 MAN,CHESTER_ 1824 he University of Man c'hester Alliince Manchester Business Schoo NumPy Good integration with Pandas and Matplotlib Scikit-learn (ML) is built on top of NumPy Fast computations Large multi-dimensional arrays (vectors and matrices) Reference docs: https://numpy.org/doc/stable/reference/index.html import numpy as np # Just a shortcut. BMAN73701 14 lvlANCHEsTER J824 NumPy arrays The University of Manchester Alliance Manchester Business School import numpy as np Vectors Matrices 𝑎Ԧ = [0 1 2] 0 1 2 𝐴= 3 [ 4 5 ] In: a = np.array([0,1,2]) In: A = np.array([[0, 1, 2], [3, 4, 5]]) Out: array([0, 1, 2]) Out: array([[0, 1, 2] [3, 4, 5] ]) In: a.shape In: A.shape Out: (3,) # 1D (3 columns) Out: (2,3) # 2D (rows, columns) BMAN73701 15 MANCH ESTER_ NumPy arrays ≠ Lists I I 1~2➔ The University of Man c'h ester Alliance Manchester Business Schoo In: a = list(range(4)) In: a = np.arange(4) Out: [0, 1, 2, 3] Out: array([0, 1, 2, 3]) In: a * 2 In: a * 2 In: a + a In: a + a Out: [0, 1, 2, 3, 0, 1, 2, 3] Out: array([0, 2, 4, 6]) In: a.append(10) In: a = np.append(a, 10) a a Out: [0, 1, 2, 3, 10] Out: array([0, 1, 2, 3, 10]) BMAN73701 16 I MANCH ESTER 1~2---l NumPy arrays ≠ Lists The Un iversity of Manchester Alliance Manchester Business Schoo,! In: a = list(range(4)) In: a = np.arange(4) Out: [0, 1, 2, 3] Out: array([0, 1, 2, 3]) In: a * 2 In: a * 2 In: a + a In: a + a Out: [0, 1, 2, 3, 0, 1, 2, 3] Out: array([0, 2, 4, 6]) In: a.append(10) In: a = np.append(a, 10) a a Out: [0, 1, 2, 3, 10] Out: array([0, 1, 2, 3, 10]) BMAN73701 17 I MAN CHEsTER Indexing: List vs NumPy 1 J824 The University of Manchester Alliance Manchester Business School 0 1 2 # A list of lists A = [[0,1,2], [3,4,5]] 𝐴= 3 4[ 5 ] Return first row, first column A Return first row A Return first column ?? # A NumPy matrix A = np.array([[0,1,2], [3,4,5]]) Return first row, first column A[0, 0] Return first row A[0, :] Return first column A[:, 0] BMAN73701 19 MANCH ESTER_ Slice Notation I I 1~2➔ The University of Man c'h ester Alliance Manchester Business Schoo x[:] is the same as x[0:len(x):1] x[START:END:STEP] Start counting at START (default 0) Stop counting before END (default len or num. rows/columns), Increment by STEP (default 1) x[0:2] same as x[[0,1]] or x[range(2)] x[1:] same as x[range(1,len(x))] x[:5] same as x[range(5)] x[:] same as x[range(len(x))] x[:0:-1] x[[len(x)-1,len(x)-2,…,1]] If STEP < 0, default START is len(x) – 1 and default END is -1 BMAN73701 20 MAN,CHEsTER_ 1824 Slice Notation The University of Man c'hester Alliance Manchester Business Schoo 0 1 2 Expression Shape 0 I A[0:2, 1:3] (2,2) 1 I A[:2, 1:] (2,2) 2 ,~ - - A (3,) A[2, :] (3,) I I A[2:, :] (1,3) I - - - A[:, :2] (3,2) A[:, [0,1]] (3,2) I I I - - - I A[:2, 1] (2,) I A[:2, 1:2] (2,1) I BMAN73701 21 MANCHEsTER J824 Quizz The University of Manc'hester Alliance Manchester Business School x = np.array([ [3,2,1], [4,5,6] ]) 3 2 1 In: x[:, 1] Out: array([2, 5]) 𝑋=[4 5 6 ] In: x[-1, :] Out: array([4, 5, 6]) In: x[:2, 1:] Out: array([[2, 1], [5, 6]]) In: x[::-1, [2,1,0]] Out: array([[6, 5, 4], [1, 2, 3]]) BMAN73701 23 I MANCH ESTER 1~2---l Boolean Indexing The Un iversity of Manchester Alliance Manchester Business Schoo,! # A NumPy vector x = np.array([0,3,1,4,2,5]) x > 2 [False,True,False,True,False,True] x[x > 2] [3,4,5] # A list x = [0,3,1,4,2,5] x[x > 2] Error !!! # A NumPy Matrix A = np.array([[0,3,1], [4,2,5]]) A[A > 2] [3,4,5] BMAN73701 25 0 1 2 MAN CHESTElz 𝐴=[ ] Copies vs Views 1 1824 The Uri iversity of Man c'hester Alliance Manchester Business School 3 4 5 B = A Creates a view B = A.copy() Creates a copy B[0, 0] = 10 B[0, 0] = 20 A[0, 0] value ? A[0, 0] value ? Slices are views not copies! Same as B = A A[0, 0] = 1 row0 = A[0, :] returns first row row0 = 5 change first element of first row A[0, 0] value ? slice_copy = A[0, :].copy() # Creates a copy! BMAN73701 26 MAN,CHEsTER_ 1824 Example: Total gains and losses The University of Man c'hester Alliance Manchester Business Schoo Given a matrix 𝑋 where 𝑋𝑡𝑗 is the net profit of department 𝑗 in time period 𝑡, calculate: 𝑔𝑎𝑖𝑛𝑠 = σ𝑇𝑡 σ𝐷 𝑗 𝑋𝑡𝑗 if 𝑋𝑡𝑗 > 0 𝑙𝑜𝑠𝑠𝑒𝑠 = σ𝑇𝑡 σ𝐷 𝑗 𝑋𝑡𝑗 if 𝑋𝑡𝑗 < 0 for-loop No loops! for t in range(X.shape): for j in range(X.shape): if X[t,j] > 0: gains = np.sum(X[X > 0]) gains += X[t,j] losses= np.sum(X[X < 0]) else: losses += X[t,j] Time: 0.004 seconds Time: 0.04 seconds (4*365 days, 50 depts) BMAN73701 27 MAN,CHESTER_ Element-wise operators I I 1s24 The University of Man c'hester Alliance Manchester Business Schoo (Most) operations are element-wise (+ - / * **) A * 2 # multiply each element of A by 2 A ** 2 # square each element of A by 2 A * B # Element-wise (not matrix product) np.dot(A, B) # Matrix product: A × B Broadcasting! A + x # Matrix (2,3) + Vector (3,) 3 2 1 3 2 1 0 1 2 [ 4 5 6 + [0 1 2] = ? ] 4 5 6 + 0 1 2 BMAN73701 28 I MANCH ESTER 1~2---l Recap The Un iversity of Manchester Alliance Manchester Business Schoo,! NumPy arrays represent vectors and matrices ≠ Lists ! Indexing Numpy arrays: – Slicing similar to lists – More powerful than lists – Boolean indexing Element-wise mathematical operations Mathematical operations that apply to many elements of a Numpy array are much faster than indexing with for-loops BMAN73701 29 I MAN,CHESTER_ I 1s24 The University of Man c'hester Alliance Manchester Business Schoo BMAN73701 Programming in ~ python™for Business Analytics Python Week 3: Lecture 2 Numerical Analysis Part 1: Vectors and matrices Part 2: Broadcasting Part 3: Aggregations and other useful operations MAN,CHEsTER_ 1824 Broadcasting The University of Man c'hester Alliance Manchester Business Schoo 0 1 2 0 0 1 2 0 1 12 10 + = + = 0 1 2 0 1 2 20 0 1 2 0 1 2 30 + = + = + = + = https://scipy-lectures.org/intro/numpy/operations.html#broadcasting Creative Commons Attribution 4.0 International License (CC-by) http://creativecommons.org/licenses/by/4.0/ BMAN73701 31 I MAN CHEsTER Broadcasting 1 J824 The University of Manchester Alliance Manchester Business School 0 10 + / / 1 011 / J2 0 = 0 10 + / / [o I 1 I 2 / a = 0 10 20 20 0 1 2 20 30 30 0 1 2 30 x = np.array([0,10,20,30]) y = np.array([0,1,2]) In: x + y Out: ValueError: operands could not be broadcast together with shapes (4,) (3,) BMAN73701 32 Mf\N,CHESTER 1824 Broadcasting The U11 iversity of Man ch ester Alliance Manchester Business School ((((0+ 0 10 20 30 - ? x = np.array([0,10,20,30]) y = np.array([0,1,2]) In: x + y Out: ValueError: operands could not be broadcast together with shapes (4,) (3,) BMAN73701 33 I MAN CHEsTER Broadcasting 1 J824 The University of Manchester Alliance Manchester Business School 0 10 + / / 1 011 / J2 0 = 0 10 + / / [o I 1 I 2 / a = 0 10 20 20 0 1 2 20 30 30 0 1 2 30 x = np.array([0,10,20,30]) y = np.array([0,1,2]) In: x + y Out: ValueError: operands could not be broadcast together with shapes (4,) (3,) In: x.reshape((4,1)) + y Out: array([[ 0, 1, 2], [10, 11, 12], [20, 21, 22], [30, 31, 32]]) BMAN73701 34 MAN,CHEsTER_ 1824 Reshaping The University of Man c'hester Alliance Manchester Business Schoo x = np.array([0,10,20,30]) In: x Out: array([ 0, 10, 20, 30]) ((((0 0 10 20 30 In: x.reshape((4,1)) 0 Out: array([[ 0], 10 , , 20 ]) 30 In: x.reshape((2,2)) / / / Out: array([[ 0, 10], 0 10 / [20, 30]]) 20 30 / BMAN73701 35 MANCHESTER 1824 Example: Total of outer product The Un iversity of Manchester Alliance Manchester Business Schoo,! Given two vectors 𝑥, 𝑦 calculate the sum of 𝑥= ((((0 1 10 20 30 every pairwise product of their elements: 𝑦= 2 (_____.,(_____._(__,,Q 3 4 5 I.._______./ 𝑡𝑜𝑡𝑎𝑙 = σ𝑛𝑖=1 σ𝑛𝑗=1 𝑥𝑖 ⋅ 𝑦𝑗 = σ (𝑥 × 𝑦 𝑇 ) 𝑖𝑗 for-loops No loops! n = x.shape total = np.sum( x.reshape((n,1)) * y) total = 0 / / / / / / / / / / for i in range(n): 1 1 11 1 2 3 4 5 / / for j in range(n): 10 10 10 10 / * 2 3 4 5 / 20 20 20 20 2 3 4 5 total += x[i] * y[j] / / 30 30 30 30 2 3 4 5 / / BMAN73701 38 MANCHESTER 1824 Broadcasting and reshape The Un iversity of Manchester Alliance Manchester Business Schoo,! QUIZ In: X = np.array([[1,2,3],[4,5,6]]) a = np.array([0,1,0]) print(X * a) Out: array([[0 2 0], [0 5 0]]) In: b = np.array([0,1]) print(X * b) Out: Error: operands could not be broadcast together with shapes (2,3) (2,) In: b = b.reshape((2,1)) print(X * b) Out: array([[0 0 0], [4 5 6]]) BMAN73701 40 I MANCH ESTER 1~2---l Recap The Un iversity of Manchester Alliance Manchester Business Schoo,! Broadcasting allows mathematical operations between NumPy arrays of different shapes If shapes cannot be broadcast Error! Mathematical operations using broadcasting are: – Faster to execute – Shorter to write } than for-loops BMAN73701 41 I MAN,CHESTER_ I 1s24 The University of Man c'hester Alliance Manchester Business Schoo BMAN73701 Programming in ~ python™for Business Analytics Python Week 3: Lecture 2 Numerical Analysis Part 1: Vectors and matrices Part 2: Broadcasting Part 3: Aggregations and other useful operations MAN,CHEsTER_ 1824 Aggregations (Reductions) The University of Man c'hester Alliance Manchester Business Schoo Some operations aggregate: sum, min, max, mean, all, any,... 0 1 2 ,--... axis = 1 3 4 5 axis = 0 - - np.min(A) returns 1 number np.min(A, axis = 0) min along rows, returns 1 number per column np.min(A, axis = 1) min along columns, returns 1 number per row BMAN73701 43 MAN,CHEsTER_ 1824 Maximum Mean Squared Error UIZ The University of Man c'hester Alliance Manchester Business Schoo 𝑃11 ⋯ 𝑃1𝑛 P= ⋮ ⋱ ⋮ each row gives 𝑛 predictions by k ML methods 𝑃𝑘1 ⋯ 𝑃𝑘𝑛 obs = [𝑜𝑏𝑠1 ⋯ 𝑜𝑏𝑠𝑛 ] the actual observed values Calculate the maximum Mean Squared Error (MSE) given as: 1 𝑛 𝑚𝑎𝑥𝑀𝑆𝐸 = max 𝑘=1,…𝑛 (- 𝑛 (𝑃𝑘𝑖 − 𝑜𝑏𝑠𝑖 𝑖=1 2 maxMSE = np.max(np.mean((P-obs)**2, axis = 1)) BMAN73701 44 MANCHEsTER 1824 Maximum Mean Squared Error The University of Man ch ester Alliance Manchester Business School 𝑃11 ⋯ 𝑃1𝑛 P= ⋮ ⋱ ⋮ each row gives 𝑛 predictions by k ML methods 𝑃𝑘1 ⋯ 𝑃𝑘𝑛 obs = [𝑜𝑏𝑠1 ⋯ 𝑜𝑏𝑠𝑛 ] the actual observed values Calculate the maximum Mean Squared Error (MSE) given as: 1 𝑛 𝑚𝑎𝑥𝑀𝑆𝐸 = max 𝑘=1,…𝑛 (- 𝑛 (𝑃𝑘𝑖 − 𝑜𝑏𝑠𝑖 𝑖=1 2 maxMSE = np.max(np.mean((P-obs)**2, axis = 1)) BMAN73701 45 MANCHESTER 1824 Functions vs. Methods The Un iversity of Manchester Alliance Manchester Business Schoo,! Most functions in NumPy have an equivalent method np.min(A) A.min() np.min(A, axis = 0) A.min(axis = 0) Methods can only be applied to NumPy arrays! np.min([1,2,3]) OK [1,2,3].min() Error Some methods modify the array in-place! BMAN73701 46 MANCHESTER 1824 Boolean arrays: Any vs. All The Un iversity of Manchester Alliance Manchester Business Schoo,! b = np.array([1, 1, 0, 0]) # 1 is True, 0 is False np.logical_not(b) np.logical_and(b, b) np.logical_or(b, b) np.all(b) # all True? np.any(b) # any True? axis = 1 1 0 B = np.array([[1,0],[0,1]]) axis = 0 0 1 np.all(B, axis = 0) all True along rows ? returns 1 value per column np.any(B, axis = 1) any True along columns ? returns 1 value per row BMAN73701 47 MAN,CHEsTER_ 1824 The University of Man c'hester −1 2 3 Alliance Manchester Business Schoo 𝐴=[−4 −5 6 ] Are there negative values? np.any(A < 0) Are all negative values? np.all(A < 0) Which columns have only negative values? np.all(A < 0, axis = 0) Which rows have at least one negative value? np.any(A < 0, axis = 1) BMAN73701 48 MANCHEsTER 1824 The University of Manchester −1 2 3 uz Alliance Manchester Business School 𝐴=[−4 −5 6 ] "-' Are there negative values? np.any(A < 0) Are all negative values? np.all(A < 0) Which columns have only negative values? np.all(A < 0, axis = 0) Which rows have at least one negative value? np.any(A < 0, axis = 1) BMAN73701 49 I MANCH ESTER 1~2---l Sorting The Un iversity of Manchester Alliance Manchester Business Schoo,! Direct sorting np.sort(array, axis=) np.sort(a) sort ascending -np.sort(-a) sort descending np.sort(A, axis=0) sort each column (along rows) x = np.array([11,12,10,9]) In: np.sort(x) Out: [9,10,11,12] In: -np.sort(-x) Out: [12,11,10,9] BMAN73701 50 I MANCH ESTER 1~2---l Indirect Sorting The Un iversity of Manchester Alliance Manchester Business Schoo,! Direct sorting (ascending) np.sort(array, axis=) Indirect sorting np.argsort(array, axis=) np.argmin(array, axis=) np.argmax(array, axis=) x = np.array([12,11,10,9]) In: np.sort(x) In: np.max(x) Out: [9,10,11,12] Out: 12 In: np.argsort(x) Out: [3,2,1,0] In: np.argmax(x) In: x[np.argsort(x)] Out: 0 Out: [9,10,11,12] BMAN73701 51 MAN,CHEsTER_ 1824 Minimisation by indirect sorting The University of Man c'hester Alliance Manchester Business Schoo Find the x value that produces the minimum of cos(x) between [0, 6] with a precision of 0.0000001 x = np.arange(0, 6, 0.0000001) y = np.cos(x) 1.0 i = np.argmin(y) -- X 05 print(x[i]) !A 0 u no· u... C: II >, Out: 3.1415926999999 - 0.5 - LO 0 1 2 3 4 5 6 BMAN73701 52 I MANCH ESTER 1~2---l numpy.random The Un iversity of Manchester Alliance Manchester Business Schoo,! Sub-module of NumPy with lots of functions for random number generation, random distributions, etc. Different from built-in module (import random). Better not mix the two to avoid confusion! Basic functions: np.random.permutation(array) np.random.rand(N) np.random.randn(N) np.random.seed(N) BMAN73701 54 MAN,CHEsTER_ 1824 The University of Man c'hester Alliance Manchester Business Schoo SciPy Library for numerical analysis and scientific computations scipy.optimize (BMAN60101) Multi-variate numerical optimization, linear programming, non-linear optimization, differential evolution scipy.linalg Faster (than np.linalg) linear algebra, matrix inversion determinant, norms, decompositions, Eigen-vectors, etc. scipy.stats (BMAN71791) Statistical distributions, tests, trimmed mean, geometric mean, interquartile range, etc. Documentation: https://docs.scipy.org/doc/scipy/reference/ BMAN73701 56 I MANCH ESTER 1~2---l Recap The Un iversity of Manchester Alliance Manchester Business Schoo,! Aggregations (reductions): sum, min, max, mean, all, any,... Along axis=0 or axis=1 np.sort() np.argsort() indirect sorting np.argmin() np.argmax() Scipy: lots of useful mathematical and statistical functions using Numpy arrays BMAN73701 57 MANCH ESTER_ Going further I I 1~2➔ The University of Man c'h ester Alliance Manchester Business Schoo NumPy User Guide: https://docs.scipy.org/doc/numpy/user/index.html Advanced numerical tutorial: https://lectures.scientific-python.org BMAN73701 58 MAN,CHEsTER_ 1824 Next week The University of Man c'hester Alliance Manchester Business Schoo pandas Yit = /3' Xit + /J,i + Eit Python library for data manipulation and analysis matpl®tlib Advanced customisation requires using Matplotlib functions Complex plots require Matplotlib concepts BMAN73701 59