NumPy Arrays & Vectors in Python (University of Petra) PDF

Summary

These notes provide an introduction to NumPy, a Python library used for working with large, multi-dimensional arrays. The document outlines basic array creation, indexing, and manipulations, including vectorization techniques.

Full Transcript

University of Petra Faculty of Information Technology Department of Data Science and Artificial Intelligence Part2: Arrays and Vectors Programming for Computation using NumPy Data Science...

University of Petra Faculty of Information Technology Department of Data Science and Artificial Intelligence Part2: Arrays and Vectors Programming for Computation using NumPy Data Science 606315 (2025-2024 First Semester) Dr. Hossam M. Mustafa Contents 1. Introduction 9. Joining Array 2. Create an Array 10. Splitting Array 3. Array Indexing 11. Searching Arrays 4. Array Slicing 12. Sorting Arrays 5. Data Types 13. Filtering Arrays 6. Array Copy vs View 14. Random Numbers 7. Array Shape 15. NumPy ufuncs 8. Array Iterating Introduction NumPy is a Python library used for working with arrays. NumPy stands for Numerical Python. It also has functions for working in linear algebra, Fourier transforms, and matrices. NumPy was created in 2005 by Travis Oliphant. It is a free open-source project Why using NumPy ? The lists can serve the purpose of arrays but are slow to process. NumPy aims to provide an array object that is up to 50x faster than traditional Python lists. The array object in NumPy is called ndarray, it provides a lot of supporting functions. Arrays are very frequently used in data science, where speed and resources are very important. Creating Arrays Import NumPy: To create a ndarray, we can pass a list, tuple or any array-like object into the array() method, and it will Import it into your applications by adding the be converted into a ndarray. import keyword: import numpy as np NumPy is usually imported under the np alias. arr = np.array((1, 2, 3, 4, 5)) import numpy as np print(arr) Create a NumPy ndarray Object: Dimensions in Arrays: The array object in NumPy is called ndarray. A dimension in arrays is one level of array depth (nested arrays). We can create a NumPy ndarray object by using the array() function. 0-D arrays, or Scalars, are the elements in an array. Each value in an array is a 0-D array. import numpy as np import numpy as np arr = np.array([1, 2, 3, 4, 5]) arr = np.array(40) print(arr) print(type(arr)) # numpy.ndarray print(arr) Creating Arrays 1-D Arrays: 3-D arrays: An array that has 0-D arrays as its elements is An array that has 2-D arrays (matrices) as its called uni-dimensional or 1-D array. elements is called a 3-D array. These are the most common and basic arrays. These are often used to represent a 3rd-order tensor. arr = np.array([1, 2, 3, 4, 5]) arr = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]]) 2-D Arrays: An array that has 1-D arrays as its elements is Checking the dimensions of the Arrays: called a 2-D array. NumPy Arrays provides the ndim attribute that These are often used to represent matrix. returns an integer that indicates the dimensions the array has. a = np.array(40) b = np.array([1, 2, 3, 4, 5]) arr = np.array([[1, 2, 3], [4, 5, 6]]) c = np.array([[1, 2, 3], [4, 5, 6]]) print(a.ndim) print(b.ndim) print(c.ndim) Creating Arrays Examples Create a null vector of size n, or by n x m. arr = np.zeros(10) arr2 = np.zeros((2,2)) Create a vector with values ranging from n to m. arr = np.arange(10,50) Create a n x n identity matrix arr =Z = np.eye(3) Create a ones vector of size n. arr = np.ones(10) arr2 = np.ones((2,2)) Array Indexing Array indexing is the same as accessing an array element. You can access an array element by referring to its index number. arr = np.array([1, 2, 3, 4]) print(arr) Access 2-D Arrays: Negative Indexing: To access elements from 2-D arrays we can use Use negative indexing to access an array from the end. comma-separated integers representing the dimension and the index of the element. arr = np.array([[1,2,3,4,5], [6,7,8,9,10]]) arr = np.array([[1,2,3,4,5], [6,7,8,9,10]]) print('Last element from 2nd dim: ', arr[1, -1]) print('2nd element on 1st dim: ', arr[0, 1]) Array Slicing Slicing in Python means taking elements from one import numpy as np given index to another given index. We pass a slice instead of an index like this: [start: arr = np.array([1, 2, 3, 4, 5, 6, 7]) end]. print(arr[1:5]) We can also define the step, like this: [start:end: print(arr[4:]) step]. print(arr[0:4]) If we don’t pass start it is considered 0. and if we print(arr[:]) print(arr[-3:-1]) don't pass the end, it considered the length of the print(arr[1:5:2]) array in that dimension If we don't pass the step, it is considered 1. arr2 = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]]) Negative slicing uses the minus operator to refer to print(arr2[1, 1:4]) an index from the end. print(arr2[0:2, 1:4]) Use the step value to determine the step of the slicing. Data Types By default, Python have these data types: Code Datatype Description S strings Text data. eg. "ABCD" import numpy as np i integer Integer numbers. eg. -2, -3 arr = np.array([1, 2, 3, 4]) f float Real numbers. eg. 1.2, 42.42 b boolean True or False print(arr.dtype) The NumPy array object has a property called dtype arr2 = np.array([1, 2, 3, 4], dtype='S') print(arr2) that returns the data type of the array. print(arr2.dtype) We use the array() function to create arrays, this # the following statement raise an error function can take an optional argument: type that arr3 = np.array(['a', '2', '3'], dtype='i') allows us to define the expected data type of the array elements. If a type is given in which elements can't be cast, then NumPy will raise a ValueError. Data Types Converting Data Type on Array import numpy as np arr = np.array([1.1, 2.1, 3.1]) The best way to change the data type of an existing array, is to make a copy of the array with newarr = arr.astype('i') the astype() method. The astype() function creates a copy of the array print(newarr) print(newarr.dtype) and allows you to specify the data type as a parameter. arr2 = np.array([1, 0, 3]) The data type can be specified using a string, like newarr2 = arr2.astype(bool) 'f' for float, 'i' for integer etc. or you can use the data type directly like float for float and int for print(newarr2) integer. print(newarr2.dtype) Array Copy vs. Array View The copy creates a new array, and the array view is import numpy as np only a view of the original array. ########### copy ################# The copy owns the data, and any changes made to arr = np.array([1, 2, 3, 4, 5]) the copy will not affect the original array. x = arr.copy() arr = 42 The view does not own the data, and any changes made to the view, or the original array will both. print(arr) print(x) All arrays have an attribute base that returns None if the array owns the data. Otherwise, the base ########### view ################# attribute refers to the original object. arr2 = np.array([1, 2, 3, 4, 5]) x2 = arr2.view() The copy (original) returns None. And the view arr2 = 42 # change original returns the original array. x2 = 32 # change view print(arr2) print(x2) print(x.base) print(x2.base) Shape of an Array The shape of an array is the number of import numpy as np elements in each dimension. arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]]) NumPy arrays have an attribute called shape that returns a tuple with each index having print(arr.shape) the number of corresponding elements. The shape (n, m) means that the array has n dimensions, and each dimension has m elements. Shape of an Array Reshaping the arrays Reshaping arrays can add or remove dimensions import numpy as np or change the number of elements in each ########### Reshaping the arrays ####### dimension. arr2 = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]) Flattening the arrays Flattening an array means converting a newarr = arr2.reshape(4, 3) multidimensional array into a 1D array. To do this we can use reshape(-1). print(newarr) ########### Flattening the arrays ####### arr3 = np.array([[1, 2, 3], [4, 5, 6]]) newarr2 = arr3.reshape(-1) print(newarr2) Array Iterating In multi-dimensional arrays, we can do this import numpy as np using the basic for loop of Python. arr = np.array([[1, 2, 3], [4, 5, 6]]) If we iterate on a 1-D array it will go through each element one by one. for x in arr: for y in x: Iterating Arrays Using nditer() print(y) The function nditer() can be used from very basic ##### nditer example ########## to very advanced iterations. It solves some basic arr2 = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]]) issues which we face in iteration, let’s go through it with examples. for x in np.nditer(arr2): print(x) Joining Arrays Joining means putting the contents of two or import numpy as np more arrays in a single array. arr1 = np.array([1, 2, 3]) We pass a sequence of arrays that we want to join arr2 = np.array([4, 5, 6]) arr = np.concatenate((arr1, arr2)) to the concatenate() function, along with the axis. If the axis is not explicitly passed, it is taken print(arr) as 0. ############### 2-D array ############## Joining Arrays Using Stack Function arr1 = np.array([[1, 2], [3, 4]]) arr2 = np.array([[5, 6], [7, 8]]) Two 1-D arrays can be concatenated along the second axis which would result in putting them one over the arr = np.concatenate((arr1, arr2), axis=1) other. We pass a sequence of arrays that we want to print(arr) join to the stack() method along with the axis. If the axis is not explicitly passed it is taken as 0. arr_stacked = np.stack((arr1, arr2), axis=1) print(arr_stacked) Stacking Along Rows and Columns NumPy provides helper functions: hstack() to stack arr_hstack = np.hstack((arr1, arr2)) along rows, vstack() to stack along columns print(arr_hstack) arr_vstack = np.vstack((arr1, arr2)) print(arr_vstack) Splitting Array Splitting is the reverse operation of Joining. import numpy as np Joining merges multiple arrays into one and arr = np.array([1, 2, 3, 4, 5, 6]) Splitting breaks one array into multiple. newarr = np.array_split(arr, 3) We use array_split() for splitting arrays, we pass it the array we want to split and the print(newarr) number of splits. You can specify which axis you want to do the arr2 = np.array([[1, 2], [3, 4], [5, 6], [7, split around. 8], [9, 10], [11, 12]]) newarr2 = np.array_split(arr2, 3) print(newarr2) Searching Arrays import numpy as np You can search an array for a certain value and return the indexes that get a match. To search an arr = np.array([1, 2, 3, 4, 5, 4, 4, 7, 8]) array, use the where() method. x = np.where(arr == 4) Searching Arrays Using Search Sorted The function searchsorted() performs a binary print(x) # the value 4 is present at index 3, 5, and 6. search in the array, and returns the index where the specified value would be inserted to maintain x2 = np.where(arr%2 == 0) # return the positions print(x2) # even numbers the search order. To search for more than one value, use an array with the specified values. x3 = np.searchsorted(arr, 4) print(x3) Sorting Arrays Sorting means putting elements in an ordered import numpy as np sequence. arr = np.array([3, 2, 0, 1]) Ordered sequence is any sequence that has an order corresponding to elements, like numeric print(np.sort(arr)) or alphabetical, ascending or descending. arr2 = np.array(['banana', 'cherry', The NumPy ndarray object has a function 'apple']) called sort(), that will sort a specified array. print(np.sort(arr2)) Filtering Arrays Filtering means to get some elements out of import numpy as np arr = np.array([41, 42, 43, 44]) an existing array and create a new array. You filter an array using a boolean index list. x = [True, False, True, False] A boolean index list is a list of booleans newarr = arr[x] corresponding to indexes in the array. print(newarr) If the value at an index is True that element is filter_arr = arr > 42 contained in the filtered array, if the value at newarr2 = arr[filter_arr] that index is False that element is excluded print(newarr2) from the filtered array. We can directly substitute the array instead of filter_arr = arr % 2 == 0 newarr3 = arr[filter_arr] the iterable variable in our condition and it print(newarr3) will work just as we expect it to. Random Numbers from numpy import random Random number does NOT mean a different number every time. Random means x = random.randint(100) something that can not be predicted logically. print(x) NumPy offers a random module to work with x2 = random.rand() random numbers. print(x2) The random module's rand() method returns x3=random.randint(100, size=(5)) a random float between 0 and 1. print(x3) To create random arrays, The randint() x4 = random.randint(100, size=(3, 5)) method is used where it takes a size print(x4) parameter where you can specify the shape of an array. The rand() method also allows you x5 = random.rand(5) to specify the shape of the array. print(x5) The choice() method allows you to generate a x6 = random.choice([3, 5, 7, 9]) random value based on an array of values. The print(x6) choice() method also allows you to return an array of values. x7 = random.choice([3, 5, 7, 9], size=(10)) print(x7) Random Numbers Shuffling Arrays Shuffle means changing the arrangement of from numpy import random elements in place. i.e. in the array itself. The shuffle() method makes changes to the import numpy as np original array. arr = np.array([1, 2, 3, 4, 5]) The permutation() method returns a re- arranged array (and leaves the original array random.shuffle(arr) unchanged). print(arr) Random Data Distribution A random distribution is a set of random print(random.permutation(arr)) numbers that follow a certain probability density function. x = random.choice([3, 5, 7, 9], p=[0.1, 0.3, 0.6, 0.0], We can generate random numbers based on size=(100)) defined probabilities using the choice() print(x) method of the random module. The choice() method allows us to specify the probability for each value. NumPy ufuncs ufuncs stands for "Universal Functions" and they are NumPy functions that operate on the ndarray object. ufuncs are used to implement vectorization in NumPy which is way faster than iterating over elements. They also provide broadcasting and additional methods like reduce, accumulate etc. that are very helpful for computation. Vectorization Converting iterative statements into a vector-based operation is called vectorization. It is faster as modern CPUs are optimized for such operations. Add the Elements of Two Lists Examples Using Python's built-in zip() method: Using NumPy has a ufunc add(x, y) method: x = [1, 2, 3, 4] y = [4, 5, 6, 7] x = [1, 2, 3, 4] z = [] y = [4, 5, 6, 7] z = np.add(x, y) for i, j in zip(x, y): z.append(i + j) print(z) print(z) NumPy ufuncs Create Your ufunc To create your own ufunc, you have to define a function as you do with normal functions in Python, and then you add it to your NumPy ufunc library with the frompyfunc() method. The frompyfunc() method takes the following arguments: ▪ function: the name of the function. ▪ inputs: the number of input arguments (arrays). ▪ outputs: the number of output arrays. import numpy as np def myadd(x, y): return x+y myadd = np.frompyfunc(myadd, 2, 1) print(myadd([1, 2, 3, 4], [5, 6, 7, 8])) NumPy ufuncs import numpy as np arr1 = np.array([10, 11, 12, 13, 14, 14]) arr2 = np.array([3, 5, 10, 8, 2, 33]) Type of Function Function Example Simple Arithmetic add() newarr = np.add(arr1, arr2) subtract() newarr = np.subtract(arr1, arr2) multiply() newarr = np.multiply(arr1, arr2) divide() arr2 = np.array([3, 5, 10, 8, 2, 33]) power() np.power(arr1, arr2) remainder() or mod() np.mod(arr1, arr2) absolute() or abs() newarr = np.absolute(arr) Rounding Decimals trunc() arr = np.trunc([-3.1666, 3.6667]) around() arr = np.around(3.1666, 2) floor() np.floor([-3.1666, 3.6667]) ceil() arr = np.ceil([-3.1666, 3.6667]) Set Operations unique() x = np.unique(arr) union1d() newarr = np.union1d(arr1, arr2) intersect1d() np.intersect1d(arr1, arr2, assume_unique=True) setdiff1d() newarr = np.setdiff1d(set1, set2, assume_unique=True)

Use Quizgecko on...
Browser
Browser