Machine Learning Lab Manual

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

What is the output shape of a numpy array created using three lists, each containing four elements?

  • (3, 3)
  • (2, 6)
  • (4, 3)
  • (3, 4) (correct)

Which numpy method provides information on the number of elements along each axis of an array?

  • dimensions()
  • size()
  • length()
  • shape() (correct)

When creating a numpy multi-dimensional array, what notation should be used to include multiple lists?

  • {} inside np.array()
  • [] inside np.array() (correct)
  • || inside np.array()
  • () inside np.array()

Given the numpy array created from 5 lists, each containing 3 elements, what would be the output of sample_array.shape?

<p>(5, 3) (C)</p> Signup and view all the answers

What does Axis 0 represent in the context of a numpy array?

<p>One-dimensional arrays (B)</p> Signup and view all the answers

What must be specified when reading a text file into a DataFrame?

<p>The separator using the sep argument (B)</p> Signup and view all the answers

How do you specify which sheet to read from an Excel file that contains multiple sheets?

<p>With the sheet_name argument (A)</p> Signup and view all the answers

Which code snippet correctly exports a DataFrame as a text file with specified separation?

<p>df.to_csv('diabetes_out.txt', sep=' ') (B)</p> Signup and view all the answers

What does the index=False argument do when saving a DataFrame to a CSV file?

<p>It excludes the DataFrame's row index from the output file (A)</p> Signup and view all the answers

Which function is used to read an Excel file into a DataFrame?

<p>pd.read_excel() (D)</p> Signup and view all the answers

What is the output of the expression df.shape?

<p>(768, 9) (B)</p> Signup and view all the answers

How can you obtain only the number of columns from a DataFrame's shape?

<p>df.shape[1] (B)</p> Signup and view all the answers

What does the .isnull() method return when applied to a DataFrame?

<p>A DataFrame indicating the presence of missing values (B)</p> Signup and view all the answers

Which method would you use to count the total number of missing values in the entire DataFrame?

<p>df.isnull().sum().sum() (C)</p> Signup and view all the answers

What happens when you use the .copy() method on a DataFrame?

<p>It creates a deep copy of the DataFrame. (C)</p> Signup and view all the answers

Flashcards

pd.read_excel()

A command used in pandas to read data from an Excel file into a DataFrame.

Separator (sep)

A separator character used in pandas to separate rows in a DataFrame. Examples include ',' (comma), ' ' (whitespace), '\t' (tab), and ':' (colon).

pd.read_csv()

A command used in pandas to read data from any file type, including CSV, text, and Excel.

DataFrame

The pandas function used to store data for analysis and manipulation.

Signup and view all the flashcards

df.to_csv()

A command used in pandas to save a DataFrame to a CSV file.

Signup and view all the flashcards

What does the NumPy shape method do?

NumPy's shape method returns a tuple representing the dimensions of an array. Each element in the tuple corresponds to the number of elements along a specific axis.

Signup and view all the flashcards

How to create a multi-dimensional array?

A multi-dimensional array in NumPy is created by nesting lists within a list, representing rows and columns. For instance, each inner list represents a row in a 2-dimensional array.

Signup and view all the flashcards

What is an axis in a NumPy array?

Arrays are multi-dimensional. Each direction you can travel through the array is an axis. Axis 0 of an array is the first direction you can traverse, Axis 1 is the second direction, etc.

Signup and view all the flashcards

What is NumPy?

NumPy is a library in Python used for numerical computations with powerful features for array manipulation and analysis. Key advantages include efficient operations on arrays, optimized for performance, and a diverse set of mathematical and statistical functions.

Signup and view all the flashcards

What is a one-dimensional array?

A one-dimensional array is a single sequence of elements, represented by a single list.

Signup and view all the flashcards

What is the .shape attribute used for in pandas?

The number of rows and columns in a pandas DataFrame. It's a tuple of the form (rows, columns) and can be accessed individually using indexing.

Signup and view all the flashcards

What does the .columns attribute return in pandas?

Returns a list of column names in a pandas DataFrame. Each column is a unique identifier.

Signup and view all the flashcards

What purpose does the .copy() method serve?

A pandas function used for creating a copy of a DataFrame, ensuring changes to the copy don't impact the original data.

Signup and view all the flashcards

How does the .isnull() method work?

This pandas method identifies missing values (NaN) in a DataFrame. It returns a DataFrame of booleans, where True indicates a missing value.

Signup and view all the flashcards

How can you check the amount of missing data using pandas?

This method, in combination with .sum(), counts the number of missing values (NaN) in each column of a pandas DataFrame.

Signup and view all the flashcards

Study Notes

University of Science and Technology Lab Manual

  • Introduction to Machine Learning
  • Lab Manual
  • Prepared by: Prof. Noureldien A. Noureldien
  • October 2024

Table of Contents

  • Week #1: Python Build-in functions and the Math Module (page 3)
  • Week #2: NumPy Module (page 14)
  • Week #3: CSV Files (page 27)
  • Week #4: Pandas Package (page 41)
  • Week #5: Data Preprocessing with Pandas (page 61)
  • Week #6: Modelling with Scikit-learn (page 87)
  • Week #7: Dataset Feature Selection Techniques (page 100)
  • Week #8: Building Supervised Learning Classification Model (page 113)
  • Week #9: Building Learning Regression Models (page 122)

Lab (1): Python Build-in- Functions and the Math Module

  • Objectives: Introduce basic machine learning concepts, real-world problems, and different algorithms
  • Outcomes: Describe basic concepts of machine learning, various algorithms, and evaluating performance of machine learning algorithms. Apply Machine learning to learn, predict and classify real-world problems. Students will acquire multidisciplinary skills.
  • Lab (1): Learning Outcomes Details: Use Python build-in functions and write C code to manipulate math module functions.

Lab (2): NumPy Module

  • Description: This lab describes the NumPy module, used for creating and manipulating arrays in Python.
  • Learning Outcomes: Use Python NumPy module functions to write codes that manipulate arrays.

Lab (3): CSV Files

  • Description: This lab goes over reading and writing from/to a CSV file.
  • Learning Outcomes: Demonstrate reading and writing data from and to a CSV file using Python.
  • Details: What a CSV is, structure of a CSV file in Python, reading a CSV file in Python (csv.reader), and writing a CSV file in Python.

Lab (4): Pandas Package

  • Description: This lab provides basic knowledge of Python's Pandas package.
  • Learning Outcomes: Use Pandas package to manipulate and explore data sets and write codes implementing Pandas packages.

Lab (5): Cleaning of Data Using Pandas

  • Description: Data cleaning using the Pandas package.
  • Learning Outcomes: Use Pandas package functions to clean data.

Lab (6): Understand and Visualize Your Data

  • Description: Instructions on how to understand and visualize data using Python.
  • Learning Outcomes: Understand and visualize the machine learning dataset. Write Python codes that implement how to understand and visualize the dataset.

Lab (7): Data Preprocessing and Machine Learning Modeling using Scikit-learn

  • Description: Data preprocessing and machine learning modeling using Scikit-learn library.
  • Learning Outcomes: Perform data preprocessing and machine learning modeling using Scikit-learn library, Write Python codes implementing Scikit-learn tools for data preprocessing and modeling.
  • Details: Machine learning basics, scikit-learn features, and different supervised machine learning techniques(ex: Binarization, Standardization, Scaling).

Lab (8): Dataset Feature Selection Techniques

  • Description: Methods on how to apply feature selection techniques to datasets.
  • Learning Outcomes: Understand feature selection techniques to datasets; write Python codes that apply feature selection techniques to datasets.

Lab (9): Building Supervised Machine Learning Classification Model

  • Description: Details on how to build supervised machine learning classification models.
  • Learning Outcomes: Understanding steps of building supervised machine learning models; writing Python codes to implement supervised machine learning classification models.

Lab (10): Building Learning Regression Models

  • Description: Basic knowledge on how to build supervised machine learning regression models.
  • Learning Outcomes: Implement supervised machine learning regression models, writing Python codes to implement supervised machine learning regression models.

Additional information

  • Exercise 1, Exercise 2, etc.: Specific exercises related to the material in each lab.
  • Modules: The Python modules imported for each exercise (e.g., pandas, numpy, matplotlib, stats).
  • Data Analysis Techniques: Summary operators, descriptive statistics, and specific feature extraction techniques demonstrated in each exercise (e.g., plotting histograms and density plots).

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

More Like This

Python Data Science and Analysis Quiz
12 questions
AI with Python Overview
10 questions
Use Quizgecko on...
Browser
Browser