Podcast
Questions and Answers
What is the purpose of np.where in the context of the provided program?
What is the purpose of np.where in the context of the provided program?
What will be the output of np.prod(array1_e[5:]) * np.prod(array2_e[5:])?
What will be the output of np.prod(array1_e[5:]) * np.prod(array2_e[5:])?
Which function is used to compute the covariance in the provided example?
Which function is used to compute the covariance in the provided example?
What will the command series_a.sort_values() return?
What will the command series_a.sort_values() return?
Signup and view all the answers
What type of arrays are created in the initial part of the program?
What type of arrays are created in the initial part of the program?
Signup and view all the answers
What does the command np.isnan(array_c) check for in the array?
What does the command np.isnan(array_c) check for in the array?
Signup and view all the answers
In comparing Array1 and Array4, which statistical measure is calculated?
In comparing Array1 and Array4, which statistical measure is calculated?
Signup and view all the answers
What does the term 'index' refer to in the context of a Pandas Series?
What does the term 'index' refer to in the context of a Pandas Series?
Signup and view all the answers
What method is used to sort the DataFrame based on the first column?
What method is used to sort the DataFrame based on the first column?
Signup and view all the answers
What function is utilized to find the correlation between the first and second columns?
What function is utilized to find the correlation between the first and second columns?
Signup and view all the answers
Which of the following correctly identifies the output when removing duplicates from column 'A'?
Which of the following correctly identifies the output when removing duplicates from column 'A'?
Signup and view all the answers
What function is used to compute the mean of a two-dimensional array along the second axis?
What function is used to compute the mean of a two-dimensional array along the second axis?
Signup and view all the answers
How many bins are created when discretizing the second column?
How many bins are created when discretizing the second column?
Signup and view all the answers
Which method is appropriate for reshaping a NumPy array?
Which method is appropriate for reshaping a NumPy array?
Signup and view all the answers
What will happen if you try to reshape an array to a new shape that has a different total number of elements?
What will happen if you try to reshape an array to a new shape that has a different total number of elements?
Signup and view all the answers
When merging two DataFrames, which method is used to find the names of students who attended both workshops?
When merging two DataFrames, which method is used to find the names of students who attended both workshops?
Signup and view all the answers
What is the result of using pd.concat() on two DataFrames to find total records?
What is the result of using pd.concat() on two DataFrames to find total records?
Signup and view all the answers
In the context of the provided Python programs, which types of elements does the function np.isnan check for?
In the context of the provided Python programs, which types of elements does the function np.isnan check for?
Signup and view all the answers
How can you create a random integer array of size m x n in NumPy?
How can you create a random integer array of size m x n in NumPy?
Signup and view all the answers
What does pd.concat([df1, df2]).drop_duplicates(keep=False) accomplish?
What does pd.concat([df1, df2]).drop_duplicates(keep=False) accomplish?
Signup and view all the answers
Which columns are used as multi-row indexes when merging two DataFrames row-wise?
Which columns are used as multi-row indexes when merging two DataFrames row-wise?
Signup and view all the answers
When subtracting two arrays of the same size, what will be the dimensions of the resulting array?
When subtracting two arrays of the same size, what will be the dimensions of the resulting array?
Signup and view all the answers
What does the np.cov function compute?
What does the np.cov function compute?
Signup and view all the answers
What type of data does the dtype attribute of a NumPy array return?
What type of data does the dtype attribute of a NumPy array return?
Signup and view all the answers
What is the method used to obtain the minimum rank of a Pandas Series?
What is the method used to obtain the minimum rank of a Pandas Series?
Signup and view all the answers
What does the 'max' method in ranking return for a Pandas Series?
What does the 'max' method in ranking return for a Pandas Series?
Signup and view all the answers
How can you identify the index of the minimum element in a Pandas Series?
How can you identify the index of the minimum element in a Pandas Series?
Signup and view all the answers
In the DataFrame creation example, how many rows are generated?
In the DataFrame creation example, how many rows are generated?
Signup and view all the answers
What percentage of values in the DataFrame is replaced by null values?
What percentage of values in the DataFrame is replaced by null values?
Signup and view all the answers
Which function is used to count the number of missing values in the DataFrame?
Which function is used to count the number of missing values in the DataFrame?
Signup and view all the answers
What is the column drop criterion based on the number of null values?
What is the column drop criterion based on the number of null values?
Signup and view all the answers
What happens to the row with the maximum sum of all values in the DataFrame?
What happens to the row with the maximum sum of all values in the DataFrame?
Signup and view all the answers
What method is used to calculate the average monthly income of female members in the DataFrame?
What method is used to calculate the average monthly income of female members in the DataFrame?
Signup and view all the answers
Which DataFrame method is utilized to group data by a specific attribute?
Which DataFrame method is utilized to group data by a specific attribute?
Signup and view all the answers
What is the purpose of the idxmax() function in the context provided?
What is the purpose of the idxmax() function in the context provided?
Signup and view all the answers
In the Titanic dataset, how is the total number of passengers under 30 determined?
In the Titanic dataset, how is the total number of passengers under 30 determined?
Signup and view all the answers
How do you calculate the familywise gross monthly income in the example provided?
How do you calculate the familywise gross monthly income in the example provided?
Signup and view all the answers
What will the 'high_income_members' DataFrame contain?
What will the 'high_income_members' DataFrame contain?
Signup and view all the answers
What data type is the 'MonthlyIncome (Rs.)' column expected to be?
What data type is the 'MonthlyIncome (Rs.)' column expected to be?
Signup and view all the answers
Which function would you use to load a CSV file in Pandas?
Which function would you use to load a CSV file in Pandas?
Signup and view all the answers
Study Notes
NumPy Programs for Data Analysis
- Generate a 2D random integer array and calculate the mean, standard deviation, and variance along the second axis.
- Create a 2D array of size m x n with random integers, displaying its shape, type, and data type, then reshape it to n x m according to user input.
- For a 1D array, identify indices of elements that are zero, non-zero, and NaN, storing these indices in separate arrays.
- Create three random arrays, perform subtraction of the second from the third, double the values of the first array, and calculate covariance and correlation between specified pairs.
- Generate two random arrays of size 10, and compute the sum of the first half and the product of the second half of both arrays.
Pandas Series Tasks
- Create a Pandas Series with five elements, displaying it sorted by index and values separately.
- Generate a Series with duplicate values, calculating the minimum and maximum ranks using 'first' and 'max' methods.
- Determine the index positions of the minimum and maximum elements of the Series.
DataFrame Manipulations
- Create a DataFrame with 3 columns and 50 rows using random numerical data, replacing 10% of values with NaN.
- Identify total missing values in the DataFrame and drop any columns with more than 5 nulls.
- Identify the row with the maximum values' sum, drop that row, sort the DataFrame by the first column, and remove duplicates from the first column.
- Calculate the correlation between the first and second columns, and covariance between the second and third columns.
- Discretize the second column into 5 bins.
Excel Data Handling
- Import workshop attendance data from two Excel files into separate DataFrames.
- Merge the DataFrames to find students who attended both workshops.
- Identify students who attended only one workshop by concatenating both DataFrames and dropping duplicates.
- Merge the DataFrames row-wise to count total records and perform hierarchical indexing using names and dates.
Income Data Analysis
- Create a DataFrame with members' names, genders, and monthly incomes.
- Calculate family-wise gross monthly income by summing incomes grouped by names.
- Identify the member with the highest income and display monthly incomes of members earning more than Rs. 60,000.
- Calculate the average monthly income of female members.
Titanic Dataset Analysis
- Load the Titanic dataset and count the total number of passengers aged under 30.
- Calculate the total fare paid by first-class passengers.
- Compare the number of survivors across different passenger classes.
- Compute descriptive statistics for any numeric attribute, differentiated by gender.
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.
Description
This assignment focuses on data visualization and statistical analysis using Python's NumPy library. Students will write programs to compute essential statistical measures such as mean, standard deviation, and variance, as well as create two-dimensional arrays. This is particularly relevant for those studying biomedical science and interested in data analytics.