5 Questions
Given the censusSplit RDD from task 3 of lab 5. Suppose you are asked to output the total income of all people in the dataset. Which of the following sets of RDD transforms/actions best solves the problem?
map, reduce
Consider the censusSplit RDD from task 3 of lab 5. Suppose you are asked to output the age and fntwgt of the person with the maximum fntwgt among all people in the dataset. Which of the following sets of RDD transforms/actions best solves the problem?
map, sortByKey, first
You are given an RDD of three-element tuples (, , ). You are asked to output the total salary of all people in the ‘teacher' occupation. Which of the following sets of RDD transforms best solves the problem?
map, filter, reduceByKey
Which of following is NOT an advantage of using structured programming with SparkSQL dataframes compared to programming using the Spark RDD API?
Structured programming allows data to be cached in RAM.
Consider the censusSplit RDD from task 3 of lab 5. Suppose you are asked to find the maximum flnwgt wage for each education category. Which of the following set of RDD transforms/actions best solves the problem?
map, sortByKey, first
Test your knowledge of Spark RDD transforms/actions and Dataframes by answering questions related to solving problems with given datasets using different sets of RDD transforms/actions.
Make Your Own Quizzes and Flashcards
Convert your notes into interactive study material.
Get started for free