Assignment 1 Quiz 4 CSE5BDC T5 2023
5 Questions
25 Views

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

Given the censusSplit RDD from task 3 of lab 5. Suppose you are asked to output the total income of all people in the dataset. Which of the following sets of RDD transforms/actions best solves the problem?

  • map, reduce, reduceByKey
  • map, reduce (correct)
  • map, sortByKey, first
  • map, reduceByKey, count
  • Consider the censusSplit RDD from task 3 of lab 5. Suppose you are asked to output the age and fntwgt of the person with the maximum fntwgt among all people in the dataset. Which of the following sets of RDD transforms/actions best solves the problem?

  • map, filter, reduceByKey
  • map, sortByKey, first (correct)
  • map, reduceByKey
  • map, reduceByKey, sortByKey
  • You are given an RDD of three-element tuples (, , ). You are asked to output the total salary of all people in the ‘teacher' occupation. Which of the following sets of RDD transforms best solves the problem?

  • map, filter, reduceByKey (correct)
  • map, filter
  • map, reduceByKey
  • map, reduceByKey, sortBy
  • Which of following is NOT an advantage of using structured programming with SparkSQL dataframes compared to programming using the Spark RDD API?

    <p>Structured programming allows data to be cached in RAM.</p> Signup and view all the answers

    Consider the censusSplit RDD from task 3 of lab 5. Suppose you are asked to find the maximum flnwgt wage for each education category. Which of the following set of RDD transforms/actions best solves the problem?

    <p>map, sortByKey, first</p> Signup and view all the answers

    Use Quizgecko on...
    Browser
    Browser