Podcast
Questions and Answers
Which of the following statements regarding data caching in Apache Spark is false?
Which of the following statements regarding data caching in Apache Spark is false?
- Caching only a part of an RDD has no performance benefits. (correct)
- Caching data are especially important for the performance of iterative programs.
- Caching reduces the amount of disk access and therefore speeds up query execution.
- RDDs in Apache Spark are only cached if you explicitly specify that you want the RDD to be cached.
Which of the following statements about parquet storage format is false?
Which of the following statements about parquet storage format is false?
- Parquet storage format stores the schema with the data.
- Given a dataframe with 100 columns, it is faster to query a single column of the dataframe if the data are stored using the CSV storage format compared to the parquet storage format. (correct)
- Parquet storage format stores all values of the same column together.
- Given a dataframe with 100 columns, it is faster to query a single column of the dataframe if the data are stored using the parquet storage format compared to the data being stored in the CSV storage format.
Which of the following statements is false?
Which of the following statements is false?
- Executing queries using SparkSQL DataFrames and DataSets functions are at least as fast as using their RDD counterparts, and often faster.
- You can add columns to a dataframe using the withColumn function.
- DataSets contain schemas whereas DataFrames do not contain schemas. (correct)
- After performing a self-join on a dataframe, the resulting columns will contain duplicate column names.
What is a benefit of using the partitionBy function in SparkSQL?
What is a benefit of using the partitionBy function in SparkSQL?
Which of the following statements about query optimisation in Spark is false?
Which of the following statements about query optimisation in Spark is false?
Which of the following statements is false?
Which of the following statements is false?