R Programming with Tidyverse Quiz
5 Questions
0 Views

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

Which func*on in the *dyverse is used to restrict to certain observations in the data?

  • mean()
  • summarize
  • group_by
  • filter() (correct)

Which func*on in the *dyverse is used to restrict to certain columns in the data?

  • mean()
  • select() (correct)
  • summarize
  • group_by

What code would correctly construct a dataframe with average price per date from the Uber dataset?

  • uber %>% group_by(date) %>% summarize(meancost=mean(cost)) (correct)
  • uber %>% by(cost) %>% summarize(meancost=mean(date))
  • uber %>% group_by(date,cost) %>% summarize(meancost=mean(cost))
  • uber %>% group_by(date,ride_id) %>% summarize(meancost=mean(cost))

If you want to calculate the total cost over time using the Uber dataset, which approach would you NOT use?

<p>uber %&gt;% mutate(total_cost=sum(cost)) (D)</p> Signup and view all the answers

Which function should you avoid using if you want to obtain the average cost per ride in the Uber dataset?

<p>ungroup() (C)</p> Signup and view all the answers

Flashcards

Restricting observations (data)

Using the filter() function from the "dplyr" package to select specific rows in a data frame based on certain conditions.

Restricting columns (data)

Using the select() function from the "dplyr" package to choose specific columns from a data frame.

Calculating average cost by date

Using group_by(date) and summarize(meancost = mean(cost)) to calculate the average cost for each unique date in a dataset (e.g., Uber rides).

Incorrect Uber Data Aggregation

Grouping by both "date" and "ride_id" in calculating the average cost by date is redundant and generally not the correct way to compute average ride cost over time. Grouping by just "date" is sufficient.

Signup and view all the flashcards

Incorrect mean calculation

Using functions like by() inappropriately in the dplyr pipeline to calculate average costs by date leads to incorrect results. Use dplyr's specialized functions for effective data manipulation rather than base R's functions for this kind of operation.

Signup and view all the flashcards

Study Notes

Question 1

  • To filter observations in 'tidyverse', use the filter() function.

Question 2

  • To select columns, use select().

Question 3

  • The correct code to calculate the average cost per date is: uber %>% group_by(date) %>% summarize(meancost=mean(cost)).
  • This groups data by date and then calculates the mean cost (cost) for each grouping (date).

Question 4

  • The provided code is missing. Therefore, I cannot determine the value of start after execution.

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Description

Test your knowledge of the Tidyverse in R programming with this quiz. It covers essential functions like filter() and select() and includes a question on calculating averages using group_by() and summarize(). Perfect for students looking to reinforce their data manipulation skills!

More Like This

Use Quizgecko on...
Browser
Browser