R Programming and Statistical Challenge MCQs PDF
Document Details
Uploaded by CourtlyDouglasFir
Nikolidaki Marina
Tags
Summary
This document presents a series of multiple-choice questions (MCQs) on R programming and statistical challenges. The questions cover topics such as sampling methods and hypothesis testing, providing practice for assessing skills in R programming.
Full Transcript
The R Programming and Statistical Challenge: MCQs to Test Your Skills a) Selecting every nth individual from a 1. Simple population Random Sampling is best b) Selecting individuals based on specific described as: characteristics c) Randomly se...
The R Programming and Statistical Challenge: MCQs to Test Your Skills a) Selecting every nth individual from a 1. Simple population Random Sampling is best b) Selecting individuals based on specific described as: characteristics c) Randomly selecting individuals, giving each an equal chance of being chosen d) Dividing the population into clusters and selecting a few for study 1. c) Randomly selecting individuals, giving each an equal chance of being chosen a) Increase the efficiency of sampling by selecting random samples from each 2. Stratified subgroup Sampling is b) Select samples based on used to: convenience or accessibility c) Choose samples based on a predetermined quota d) Randomly select samples without considering subgroups 2. a) Increase the efficiency of sampling by selecting random samples from each subgroup 3. Which of the a) It is complex to implement following is a limitation of Simple Random b) It may not capture specific Sampling? subgroups of interest c) It is only suitable for small populations d) It requires detailed prior knowledge of the population structure 3. b) It may not capture specific subgroups of interest. 4. In Hypothesis a) There is a significant effect or Testing, the Null difference Hypothesis (H0) usually states b) There is no significant effect or that: difference c) The sample data is biased d) The observed effect is due to chance 4. b) There is no significant effect or difference. 5. In R programming, a) rnorm() which function is typically used for random b) sample() sampling? c) set.seed() d) seq() 5. b) sample() 6. The a) Generate a sequence of numbers ‘set.seed()' function in R is used to: b) Set the starting point for producing a sequence of random numbers c) Sample a set number of observations from data d) Create a variable 6. b) Set the starting point for producing a sequence of random numbers a) Dividing the population into groups and selecting all members from randomly chosen groups 7. Cluster sampling b) Selecting every nth individual from a list involves: c) Randomly selecting individuals without any group division d) Dividing the population into strata and randomly selecting from each stratum 7. a) Dividing the population into groups and selecting all members from randomly chosen groups. a) The null hypothesis is incorrectly rejected 8. In hypothesis testing, Type I error occurs b) The null hypothesis is incorrectly accepted when: c) There is a mistake in data collection d) The test statistic is miscalculated 8. a) The null hypothesis is incorrectly rejected. a) lm() 9. Which function in R is used for linear b) glm() regression? c) lapply() d) regress() 9. a) lm() a) dplyr 10. For data visualization in R, which package provides extensive b) tidyr graphing capabilities? c) ggplot2 d) shiny 10. c) ggplot2 a) Random selection 11. In stratified sampling, strata are formed based b) Characteristics irrelevant to the study on: c) Specific characteristics that are relevant to the research question d) Geographic location only 11. c) Specific characteristics that are relevant to the research question. a) read.csv() 12. In R, which command is used to read a CSV b) fileopen() file? c) opencsv() d) loadcsv() 12. a) read.csv() a) It is easier to implement 13. The main advantage of b) It ensures representation of all stratified subgroups sampling over simple random c) It requires less knowledge about the sampling is: population d) It is less time-consuming 13. b) It ensures representation of all subgroups. a) To perform ANOVA 14. What is the primary use of the ‘t.test()' b) To compare means between two function in R? groups c) To create plots d) To perform cluster analysis 14. b) To compare means between two groups. a) Representativeness 15. Which of the following is not a principle of b) Bias elimination sampling? c) Maximum variability d) Randomization 15. c) Maximum variability. a) A named argument 16. In R, 'NA' b) A non-applicable function represents: c) Missing or undefined data d) Negative association 16. c) Missing or undefined data. 17. Which sampling a) Simple random sampling method is best for ensuring each subgroup within a b) Stratified sampling population is represented? c) Cluster sampling d) Systematic sampling 17. b) Stratified sampling. a) The probability that the null hypothesis is true 18. A p-value in hypothesis b) The probability of obtaining testing signifies: observed data under the null hypothesis c) The likelihood of a Type II error d) The effect size of the test 18. b) The probability of obtaining observed data under the null hypothesis. 19. In R, which a) install.packages() command is used for installing packages? b) library() c) require() d) loadpackages() 19. a) install.packages() 20. The 'ggplot2' a) Data manipulation package in R is primarily used for: b) Statistical modeling c) Data visualization d) Machine learning 20. c) Data visualization. a) Each cluster is represented 21. Simple random sampling b) Each stratum is represented ensures: c) Each individual has an equal chance of being selected d) The sample is convenient to collect 21. c) Each individual has an equal chance of being selected. a) The null hypothesis is 22. In hypothesis falsely rejected testing, a Type II error occurs b) The null hypothesis is when: falsely accepted c) The alternative hypothesis is falsely rejected d) The alternative hypothesis is falsely accepted 22. b) The null hypothesis is falsely accepted 23. Which R a) rnorm() function is used for generating normal distribution values? b) runif() c) rbinom() d) rpois() 23. a) rnorm() a) It eliminates bias 24. Which of the following is an advantage of b) It is cost-effective for cluster sampling? large populations c) It guarantees every subgroup is represented d) It is the most accurate sampling method 24. b) It is cost-effective for large populations. 25. In R, how do you view the a) view() structure of a dataset? b) str() c) head() d) summary() 25. b) str() a) To compare means of two 26. What is the groups primary purpose b) To compare variances of ANOVA? within groups c) To compare means across more than two groups d) To assess correlation between two variables 26. c) To compare means across more than two groups. a) Analysis Of Variance 27. What does the 'ANOVA' in ANOVA stand b) Aggregate Normalization for? Of Variants c) Association Of Variable Analysis d) Analytical Observation Of Variables 27. a) Analysis Of Variance a) Two means 28. ANOVA is b) Two variances used to compare: c) Means of three or more groups d) Variances of three or more groups 28. c) Means of three or more groups 29. Which of the a) All groups are dependent following is a key assumption of ANOVA? b) Normal distribution of data c) All variables are qualitative d) Data is collected through observation only 29. b) Normal distribution of data a) There is a significant 30. In ANOVA, difference between group means if the null hypothesis is b) There is no significant true, then: difference between group means c) The p-value is less than the significance level d) The F-statistic is equal to zero 30. b) There is no significant difference between group means a) Data manipulation 31. The `ggvis` package in R is used primarily b) Data visualization for: c) Statistical modeling d) Machine learning algorithms 31. b) Data visualization 32. Which of the a) Creating interactive web following is a key feature of the applications `dplyr` package b) Data manipulation and in R? transformation c) Statistical tests d) Generating reports 32. b) Data manipulation and transformation a) Matrix multiplication 33. In R, the `%%` operator is b) Exponentiation used for: c) Integer division d) Finding the remainder of division 33. d) Finding the remainder of division 34. A scatter plot in R can be a) plot() created using which function? b) scatter() c) graph() d) scatterplot() 34. a) plot() a) Calculate correlation b) Conduct regression 35. In R, the analysis `cor()` function is used to: c) Plot data d) Clean data 35. a) Calculate correlation a) Statistical analysis 36. The `RMarkdown` framework is b) Data visualization used for: c) Creating dynamic documents d) Database connections 36. c) Creating dynamic documents 37. In R, the a) Loop `sapply()` function is a type of: b) Conditional statement c) User-defined function d) Apply function 37. d) Apply function a) Correlation between two 38. What does a boxplot visually variables represent in data b) Distribution of a dataset analysis? c) Linear regression model d) Time series data 38. b) Distribution of a dataset a) To create scatter plots 39. What is the purpose of the `hist()` function b) To generate histograms in R? c) To perform hypothesis testing d) To compute correlations 39. b) To generate histograms 40. In R, which function is used a) sum() for calculating the mean? b) median() c) mean() d) average() 40. c) mean() 41. What does a) Sum of data the function `sd()` compute in R? b) Standard deviation c) String distance d) Sequential digits 41. b) Standard deviation a) Variance 42. In R, what does the function `var()` compute? b) Vector length c) Variable type d) Value assignment 42. a) Variance 43. What does a) A detailed analysis of the `summary()` function in R each variable provide for a b) A textual plot of the data dataset? c) Summary statistics like Min, Max, Mean d) A list of variables and functions used 43. c) Summary statistics like Min, Max, Mean 44. What type of plot does the `boxplot()` a) Histogram function in R create? b) Pie chart c) Boxplot d) Scatter plot 44. c) Boxplot 45. What is the a) To pair columns and main purpose of the `pairs()` rows in a data frame function in R? b) To create a matrix of scatter plots c) To calculate pairwise correlation d) To merge two datasets 45. b) To create a matrix of scatter plots a) Binding vectors 46. In R, `rbind()` function b) Randomly shuffling data is used for: c) Row-wise binding of matrices or data frames d) Recursive binding 46. c) Row-wise binding of matrices or data frames 47. The `barplot()` a) Bar charts function in R is used to create: b) Line graphs c) 3D plots d) Heatmaps 47. a) Bar charts a) Resizing plots 48. In R, `scale()` function is commonly b) Standardizing variables used for: c) Scaling up computations d) Scanning data frames 48. b) Standardizing variables a) 3D plots 49. In R, `pie()` function is used for creating: b) Pie charts c) Histograms d) Network diagrams 49. b) Pie charts a) Time-series testing 50. The function `t.test()` in R is b) Testing table structures used for: c) Performing t-tests for comparing means d) Testing for data trends 50. c) Performing t-tests for comparing means a) A function for data analysis 51. In R, a b) A storage mode for data.frame is: datasets c) A type of plot d) An R package 51. b) A storage mode for datasets a) Linear regression 52. The function prcomp() in R is b) Principal component used for: analysis c) Plotting data d) Calculating p-values 52. b) Principal Component Analysis a) Applying a function to 53. In R, the apply() function margins of an array is typically used b) Plotting data for: c) Database management d) Machine learning 53. a) Applying a function to margins of an array a) Creating pivot tables 54. In R, the table() function b) Data frame creation is used for: c) Generating contingency tables d) Data visualization 54. c) Generating contingency tables a) Clean data 55. The shiny package in R is b) Build interactive web used to: applications c) Conduct statistical tests d) Generate reports 55. b) Build interactive web applications a) Survival analysis 56. The survival package in R is used for: b) Data manipulation c) Creating plots d) Machine learning 56. a) Survival analysis a) Database interfacing with 57. In R, the dbplyr package dplyr is primarily used b) Data visualization for: c) Statistical modeling d) Text analysis 57. a) Database interfacing with dplyr a) Correlation between two 58. What does a boxplot visually variables represent in data b) Distribution of a dataset analysis? c) Linear regression model d) Time series data 58. b) Distribution of a dataset a) To create scatter plots 59. What is the main purpose of using hist() b) To generate histograms function in R? c) To perform hypothesis testing d) To compute correlations 59. b) To generate histograms 60. The barplot() a) Bar charts function in R is used to create: b) Line graphs c) 3D plots d) Heatmaps 60. a) Bar charts THANK YOU ! Any Questions?