Podcast
Questions and Answers
Which statement best describes R's capability as a tool?
Which statement best describes R's capability as a tool?
R is used only for reading files and does not perform any calculations.
R is used only for reading files and does not perform any calculations.
False
What are the two main roles R serves as mentioned in the introduction?
What are the two main roles R serves as mentioned in the introduction?
Calculator and data analysis tool
In R, the term ______ refers to the various types of data that can exist within an object.
In R, the term ______ refers to the various types of data that can exist within an object.
Signup and view all the answers
Match the following features of R with their descriptions:
Match the following features of R with their descriptions:
Signup and view all the answers
How can R be utilized as a calculator?
How can R be utilized as a calculator?
Signup and view all the answers
What is the significance of functions in R?
What is the significance of functions in R?
Signup and view all the answers
What types of objects can be examined in R?
What types of objects can be examined in R?
Signup and view all the answers
Describe the process of reading files in R.
Describe the process of reading files in R.
Signup and view all the answers
What is one simple and useful function in R and its purpose?
What is one simple and useful function in R and its purpose?
Signup and view all the answers
Study Notes
Introduction to R
- R is a calculator
- R can evaluate complex mathematical expressions
- Variables are assigned using the
<-
operator (e.g.,x <- 10
) - Basic arithmetic functions are available (e.g.,
1+1
,sqrt()
) - Functions for creating sequences (
seq()
,rep()
) - Functions for calculating absolute values (
abs()
) - Functions for manipulating decimal places (
e2
,e-2
) - Element-wise product can be performed with
ab; uv
Functions in R
- R has built-in functions for performing various tasks, including mathematical calculations.
- Functions can be used to perform simple arithmetic operations or complex analyses.
Reading Files in R
-
read.delim()
is used for tab separated files (.txt) - Default decimal separator is "."
-
read.table()
reads files in tabular format to create a data frame -
read.csv()
reads comma separated values files (.csv) into a data frame. -
read.csv2()
is used when the decimal separator is "," and the field separator is ";".
Exploring Data
-
View(x)
displays the data frame's contents. -
head(x)
shows the top 6 rows,head(x, n=n)
for the first few rows -
tail(x)
displays the last 6 rows,tail(x,n=n)
for the last few rows -
names(x)
shows the names of the variables in the data frame.
Types of data in R
- R supports several data types including numeric, character and logical, vectors, matrices and dataframes
Data Types
- Scalars: Single value (e.g.,
a <- 5
) - Vectors: Multiple values (e.g.,
v <- c(1,2,3)
) - Matrices: Two-dimensional array of values (e.g.,
m <- matrix(v, 3, 2)
) - Lists: Can contain varied data types (e.g.,
q <- list(a=v, b=x, c=u)
) - Data frames: Table-like structure, typically used for storing data with different variable types.
Data Structures
- Vectors store multiple values of the same type.
- Matrices are two-dimensional structures.
- Lists can contain elements of different data types.
- Data frames are tabular structures, organized into rows and columns.
Operations (Arithmetic)
-
3 + exp(4) * 2^2
-
(3 + exp(4)) * 2^2
- R follows operator precedence rules when evaluating expressions.
Data Frames
- A table-like structure with different variable types.
Data Merging
-
cbind()
merges by columns. -
rbind()
merges by rows. -
merge()
joins two datasets based on a common variable. (Commonly used for merging two datasets).
Data Manipulation
-
na.omit()
removes rows with NA (missing) values. -
complete.cases()
filters out rows with missing values. -
apply
,colMeans
,rowMeans
,mean
, used to calculate and analyze data. -
summary()
provides summary statistics for data frames or vectors.
Functions Creation
- User-defined functions are created in R using the
function()
syntax.
Data Types in R (continued)
- Character data: Text data (e.g.,
name <- c("Ahmed", "Laila")
). - Logical data: TRUE/FALSE values (e.g.,
smoker <- c(TRUE, FALSE, FALSE)
). - Numeric data stores numbers.
- Ordering variables:
sort(c(4,2,6))
-
colnames(m)=paste("X",1:ncol(m),sep="")
renames the column names. -
rownames(m)=1:nrow(m)
renames the row names.
Data Preparation
- Data import from different formats:
read.csv
,read.table
.
Missing Values Analysis
-
is.na(dataNA)
identifies missing values. -
sum(is.na(data)
,apply(data, 2, sum)
gives counts or the total of missing data by column. - Common functions include
na.omit()
,complete.cases()
,colSums(is.na(data))
to find the missing values in a column/data. - Handling missing values, including replacing them with imputation methods (mean, median etc).
Outliers
- Identifying:
boxplot(data)
. - Removal:
data[data > bench]
- Filtering data using calculated quartiles (Q1, Q3, IQR).
Descriptive Statistics
- Calculating measures like mean, median, min, max, range, IQR, standard deviation, variance.
- Using functions such as
mean(), median(), min(), max(), range(), IQR(), sd(), var()
. - Determining the mode:
Mode()
calculated using theDescTools
library. -
summary()
summarises numeric, logical and/or factor.
Tables and Plots
- Creating frequency tables, contingency tables (crosstabulations)
- Creating histograms to plot data (including specific quantitative or qualitative variable(s)).
- Constructing boxplots to identify data distribution.
- Using
table()
,chisq.test()
,fisher.test()
,oddsratio()
,assocstats()
, andcor()
.
Association and Correlation
- Calculate and display correlations of variables.
- Create contingency tables for association analysis - using measures like odds ratio, relative risk or chi-squared test
vcd()
,vcdExtra
andnnet
libraries.
Data Manipulation & Visualisation
- Data manipulation using
data.frame()
,cbind()
,rbind()
,dplyr
. - Scatter plots: The
plot()
function can be used.
Additional Considerations:
- Load the necessary libraries at the start of your script: using
library
command. - Using
attach()
to make variables in a data frame directly accessible. - Appropriately handle data types when using specific functions (e.g., converting factors to numerical values with
as.numeric()
).
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.
Related Documents
Description
This quiz covers the basics of R programming, including its functions as a calculator, variable assignment, and file reading techniques. Learn about various built-in functions for mathematical operations and data manipulation in R. Test your knowledge on sequences, absolute values, and reading different file formats in R.