Podcast
Questions and Answers
What is the name of the method introduced by John Tukey used in Exploratory Data Analysis?
What is the name of the method introduced by John Tukey used in Exploratory Data Analysis?
Median Polish
What kind of table does Median Polish use to extract effects?
What kind of table does Median Polish use to extract effects?
Two-way table
Median Polish is robust to outliers because it uses medians instead of means.
Median Polish is robust to outliers because it uses medians instead of means.
True (A)
Median Polish is more robust than ANOVA for identifying the significance of factors in a multifactor model.
Median Polish is more robust than ANOVA for identifying the significance of factors in a multifactor model.
How is Median Polish used to analyze data and identify the role of each factor?
How is Median Polish used to analyze data and identify the role of each factor?
Describe the goal of the first step in conducting Median Polish according to John Tukey.
Describe the goal of the first step in conducting Median Polish according to John Tukey.
What does the overall effect represent in the second step of Median Polish?
What does the overall effect represent in the second step of Median Polish?
What is the name of the formula used to represent the fit for each cell in a two-way table with row 'i' and column 'j'?
What is the name of the formula used to represent the fit for each cell in a two-way table with row 'i' and column 'j'?
What is the name of the formula used to represent the residual for each cell in a two-way table with row 'i' and column 'j'?
What is the name of the formula used to represent the residual for each cell in a two-way table with row 'i' and column 'j'?
What is the name of the parameter used in the medpolish() function that allows you to control the maximum number of iterations?
What is the name of the parameter used in the medpolish() function that allows you to control the maximum number of iterations?
The medpolish() function assumes the data is pre-formatted in a long form.
The medpolish() function assumes the data is pre-formatted in a long form.
The eda_pol() function requires that the data be in long format.
The eda_pol() function requires that the data be in long format.
What kind of output does the eda_pol() function provide?
What kind of output does the eda_pol() function provide?
What is the name of the argument used in the eda_pol() function to sort the output values by their effect size?
What is the name of the argument used in the eda_pol() function to sort the output values by their effect size?
What kind of object is the df.pol object in R?
What kind of object is the df.pol object in R?
What are the three key output values stored in the df.pol object?
What are the three key output values stored in the df.pol object?
How do you access the global effect stored in the df.pol object?
How do you access the global effect stored in the df.pol object?
How do you access the row effects stored in the df.pol object?
How do you access the row effects stored in the df.pol object?
Flashcards
What is Median Polish?
What is Median Polish?
Median polish is a method for analyzing two-way tables that helps identify the effect of row and column factors on a response variable.
How does median polish work?
How does median polish work?
The iterative process of median polish involves repeatedly subtracting row and column medians and recalculating medians until they converge to a stable value.
What is a 'row effect' in median polish?
What is a 'row effect' in median polish?
The row effect represents the influence of a particular row on the response variable. For example, a positive row effect might indicate a higher response value for that row.
What is a 'column effect' in median polish?
What is a 'column effect' in median polish?
Signup and view all the flashcards
What is a 'residual' in median polish?
What is a 'residual' in median polish?
Signup and view all the flashcards
What is the 'common value' in median polish?
What is the 'common value' in median polish?
Signup and view all the flashcards
What makes median polish robust?
What makes median polish robust?
Signup and view all the flashcards
When does median polish stop?
When does median polish stop?
Signup and view all the flashcards
What is the goal of median polish?
What is the goal of median polish?
Signup and view all the flashcards
How is a table cell represented in median polish?
How is a table cell represented in median polish?
Signup and view all the flashcards
What can we learn from median polish?
What can we learn from median polish?
Signup and view all the flashcards
How can median polish be implemented in R?
How can median polish be implemented in R?
Signup and view all the flashcards
Row effect
Row effect
Signup and view all the flashcards
Column effect
Column effect
Signup and view all the flashcards
Two-way table
Two-way table
Signup and view all the flashcards
Residual value
Residual value
Signup and view all the flashcards
Common term
Common term
Signup and view all the flashcards
Robustness of Median Polish
Robustness of Median Polish
Signup and view all the flashcards
Purpose of Median Polish
Purpose of Median Polish
Signup and view all the flashcards
R function for Median Polish
R function for Median Polish
Signup and view all the flashcards
Controlling Iterations in Median Polish
Controlling Iterations in Median Polish
Signup and view all the flashcards
Applications of Median Polish
Applications of Median Polish
Signup and view all the flashcards
Components of Median Polish Analysis
Components of Median Polish Analysis
Signup and view all the flashcards
Interpreting Median Polish Results
Interpreting Median Polish Results
Signup and view all the flashcards
Importance of Understanding Median Polish
Importance of Understanding Median Polish
Signup and view all the flashcards
Fields of Application for Median Polish
Fields of Application for Median Polish
Signup and view all the flashcards
Benefits of Using Median Polish
Benefits of Using Median Polish
Signup and view all the flashcards
Study Notes
STT157 Exploratory Data Analysis (EDA)
- The course is about exploratory data analysis (EDA)
- The subject is Median Polish, a data analysis technique
Median Polish
- Median polish is a method introduced by John Tukey
- It's a simple and robust technique in EDA
- It's used to extract effects from a two-way table
- The method is robust to outliers because it uses medians instead of means for analysis
- It's more robust compared to ANOVA in a multi-factor model when examining the significance of different factors
- It iteratively extracts row and column effects to characterize the factors contributing to the expected value.
- The aim is to pinpoint each factor's role by progressively deducting the row and column effects.
Steps for Conducting Median Polish
- Take the median of each row and record it next to the row. Subtract the row median from every value in that row.
- Compute the median of row medians, consider it the overall effect, and subtract this effect from each row median.
- Take each column's median, record beneath the column, and subtract each column median from the values in that particular column.
- Compute the column medians' median and add it to the current overall effect. Subtract this new overall effect from the column medians.
- Repeat steps 1-4 until no changes occur in row or column medians.
Fit for Each Cell
- The fit for each cell (row i and column j) is equal to a common term plus the row effect (i) plus the column effect(j).
- Residuals (residualᵢⱼ) are differences between raw data (dataᵢⱼ) and fitted values (fitᵢⱼ). residualᵢⱼ = dataᵢⱼ - fitᵢⱼ
Model
- The resulting model is additive; the response variable (yᵢⱼ) is equal to the common value (μ) plus the row effect (αᵢ) plus the column effect (βⱼ) plus the residual (εᵢⱼ).
Example (Infant Mortality)
- Example data presents infant mortality by region and father's education level in United States (1964-1966)
- Data is reported as the number of deaths per 1000 live births.
Stopping Iteration
- Iterate through row and column smoothing operations until row and column effect medians approach zero
- Refrain from unlimited iterations, as suggested by Hoaglin et al. (1983) a few steps are sufficient.
Implementing Median Polish in R
- R has a built-in function
medpolish()
to implement median polish. - Users can set the
maxiter
parameter to define the maximum number of iterations. However, by default, R automatically calculates the best number of iterations. - The data frame needs to be loaded first. A sample
R
code is provided - The
medpolish()
function returns a resultdf.med
- The console output from the
medpolish()
function shows the sum of absolute residuals at each step during the iteration.
Using eda_pol
- The
tukeyedar
package provides a custom functioneda_pol
eda_pol
creates polished tables as graphic elements, unlike the basicmedpolish()
function which requires data in long format.- Example
R
code demonstrating how to useeda_pol
is provided - The
df.pol
contains values of polished table for the common, row and column effects.
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.