Summary

This document provides a comprehensive summary of various graphical analysis techniques and statistical tests, including histograms, pie charts, bar charts, box plots, kernel density estimation, scatter plots, and graph matrices. It outlines the syntax, function, and applications of these methods in data visualization and analysis.

Full Transcript

GRAPHICAL ANALYSIS Name histogram pie chart bar chart box plot kdensity scatter plot graph matrix IMAGE...

GRAPHICAL ANALYSIS Name histogram pie chart bar chart box plot kdensity scatter plot graph matrix IMAGE / Syntax histogram XXX graph pie, over(YYY) graph bar XXX YYY graph box XXX kdensity XXX scatter XXX YYY graph matrix vars provides the are alternative ways provides information plots the on the central Kernel density histogram of the of representing the compare relationship tendency and on the provides a smooth variables listed shares of categorical characteristics of a between two spread of a variable (or continuous To generate a after the command. variables. They are variable (e.g. mean, more representation of its continuous It is the graphical often median, standard matrix of What does it do? variables), on the distribution variables. It is representation of used but sometimes deviation, etc.) skewness and tails. It scatterplots for (differently from useful for the variable are misused. across different summarizes and several variable histogram that examining represented in a Moreover, they are categories of a visualizes the key divides data bivariate one-way table (e.g. less helpful then categorical variable statistical properties of into discrete bins). distributions. tab XXX) histograms. a variable. bin(#) set the number of bins to over() allows to l t XXX YYY, to t plabel() speci es The option normal #; over() draw boxplots for and plot a linear the labels to report adds normal density draws the speci c subsamples regression to this histogram as on the slices; reproduces the de ned on the basis density to the data, that makes the density (default); percent within different bars in of the values taken kdensity graph. scatterplot easier to fraction draws the brackets the same graph by It is also possible interpret The option half histogram as adds the the categorical to superimpose l tci is used the produce the area; fractions; percentage to each variable written in two densities. In con dence interval lower triangle of Options by() reproduces frequency draws slice; parentheses. that case, the is shown around the the matrix the histogram as name adds the the different noout is added, no option to be use is regression line. (symmetric to frequencies; label attached to representations outlier will appear in addplot(). Within || is used to overlay percent draws the the graph. If one two or more graphs. the upper part). each value; per category in brackets histogram as wants by() can also be sum adds the separate graph one must specify percentages; horizontal box plots, used to distinguish frequency of areas. the other density normal adds a the command graph the scatter plot by normal density to each slice hbox must be used to be drawn). categories. the graph fi fi fi fi fi fi fi BIVARIATE INFERENTIAL STATISTICS analysis of variance Name indipendent t-test chi-squared test (ANOVA) Which kind? CONT + DUMMY CONT + CAT CAT + CAT The independent t-test is used to If there appear to be some The Chi-squared test tests the test whether means of a certain differences among means (using relationship between two variable in two independent tabstat). To nd out whether these categorical variables. What does it do? samples signi cantly differ from differences are statistically each other. signi cant, we perform an ANOVA analysis Syntax ttest XXX, by(XXX) anova XXX YYY tab XXX YYY, col chi2 nofreq If a p-value reported from a t If the F statistic is higher than If a p-value reported from a t test is less than 0.05, then that the critical value (the value of F test is less than 0.05, then that result is said to be statistically that corresponds with your result is said to be statistically signi cant. If a p-value is alpha value, usually 0.05), then signi cant. If a p-value is greater than 0.05, then the the di erence among groups is greater than 0.05, then the Results result is insigni cant. deemed statistically signi cant. result is insigni cant. In the middle In the high right part In the bottom right part Pr(|T| > |t|) = 0.6802 Pr = 0.000 Answer < 0.05 I can reject the H0 || > 0.05 I cannot reject the H0 fi fi fi ff fi fi fi fi fi

Use Quizgecko on...
Browser
Browser