Applied Econometrics Lecture 1 PDF

Applied Econometrics Lecture 1 Autumn 2023 D.Sc. (Econ.) Elias Oikarinen Professor (Associate) of Economics University of Oulu, Oulu Business School This lecture handout • Introduction to the course • Motivation • Basics of the linear regression models • Repetition/recap of what you have learned in earlier courses already 2 Aim of the course The main aim is to provide students with understanding of the basics of applied econometrics and to lower the threshold of students to apply the basic time series, cross-sectional, and panel data estimation techniques by studying these techniques through examples and practical applications. 3 Aim of the course The aim is thus to provide students with understanding that enables / helps with e.g. • Application of econometric methods in their Master’s thesis, including appropriate interpretation and reporting of the empirical results. • Correct interpretation of econometric analysis and results reported in the scientific (especially economics and finance) literature. • Use of econometric methods at work (i.e. conducting research/analysis using econometric methods) to provide valuable information for decision-making • To independently deepen the understanding and find more information on various econometric models & topics 4 Learning Outcomes Upon completion of the course, a student is able to • analyze econometric modeling problems • perform appropriate econometric analyses • report the empirical results in a coherent manner (learning from articles!) related to the empirics of economics and/or finance • Student has basic knowledge on the various econometric techniques • • • • For cross-sectional data For time series data (For panel data) Including knowledge on the potential complications with the techniques when applied to real life data 5 Contents • Various econometric techniques used for cross-sectional, time series and panel data (panel data mainly for self-study) • Very basics of the underlying econometric theory • Main focus on applications and intuition • Emphasis in time series methods 6 General on the Course • Learning activities and teaching methods: app. 24 hours of lectures, and app. 12 hours of empirical exercises • I might be able to schedule some time for actual data exercises within the lectures, too • For this, self-study (before lectures) is important! • Independent studying(!!) and completing assignment(s) 124 hours (+ the empirical exercise videos, app. 12h). • Target group: Second year Master’s students in Economics and Finance • Prerequisites and co-requisites: 806116P Basic Methods in Statistics (in Finnish: 806116P Tilastotiedettä kauppatieteilijöille) or elementary knowledge of statistics and probability theory, and 721066S Principles of Econometrics 7 General Cont’d • Working life cooperation: The course provides students with the various econometric techniques that are of both academic and vocational value. The obtained skills can be applied e.g. in providing valuable information for decision-making in both public and private sectors, and providing quantitative insights into economic and financial problems. • NB. This course is alternative to the course 721954S Financial Econometrics. If you have already taken the course Financial Econometrics, you cannot include 721351S Applied Econometrics in your degree. POLL 8 Learning Material • Lecture handouts • NOTE: the handouts only provide the core structure for lectures – students are expected to independently make additional notes. • Textbooks – including self-study before lectures! • Additional material that is provided in Moodle during the course, e.g.: • • Data for exercises • Research articles Tip (e.g for thesis work): it is always useful to read and learn from scientific articles in which the econometric method of interest is applied, and/or the question of interest is studied / data of interest is analyzed. 9 Textbooks • Valuable help in the learning process • Pre-readings before lectures (list in Moodle) Relevant parts from: • • Gujarati, “Econometrics by Example", 1st edition or newer • General book of econometrics, with strong applied emphasis • NOTE: the sections and pages of Gujarati’s book referred to in the lecture handouts reflect the 1st edition • Brooks, ”Introductory Econometrics for Finance” • Especially for financial econometrics • NOTE: the sections and pages of Brooks’ book referred to in the lecture handouts reflect the 3rd edition • Available in the Uni of Oulu library 10 Textbooks • Relevant parts from: • Enders, ”Applied Econometric Time Series” • Excellent for time series econometrics • NOTE: the sections and pages of Enders’s book referred to in the lecture handouts reflect the 4th edition 11 Textbooks NOTE: You are always expected to independently read through the relevant parts of the book(s) (even if not mentioned as pre-reading). “Relevant parts” = the parts of a given book that are mentioned in the lecture handouts regarding each topic. Hint: it might be possible to find e-versions of the textbooks – you can share information among students on potential sources. 12 Assessment Methods and Criteria • Passing the course requires acceptable accomplishment of the end of course major learning assignment • This is a demanding task and requires a lot of effort! • Very hard to accomplish successfully, if one does not put substantial effort on the course (lectures, books, empirical exercises) regularly during the course!! • Short e-exam • Both need to be accomplished successfully in order to pass the course • Evaluation/grading: 0-5; 70% weight for the assignment, 30% for exam • As typical, it requires substantial effort to get a good grade 13 Lectures Lecture handouts will be updated to 2023 versions in Moodle latest in the morning before each lecture. For a successful accomplishment of the course, participation in the lectures is strongly recommended 1) The lectures and textbooks are only partly overlapping, i.e. they also are complementary. 2) The lecture handouts only provide the core structure based on which lectures are "built", i.e., the they do not cover all details and issues discussed in the lectures. 3) The lectures concentrate on explaining key issues from the point of view of the learning aims of the course. Further information is provided in the textbooks and in the lecture handouts. Lecture handouts include material that will not be discussed in detail or at all during the lectures, but provide additional info that also is discussed in the textbooks – summary info aiming to help/guide you. 14 Lectures; Pre-Reading IMPORTANT: Often there is an allocated pre-reading that should be read before the lecture. In addition to enhancing your learning process, this allows us to spend a bit less time on (i.e. concentrate on repeating key issues regarding the topic, and hence not having to go though / repeat all the details & basics explained in the textbook). This gives us more time to concentrate on working on empirical examples and illustrations! 15 Empirical Exercises • Application of econometric methods covered in the lectures to real life data. • The aim is to give the students the ability to independently apply the econometric methods, and to understand the many complications of real life data. • Very important also in terms of successfully completing the course assignment(s). • Each week, home exercises may be assigned to students – it will not be checked, whether a student has done the exercises, but it is highly recommended to conduct the empirical tasks to enhance learning during the course; The solutions for these excercises are then shown the next week (or earlier). 16 Exercises, cont’d The exercises are delivered as video recordings to allow you to go through the task in your own time and at a suitable pace for each individual. The program used by the lecturer in the exercises is Eviews (a student can use a different software as well, if (s)he wishes, but instructions are provided using EViews). About EViews: • Menu based, but enables writing own codes as well. • Easy to use – enables the student to concentrate on the substance (i.e. learning to apply the methods and interpret the results) rather than spending a lot of time to learn how to write codes. • Works well for cross-section and panel data, too, but especially handy for time series data 17 About EViews Each student should download Eviews (or another statistical package one aims to use during the course) in one’s laptop to be able to do the exercises and assignment. EViews documentation is included in Moodle in the Exercise Material folder so that an interested student can get guidance with EViews before exercise classes (videos) already, or look for additional information on Eviews to conduct e.g. nice econometric analysis for Master’s Thesis. However, you DO NOT necessarily need to use the EViews documentation if you do not wish: the aim in the exercise videos is to provide you with the necessary basic capabilities with EViews to be able to successfully accomplish the major course assignment. 18 About EViews Students can use EViews through the salava2.student.yo.oulu.fi remote server. You can find instructions on how to install the VPN connection etc. in order to use EViews through the server here: https://www.oulu.fi/ict/remote. EViews in not mentioned among other programs in that web page, but it should still be available. This also enables you to save workfiles. Another option (with restrictions on data size and saving opportunities) is a free student version, available at: http://www.eviews.com/download/student10/ 19 Timetable, Week I Lecture timetable together with preliminary contents: Note: we may conduct some empirical illustrations with EViews within some of the face-to-face sessions as well. All lectures in SÄ118 unless mentioned otherwise Mon 04.09., 12:15 to app. 14:00 Introduction & Recap of the basics of linear regression model Wed 06.09., 12:15 to app. 14:00 Regression diagnostics & complications (inc. cross-section regression example) Empirical exercises (video link in Moodle) - Learning the very basics of using EViews - Cross-sectional regression (estimation, diagnostics, interpretation) 20 Timetable, Week II Lecture timetable together with preliminary contents: Mon 11.09., 12:15 to app. 14:00 Endogeneity & instrumental variable regression Tue 12.09., 12:15 to app. 14:00 IV regression, cont’d Empirical exercises (video link in Moodle) Endogeneity & instrumental variable regression 21 Timetable, Week III Lecture timetable together with preliminary contents: Mon 18.09., 12:15 to app. 14:00 Basics of time series econometrics Tue 19.09., 12:15 to app. 14:00 Autoregression & Autoregressive (ARIMA) models Empirical exercises (video link in Moodle) ARIMA models 22 Timetable, Week IV Lecture timetable together with preliminary contents: Mon 26.09., 12:15 to app. 14:00 Modelling volatility: GARCH models Tue 27.09., 12:15 to app. 14:00 Forecasting very briefly (only some key issues & examples during the lecture – pre-reading important!) Stationarity & Unit root Empirical exercises (video link in Moodle) GARCH models; Forecasting with ARMA & GARCH models 23 Timetable, Week V Lecture timetable together with preliminary contents: Mon 02.10., 12:15 to app. 14:00; L7 (note!) Cointegration Vector autoregressive (VAR) model Tue 03.10., 12:15 to app. 14:00 Vector autoregressive (VAR) model (cont’d) Vector error-correction model (VECM) very briefly (the basic idea) Empirical exercises (video links in Moodle) Unit root; Cointegration; VAR 24 Timetable, Week VI Lecture timetable together with preliminary contents: Mon 9.10., 12:15 to app. 14:00 Vector autoregressive (VAR) model (cont’d) Vector error-correction model (VECM) very briefly (the basic idea) Logit & Probit models (started) Tue 10.10., 12:15 to app. 14:00 Logit & Probit models (started) The basics idea of panel regressions Special topics in very brief Empirical exercises (video links in Moodle) VAR; Probit & Logit (There additionally are panel exercises – these are out of the scope of this course, but included in Moodle to help an interested student) 25 Major Assignment • Assignment deadline, Wed 8.11. at 16:00 • Passing the course requires a successful accomplishment of the assignment • Independent work in pairs • Application of the methods learned during the course to data assigned by the lecturer • Similar analyses as in the exercise classes • Probably some room for students to choose between different kinds of data and methods • E.g. whether one wishes to conduct part of the analysis with relatively high frequency financial data, or (other) economic data with lower frequency • Detailed instructions are provided later on during the course 26 On Econometrics • Econometrics is concerned with the tasks of developing and applying quantitative or statistical methods to the study and elucidation of economic principles • Econometrics combines economic theory with statistics to analyze and test economic relationships • Theoretical econometrics considers questions about the statistical properties of estimators and tests • Econometrics is an extremely wide area – not even the Economics Noble price winners who are econometricians master the whole range (even close) • Time series econometrics alone includes a huge number of different estimators and topics • While it is useful to know the underlying statistical theory, this course concentrates on APPLYING econometric methods 27 On Applied Econometrics • Applied econometrics = The application of econometric methods to real-world data to • Assess economic theories • Study relationships between economic (inc. financial) variables • Analyze economic history (i.e. explaining observed economic events / phenomena in the past) • Investigate the dynamics of the economy & economic variables • Examine the effects of policy changes on various variables • Forecast 28 On Applied Econometrics • Academic applied econometric analyses are often highly policy relevant (both for public authorities and private businesses) • Both of academic and business oriented value • Examples of business oriented uses: e.g. prediction models for sales volume / price development / asset returns, detecting mispricings etc. → Support for decision-making in e.g financial institutions, consultancy companies, price setting for consumer products, public organizations etc. • What is a major complication in applied econometrics? 29 On Applied Econometrics • Real-life data often includes complications w.r.t. theoretical assumptions • Theoretical econometrics (mathematics) is precise, in applied analysis we get probabilities, estimates with confidence banda etc. • • • • • Four principles of applied econometric modeling (G. Box) All models are wrong …But some models are useful …And some models are more useful than others Simple models are generally preferred to more complicated ones (”Occam’s razor”) 30 Why study Econometrics? • Provides a set of analytical tools that are useful to economics and finance students as well as students in other areas • Econometric methods are used to analyze practical business and planning problems, ranging from federal government taxation plans to small business sales campaigns. • Quantitative analytical skills are highly valued in the workplace: knowledge of econometric or quantitative methods often improve job prospects. + It is great fun to dig into to real-life data in detail and be able to analyze that data in a useful manner to provide interesting findings, do forecasts etc. 31 THE LINEAR REGRESSION MODEL: AN OVERVIEW & RECAP (Gujarati, Part I, Ch. 1, or Brooks Ch. 3-4) 32 THE LINEAR REGRESSION MODEL (LPM) ➢ The general form of the LPM model is: Yi = B1 + B2X2i + B3X3i + … + BkXki + ui ➢ Or, as written in short form: Yi = BX + ui ➢ Y is the regressand, X is a vector of regressors, and u is an error term. ➢ The subscript i denotes the ith observation Damodar Gujarati Econometrics by Example, second edition 33 On the Meaning of Linear Regression • Refers to linearity in the regression coefficients, the Bs, and not linearity in the Y and X variables • For instance, the Y and X variables can be logarithmic (e.g. ln X2), or reciprocal (1/X3) or raised to a power (e.g. X2) • Linearity in the B coefficients means that they are not raised to any power (e.g. 𝐵12 ) or divided by other coefficients (e.g. B2/B3) or transformed, such as ln B4 • There are occasions where regression models that are not linear in the regression coefficients are considered (non-linear econometrics; out of the scope of this course) 34 POPULATION (TRUE) MODEL Yi = B1 + B2X2i + B3X3i + … + BkXki + ui ➢ This equation is known as the population or true model. ➢ It consists of two components: ➢ (1) A deterministic component, BX (the conditional mean of Y, or E(Y|X)). ➢ (2) A nonsystematic, or random component, ui. Damodar Gujarati Econometrics by Example, second edition 35 REGRESSION COEFFICIENTS ➢ B1 is the intercept. ➢ B2 to Bk are the slope coefficients. ➢ Collectively, they are the regression coefficients or regression parameters. ➢ Each slope coefficient measures the (partial) rate of change in the mean value of Y for a unit change in the value of a regressor, ceteris paribus. ➢ Any causal relationship between Y and the Xs, should be based on the relevant theory Damodar Gujarati Econometrics by Example, second edition 36 SAMPLE REGRESSION FUNCTION ➢ The sample counterpart is: Yi = b1 + b2X2i + b3X3i + … + bkXki + ei ➢ Or, as written in short form: Yi = bX + ei where e is a residual. ➢ The deterministic component is written as:  Yi = b1 + b2 X 2i + b3 X 3i + ... + bk X ki = bX ➢ 𝑌෡𝑖 = 𝑌𝑖 + 𝑒𝑖  Estimated value = Determistic value + residual Damodar Gujarati Econometrics by Example, second edition 37 Terminological issues • Alternative way to put it: • Regressand = dependent variable (= response variable) • Regressor = independent variable = explanatory variable • b1 (or B1) is often denoted as b0 (or Bo) • The deterministic component can be thought of as the expected value of Y given X [i.e. E(Y|X)] • b coefficients are the estimators of the (true) B coefficients 38 THE NATURE OF DATA ➢ Cross-Section Data ➢ Data on one or more variables collected at the same point in time. ➢ Examples are the census of population conducted by the Census Bureau every 10 years, opinion polls conducted by various polling organizations, and temperature at a given time in several places. Damodar Gujarati Econometrics by Example, second edition 39 THE NATURE OF DATA ➢ Time Series Data ➢ A set of observations that a variable takes at different times, such as daily (e.g., stock prices), weekly (e.g., money supply), monthly (e.g., the unemployment rate), quarterly (e.g., GDP), annually (e.g., government budgets), quinquenially or every five years (e.g., the census of manufactures), or decennially or every ten years (e.g., the census of population). Damodar Gujarati Econometrics by Example, second edition 40 THE NATURE OF DATA ➢ Panel, Longitudinal or Micro-panel Data ➢ Combines features of both cross-section and time series data. ➢ Same cross-sectional units are followed over time. ➢ Panel data represents a special type of pooled data (simply time series, cross-sectional, where the same cross-sectional units are not necessarily followed over time). Damodar Gujarati Econometrics by Example, second edition 41 Macro Data • Macro data: data at the national level • E.g. GDP, inflation rate, etc. • Often refers to market level data; e.g. housing price index for Oulu, OMX Helsinki index returns, etc. • Macro econometrics • Tools and techniques needed to model aggregate economic data, e.g. unemployment, wages, prices • In practice, typically based on the methods of time series econometrics 42 Micro Data • Micro data: data on the characteristics of units of a population, such as individuals, households, or establishments • E.g. income at the household level, data on individual housing transactions • (Was) Conventionally largely correlation based – causality complications • More recently, approaches aiming to identify causal relations: diff-in-diff, Regression discontinuity design (RDD) • Economics Nobel prize for Joshua D. Angrist, David Card, and Guido W. Imbens in 2021 (https://www.nber.org/news/joshua-angrist-david-cardand-guido-imbens-awarded-2021-nobel-prize ) 43 METHOD OF ORDINARY LEAST SQUARES ➢ Method of Ordinary Least Squares (OLS) does not minimize the sum of the error term, but minimizes error sum of squares (ESS): 2 2 u = ( Y − B − B X − B X − .... − B X )  i  i 1 2 2 i 3 3i k ki ➢ To obtain values of the regression coefficients, derivatives are taken with respect to the regression coefficients and set equal to zero. Damodar Gujarati Econometrics by Example, second edition 44 CLASSICAL LINEAR REGRESSION MODEL ➢ Assumptions of the Classical Linear Regression Model (CLRM): ➢ A-1: Model is linear in the parameters. ➢ A-2: Regressors (RHS variables) are fixed or nonstochastic.* ➢ A-3: Given X, the expected value of the error term is zero, or E(ui |X) = 0. * In the sense that their values are fixed in repeated sampling Damodar Gujarati Econometrics by Example, second edition 45 CLASSICAL LINEAR REGRESSION MODEL ➢ A-4: Homoscedastic, or constant, variance of ui, or var(ui|X) = σ2. ➢ A-5: No autocorrelation, or cov(ui,uj|X) = 0, i ≠ j. ➢ A-6: No multicollinearity, or no perfect linear relationships among the X variables. ➢ A-7: No specification bias. Damodar Gujarati Econometrics by Example, second edition 46 GAUSS-MARKOV THEOREM ➢ On the basis of assumptions A-1 to A-7, the OLS method gives best linear unbiased estimators (BLUE): ➢ (1) Estimators are linear functions of the dependent variable Y. ➢ (2) The estimators are unbiased; in repeated applications of the method, the estimators approach their true values. ➢ (3) In the class of linear estimators, OLS estimators have minimum variance; i.e., they are efficient, or the “best” estimators. (→ the true parameter values can be estimated with least possible uncertainty; an unbiased estimator with the least variance is called an efficient estimator) ➢ For more details, see Gujarati 1.4 (pp. 8-10). Damodar Gujarati Econometrics by Example, second edition 47 Violations of the assumptions ➢ Often one or more of the assumptions are violated. ➢ This does not mean that OLS could not be used at all. ➢ In many cases, there are techniques to cater for the violations - or at least to diminish their biasing influence. ➢ These will be discussed especially in lecture (handout) II. ➢ Also considered within many of the topics later during the course. Damodar Gujarati Econometrics by Example, second edition 48 Hypothesis testing • We do not know the true model – instead, the aim is to estimate a model that reflects the true one • That is, the estimated parameters are only estimates (estimators of the true coefficients), not accurate true values • OLS estimators, bs, are random variables, for their values will vary from sample to sample • Hence, we need statistical tests and confidence intervals in order to make conclusions / to derive implications 49 HYPOTHESIS TESTING: t TEST ➢ To test the following hypothesis: H0: Bk = 0 H1: Bk ≠ 0 we calculate the following and use the t table to obtain the critical t value with n-k degrees of freedom for a given level of significance (or α, equal to 10%, 5%, or 1%): bk t= se(bk ) If this value is greater than the critical t value, we can reject H0. On the estimation of se(bk), Gujarati 1.5 (p. 10) Damodar Gujarati Econometrics by Example, second edition 50 HYPOTHESIS TESTING: t TEST ➢ An alternative method is seeing whether zero lies within the confidence interval: [bk  t / 2 se(bk )] = (1 −  ) ➢ If zero lies in this interval, we cannot reject H0. ➢ The p-value gives the exact level of significance, or the lowest level of significance at which we can reject H0. Damodar Gujarati Econometrics by Example, second edition 51 Type I and II Errors • Type I error is the rejection of a true null hypothesis (also known as a "false positive" finding or conclusion) • Type II error is the non-rejection of a false null hypothesis (also known as a "false negative" finding or conclusion) • The size of a test is the probability of committing a Type I error, i.e., of incorrectly rejecting the null hypothesis when the null hypothesis is true • The power of a binary hypothesis test is the probability of correctly rejecting the null hypothesis if it is false — i.e., it indicates the probability of avoiding a type II error 52 GOODNESS OF FIT, R2 ➢ R2, the coefficient of determination, is an overall measure of goodness of fit of the estimated regression line. ➢ Gives the percentage of the total variation in the dependent variable that is explained by the regressors. ➢ It is a value between 0 (no fit) and 1 (perfect fit). ➢ Let: Explained Sum of Squares (ESS) = (Yˆ − Y ) 2 Residual Sum of Squares (RSS) = e 2 Total Sum of Squares (TSS) = (Y − Y ) 2 ➢ Then: ESS RSS R = = 1− TSS TSS 2 Damodar Gujarati Econometrics by Example, second edition 53 HYPOTHESIS TESTING: F TEST ➢ Testing the following hypothesis is equivalent to testing the hypothesis that all the slope coefficients are 0: H0: R2 = 0 H1: R2 ≠ 0 ➢ Calculate the following and use the F table to obtain the critical F value with k-1 degrees of freedom in the numerator and n-k degrees of freedom in the denominator for a given level of significance: ESS / df R 2 /(k − 1) F= = RSS / df (1 − R 2 ) /(n − k ) If this value is greater than the critical F value, reject H0. n = number of usable observations; k = number of regressors Damodar Gujarati Econometrics by Example, second edition 54 On the t and F tests • In sum, in the basic linear regression model • t-test concerns the hypothesis that a given single regression coefficient has a specific value (often, but not necessarily zero) • F-test concerns the whole regression: whether the regression model has any explanatory power at all w.r.t. to Y 55 Functional Forms of Regression Models (Gujarati, Part I, Ch 2) LOG-LINEAR, DOUBLE LOG, OR CONSTANT ELASTICITY MODELS ➢ The Cobb-Douglas Production Function: Qi = B1Li B2 Ki B3 can be transformed into a linear model by taking natural logs of both sides: ln Q = ln B + B ln L + B ln K i 1 2 i 3 i ➢ The slope coefficients can be interpreted as elasticities. ➢ If (B2 + B3) = 1, we have constant returns to scale. ➢ If (B2 + B3) > 1, we have increasing returns to scale. ➢ If (B2 + B3) < 1, we have decreasing returns to scale. Damodar Gujarati Econometrics by Example, second edition 57 LOG-LIN MODELS ➢ Log-lin models follow this general form: ln 𝑌 = 𝐵1 + 𝐵2 𝑋𝑖 ➢ B2 is the relative (or percentage) change in Y responding to a one unit absolute change in X ➢ If X increases by 1, predicted Y increases by B2 % ➢ Used in Engel expenditure functions: “The total expenditure that is devoted to food tends to increase in arithmetic progression as total expenditure increases in geometric proportion.” Damodar Gujarati Econometrics by Example, second edition 58 LIN-LOG MODELS ➢ Lin-log models follow this general form: Yi = B1 + B2 ln X i + ui ➢ B2 is the absolute change in Y responding to a 100% change in X ➢ If X increases by 100%, predicted Y increases by B2 units Damodar Gujarati Econometrics by Example, second edition 59 RECIPROCAL MODELS ➢ Sometimes the relationship between the regressand and regressor(s) is reciprocal or inverse: 1 Yi = B1 + B2 ( ) + ui Xi ➢ Note that: 1 ➢ As X increases indefinitely, the term B2 ( X i ) approaches zero and Y approaches the limiting or asymptotic value B1. 1 ➢ The slope is: dY = − B2 ( 2 ) dX X ➢ Therefore, if B2 is positive, the slope is negative throughout, and if B2 is negative, the slope is positive throughout. Damodar Gujarati Econometrics by Example, second edition 60 POLYNOMIAL REGRESSION MODELS ➢ The following regression predicting housing prices (HP) is an example of a quadratic function, or more generally, a seconddegree polynomial in the variable size: 𝐻𝑃 = 𝐵0 + 𝐵1 𝑠𝑖𝑧𝑒 + 𝐵2 𝑠𝑖𝑧𝑒 2 ➢ The slope is nonlinear and equal to: 𝑑𝐻𝑃 = 𝐵1 + 2𝐵2 𝑠𝑖𝑧𝑒 𝑠𝑖𝑧𝑒 61 SUMMARY OF FUNCTIONAL FORMS MODEL FORM SLOPE ELASTICITY dY ) dX dY X . dX Y B2 B2 ( ( X ) Y Linear Y =B1 + B2 X Log-linear lnY =B1 + ln X B2 ( Log-lin lnY =B1 + B2 X B2 (Y ) B2 ( X ) Lin-log Y = B1 + B2 ln X B2 ( 1 ) X 1 B2 ( ) Y 1 ) X − B2 ( 1 ) 2 X Reciprocal Y = B1 + B2 ( Y ) X B2 − B2 ( 1 ) XY NOTE: We cannot directly compare the fits of two models that have different dependent variables, but we can transform the models and compare RSS (see Gujarati, 2.8) Damodar Gujarati Econometrics by Example, second edition 62 STANDARDIZED VARIABLES ➢ We can avoid the problem of having variables measured in different units by expressing them in standardized form: _ − Yi − Y Xi − X * * Yi = ; Xi = SY SX _ _ where SY and SX are the sample standard deviations and Y and X are the sample means of Y and X, respectively ➢ The mean value of a standardized variable is always zero and its standard deviation value is always 1. ➢ Gujarati pp. 41-43 Damodar Gujarati Econometrics by Example, second edition 63 MEASURES OF GOODNESS OF FIT ➢ R2: Measures the proportion of the variation in the regressand explained by the regressors.− ➢ Adjusted R2: Denoted as R 2, it takes degrees of freedom into account: _ 2 2 n −1 POLL R = 1 − (1 − R ) n−k ➢ Various Information Criteria ➢ Akaike’s Information Criterion (AIC) ➢ Schwarz’s Information Criterion (SIC) ➢ Also other info criteria ➢ To be discussed more in the time series econometrics “stage” 64 Qualitative Explanatory Variables in Regression Models (Gujarati, Part I, Ch 3) QUALITATIVE VARIABLES ➢ Qualitative variables are nominal scale variables which have no particular numerical values. ➢ We can “quantify” them by creating the so-called dummy variables, which take values of 0 and 1 ➢ 0 indicates the absence of an attribute ➢ 1 indicates the presence of the attribute ➢ For example, a variable denoting gender can be quantified as female = 1 and male = 0 or vice versa. ➢ Dummy variables are also called indicator variables, categorical variables, and qualitative variables. ➢ Examples: gender, race, color, religion, nationality, geographical region, party affiliation, and political upheavals Damodar Gujarati Econometrics by Example, second edition 66 DUMMY VARIABLE TRAP ➢ If an intercept is included in the model and if a qualitative variable has m categories, then introduce only (m – 1) dummy variables. ➢ For example, gender has only two categories; hence we introduce only one dummy variable for gender. ➢ This is because if a female gets a value of 1, ipso facto a male gets a value of zero. ➢ If we consider self-reported health as a choice among excellent, good, and poor, we can have at most two dummy variables to represent the three categories. ➢ If we do not follow this rule, we will fall into what is called the dummy variable trap, the situation of perfect collinearity. Damodar Gujarati Econometrics by Example, second edition 67 REFERENCE CATEGORY ➢ The category that gets the value of 0 is called the reference, benchmark, or comparison category. ➢ All comparisons are made in relation to the reference category. ➢ If there are several dummy variables, you must keep track of the reference category; otherwise, it will be difficult to interpret the results. Damodar Gujarati Econometrics by Example, second edition 68 POINTS TO KEEP IN MIND ➢ If there is an intercept in the regression model, the number of dummy variables must be one less than the number of classifications of each qualitative variable. ➢ If you drop the (common) intercept from the model, you can have as many dummy variables as the number of categories of the dummy variable. ➢ The coefficient of a dummy variable must always be interpreted in relation to the reference (i.e. omitted) category. ➢ Dummy variables can interact with quantitative regressors as well as with qualitative regressors. If a model has several qualitative variables with several categories, introduction of dummies for all the combinations can consume a large number of degrees of freedom. Damodar Gujarati Econometrics by Example, second edition 69 INTERPRETATION OF DUMMY VARIABLES ➢ Dummy coefficients are often called differential intercept dummies, for they show the differences in the intercept values of the category that gets the value of 1 as compared to the reference category. ➢ The common intercept value refers to all those categories that take a value of 0. Damodar Gujarati Econometrics by Example, second edition 70 USE OF DUMMY VARIABLES IN SEASONAL DATA ➢ The process of removing the seasonal component from a time series is called deseasonalization or seasonal adjustment. ➢ The resulting time series is called deseasonalized or seasonally adjusted time series. ➢ Consider the following model predicting the sales of fashion clothing: Sales = A + A D + A D + A D + u t 1 2 2t 3 3t 4 4t t where D2 =1 for second quarter, D3 =1 for third quarter, D4= 1 for 4th quarter, Sales = real sales per thousand square feet of retail space. Damodar Gujarati Econometrics by Example, second edition 71 USE OF DUMMY VARIABLES IN SEASONAL DATA (pp. 58-61) ➢ In order to deseasonalize the sales time series, we proceed as follows: ➢ 1. From the estimated model we obtain the estimated sales volume. ➢ 2. Subtract the estimated sales value from the actual sales volume and obtain the residuals. ➢ 3. To the estimated residuals, we add the (sample) mean value of sales. The resulting values are the deseasonalized sales values. Damodar Gujarati Econometrics by Example, second edition 72 ➢ Dummy variables also can be used e.g. to capture structural changes in data or parameters over time Damodar Gujarati Econometrics by Example, second edition 73

Applied Econometrics Lecture 1 PDF

Document Details

Tags

Related

Summary

Full Transcript