QTTM 409 Class Presentaiton of Dr P James Daniel Paul MSB LPU PDF
Document Details
LPU
Dr P James Daniel Paul
Tags
Summary
This document is a presentation on statistics. It covers various topics including types of statistical methods, their importance in business decisions, data classification, tabulation, measures of central tendency (mean, median, mode), dispersion, correlation and regression analysis, distributions (normal, uniform, exponential, binomial, Poisson, geometric) and forecasting methods. It also includes examples and applications.
Full Transcript
QTT 509 Syllabus Unit 1 Statistics : Unit V Types of statistical methods, importance and scope of statistics in...
QTT 509 Syllabus Unit 1 Statistics : Unit V Types of statistical methods, importance and scope of statistics in Index Numbers : types of an index business decisions, number, uses of index number, types of data methods of construction, unweighted vs weighted price indexes, consumer Data Classification, price indexes, problem in the Tabulation and Presentation : construction of index number classification of data, bases of classification, Tabulation of data, objectives of tabulation, parts and types of tables, diagrammatic presentation of data Unit II Unit IV Unit VI Measures of Central Tendency : Correlation Analysis : Meaning, Karl Forecasting and Time Series mean, median, mode, quartiles, percentile, Pearson’s coefficient of correlation, rank Analysis : types of forecasts, timing of deciles correlation, multiple correlation, partial forecasts, time series analysis, correlation forecasting methods, objectives of Dispersion : time series forecasting, steps of significance of measuring dispersion, range, Regression Analysis : concept of simple forecasting, time series standard deviation, coefficient of variation, linear regression, multiple regression, Concept of outlier estimation of coefficients, basic of non-linear decomposition models, quantitative regression forecasting methods Structure of the research Report (CA1) Title (Country/ Sector / Company / Domain ) Abstract Key words Introduction Methodology Review of literature Analysis Opt 1 Opt 1: Copy 30 Abstracts beginning with Author name (year 30 PDFs Opt 2 Author year Variables Statistical Key findin Conclusion used tools used Bibliography Fill the table from pdf creating 30 rows, copy to word ; (30 citations for each PDF in APA format from Convert table to text and add conjunctions google scholar while downloading) Mean for grouped Data Median grouped data Median Vs Quartile Vs Decile vs Percentile Mode for Grouped Data Where: (L) represents the lower boundary of the modal class. (f_1) is the frequency of the modal class. (f_0) is the frequency of the class preceding the modal class. (f_2) is the frequency of the class succeeding the modal class. (h) denotes the size of the class interval. Distributions Distributions,data types, tests of significance Distribution Name Shape Key Parameters Typical Applications Data Type Test of Significance T-test (for comparing means), Z-test (for large Mean (μ), Standard samples), ANOVA (for Normal Symmetric (bell-shaped) Deviation (σ) Heights, weights, IQ scores Continuous comparing multiple means) Random number generation, discrete Chi-square goodness-of-fit Uniform Rectangular Minimum (a), Maximum (b) uniform outcomes Continuous or Discrete test Kolmogorov-Smirnov test Time between events (e.g., (for goodness-of-fit), Chi- customer arrivals, machine square test (for discrete Exponential Skewed right Rate parameter (λ) failures) Continuous data) Chi-square test (for Number of trials (n), goodness-of-fit), Exact Binomial Discrete, bar chart Probability of success (p) Coin flips, quality control Discrete binomial test Number of events in a fixed Chi-square test (for interval (e.g., phone calls, goodness-of-fit), Poisson Poisson Discrete, bar chart Rate parameter (λ) accidents) Discrete exact test Chi-square test (for Number of trials until the goodness-of-fit), Exact Geometric Discrete, decreasing Probability of success (p) first success Discrete geometric test Estimate the Mean median mode for the given share prices and explain the uses =42.57 =41.26 =37.5 Dr P James Daniel Paul, [email protected]; [email protected]; +91 98402 94590 Converting an long array of data to a frequency distirbution or grouped data H= Range / desired number of classes where H is the class intravel. Dr P James Daniel Paul, [email protected]; [email protected]; +91 98402 94590 Mean for grouped Data Price range Mid value (x) Number of trades f Median grouped data Mode I. Ungrouped Data Mean = ∑X/n Median is the mid value Mode is the most repeated value average share price 1. weekly average - daily trader 2. monthly average for montly trader 3. yearly average for longterm investor Median grouped data Median for grouped Data Median for Grouped Data Weight f cf In grams 10-20 14 14 20-30 20 34 30-40 42 76 40-50 54 130 50-60 45 175 60-70 18 193 70-80 7 200 Quartile Decile Percentiles Find the Q3 of the following data set Decile Mean (Central Tendancies) for grouped Data Arithmetic mean Vs Geometric mean Year Growth Rate Sales (%) Trunover in M $ 2019 5.0 105 2020 7.5 112.87 2021 2.5 115.69 2022 5.0 121.47 2023 10.0 133.61 Arithmatic mean 𝑥 = (5+7.5+2.5+5+10) = 6 Geometric mean 𝑥1 ∗ 𝑥2 ∗ 𝑥3 ∗ 𝑥4 ∗ 𝑥5 = 5 ∗ 7.5 ∗ 2.5 ∗ 5.0 ∗ 10 = 4687.5 = 5.9 5 5 5 = (𝑋1 ∗ 𝑋2 ∗ 𝑋3 ∗ 𝑋4 ∗ 𝑥5) / Mean deviation vs Variance vs Standard Deviation Mean Deviation Mean Deviation Standard deviation ungrouped Standard Deviation on grouped Data Median, Quartile, Decidile and Percentile for discrete data Step 1 arrange the data in acending order Dr P James Daniel Paul, [email protected]; [email protected]; +91 98402 94590 Median, Quartile, Decidile and Percentile for Contineous data Step 1 Arrange the data in acending order Step 2: Find the middle number Quartile (continous) Dr P James Daniel Paul, [email protected]; [email protected]; +91 98402 94590 Median, Quartile, Decile and Percentile for discrete data Step 1 arrange the data in acending order Decile = N/10 instead of N/4 Dr P James Daniel Paul, [email protected]; [email protected]; +91 98402 94590 Skewness & Kurtosis vs Normal distribution K=3 Positive Skewness: Tail on the right, data concentrated on the left. Negative Skewness: Tail on the left, data concentrated on the right. Examples of Skewness and Kurtosis Skewness Kurtosis 1. Income Distribution in Economics: 1. Quality Control in Manufacturing: 1. Example: Income distribution is often positively skewed, where a small 1. Example: In a manufacturing process, a leptokurtic distribution of product number of individuals earn much higher incomes than the majority. This weights may indicate that most products are close to the target weight, results in the mean income being higher than the median, with a long tail but there are occasional defects with extreme deviations. to the right. 2. Application: High kurtosis in quality control metrics indicates that while 2. Application: Policymakers use skewness to understand income inequality most products are of high quality, occasional defects are severe. This helps and to design tax policies. Positive skewness indicates wealth in improving process consistency. concentration among a few, requiring progressive tax systems. 2. Risk Management in Finance: 2. Stock Returns in Finance: 1. Example: Financial returns with high kurtosis suggest that extreme returns 1. Example: A negatively skewed distribution of stock returns may indicate (both gains and losses) are more likely than a normal distribution would that there are frequent small gains and occasional large losses. predict. 2. Application: Investors might avoid negatively skewed stocks, as the 2. Application: Risk managers use kurtosis to assess the likelihood of potential for large losses outweighs the frequent small gains. extreme market events. Leptokurtic distributions signal the need for Understanding skewness helps in portfolio management and risk robust risk management strategies to hedge against potential market assessment. crashes. 3. Customer Satisfaction Surveys: 3. Insurance Claim Analysis: 1. Example: A customer satisfaction survey might have negative skewness if 1. Example: Insurance claims data might show high kurtosis if most claims most customers are highly satisfied, with a few being moderately satisfied. are of moderate size, but there are rare, extremely large claims (e.g., 2. Application: Companies use skewness to identify overall customer natural disasters). sentiment. Negative skewness suggests high satisfaction, allowing 2. Application: Insurance companies use kurtosis to price policies and companies to focus on retaining these satisfied customers. maintain reserves for catastrophic events. High kurtosis indicates a need for higher reserves to cover the potential for large claims. Skewness and Kurtosis Compared Skewness and kurtosis Skewness Example 1 Question: A retail chain wants to analyze the sales performance of its stores across different regions during the last quarter. They have collected sales data from 8 stores and wish to understand the distribution of sales to make informed strategic decisions. Calculate the skewness of the sales data and interpret the results. Example 1 Solution with steps Skewness Example1: Solution & Interpretation Skewness Example 2 A service company collects customer satisfaction scores on a scale from 1 to 10 to assess its service quality. After gathering data from recent feedback, the company wants to determine the distribution and skewness of the satisfaction scores to identify areas for improvement. Calculate the skewness and interpret the findings. Example 2 Skewness steps: Calculating Mean, Median and Standard Deviation Example 2 Skewness Calculation Example 3 Skewness A tech company aims to analyze the salary distribution among its employees to assess equity and fairness. The HR department has collected salary data from 9 employees across different roles. Calculate the skewness of the salary distribution and interpret the results to inform potential salary adjustments. Skewness Example 3 Skewness Example 3 Step 5: Interpretation Skewness Value: Approximately 1.98, which is positive. Interpretation: A skewness value of 1.98 indicates a significant right-skew in the salary distribution. Most employees earn below the mean salary, with a few high earners pulling the average upwards. This suggests considerable disparity in salaries within the company. The HR department should review compensation structures to ensure fairness and equity, considering whether the high salaries are justified by roles, experience, and performance. Addressing significant disparities can improve employee satisfaction and retention. Question 1: Kurtosis Classification Given Dataset: 5,7,9,12,15,19,22,24,30,35 Task: Calculate the kurtosis of this dataset by hand. Classify the distribution as leptokurtic, platykurtic, or mesokurtic based on the result. Kurtosis Solution with steps 1 Kurtosis Example 2 kurtosis Example 3 Kellys Coeffecient of Skewness Price Mi Num CF d=m- fd d2 fd2 Range d ber a/h Val of ue trad (m) es (f) 21-25 23 5 5 -3 -15 9 45 26-30 28 15 20 -2 -30 4 60 31-35 33 28 48 fm-1 -1 -28 1 28 36-40 38 42 90 Fm 0 0 0 0 41-45 43 15 105 fm+ 1 15 1 15 1 46-50 48 12 117 2 24 4 48 51-55 53 3 120 3 9 9 37 120 -25 Karl Pearsons Coeffecient of Skewness 104 104 104 16 288 N=104 288 Coeffecient of Skewness Coeffecient of Skewness Skewness Carl pearsons method of estimation Dr P James Daniel Paul, [email protected]; [email protected]; +91 98402 94590 Skewness carl pearsons method of estimation Dr P James Daniel Paul, [email protected]; [email protected]; +91 98402 94590 Carl Pearsons Correlation Coeffecient Example - direct Method Type 2 Dr P James Daniel Paul, [email protected]; [email protected]; +91 98402 94590 Kelleys Coeffecient of Skewness Kellys Coeffecient of Skewness (method 2) Kelleys Coeffecient of Skewness (Method2) Karl Pearson Vs Kelley’s Skewness measure Carl Pearsons Correlation Coeffecient Example - direct Method Type 1 Dr P James Daniel Paul, [email protected]; [email protected]; +91 98402 94590 Spearmans Rank Correlation Dr P James Daniel Paul, [email protected]; [email protected]; +91 98402 94590 Regression Analysis 1 Substitute the values in the Normal Equation and estimate the value of a & b Dr P James Daniel Paul, [email protected]; [email protected]; +91 98402 94590 Regression 2 A= 0; b = 0.25 Index numbers: Laspeyres & Paasche’s & Fishers Examples P0 Q0 P1 Q1 P1Q0 P0Q0 P1Q1 P0Q1 100 1000 120 800 120000 100000 96000 80000 1500 180 2000 100 360000 270000 200000 150000 150 500 100 800 50000 75000 80000 120000 750 300 700 200 210000 225000 140000 150000 1200 120 1000 150 120000 144000 150000 180000 Total 860000 814000 666000 680000 Dr P James Daniel Paul, [email protected]; [email protected]; +91 98402 94590 Rapid Review of Literature Steps 3 1 Steps for Review of literature: 1. Search for Favourite title of research in google scholar eg. “GDP, Share Prices” 2 2. Download the articles with full pdf into a folder. 3. Create a excel sheet with the following columns: S.No, Name of the author, year of publication of the article, Title of the article, Type of data used (primary, Secondary), Sample Size, Variables used for analysis, Tools used for analysis, key findings, Citation as Bibiliography 4. change the colour of the columns other than citation 5. Copy the colums to a word file and paste 6. convert table to text 7. Insert a conjuction between different colours in the sentences 8. Copy the citations seperately from excel file as the Bibiliography if there are 25 reviews you can send your paper to international Journal for publication) Dr P James Daniel Paul, [email protected]; [email protected]; +91 98402 94590 Regression Analysis Regression Quantifies the impact on one variable on the other Dr P James Daniel Paul, [email protected]; [email protected]; +91 98402 94590 Variants of regression analysis linear semilog double log dichotomous Simultaneous logit independent equation variables models Machine learning Dr P James Daniel Paul, [email protected]; [email protected]; +91 98402 94590 Round II Dr P James Daniel Paul; +91 9840294590; [email protected]; [email protected] Different types of Mean formulea with examples Different types of Mean formulea with examples Dr P James Daniel Paul; +91 9840294590; [email protected]; [email protected] Different types of Mean formulea with examples Dr P James Daniel Paul; +91 9840294590; [email protected]; [email protected] Different Median, Quartile, Decile, Percentile Dr P James Daniel Paul; +91 9840294590; [email protected]; [email protected] Variation Formulea Skewness Kurtosis & Moments Kurtosis & Moments Kurtosis & Moments Correlation Coeffecient through Correlation another method Correlation Correlation Carl Pearson Correlation Coeffeciant Carl Pearson Correlation Coeffeciant Carl Pearson Correlation Coeffecient Regression with Assumed mean P1 Regression with Assumed mean P2 Regression with grouped Data Regression with grouped Data Regression with grouped Data Regression with grouped Data Regression Text book Example 2 Regression Example 2 Regression Example 2 Regression Example 3 Regression Example 3 Regression Example 3 Forecasting free hand method Forecasting free hand method Forecasting free hand method Forecasting free hand method Forecasting free hand method Forecasting free hand method least squares method Least Squares method Quadratic method Quadratic method Customer data analysis [Default Prediction] Default [Y] Age [X] XY X2 Equation of the straight line 1 20 20 400 𝑌 = 𝑎 + 𝑏𝑥…………………………………………………….. 0 30 0 900 Adding sigma on both sides 1 22 22 484 ∑𝑌 = 𝑛𝑎 + 𝑏∑𝑥……………………………………………… 0 35 0 1225 multiplying both sides with x 1 25 25 625 ∑𝑋𝑌 = 𝑎∑𝑥 + 𝑏∑𝑥2……………………………………….. 3 132 67 3634 If we estimate a and B through the equations 2&3 we would be able to forecast y from x If 1 is default who is more likely to default young or old? You can also do it through Jasp or spss and cross verify the answers Variants of Regression S.No Equation of the straight line / with modification Name of the model Y= a + bx simple linear model In Y = a + b x semi log model In Y = a + b In x double log model Y= a + b x 2 Exponential Model Y (0,1) = a + bx Logit SEM Variants