Tourism & Hotel Statistics PDF
Document Details
Uploaded by Deleted User
South Valley University, Hurghada Campus
2025
Dr. Maged Gabr
Tags
Summary
This document is a collection of lectures on tourism and hotel statistics for first-year students at Hurghada University in the 2024/2025 academic year. It covers topics such as introduction to statistics, statistical data, tourism statistics, and descriptive statistics. The lectures look at both descriptive and inferential statistical concepts.
Full Transcript
Tourism & Hotel Statistics 9 Hurghada University Hurghada University 2024/ 2025 0 Tourism & Hotel Statistics Hurghada University Lectures on...
Tourism & Hotel Statistics 9 Hurghada University Hurghada University 2024/ 2025 0 Tourism & Hotel Statistics Hurghada University Lectures on Tourism and Hotel statistics Collected By Dr. Maged Gabr Level 1 1 Tourism & Hotel Statistics Contents Chapter One :Introduction To Statistics............................................................................................ 3 1) Introduction........................................................................................................................ 4 1.1 A brief history of statistics.......................................................................................... 4 1.2 Definition of statistics................................................................................................. 5 1.3 Departments of Statistics........................................................................................... 5 1.4 Basic Statistical Concepts.......................................................................................... 6 1.5 The importance of statistics....................................................................................... 9 2- Statistical data...................................................................................................................... 12 2.1 Statistical data sources............................................................................................ 12 2.2 Statistical data collection methods.......................................................................... 13 2.3 : Statistical data collection styles............................................................................. 14 2.4 : Samples.................................................................................................................. 14 Chapter Two: TOURISM STATISTICS............................................................................................. 24 1. Introduction...................................................................................................................... 25 2. Historical Development Of Tourism Statistics...................................................................... 25 3. Basic Concepts Of Tourism Statistics............................................................................... 27 4. Demand Perspective of Tourism Statistics....................................................................... 30 5. Measurement Of Demand Elements................................................................................. 33 6. Supply Perspective of Tourism Statistics.......................................................................... 34 7. TOURISM SATELLITE ACCOUNT......................................................................................... 37 Chapter 3: Tab and Display Statistical Data.................................................................................... 43 3.1: Statistical data tab............................................................................................................. 44 3.2: Display data statistics........................................................................................................ 44 First: the tabular presentation of the statistical data:........................................................... 44 Second: graphic display of statistical data........................................................................... 54 Chapter 4: Descriptive Statistics: Numerical Methods................................................................ 68 3.1Measures of Central Tendency for Ungrouped Data............................................................... 69 1- Mean...................................................................................................................................... 69 2- Median................................................................................................................................... 74 3. Mode...................................................................................................................................... 77 4. Relationships Among the Mean, Median, and Mode............................................................... 80 4. Percentile:.............................................................................................................................. 82 5. Quartiles:............................................................................................................................... 83 3.2 : Measures of Dispersion for Ungrouped Data..................................................................... 85 2 Tourism & Hotel Statistics Chapter One :Introduction To Statistics 3 Tourism & Hotel Statistics 1) Introduction 1.1 A brief history of statistics Statistics has been known since the beginning of ancient civilizations, such as the Babylonians, Assyrians, Pharaohs, and Greeks. Since its inception, it has been associated with census and counting operations that the state conducted to calculate thenumber of members of society and then impose the necessary taxes to finance the army to defend it. Then the census and inventory operations were developed to include data on births and deaths, production, and consumption. He called these operations "the science of the state or the science of kings", so we find thatthe word statistics is derived from the word Status, which means the state in Latin, or the word statistics in Italian, which also means the state. The sixteenth-century AD witnessed a new beginning for the development of statistics in Europe, where it was introduced for the first time as an educational subject taught in universities in 1748 AD in Germany.While in 1839 William Cogswell (1787-1850) established the American Statistical Organization, which later became known as the American Statistical Association “ASA.” By the end of the nineteenth century, statistics were taught in most colleges and institutes. As a result of the great developments that human life has witnessed in various scientific, economic, and social fields, statistics have a close relationship with various other sciences (medicine, literature, agriculture, tourism, etc.). The development of electronic computers in the second half of the twentieth century also led to the advancement of statistics significantly. This has been believed that statistics science is that science that specializes in collecting, organizing, and displaying data graphically or in tabular form. Existing censuses or forecasting production and consumption. In general, statistics are concerned with data, as it collects, categorizes, organizes, analyzes, draws conclusions from it, and uses it in decision-making. Then predict the future. First level- First Term 2024/2025 4 Tourism & Hotel Statistics 1.2 Definition of statistics Statistics is that science that is concerned with scientific methods for collecting, summarizing, organizing,presenting, and analyzing data, to reach acceptable results and sound decisions. It is a set of standard scientific methods that can be employed to collect statistical data on a phenomenon, categorize, summarize, evaluate and draw conclusions about the community. It can also be defined by the science of inferring facts through numbers in a scientific manner. 1.3 Departments of Statistics Statistics is divided into two parts: Firstly: Descriptive statistics Descriptive statistics represents the first part of the analysis process, which is concerned with collecting, organizing and summarizing data according to the field of study through numerical measures and describing results in the form of frequency tables and graphs, and depends on: ▪ Measures of central tendency: They are represented bythe presence of a typical value or a central point, and its tools are ( the median, the mean, the mode...etc.) ▪ Measures of dispersion: They express the amount of difference in a random variable (the extent to which the values are dispersed from each other), and its tools are (variance, standard deviation...etc.) ▪ Relationship measures: It expresses the degree of relationship between two variables and its tools: simple correlation, rank correlation, partial correlation...etc. Whereas the purpose of descriptive statistics is to estimate the features of the statistical community and describe it as a prelude to inferential statistics. First level- First Term 2024/2025 5 Tourism & Hotel Statistics Secondly: Inferential statistics Inferential statistics is the second or complementary part of the analysis process. After describing the data through different numerical measures, it is necessary to study the characteristics of these data, their impact and the relationships between them, through analysis, interpretation and drawing conclusions from (the sample) taken from the population under study, and then generalizing and hypotheses about the impact of these features on society. Inferential statistics go into the depth of the data, and it deals with generalization, prediction, and estimation. 1.4 Basic Statistical Concepts ▪ Data It is a set of observations and primary facts (raw materials) that are collected about a phenomenon, for the purpose of interpretation and analysis. It is divided into two parts: 1. quantitative data It is the one whose units is measured on a quantitative scale, and it may be separate data such as the number of guests in the hotel, the number of students in the second year...etc. It may be related data such as the heights and weights of a group of students, temperatures...etc. 2. qualitative data: It is the units that is categorized according to a specific characteristic, such as students' grades, skin color, gender...etc. As metadata are data about phenomena that cannot be expressed numerically. ▪ Information They are the useful results that have been obtained from the processing and analysis of the data. As for the information that is collected about something related to the study of a particular phenomenon, it is called a dataset, and each statement in this category is based on three basic concepts, which are the following: First level- First Term 2024/2025 6 Tourism & Hotel Statistics 1. Experimental unit (element) Sometimes it is known as sampling unit or element, and indicates the thing (person, particular phenomenon, animal, plant... Etc. in the study area from which the data is collected. 2. Variable Characteristic is the characteristic of the thingstudied, and this feature is the subject of measurement and analysis for the study. It is also a characteristic of each element, for example: sex, age, income, social status. Etc. The statistical variable is divided into two parts: A. Quantity Variables These are variables that take numerical values, so they are also called numerical variables, such as: number of students of the Faculty of Tourism, income, age, numberof food and beverage workers. Etc. B. Qualitative Variables These are variables that categorize views intoseveral groups that each group shares in a particular characteristic, such as: social status (single, married, divorced), sex (male, female). 3. Observations Measurements or information obtained frommeasuring or observing the variables studied are also those collected for a particular item. Example 1: The following is the database obtained by the hotel's human resources manager at the interviewof five individuals for The Wetter job at the restaurant: First level- First Term 2024/2025 7 Tourism & Hotel Statistics character exam Type The name assessment mark Tense 83 Male Ahmad introverted 85 feminine Israa sober 90 Male On emotional 70 feminine Talia self-confident 65 Male Hamza Define items, variables, and Observations? Example 1 solution: Elements Ahmed, Esraa, Ali, Talia and Hamza Variables Type, exam mark and personal assessment Male, male, female...etc, 83, 85...etc, tense, Observations balanced...etc. Knowledge Knowledge is perception, awareness and understanding of facts through the abstract mind,in the way of acquiring information by experimenting and interpreting the results of the experiment or interpreting a story, or by reflecting on the nature of things and contemplating the soul or by reading the experiences of others and reading their conclusions. It is also defined as a set of information aboutsomething that we have knowledge of. Statistical Population It is all the unites common to a particular characteristic, and this unites may be human beings, objects or phenomena. Example: college students, hotel staff, cancer patients. Etc. The statistical Population is divided into two parts: A. limited Population: It is the Population in which all its vocabulary can be counted, for example: the number of students of the Department of Hotel Studies in the college, the First level- First Term 2024/2025 8 Tourism & Hotel Statistics number of hotel employees...etc. B. Unlimited Population: It is a society in which it is not possible to count all its vocabulary, for example: the number of Red Sea fish,the number of grains of wheat. etc. 1.5 The importance of statistics ▪ Statistics is the science that is interested in studying, analyzing and presenting data and phenomena in a brief manner, drawing possible conclusions and making decisions about them. Statistics are used in various science applications, in the field of physical sciences statistics are used in gas X-theory, radiology and astronomical statistics as well as in the field ofbiological sciences and public health policy as well as medical sciences at the level of diagnosis and scientific and pharmaceutical research. ▪ Statistics has become a great necessity in our daily lives, which have been characterized by continuous technological development, so statistics have become a necessary thing facing us every day, such as: pointstables recorded by a football club, weather forecasts, stock market indicators, stock market indicators, the achievements and successes of governments, infectedwith Corona virus, changes in currency prices and changing commodity prices every day, and one maywonder about the importance of statistics for tourismand hotel students believing that statistics are at the heartof the specialization of businesses and economists. ▪ In fact, a researcher specializing in social sciences in general often needs to use the numbers to summarize and present a set of observations related to a phenomenon he is interested in studying and may be asked to report on the number of guests at the hotel where he works. On each of these occasions, the researcher or scholar will need a statistical tool to use it to summarize and express his ideas specifically and effectively, the phrase "We have 463 hotel guests" is stronger and stronger than the phrase: "We have a lot of hotel guests. First level- First Term 2024/2025 9 Tourism & Hotel Statistics ▪ Statistics is also important in modern scientific research, as no study or research is free of a statistical analytical study exposed to the origin of the phenomenon or phenomena studied, so it depicts its reality in a digital template, and ends up with its most prominent trendsand relationships with other phenomena. ▪ The study of statistics undoubtedly has many benefits for social sciences, especially after many areas of work have been opened in police organizations, corporate public relations, research centers and other different areas of work. Knowledge of statistics can even benefit people on a personal level, earning them the skill of planning their own economic life. ▪ The importance of statistics is reflected in the fact that it is used to guide the process of data collection and to ▪ interpret the relationships reflected in the data. One of the most prominent areas in which statistical treatments are used is to compare many things on many occasions. We can say that human life is a series of situations in which the individual makes his decision based on the result of his comparison between several possibilities and this comparison is essentially a statistical process associated with measurement, evaluation and appreciation. The success of man in his life is determined by a certain measure in his mind appreciated by this success, and the freedom of the individual in his society is also measured according to the criteria that individuals in their society are familiar with. ▪ In other words, our lives are full of statistical measurements and estimates. For example, when we go down to the market to buy certain items in the download season, we care in a way that does not make sense of calculating the price of this item relative to the total money we have and estimate whether the rest of this money will be enough until the end of the month or not and whether the rate of downloads on the item is real or false... Etc. ▪ In all these intellectual processes we use statistical processes and continuous First level- First Term 2024/2025 10 Tourism & Hotel Statistics comparisons between different positions. Moreover, what we call a social or natural phenomenon is in fact only a recurring series of realities whose continuous occurrence can be monitored over a period and at the same pace in a statistical way. ▪ In the tourism sector, statistical analyses are used indecision-making processes, where tourism statistics are concerned with collecting, classifying, presenting and analyzing data for the tourism sector to reach results that help decision makers make their decisions correctly. First level- First Term 2024/2025 11 Tourism & Hotel Statistics 2- Statistical data 2.1 Statistical data sources There are two main sources of statistical data collection: current data and statistical studies, and the following is a detailed explanation of each: First: Existing sources There are two types of current sources: 1. Company These data are generally available from internal sources of information for most organizations, whether private or public, including employee records, production records, inventory records, sales records, credit records and customer profile. Etc. 2. government agency: These data are generally available from internal sources of information for most government institutions, for example: The Department of Labor, the Bureau of Statistics, the Federal Reserve, the Office of Management and Budget, the Department of Commerce, Etc. Secondly: Statistical studies A statistical study can be conducted to obtain data that is not readily available from current sources. They can be classified as follows: 1) Experimental Study These are data that can be obtained through an experiment, an attempt to control or influence important variables, for example, drug testing, and industrial product testing. Etc. 2) Observational Study This is data that can be obtained through observation, where no attempt is made to control or influence the interest variable, for example: survey. Etc. First level- First Term 2024/2025 12 Tourism & Hotel Statistics 2.2 Statistical data collection methods Statistical data collection methods vary according to the multiplicity of statistical societies and the data we want to collect, and the financial possibilities available for study, including: A. Observational Method This is the way in which the researcher observes phenomena as in their natural environment, for example: the researcher's living of the Badiacommunity, with a certain disability, a group of prisoners Etc. B. Survey Method This is the method in which the researcher sets a set of questions (a questionnaire) and the respondents answer those questions, for example: the opinions of hotel guests about staying in the hotel…etc. C. Interview Method This is the method in which the researcher collects data through a personal interview with the respondents, for example: the opinions of hotel managers about the effectiveness of their employees. etc. D. Phone method: This is the method in which the researcher collects data through telephone contact with the respondents. Figure 1. shows the methods of collecting statistical data. Observational data collection Survey Phone methods y Interview Figure 1.2 shows the methods of collecting statistical data. First level- First Term 2024/2025 13 Tourism & Hotel Statistics 2.3 : Statistical data collection styles After the researcher determines the source of the data for his research, the second stage comes by defining theappropriate method to use for data collection. There are two styles for collecting data They as follows: A- Comprehensive survey style: This method is the method in which data are collected from all elements of the statistical community, and this method provides complete accuracy in the results that the researcher can access, such as the census of a country, but the researcher may not be able to use this method for the following reasons: ✓ Difficulty in accessing every component of thestatistical community. ✓ It takes a lot of time and effort. ✓ Expenses and high cost. ✓ The impossibility of studying the whole community. In some studies, collecting data causes damage or contamination to the elements of the community, which is known as the destructive examination, for example, a patient's blood test. B- Sample style: And as a result of the reasons Previous Search to use style the sample, where the sample is defined as: Part ofthe statistical community to be studied Provided that itis representative of the community, And then the datais collected from the sample analysis, then access to a group of The results are then generalized to the statistical community. Example: a patient's blood sample...etc. 2.4 : Samples 1) Concept the sample It is a part or segment of society that includes the characteristics of the original society whose characteristics we want to know, and that sample must be representative of all the vocabulary of this population correctly. First level- First Term 2024/2025 14 Tourism & Hotel Statistics The use of samples has been known since ancient times, and there are many examples in practical life. A chemist in his lab studies the properties of matter from the reality of a sample of this substance, The doctor analyzes the patient’s blood from a small sample consisting of a few drops of his blood…..etc. 2) Sample selection methods When you conduct research about a group of people, it’s rarely possible to collect data from every person in that group. Instead, you select a sample. The sample is the group of individuals who will participate in the research. To draw valid conclusions from your results, you must carefully decide how you will select a sample that is representative of the group. This is called a sampling method. There are two primary types of sampling methods that you can use in your research: Probability sampling involves random selection, allowing you to make strong statistical inferences about the whole group. Non-probability sampling involves non-random selection based on convenience or other criteria, allowing you to easily collect data. First level- First Term 2024/2025 15 Tourism & Hotel Statistics ▪ Voluntary response sampling A. Probability sampling methods Probability sampling means that every member of the population has a chance of being selected. It is mainly used in quantitative research. If you want to produce results that are representative of the whole population, probability sampling techniques are the most valid choice. There are five main types of probability sample 1- Simple random sampling In a simple random sample, every member of the population has an equal chance of being selected. Your sampling frame should include the whole population. To conduct this type of sampling, you can use tools like random number generators or other techniques that are based entirely on chance. Example: Simple random sampling You want to select a simple random sample of 1000 employees of a social media marketing company. You assign a number to every employee in the company database from 1 to 1000 and use a random number generator to select 100 numbers. First level- First Term 2024/2025 16 Tourism & Hotel Statistics 2- Systematic sampling Systematic sampling is like simple random sampling, but it is usually slightly easier to conduct. Every member of the population is listed with a number, but instead of randomly generating numbers, individuals are chosen at regular intervals Example: Systematic sampling All employees of the company are listed in alphabetical order. From the first 10 numbers, you randomly select a starting point: number 6. From number 6 onwards, every 10th person on the list is selected (6, 16, 26, 36, and so on), and you end up with a sample of 100 people. 3- Stratified sampling Stratified sampling involves dividing the population into subpopulations that may differ in important ways. It allows you draw more precise conclusions by ensuring that every subgroup is properly represented in the sample. To use this sampling method, you divide the population into subgroups (called strata) based on the relevant characteristic (e.g., gender identity, age range, income bracket, job role). Based on the overall proportions of the population, you calculate how many people should be sampled from each subgroup. Then you use random or systematic sampling to select a sample from each subgroup. Example: Stratified sampling The company has 800 female employees and 200 male employees. You want to ensure that the sample reflects the gender balance of the company, so you sort the population into two strata based on gender. Then you use random sampling on each group, selecting 80 women and 20 men, which gives you a representative sample of 100 people. First level- First Term 2024/2025 17 Tourism & Hotel Statistics 4- Cluster sampling Cluster sampling also involves dividing the population into subgroups, but each subgroup should have similar characteristics to the whole sample. Instead of sampling individuals from each subgroup, you randomly select entire subgroups. If it is practically possible, you might include every individual from each sampled cluster. If the clusters themselves are large, you can also sample individuals from within each cluster using one of the techniques above. This is called multistage sampling. This method is good for dealing with large and dispersed populations, but there is more risk of error in the sample, as there could be substantial differences between clusters. It’s difficult to guarantee that the sampled clusters are representative of the whole population. Example: Cluster sampling The company has offices in 10 cities across the country (all with roughly the same number of employees in similar roles). You don’t have the capacity to travel to every office to collect your data, so you use random sampling to select 3 offices – these are your clusters. 5- Multi-stage sampling Multi-stage sampling is a process of moving from a broad to a narrow sample, using a step-by-step process (Ackoff, 1953). If, for example, a Malaysian publisher of an automobile magazine was to conduct a survey, it could simply take a random sample of automobile owners within the entire Malaysian population. Obviously, this is both expensive and time consuming. A cheaper alternative would be to use multi-stage sampling. In essence, this would involve dividing Malaysia into a number of geographical regions. Subsequently, some of these regions are chosen at random, and then subdivisions are made, perhaps based on local authority areas. Next, some of these are again chosen at random First level- First Term 2024/2025 18 Tourism & Hotel Statistics and then divided into smaller areas, such as towns or cities. The main purpose of multi-stage sampling is to select samples which are concentrated in a few geographical regions. Once again, this saves time and money. B. Non-probability sampling methods In a non-probability sample, individuals are selected based on non-random criteria, and not every individual has a chance of being included. This type of sample is easier and cheaper to access, but it has a higher risk of sampling bias. That means the inferences you can make about the population are weaker than with probability samples, and your conclusions may be more limited. If you use a non-probability sample, you should still aim to make it as representative of the population as possible. Non-probability sampling techniques are often used in exploratory and qualitative research. In these types of research, the aim is not to test a hypothesis about a broad population, but to develop an initial understanding of a small or under-researched population. 1- Convenience sampling A convenience sample simply includes the individuals who happen to be most accessible to the researcher. This is an easy and inexpensive way to gather initial data, but there is no way to tell if the sample is representative of the population, so it can’t produce generalizable results. Convenience samples are at risk for both sampling bias and selection bias. Example: Convenience sampling You are researching opinions about student support services in your university, so after each of your classes, you ask your fellow students to complete a survey on the topic. This is a convenient way to gather data, but as you only surveyed First level- First Term 2024/2025 19 Tourism & Hotel Statistics students taking the same classes as you at the same level, the sample is not representative of all the students at your university. 2- Voluntary response sampling Similar to a convenience sample, a voluntary response sample is mainly based on ease of access. Instead of the researcher choosing participants and directly contacting them, people volunteer themselves (e.g. by responding to a public online survey). Voluntary response samples are always at least somewhat biased, as some people will inherently be more likely to volunteer than others, leading to self- selection bias. Example: Voluntary response sampling You send out the survey to all students at your university and a lot of students decide to complete it. This can certainly give you some insight into the topic, but the people who responded are more likely to be those who have strong opinions about the student support services, so you can’t be sure that their opinions are representative of all students. 3- Purposive or judgmental sampling This type of sampling, also known as judgement sampling, involves the researcher using their expertise to select a sample that is most useful to the purposes of the research. It is often used in qualitative research, where the researcher wants to gain detailed knowledge about a specific phenomenon rather than make statistical inferences, or where the population is very small and specific. An effective purposive sample must have clear criteria and rationale for inclusion. Always make sure to describe your inclusion and exclusion criteria and beware of observer bias affecting your arguments. First level- First Term 2024/2025 20 Tourism & Hotel Statistics Example: Purposive sampling You want to know more about the opinions and experiences of disabled students at your university, so you purposefully select a number of students with different support needs in order to gather a varied range of data on their experiences with student services. 4- Snowball sampling If the population is hard to access, snowball sampling can be used to recruit participants via other participants. The number of people you have access to “snowballs” as you get in contact with more people. The downside here is also representativeness, as you have no way of knowing how representative your sample is due to the reliance on participants recruiting others. This can lead to sampling bias. Example: Snowball sampling You are researching experiences of homelessness in your city. Since there is no list of all homeless people in the city, probability sampling isn’t possible. You meet one person who agrees to participate in the research, and she puts you in contact with other homeless people that she knows in the area. 5- Quota sampling Quota sampling relies on the non-random selection of a predetermined number or proportion of units. This is called a quota. You first divide the population into mutually exclusive subgroups (called strata) and then recruit sample units until you reach your quota. These units share specific characteristics, determined by you prior to forming your strata. The aim of quota sampling is to control what or who makes up your sample. Example: Quota sampling You want to gauge consumer interest in a new produce delivery service in Boston, focused on dietary preferences. You divide the population into meat eaters, vegetarians, and vegans, drawing a sample of 1000 people. Since the company wants to cater to all consumers, you set a quota of 200 people for each First level- First Term 2024/2025 21 Tourism & Hotel Statistics dietary group. In this way, all dietary preferences are equally represented in your research, and you can easily compare these groups. You continue recruiting until you reach the quota of 200 participants for each subgroup. 3) Statistical methods for determining sample size ▪ Determining the sample size from a statistical population Limited In the case of a limited population the sample size is determined the appropriate Yamens equation, which is as follows: n=N / (1+N*e2) n = Appropriate sample size N= size population e= Margin of error = 0.05, this value is suitable for separate data. Example: Calculate the appropriate sample size from the second year students at the Faculty of Tourism, knowingthat the total number of students is 2600. the answer: n = N / (1+ N x e2) n=2600 / (1+2600 x (0.05)2) n = 346,666 Therefore, the appropriate sample size is equal to347 students. ▪ Determining the sample size from a statistical population not Limited In the case of an unlimited population The sample size is determined the appropriate Cochrane population, which is as follows: n = Appropriate sample size σ2 = the variance of the population.σ= standard deviation. Z= Standard score (1.96 at a significant level of0.05) First level- First Term 2024/2025 22 Tourism & Hotel Statistics e= Maximum allowable error (0.05 at a significantlevel of 0.05) Example: Calculate the appropriate sample size of website visitors TripAdvisor of the tourists The standard deviation is 0.14. the answer: n= ((1.96)2 x (0.14)2) / (0.05)2n=30.1181 Therefore, the appropriate sample size is 30 Tourist. Questions: Answer the following questions: 1. Talk about the sources of statistical data? 2. Talk about the conditions for selecting the sample. 3. talk about Multistage cluster sampling? 4. talk about the sample Simple randomness? 5. talk about Sample collection methods? 6. Compare current sources and statistical studies? 7. If you know that the number of employees in Hurghada Marriott Hotel is 1800 workers, then calculate the appropriate sample size? 8. In the case of an infinite population, is the formula used? 9. In the case of a finite population, theequation… ? 10. If you know that the number of students at the Faculty of Tourism is 3600, what is the appropriate sample size? First level- First Term 2024/2025 23 Tourism & Hotel Statistics Chapter Two: TOURISM STATISTICS First level- First Term 2024/2025 24 Tourism & Hotel Statistics 1. Introduction We all know that tourism is a complex phenomenon having so many diverse elements which act together to deliver quality experience to the tourists. Because of the nature of tourism activity and its associated system, tourism is a complex adaptive system presenting several challenges for the policy makers when it comes to measuring tourism. However, each country collects and publishes tourism statistics across the world. Even, international organizations like World Tourism Organization (UNWTO) and World Tourism and Travel Council (WTTC) also collects, compile and publish tourism statistics for assessing global trends and making projections. At the national level, tourism statistics helps is identifying the tourist flows across various destinations helping the policy makers to develop destination specific marketing plans. In this unit, we will understand the concept of tourism statistics, its need and the scope of its measurement. We will also highlight the various problems and issues regarding the measurement of tourism statistics and its relevance to tourism industry. At the end, we will present the global tourism statistics and the Indian statistics which is published by the relevant credible agencies. 2. Historical Development Of Tourism Statistics The concept of “international visitor” was first established in 1953 by International Union of Official Travel Organizations (now known as UNWTO, United Nations World Tourism Organization) which was recommended during United Nations Conference on International Travel and Tourism in 1963 at Rome. Till this time, the global fraternity used the word ‘foreign visitor’, which was defined as, “a person who visits a country other than that in which he habitually lives for a period of at least 24 hours and a maximum period of six months.” In 1963, during the conference a new term ‘visitor’ was proposed which was defined as, “any person visiting a country other than that in which he has his usual place of First level- First Term 2024/2025 25 Tourism & Hotel Statistics residence, for any reason other than that in which he has his usual place of residence, for any reason other than following an occupation remunerated from within the country visited.” The term visitor was divided into two different categories a) Tourist – defined as “temporary visitors staying at least twenty-four hours in a country visited, and whose purpose was for leisure, business, family or meeting”. b) Excursionist – defined as “temporary visitors staying less than 24 hours in a destination visited and not staying overnight. “Excursionists are also known as same day visitors for example cruise visitor or border shoppers. In 1980, UNWTO in its Manila declaration extended these definitions to all form of tourism including domestic tourism. During the fifth session of its General Assembly (1983) in New Delhi, UNWTO issued a report stressing upon the need of a uniform and comprehensive means of measurement and comparison with other sectors of economy. Subsequently in 1991, UNWTO organized International Conference on Travel and Tourism in Ottawa, Canada with an objective to standardize industrial classifications and tourism terminology. During the conference resolution, a new definition of tourism was recommended as “The activities of a person travelling to a place outside his or her usual environment for less than a specified period of time and whose main purpose oftravel is other than the exercise of an activity remunerated from within the placevisited….” In 1994, UNWTO published Recommendation on Tourism Statistics which represent the first international standard to set up the basic foundations of a System of Tourism Statistics in terms of concepts, definition, classification and indicators. Based on the recommendations of Ottawa Conference, many countries initiated the development of Tourism Satellite Account (TSA) that would give credibility to the measurement of tourism and provide comparability with that of other economic and social activity. In 1997, the OECD Tourism Committee made its first proposal for TSA for OECD countries. In 1999, at the UNWTO conference held at Nice, the TSA First level- First Term 2024/2025 26 Tourism & Hotel Statistics standard was proposed which was published in 2001 as Tourism Satellite Account – Recommended Methodological Framework that was structurally linked with System of National Accounts 1993. In 2008, United Nations based on the decision of UN Statistical Commission published International Recommendations for Tourism Statistics which was drafted by UNWTO in collaboration with other agencies including International Labor Organization, Organization for Economic Cooperation and development (OECD), International Monetary Fund (IMF) and other similar agencies. This document provided a comprehensive methodological framework for collection and compilation of tourism statistics in all countries. 3. Basic Concepts Of Tourism Statistics While measuring tourism activity, it is important for us to understand the basic forms of tourism, their classification and the basic tourism units that are considered as statistical units in the survey. Tourism is classified as (a) International – Inbound and Outbound (b) Domestic. These three basic forms of tourism are defined as i) Domestic Tourism: comprising of residents visiting their own country ii) Inbound Tourism: comprising of non-residents travelling in each country iii) Outbound Tourism: comprising of residents travelling in another country. Another way to look at these basic forms of tourism is by dividing them into the following categories i) Internal Tourism: includes domestic and inbound tourism. ii) National Tourism: includes domestic and outbound tourism. iii) International Tourism: included inbound and outbound tourism Another term that is used is ‘traveler’ which refers to all individuals making a trip First level- First Term 2024/2025 27 Tourism & Hotel Statistics between two or more geographical locations. Traveler includes visitors, direct transit travelers, commuters (travel for study or work) and other non- commuting travel (e.g. diplomats, migrants etc.). The entire system of tourism statistics is based on ‘visitors’ who are engaged in the activity of tourism. According to International Recommendations for Tourism Statistics 2008, the basic concepts in tourism statistics are a) Economic Territory and Economy – The term economic territory corresponds to geographic reference (or country of reference). The term economy refers to the economic agents that are resident in the country of reference. b) Residence – To distinguish different forms of tourism, the concept of residence helps in classifying the visitors according to their place of origin. c) Nationality and Citizenship – It is important to understand that the ‘country of residence’ is different from his / her nationality. For example, an Indian citizen might be residing in United States of America and would be planning to visit Europe on a vacation. The nationality of an individual depends upon which country has issued his / her passport. d) Usual environment of an individual – This is an important concept which was introduced to exclude those travelers from visitors who commute regularly between their place of usual residence and frequently visited places within their current life routine. For example, many people commute everyday between Mumbai and Pune or between Agra and New Delhi in respect to their work or study. Such travelers cannot be considered as domestic tourist. e) Tourism Trips and Visits – The trips taken by tourists are called as tourism trips. A tourism trip is characterized by main destination, which is the place which is central to the decision to take that trip. For example, a domestic tourist who intends to visit Taj Mahal will have Agra as the main destination, though he / she might visit Mathura as well. The term ‘tourism visit’ refers to a stay in a place First level- First Term 2024/2025 28 Tourism & Hotel Statistics visited during a tourism trip. f) Tourism and being employed – In case a traveler is being employed and the payment received is only incidental (and not purposeful), a traveler will still be considered as a visitor. It is important to understand the employer employee relationship to examine whether a traveler can be classified as a tourist or not. Following set of international travelers are excluded from the visitors list – all form of workers (border, seasonal, short-term, long- term), nomads and refugees, transit passengers, crew of public modes of transport (airlines, ships), long-term students and patients, diplomats, consular staff, military personnel or any other armed forces. First level- First Term 2024/2025 29 Tourism & Hotel Statistics 4. Demand Perspective of Tourism Statistics As tourism is primarily an economic activity, tourism statistics is viewed from demand-supply perspective. Tourism is primarily seen as a demand side phenomenon. Demand can be measured by considering four elements: a) people - includes all visitors / tourists b) money- includes the expenditure incurred by the tourists c) time - includes the travel duration and period of stay d) space - includes the distance and length of the trips International Recommendations for Tourism Statistics 2008, laid emphasis on characterization of visitor and tourism trip and tourism expenditure. They proposed that the following information should be captured: ▪ Personal Characteristics of the visitor – age, gender, economic activity status, occupation, annual household income and education. While capturing the characteristics of tourism trips, the report recommended to capture the following information. ▪ Main purpose of the trip – This information is essential as it help in differentiating between a traveler and a visitor. The various classifications include – holiday, leisure and recreation; visiting friends and relatives; education and training; health and medical care; religion or pilgrimage; shopping; transit; business and professional or others. It is important to understand that a visitor may undertake additional activities considered as secondary. For example, a business tourist might also like to visit places of historical importance or would like to do shopping. First level- First Term 2024/2025 30 Tourism & Hotel Statistics ▪ Types of Tourism Product – To market a specific form of packages or destinations, tourism professionals use this term ‘tourism product’ which represent a combination of different aspects around a specific centre of interest. There is no international classification, and each country defines tourism products in their own way. ▪ Duration of trip or visit – To assess the level of demand for tourism services, the duration of the trip is an important input. The assessment of duration is also important to assess the estimated expenditure associated with the trip. Generally, the number of nights is used to assess the duration of the trip. Based on the number of nights, the destinations can differentiate between long stays (four nights or more) and short stays (less than four nights). ▪ Origin and Destination – According to the report, all inbound trips should be classified based on country of residence rather than by the nationality. Similarly for outbound trips, the main destination of the trip should be the considered. For domestic tourism, the place of usual residence of the visitor should be considered. ▪ Modes of Transport – The main mode of transport is identified based on a. on which most kms are travelled b. on which most time is spent c. which has the highest share of transportation cost. The standard classification of modes of transport include – a. Air – scheduled flights, unscheduled flights, private aircraft b. Water – passenger line and ferry, cruise ship, yacht c. Land – railways, bus, car, taxi, foot, two-wheeler or any other. In addition to the above-mentioned information, measurement of the contribution of tourism to the economy is also captured using monetary variables. First level- First Term 2024/2025 31 Tourism & Hotel Statistics ▪ Tourism expenditure is referred to as, “the amount paid for the acquisition of consumption of goods and services, as well as valuables, for own use or to give away, for and during tourism trips.” The expenditure can be incurred by the visitor himself or can even be reimbursed by any other person or company. For example, the monetary expenditure paid by the employer in context of a business travel will also be included. In case the monetary expenditure is borne by government or is reimbursed by a third party, the same shall also be included in tourism expenditure. The following payments are excluded from tourism expenditure – taxes and duties,payment of interest, purchase of financial or non-financial assets, purchase of goods for resale purpose, donations or charities. It is important to understand that tourism expenditure happens the moment the transfer of ownership of goods takes place or the services are delivered. It is not linked to the time of its payment. For example, a tourist might avail his / her credit card at the hotel and may choose to pay the amount in EMI. In this case, the tourism expenditure is done during the period of his stay and is not linked to the timing of the payments. In fact, all the goods and service acquired before the travel, specifically for the purpose of the trip should also be included in tourism expenditure. A major question that arises while assessing tourism expenditure is – which economy gets benefitted? It is not necessary that all the expenditure is incurred by a visitor at the places visited during the trip. For example, an Indian tourist visiting New York might travel by Air India and therefore the travel expenditure gets accrued to the origin destination and not to the country visited by the tourist. Similarly compare a tourist purchasing clothes before trip from his usual place of residence versus buying at a destination. In the former case, the tourism expenditure get accrued to home destination whereas in the latter it gets accrued to the place of visit. Depending upon the form of tourism, the expenditure can be classified as – domestic tourism expenditure, inbound tourism expenditure and outbound tourism expenditure. First level- First Term 2024/2025 32 Tourism & Hotel Statistics 5. Measurement Of Demand Elements The guidelines with respect to measurement of characteristics of visitors, tourism trips and tourism expenditure, the guidelines are well specified in the International Recommendations for Tourism Statistics 2008. These are 1) The data collected, either through survey or any other procedure, should provide information on all visitors. 2) The classification should be identical to those used in expenditure surveys. 3) The data should be collected on the entry / departure cards at the borders, at destination (accommodation surveys) or as part of household survey (for domestic tourism). 4) The data on inbound and outbound tourism is generally captured by immigration authorities. Here the information on purpose of the trip becomes critical to assess whether the traveler is to be considered as a tourist or not. 5) The duration of the stay is an important determinant while examining the nature of tourists. 6) Countries should include a specific expenditure module in survey of inbound visitors. 7) In case it is difficult to collect information through surveys at borders, data can be collected from guests at places of accommodation. 8) In case of domestic and outbound tourists, tourism-specific household surveys are recommended. 9) Alternative estimation methods can also be used where administrative data from travel agencies and credit card companies can be used. 10) In terms of assessing tourism expenditure, there are various challenges that should be addressed – residency of the visitor and provider should be clearly identified; in First level- First Term 2024/2025 33 Tourism & Hotel Statistics case of package tour the components of package should be identified; the modes of transport should be clearly identified; all expenditures whether made by self, company or third party should be included. 6. Supply Perspective of Tourism Statistics In addition to the demand perspective, it is equally important to understand the supply perspective. To understand and describe tourism in a country, it is important to study the supply of consumption goods and services to the visitors. Tourism expenditure is made up of goods and services that are provided as part of the tourism supply. The statistical unit from the supply perspective includes institutional units and establishments. All those establishments having a particular tourism characteristic activity that serves the visitors directly are included and are further categorized based on their main activity. For example, a hotel not only provides accommodation, but it also provides food and beverage, and other tourism connected products. According to International Recommendations for Tourism Statistics 2008, the selected tourism industry components include: ▪ Accommodation for visitors – Such services are provided either on commercial basis (e.g. hotels) or non-commercial basis (free stay with friends). The accommodation on commercial basis can take various forms- fully furnished guest rooms, self-contained units with kitchen, shared accommodation, bread and breakfast units. The services provided by accommodation units could also vary including food and beverage, laundry services, conference facilities, swimming pools and gyms etc. Data collected from accommodation service providers (supply side) provides an advantage to policy makers to understand the geographic diversity of the services available and also do an in-depth analysis on the market segments that are catered to. Surveys of accommodation establishment are very important from policy perspective as it provides direct information on tourism flows and the number of nights stayed gives credible information on the duration of the stay. Many countries have First level- First Term 2024/2025 34 Tourism & Hotel Statistics made it mandatory for all accommodation providers to maintain the record of stay of the visitors. The following indicators are generally used while describing the accommodation capacities – number of operating months / days; number of rooms; number of beds; occupancy rates; revenue per available room. ▪ Food and Beverage Serving Activities – Even though food and beverage serving activities are considered as an important element in tourism industry, establishments in this sector also serve many non-visitors for example, residents. Validating the expenditure on food and beverage becomes challenging as these services can also be provided on a non-marketbasis by friends and family members. Also, the informal segment of the industry (e.g. street vendors) makes the measurement difficult. According to the report, all the countries should have a proper classification of establishments offering food and beverage service which may include – full-service restaurant, self-service restaurant, take-away establishments, street vendors, bars, night clubs etc. While measuring the supply side following information should be captured – number of tables, number of seats, number of meals that can be served daily, number of meals served, number of drinks served. ▪ Passenger Transportation –While travelling, one of the major components of travel expenditure is on transportation, particularly in case of international travel where visitors travel by air. Transportation expenditure is categorized under two categories – transportation to or from the destination and transportation at the destination. Further the transportation services may have been purchased or visitor can use their own resources (e.g. motorcycle, car or on foot). Even third- party individuals / organizations can provide free transportation (e.g. friends, employers) ▪ Travel Agencies and Other Reservation Activities – Visitors while travelling to a destination often use the services of travel agents for making reservations and First level- First Term 2024/2025 35 Tourism & Hotel Statistics bookings of transport, accommodation and other local recreation activities. Since travel agencies act as a mediator between the tourist and the actual service provider, the total value paid by the customer will be split into two parts – margin of the travel agent and value of the tourism services included. Data should be collected from travel agencies which should include – number and value of products sold; types of clients; categories of destination sold; trips with and without packages. With regard to the measurement issues, the following points should be kept in mind. 1. Classification of the type of accommodation should correspond to the data collected from the visitors. 2. The data from unincorporated business (e.g. family homes etc.) should also be addressed and captured. 3. The data on available rooms, beds and occupancy rate should be captured on a frequent basis and should be updated. 4. In context of food and beverage services, due care should be taken to eliminate the services provided to non-visitors. Also, the methods should be developed to include the informal economy like street vendors etc. While developing tourism statistics, all the countries are encouraged to follow guidelines such as. A. Reliable statistical sources should be used to develop the estimates B. The data should be collected in an ongoing process and all the observations should be statistical in nature. C. Data collected should be comparable between countries and over time. D. Data should be internally consistent. First level- First Term 2024/2025 36 Tourism & Hotel Statistics 7. TOURISM SATELLITE ACCOUNT In 2010, UNWTO, in collaboration with OECD, Eurostat and UN Statistics Division published Tourism Satellite Account: Recommended Methodological Framework (TSA: RMF) 2008. The purpose of a TSA is to analyze the aspects of demand for goods and services associated with the activity of visitors and to understand how this supply interacts with other economic activities. The TSA: RMF considers both tourism expenditure and tourism consumption in account. While tourism expenditure refers to the amount paid for acquisition of consumptions of goods and services during trip; tourism consumption goes beyond tourism expenditure and includes services associated with vacation accommodation, tourism social transfers and other imputed consumption. Tourism consumption included barter transactions, transactions in own accounts, remuneration in kind, as well as transactions by governments described as social transfers. The complete Tourism Satellite Accounts (TSA) provides. ▪ Macroeconomic aggregates that provide information on tourism direct gross value added (TDGVA) and tourism direct gross domestic product (TDGDP). ▪ Detailed data on tourism consumption. ▪ Detailed data on production by tourism industries including information related to employment, gross fixed capital formation and linkage with other economic activities. ▪ Linking non-monetary information with economic data. TSA can be viewed from two different perspectives – a statistical tool and as a framework for development of tourism statistics. The framework can be used by various countries as a starting point to improve the system of tourism statistics. First level- First Term 2024/2025 37 Tourism & Hotel Statistics Example for TSA reports in Egypt: First level- First Term 2024/2025 38 Tourism & Hotel Statistics First level- First Term 2024/2025 39 Tourism & Hotel Statistics First level- First Term 2024/2025 40 Tourism & Hotel Statistics First level- First Term 2024/2025 41 Tourism & Hotel Statistics First level- First Term 2024/2025 42 Tourism & Hotel Statistics Chapter 3: Tab and Display Statistical Data First level- First Term 2024/2025 43 Tourism & Hotel Statistics 3.1: Statistical data tab Data tabulation means displaying the data (raw data) in appropriate tables so that it can be summarized, understood, absorbed, deduced results from it and compared it with other data. The presentation and tabulation of statistical data is also considered the second step (after collecting these raw data) in the concept of statistical analysis, and the researcher resorts to inventorying and classifying these data and presenting them in a concise manner that helps to understand and analyze them statistically to identify, describe and compare them with other phenomena, and to come up with some statistical implications for the study community. 3.2: Display data statistics The way the data is presented depends on the type of data and the facts to be presented. There are two main ways to display and tabulate statistical data: as follows: 1. A Tabular display of statistical data. 2. Statistical data graph First: the tabular presentation of the statistical data: After the process of tabulating and assigning the characteristics that characterize the vocabulary, the results are monitored in appropriate tables that show the final form of the distinct groups. This process, in which data are collected in distinct and homogeneous groups, is called the classification process. : 1. geographical classification 2. Historical or chronological classification. 3. qualitative or descriptive classification. 4. quantitative classification First level- First Term 2024/2025 44 Tourism & Hotel Statistics It is possible to distinguish between a group of forms of statistical tables mentioned below: 1. Tabulate the raw data into a simple frequency Table: What is meant by the simple table is that table in which the grade values are arranged in ascending order in its first column, while the second column is called the repetition column and the number of times each degree or event is repeated. Example 1: The following data are the grades obtained by twenty students in the subject of tourism statistics in the first year, Department of Tourism Studies, in the end-of-year exam: What is required to tabulate these data in a simple frequency distribution table? The solution: The data is arranged without repetition in ascending order, then put this data in the first column of the table called (x), then put the number of repetitions using signs in the second column, and the third column represents the repetition and is symbolized by the symbol (f). First level- First Term 2024/2025 45 Tourism & Hotel Statistics Example 2: The following data are the estimates of 20 students in the front office subject in the third year of the Hotel Studies Department in the academic year 2015/2016. What is required to put these data on a simple table? The solution: First level- First Term 2024/2025 46 Tourism & Hotel Statistics 2. Tab The data in a frequency table with categories: Before preparing this table, we will first learn the meaning of categories and ways to write them. Categories meaning: A category is a set of data that is very similar in characteristics, and if the number of raw data obtained from the questionnaire increases, simple tables cannot be used to express these cases, otherwise we will need hundreds of pages, but the data is divided into similar and similar groups in Attributes are called classes. How to write categories: There are several ways to write categories, which are as follows: First method: We mention both the minimum and the maximum for the category as in the following table: For example, the first category is pronounced (20 to 30) and not (20 - 30). This method is defective because the end of the first category is the same as the beginning of the second category, and so on. In this case, we do not know to which category this number belongs. The second method: We mention both the minimum and the upper limit for the category, but we leave an interval of one integer between the end of the first category and the beginning of the second category, and so on as in the following table: First level- First Term 2024/2025 47 Tourism & Hotel Statistics It is wrong with this method that it is not valid in the case of data that contain fractions Third method: We mention only the minimum for the category, and we put a dash after it and pronounce the first category, for example (10 to less than 20), and this method is suitable for all phenomena. Fourth method: We mention only the upper limit of the category and put a dash before it and pronounce the first category, for example (more than zero to 20). This method is suitable for all phenomena as well, but it is less common. First level- First Term 2024/2025 48 Tourism & Hotel Statistics Steps to construct a frequency distribution table with categories: Calculate the range of class = largest value - smallest value Calculate the number of classes = 1 + 3.3 Log(n) Calculate category length — range / number of categories Choosing the beginning of the first category, i.e., its minimum is equal to the lowest value in the data or slightly less than it, for example, it is one of the thinnest Zero to facilitate the calculations afterwards. Building the table and putting the marks that represent the repetition. Example 3: A researcher has collected data representing the computer test scores for fifty students of the second stage of high school in the following table. What is required to prepare a frequency distribution table with categories for the previous table? The solution: range = largest value - smallest value = 88 - 20 = 68 Number of classes = 1 + 3.3 x Log (n) = 1 + 3.3 x Log (50) =3.3 + 1 x 1.699 = 6.6 We round the number of categories to the nearest whole number Number of classes =7 First level- First Term 2024/2025 49 Tourism & Hotel Statistics Class length = Range / Number of classes = 68 / 7 = 9.7 We round the length of the category to the nearest integer, so it becomes Class length = 10 We choose the beginning of the first category, which is the smallest number = 20 Let's start building the table as follows: 3. Tab data in the ascending cumulative frequency table: The ascending cumulative frequency means the aggregation of the repetitions of each category over all its previous iterations, so that the sum of the ascending repetitions of the last category is equal to the sum of the repetitions Example 4: From the same data as the previous example, create the Ascending Aggregate Frequency Table. The solution: With the same previous steps, we create the frequency distribution table with categories, and from it we create the ascending cumulative frequency distribution table as follows: First level- First Term 2024/2025 50 Tourism & Hotel Statistics 4. Tab the data in the descending aggregate frequency table: The descending cumulative frequency means the aggregation of the repetitions of each class over all subsequent iterations, so that the sum of the descending repetitions of the first class is equal to the sum of the repetitions. Example5: From the same data as the previous example, from the descending aggregate frequency table The solution: With the same previous steps, we create the frequency distribution table with categories, and from it we create the ascending cumulative frequency distribution table as follows: First level- First Term 2024/2025 51 Tourism & Hotel Statistics 5. Double Table It is a table that links two variables at the same time, and each variable has its categories. It is built by following several steps, as follows: 1. Define the two variables 2. Determine the independent variable and the dependent variable 3. Define the categories of each of the variables 4. Configure the table so that the independent variable occupies the top of the table, that is, it is horizontal, and the dependent variable occupies the lower part, that is, it is vertical. 5. Putting marks that represent repetition. 6. Rewrite the table with numbers. Example 6: The following table shows the data obtained by a researcher in a study between gender and watching educational programs for a group of third-year secondary students as follows: What is required to form a double table for the relationship between the two variables (type and viewing of educational programs)? First level- First Term 2024/2025 52 Tourism & Hotel Statistics The solution: 1. variables (type - watch tutorials) 2. The independent variable is the type, and the dependent variable is the viewing of educational programs. 3. Categories of the gender variable are (males - female) 4. Variable Categories Watching Tutorials (Watching - not seen) 5. Configure the table so that the independent variable occupies the top of the table, i.e. it is horizontal, and the dependent variable occupies the bottom part, i.e. it is vertical. As follows: First level- First Term 2024/2025 53 Tourism & Hotel Statistics Second: graphic display of statistical data The graphic display of statistical data is a summary of the statistical data in a form that facilitates the understanding of the characteristics of the subject of the study. The methods of displaying the classified data differ from the unclassified data, and we will discuss each of them in detail as follows: First: Graphic display of unclassified data: Unclassified data means that single data, that is, there are no categories, and there are several ways to display unclassified data, including the following: (1) Simple Bar Graph Method: In this method, the x-axis represents the values of the variable, and the y-axis represents the value corresponding to the value of the variable. A column is drawn around the variable and its height represents the value of the variable. Example 7: The following table shows the numbers of students in some departments of the Faculty of Arts, Mansoura University. It is required to display this data using the simple bar graph method. First level- First Term 2024/2025 54 Tourism & Hotel Statistics (2) Simple Curve Method: In this method, the x-axis represents the variable, while the y-axis represents the value of the variable, and points are signed between each value of the variable on the x-axis and the corresponding value on the y-axis, and then those points are connected to a curved line by hand. Example 8: The following table shows the numbers of students in some departments of the Faculty of Arts, South Valley University. It is required to display this data using the simple curve method. (3) Broken line method: In this method, the x-axis represents the variable, while the y-axis represents the value of the variable, and points are signed between each value of the variable on the x-axis and the corresponding value on the y-axis, and then those points are connected by a broken line using the ruler. First level- First Term 2024/2025 55 Tourism & Hotel Statistics Example 9: The following table shows the numbers of students in some departments of the Faculty of Arts, Mansoura University. It is required to display this data using the refracted graph method. (4) Circuit Diagram Method: In this method, a circle is drawn, then we calculate the angle of the sector for each value separately, and we draw that angle inside the circle until the circle ends. And we calculate the angle of the sector of the segment of the relationship: First level- First Term 2024/2025 56 Tourism & Hotel Statistics Example 10: The following table shows the numbers of students in some departments of the Faculty of Arts, Mansoura University. It is required to display this data using the circle graph method. First level- First Term 2024/2025 57 Tourism & Hotel Statistics (5) Adjacent Bar graph method: This method is also called the adjacent column graph method. It is like the simple bar graph method, but a number of adjacent columns are drawn, each representing one of the variable values. Example 11: The following table shows the numbers of students in some departments of the Faculty of Arts, Mansoura University. It is required to display this data using the adjoining column graph method. First level- First Term 2024/2025 58 Tourism & Hotel Statistics (6) The method of segmenting bar graphs: This method is like the method of simple graphs, but a column is drawn representing the first value of the variable, then it is followed by a column with the rest of the value of the variable, and the beginning of the second column is the end of the first column. Example 12: The following table shows the numbers of students in some departments of the Faculty of Arts, Mansoura University. It is required to display this data using the method of segmented columns: Second: Graphic display of the classified data: Classified data means that data is divided into categories. There are several ways to display the classified data, including the following: (1) Histogram: One of the methods of displaying classified data, where a column is assigned to First level- First Term 2024/2025 59 Tourism & Hotel Statistics each category and its frequency, so that the length of the category is the base of the column and the frequency is the height of the column, and it is preferable to leave enough space before the first category and another space after the last category, as for the middle of the column, it is the center of the category. Example 13: Show this table graphically using histogram? First level- First Term 2024/2025 60 Tourism & Hotel Statistics (2) The Frequent Polygon: A point is assigned to each category and its repetition, so that its x-coordinate is the center of the category, while its y-coordinate is the repetition. Noticeable: The area of the figure under the histogram = the area of the figure under the frequency polygon. Example 14: Show this table graphically using the frequency polygon? (3) Frequency Curve: After monitoring the points as in the previous method, we connect each two successive points with a curve by hand. Example 15: Show this table graphically using the frequency curve? First level- First Term 2024/2025 61 Tourism & Hotel Statistics First level- First Term 2024/2025 62 Tourism & Hotel Statistics First level- First Term 2024/2025 63 Tourism & Hotel Statistics First level- First Term 2024/2025 64 Tourism & Hotel Statistics First level- First Term 2024/2025 65 Tourism & Hotel Statistics First level- First Term 2024/2025 66 Tourism & Hotel Statistics First level- First Term 2024/2025 67 Tourism & Hotel Statistics Chapter 4: Descriptive Statistics: Numerical Methods First level- First Term 2024/2025 68 Tourism & Hotel Statistics Measures of Central Tendency for Ungrouped Data We often represent a data set by numerical summary measures, usually called the typical values. A measure of central tendency gives the center of a histogram or a frequency distribution curve. This section discusses three different measures of central tendency: the mean, the median, and the mode; however, a few other measures of central tendency, such as the trimmed mean, the weighted mean, and the geometric mean, are explained in exercises following this section. We will learn how to calculate each of these measures for ungrouped data. The data that gives information on each member of the population or sample individually is called ungrouped data, whereas grouped data are presented in the form of a frequency distribution table. 1- Mean The mean, also called the arithmetic mean, is the most frequently used measure of central tendency. This book will use the words mean and average synonymously. For ungrouped data, the mean is obtained by dividing the sum of all values by the number of values in the data set: The mean calculated for sample data is denoted by x (read as “x bar”), and the mean calculated for population data is denoted by μ First level- First Term 2024/2025 69 Tourism & Hotel Statistics (Greek letter mu). We know from the discussion in Chapter 2 that the number of values in a data set is denoted by n for a sample and by N for a population. In Chapter 1, we learned that a variable is denoted by x, and the sum of all values of x is denoted by x. Using these notations, we can write the following formulas for the mean. The mean for ungrouped data is obtained by dividing the sum of all values by the number of values in the data set. Thus, x Mean for population data: m =N x Mean for sample data: x= n where x is the sum of all values, N is the population size, n is the sample size, μ is the population mean, and x is the sample mean. EXAMPLE 3–1 Table 3.1 lists the total cash donations (rounded to millions of dollars) given by eight U.S. companies during the year 2010 (Source: Based on U.S. Internal Revenue Service data analyzed by The Chronicle of Philanthropy and USA TODAY). Table 3.1 Cash Donations in 2010 by Eight U.S. Companies Cash Donations Company (millions of dollars) Wal-Mart 319 Exxon Mobil 199 Citigroup 110 Home Depot 63 Best Buy 21 Goldman Sachs 315 American Express 26 Nike 63 Find the mean of cash donations made by these eight First level- First Term 2024/2025 70 Tourism & Hotel Statistics companies. Solution The variable in this example is the 2010 cash donations by a company. Let us denote this variable by x. Then, the eight values of x are where x1 = 319 represents the 2010 cash donations (in millions of dollars) by Wal-Mart, x2 = 199 represents the 2010 cash donations by Exxon Mobil, and so on. The sum of the 2010 cash donations by these eight companies is x = x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 = 319 + 199 + 110 + 63 + 21 + 315 + 26 + 63 = 1116 Note that the given data includes only eight companies. Hence, it represents a sample. Because the given data set contains eight companies, n = 8. Substituting the values of x and n in the sample formula, we obtain the mean of 2010 cash donations of the eight companies as follows: Thus, these eight companies donated an average of $139.5 million in 2010 for charitable purposes. First level- First Term 2024/2025 71 Tourism & Hotel Statistics EXAMPLE 3–2 The following are the ages (in years) of all eight employees of a small company: 53 32 61 27 39 44 49 57 Find the mean age of these employees. Solution: Because the given data set includes all eight employees of the company, it represents the population. Hence, N = 8. We have x = 53 + 32 + 61 + 27 + 39 + 44 + 49 + 57 = 362 The population mean is Thus, the mean age of all eight employees of this company is 45.25 years, or 45 years and 3 months. Reconsider Example 3–2. If we take a sample of three employees from this company and calculate the mean age of those three employees, this mean will be denoted by x. Suppose the three values included in the sample are 32, 39, and 57. Then, the mean age for this sample is First level- First Term 2024/2025 72 Tourism & Hotel Statistics If we take a second sample of three employees of this company, the value of x will (most likely) be different. Suppose the second sample includes the values 53, 27, and 44. Then, the mean age for this sample is Consequently, we can state that the value of the population mean μ is constant. However, the value of the sample mean x varies from sample to sample. The value of x for a particular sample depends on what values of the population are included in that sample. Sometimes a data set may contain a few very small or a few very large values. As mentioned in Chapter 2, such values are called outliers or extreme values. A major shortcoming of the mean as a measure of central tendency is that it is very sensitive to outliers. Example 3–3 illustrates this point. EXAMPLE 3–3 Table 3.2 lists the total number of homes lost to foreclosure in seven states during 2010. First level- First Term 2024/2025 73 Tourism & Hotel Statistics Note that the number of homes foreclosed in California is very large compared to those in the other six states. Hence, it is an outlier. Show how the inclusion of this outlier affects the value of the mean. Solution If we do not include the number of homes foreclosed in California (the outlier), the mean of the number of foreclosed homes in six states is 2- Median Another important measure of central tendency is the median. It is defined as follows. Median The median is the value of the middle term in a data set that has been ranked in increasing order. As is obvious from the definition of the median, it divides a ranked data set into two equal parts. The calculation of the median consists of the following two steps: 1. Rank the data set in increasing order. 2. Find the middle term. The value of this term is the median. Note:- The value of the middle term in a data set ranked in decreasing order will also give the value of the median Note:- that if the number of observations in a data set is odd, then the median is given by the value of the middle term in the ranked First level- First Term 2024/2025 74 Tourism & Hotel Statistics data. However, if the number of observations is even, then the median is given by the average of the values of the two middle terms. EXAMPLE 3–4 Refer to the data on the number of homes foreclosed in seven states given in Table 3.2 of Example 3–3. Those values are listed below. 173,175 49,723 20,352 10,824 40,911 18,038 61,848 Find the median for these data. Solution First, we rank the given data in increasing order as follows: 10,824 18,038 20,352 40,911 49,723 61,848 173,175 Since there are seven homes in this data set and the middle term is the fourth term, the median is given by the value of the fourth term in the ranked data as shown below. Thus, the median number of homes foreclosed in these seven states was 40,911 in 2010 EXAMPLE 3–5 Table 3.3 gives the total compensations (in millions of dollars) for the year 2010 of the 12 highest-paid CEOs of U.S. companies. First level- First Term 2024/2025 75 Tourism & Hotel Statistics Find the median for these data. Solution: First, we rank the total compensation given of the 12 CEOs as follows: 21.6 21.7 22.9 25.2 26.5 28.0 28.2 32.6 32.9 70.1 76.1 84.5 There are 12 values in this data set. Because there is an even number of values in the data set, the median will be given by the average of the two middle values. The two middle values are the sixth and seventh in the arranged data, and these two values are 28.0 and 28.2. The median, which is given by the average of these two values, is calculated as follows: First level- First Term 2024/2025 76 Tourism & Hotel Statistics Thus, the median compensation for 2010 of these 12 CEOs is $28.1 million. The median gives the center of a histogram, with half of the data values to the left of the median and half to the right of the median. The advantage of using the median as a measure of central tendency is that it is not influenced by outliers. Consequently, the median is preferred over the mean as a measure of central tendency for data sets that contain outliers. 3. Mode Mode is a French word that means fashion—an item that is most popular or common. In statistics, the mode represents the most common value in a data set. Mode definition: The mode is the value that occurs with the highest frequency in a data set. EXAMPLE 3–6 The following data give the speeds (in miles per hour) of eight cars that were stopped on I-95 for speeding violations. 77 82 74 81 79 84 74 78 Find the mode. Solution In this data set, 74 occurs twice, and each of the remaining values occurs only once. Because 74 occurs with the highest frequency, it is the mode. Therefore, Mode = 74 miles per hour First level- First Term 2024/2025 77 Tourism & Hotel Statistics A major short coming of the mode is that a data set may have none or may have more than one mode, whereas it will have only one mean and only one median. For instance, a data set with each value occurring only once has no mode. A data set with only one value occurring with the highest frequency has only one mode. The data set in this case is called unimodal. A data set with two values that occur with the same (highest) frequency has two modes. The distribution, in this case, is said to be bimodal. If more than two values in a data set occur with the same (highest) frequency, then the data set contains more than two modes and it is said to be multimodal. EXAMPLE 3–7 Last year’s incomes of five randomly selected families were $76,150, $95,750, $124,985, $87,490, and $53,740. Find the mode. Solution Because each value in this data set occurs only once, this data set contains no mode. EXAMPLE 3–8 A small company has 12 employees. Their commuting times (rounded to the nearest minute) from home to work are 23, 36, 12, 23, 47, 32, 8, 12, 26, 31, 18, and 28, respectively. Find the mode for these data. Solution In the given data on the commuting times of these 12 employees, each of the values 12 and 23 occurs twice, and each of First level- First Term 2024/2025 78 Tourism & Hotel Statistics the remaining values occurs only once. Therefore, this data set has two modes: 12 and 23 minutes. EXAMPLE 3–9 The ages of 10 randomly selected students from a class are 21, 19, 27, 22, 29, 19, 25, 21, 22, and 30 years, respectively. Find the mode. Solution This data set has three modes: 19, 21, and 22. Each of these three values occurs with a (highest) frequency of 2. One advantage of the mode is that it can be calculated for both kinds of data—quantitative and qualitative—whereas the mean and median can be calculated for only quantitative data. EXAMPLE 3–10 The status of five students who are members of the student senate at a college are senior, sophomore, senior, junior, and senior, respectively. Find the mode. Solution Because seniors occur more frequently than the other categories, it is the mode for this data set. We cannot calculate the mean and median for this data set To sum up, we cannot say for sure which of the three measures of central tendency is a better measure overall. Each of them may be better under different situations. Probably the mean is the most-used measure of central tendency, followed by the median. The mean has First level- First Term 2024/2025 79 Tourism & Hotel Statistics the advantage that its calculation includes each value of the data set. The median is a better measure when a data set includes outliers. The mode is simple to locate, but it is not of much use in practical applications. 4. Relationships Among the Mean, Median, and Mode There are two of the many shapes that a histogram or a frequency distribution curve can assume are symmetric and skewed. This section describes the relationships among the mean, median, and mode for three such histograms and frequency distribution curves. Knowing the values of the mean, median, and mode can give us some idea about the shape of a frequency distribution curve. 1.For a symmetric histogram and frequency distribution curve with one peak (see Figure 3.2), the values of the mean, median, and mode are identical, and they lie at the center of the distribution. 2. For a histogram and a frequency distribution curve skewed to the right (see Figure 3.3), the value of the mean is the largest, that of the mode is the smallest, and the value of the median lies between these two. (Notice that the mode always occurs at the peak point.) The value of the mean is the largest in this case because it is sensitive to First level- First Term 2024/2025 80 Tourism & Hotel Statistics outliers that occur in the right tail. These outliers pull the mean to the right 3. If a histogram and a frequency distribution curve are skewed to the left (see Figure 3.4), the value of the mean is the smallest and that of the mode is the largest, with the value of the median lying between these two. In this case, the outliers in the left tail pull the mean to the left. First level- First Term 2024/2025 81 Tourism & Hotel Statistics 4. Percentile: The 𝒑𝒕𝒉 percentile is a value such as at least 𝒑 percent of the data have this value or less and at least (𝟏𝟎𝟎 − 𝒑) percent of the data have this value or more. Note: 𝟓𝟎𝐭𝐡 𝐩𝐞𝐫𝐜𝐞𝐧𝐭𝐢𝐥𝐞 = 𝐦𝐞𝐝𝐢𝐚𝐧. The procedure to calculate the 𝒑𝒕𝒉 percentile: 1. Arrange the data in ascending order. 2. Compute an index 𝒊, 𝒑 𝒊 = 𝒏∙( ). 𝟏𝟎𝟎 3. (a) If 𝒊 is not an integer, round up, i.e., the next integer value greater than 𝒊 denote the position of the 𝒑𝒕𝒉 percentile. (b) If 𝒊 is an integer, the 𝒑𝒕𝒉 percentile is the average of the data values in positions 𝒊 and 𝒊 + 𝟏. Example 1 (continue): Please find 𝟒𝟎𝒕𝒉 percentile and 𝟐𝟔𝒕𝒉 percentile for the previous data. [Solution] Step 1: the data in asce