🎧 New: AI-Generated Podcasts Turn your study notes into engaging audio conversations. Learn more

L2_احصاء تطبيقي في علم البيانات 85z .pdf

Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...

Full Transcript

Probability-Size of the Sample Space...

Probability-Size of the Sample Space The Fundamental Counting Principle is a shortcut to finding the size of the sample space when there are many trials and outcomes: If one event has p possible outcomes, and another event has m possible outcomes, then there are a total of p m possible outcomes for the two events. The Fundamental Counting Principle is a way to find the number of outcomes without listing and counting every one of them. Examples Rolling two six-sided dice: Each die has 6 equally likely outcomes, so the sample space is 6 6 or 36 equally likely outcomes. LECTURE 2: KEY CONCEPTS Flipping three coins: Each coin has 2 equally likely outcomes, so the sample space is 2 2 2 or 8 equally likely outcomes. Rolling a six-sided die and flipping a coin: The sample space is 6 2 or 12 equally likely outcomes. Terms Used in Probability and Statistics Probability Terms and Definition There are various terms utilized in the probability and statistics concepts, Such as: Term Definition Example Random Experiment Sample space Expected Value Sample Space The set of all the possible outcomes to 1.Tossing a coin, Sample Space (S) = {H,T} occur in any trial 2.Rolling a die, Sample Space (S) = Random Experiment {1,2,3,4,5,6} An experiment whose result cannot be predicted, until it is noticed is called a random experiment. For example, Experiment or Trial A series of actions where the outcomes are The tossing of a coin, throwing a dice. when we throw a dice randomly, the result is uncertain to us. We can get any output between 1 to 6. Hence, this always uncertain. experiment is random. Event It is a single outcome of an experiment. Getting a Heads while tossing a coin is an Sample Space event. A sample space is the set of all possible results or outcomes of a random experiment. Suppose, if we have thrown a Outcome Possible result of a trial/experiment T (tail) is a possible outcome when a coin is tossed. dice, randomly, then the sample space for this experiment will be all possible outcomes of throwing a dice, such as; Impossible Event The event cannot happen In tossing a coin, impossible to get both head and tail at the same time Sample Space = { 1,2,3,4,5,6} Expected Value Expected value is the mean of a random variable. It is the assumed value which is considered for a random experiment. It is also called expectation, mathematical expectation or first moment. For example, if we roll a dice having six faces, then the expected value will be the average value of all the possible outcomes 1+2+3+4+5+6/6 21/6 = 3.5 Population and sample-cont. Population and sample Finite Population The finite population is also known as a countable population in which the population can be counted. In Population other words, it is defined as the population of all the individuals or objects that are finite. For statistical It includes all the elements from the data set and measurable characteristics of the population such as analysis, the finite population is more advantageous than the infinite population. Examples of finite mean and standard deviation are known as a parameter. populations are employees of a company For example, All people living in Saudi Araba indicates the population of the country. Infinite Population The infinite population is also known as an uncountable population in which the counting of units in the population is not possible. Example of an infinite population is the number of germs in the patient’s body Type of Population which is uncountable. There are different types of population. They are: Existent Population Finite Population The existing population is defined as the population of concrete individuals. In other words, the population Infinite Population whose unit is available in solid form is known as existent population. Examples are books, tables, etc. Existent Population Hypothetical Population Hypothetical Population The population in which whose unit is not available in solid form is known as the hypothetical population. A population consists of sets of observations, objects etc. that are all something in common. In some situations, the populations are only hypothetical. Examples are an outcome of rolling the dice, the outcome of tossing a coin Population and sample-cont. Population and sample-cont. Differences between Population and Sample Sample : It includes one or more observations that are drawn from the population and the measurable characteristic Comparison Population Sample Meaning Collection of all the units or A subgroup of the members of of a sample is a statistic. Sampling is the process of selecting the sample from the population. For elements that possess the population common characteristics example, some people living in Saudi Arabia represent a sample of the population. Includes Each and every element of a Only includes a number of Sampling frame group units of population The sampling frame is the actual list of individuals that the sample will be drawn from. Ideally, it should Data Collection Such as Complete census Such as surveys include the entire target population (and nobody who is not part of that population). Focus on Identification of the Making inferences about the characteristics population Example: of a Sampling frame You are doing research on working conditions at a social media marketing company. Your population is all 1000 employees of the company. Your sampling frame is the company’s HR database, which lists the names and contact details of every employee. Population and sample-cont. Types of sampling: Population and sample-cont. Probability sampling Non-probability sampling 1. Simple random sampling In a simple random sample, every member of the population has an equal chance of being selected. Your Probability sampling methods sampling frame should include the whole population. Probability sampling means that every member of the population has a chance of being selected. It is mainly used in quantitative research. If you want to produce results that are representative of the whole population, probability To conduct this type of sampling, you can use tools like random number generators or other techniques sampling techniques are the most valid choice. Probability sampling techniques are often used in quantitative that are based entirely on chance. research. In these types of research, the aim is to test a hypothesis about a broad population Types of Probability sampling methods Example of Simple random sampling There are four main types of probability samples. You want to select a simple random sample of 1000 employees of a social media marketing company. You 1. Simple random sampling assign a number to every employee in the company database from 1 to 1000, and use a random number 2-Systematic sampling generator to select 100 numbers. 3-Stratified sampling 4-Cluster sampling Population and sample-cont. Population and sample-cont. 3. Stratified sampling 2. Systematic sampling Stratified sampling involves dividing the population into subpopulations that may differ in important ways. It allows you draw more precise conclusions by ensuring that every subgroup is properly represented in the Systematic sampling is similar to simple random sampling, but it is usually slightly easier to conduct. sample. Every member of the population is listed with a number, but instead of randomly generating numbers, To use this sampling method, you divide the population into subgroups (called strata) based on the relevant individuals are chosen at regular intervals. characteristic (e.g., gender identity, age range, job role). Example of Systematic sampling Based on the overall proportions of the population, you calculate how many people should be sampled All employees of the company are listed in alphabetical order. From the first 10 numbers, you from each subgroup. Then you use random or systematic sampling to select a sample from each subgroup. randomly select a starting point: e.g., number 6. From number 6 onwards, every 10th person on the list is selected (6, 16, 26, 36, and so on), and you end up with a sample of 100 people. If you use this Example of Stratified sampling technique, it is important to make sure that there is no hidden pattern in the list that might skew the The company has 800 female employees and 200 male employees. You want to ensure that the sample sample. For example, if the HR database groups employees by team, and team members are listed in reflects the gender balance of the company, so you sort the population into two strata based on gender. order of seniority, there is a risk that your interval might skip over people in junior roles, resulting in a Then you use random sampling on each group, selecting 80 women and 20 men, which gives you a sample that is skewed towards senior employees representative sample of 100 people. Population and sample-cont. Population and sample-cont. 4. Cluster sampling Non-Probability sampling methods Cluster sampling also involves dividing the population into subgroups, but each subgroup should have similar In a non-probability sample, individuals are selected based on non-random criteria, and not every individual has a characteristics to the whole sample. Instead of sampling individuals from each subgroup, you randomly select chance of being included. This type of sample is easier and cheaper to access, but it has a higher risk of sampling bias. entire subgroups. That means the inferences you can make about the population are weaker than with probability samples, and your If it is practically possible, you might include every individual from each sampled cluster. If the clusters conclusions may be more limited. If you use a non-probability sample, you should still aim to make it as representative themselves are large, you can also sample individuals from within each cluster using one of the techniques of the population as possible. Non-probability sampling techniques are often used in exploratory and qualitative above. This is called multistage sampling. research. In these types of research, the aim is not to test a hypothesis about a broad population, but to develop an This method is good for dealing with large and dispersed populations, but there is more risk of error in the initial understanding of a small or under-researched population. sample, as there could be substantial differences between clusters. It’s difficult to guarantee that the sampled Types of Non-Probability sampling methods clusters are really representative of the whole population. There are five main types of non-probability samples. 1. Convenience sampling Example of Cluster sampling 2-Voluntary response sampling The company has offices in 10 cities across the country (all with roughly the same number of employees in 3-Purposive sampling similar roles). You don’t have the capacity to travel to every office to collect your data, so you use random 4-Snowball sampling sampling to select 3 offices – these are your clusters. 5-Quota sampling Population and sample-cont. Population and sample-cont. 2. Voluntary response sampling 1. Convenience sampling Similar to a convenience sample, a voluntary response sample is mainly based on ease of access. Instead A convenience sample simply includes the individuals who happen to be most accessible to the researcher. of the researcher choosing participants and directly contacting them, people volunteer themselves (e.g. by responding to a public online survey). This is an easy and inexpensive way to gather initial data, but there is no way to tell if the sample is representative of the population, so it can’t produce generalizable results. Convenience samples are at risk for Voluntary response samples are always at least somewhat biased, as some people will inherently be both sampling bias and selection bias. more likely to volunteer than others, leading to self-selection bias. Example of Convenience sampling Example of Voluntary response sampling You are researching opinions about student support services in your university, so after each of your classes, You send out a survey form about student support services to all students at your university and a lot of you ask your fellow students to complete a survey on the topic. This is a convenient way to gather data, but students decide to complete it. This can certainly give you some insight into the topic, but the people as you only surveyed students taking the same classes as you at the same level, the sample is not who responded are more likely to be those who have strong opinions about the student support services, representative of all the students at your university. so you can’t be sure that their opinions are representative of all students. Population and sample-cont. Population and sample-cont. 3. Purposive sampling 4. Snowball sampling This type of sampling, also known as judgement sampling, involves the researcher using their expertise to select a sample that is most useful to the purposes of the research. If the population is hard to access, snowball sampling can be used to recruit participants via other participants. The number of people you have access to “snowballs” as you get in contact with more people. The downside It is often used in qualitative research, where the researcher wants to gain detailed knowledge about a specific here is also representativeness, as you have no way of knowing how representative your sample is due to the topic rather than make statistical inferences, or where the population is very small and specific. An effective reliance on participants recruiting others. This can lead to sampling bias. purposive sample must have clear criteria and rationale for inclusion. Always make sure to describe your inclusion and exclusion criteria and beware of observer bias affecting your arguments. Example of Snowball sampling You are researching experiences of parents of children with special health condition in your city. Since there is Example of Purposive sampling no list of these people in the city, probability sampling isn’t possible. You meet one person who agrees to You want to know more about the opinions and experiences of students with special needs at your university, participate in the research, and she or he puts you in contact with other people that she/he knows. so you purposefully select a number of students with different support needs in order to gather a varied range of data on their experiences with student services. Population and sample-cont. Population and sample-cont. 5. Quota sampling Sampling bias Quota sampling relies on the non-random selection of a predetermined number or proportion of units. This is It occurs when some members of a population are systematically more likely to be selected in called a quota. a sample than others. You first divide the population into mutually exclusive subgroups (called strata) and then recruit sample units until you reach your quota. These units share specific characteristics, determined by you prior to forming your Sampling bias limits the generalizability of findings because it is a threat to validity, specifically strata. The aim of quota sampling is to control what or who makes up your sample. population validity. In other words, findings from biased samples can only be generalized to populations that share characteristics with the sample. Example of Quota sampling You want to explore consumer interest in a new produce delivery service in your city, focused on dietary Causes of sampling bias preferences. You divide the population into meat eaters, vegetarians, and vegans, drawing a sample of 1000 Your choice of research design or data collection method can lead to sampling bias. This type people. Since the company wants to cater to all consumers, you set a quota of 200 people for each dietary of research bias can occur in both probability and non-probability sampling. group. In this way, all dietary preferences are equally represented in your research, and you can easily compare these groups. You continue recruiting until you reach the quota of 200 participants for each subgroup. Population and sample-cont. Population and sample-cont. Sampling bias in probability samples In probability sampling, every member of the population has a known chance of being selected. For instance, you can use a random number generator to select a simple random sample from your population. Sampling bias in non-probability samples Although this procedure reduces the risk of sampling bias, it may not eliminate it. If your sampling frame – A non-probability sample is selected based on non-random criteria. For instance, in a convenience the actual list of individuals that the sample is drawn from – does not match the population, this can result in sample, participants are selected based on accessibility and availability. a biased sample. Non-probability sampling often results in biased samples because some members of the population Example of sampling bias in a simple random sample are more likely to be included than others. You want to study anxiety levels experienced by undergraduate students at your university using a simple random sample. You assign a number to every student in the research participant database from 1 to 1500 Example of sampling bias in a convenience sample and use a random number generator to select 120 numbers. You want to study the popularity of plant-based foods amongst undergraduate students at your Although you used a random sample, not every member of your target population –undergraduate students university. For convenience, you send out a survey to everyone enrolled in Introduction to Psychology at your university – had a chance of being selected. Your sample misses anyone who did not sign up to be courses at your university. Because this is a convenience sample, it is not representative of your target contacted about participating in research. This may bias your sample towards people who have less anxiety population. level and are more willing to participate in research. Population and sample-cont. Population and sample-cont. Main Types of sampling bias Type Explanation Example How to avoid or correct sampling bias Self-selection bias People with specific characteristics are more People who are more thrill-seeking are likely to take part likely to agree to take part in a study than in pain research studies. Using careful research design and sampling procedures can help you avoid sampling bias. Also consider: others. Nonresponse bias People who refuse to participate or drop out In a study on stress and workload, employees with high from a study systematically differ from those workloads are less likely to participate. The resulting 1-Define a target population and a sampling frame (the list of individuals that the sample will be drawn who take part. sample may not vary greatly in terms of workload. from). Match the sampling frame to the target population as much as possible to reduce the risk of Undercoverage bias Some members of a population are Administering general national surveys online may miss sampling bias. inadequately represented in the sample. groups with limited internet access, such as the elderly and lower-income households. Pre-screening or advertising bias The way participants are pre-screened or When seeking volunteers to test a novel sleep 2-Make online surveys as short and accessible as possible. where a study is advertised may bias a intervention, you may end up with a sample that is more sample. motivated to improve their sleep habits than the rest of the 3-Follow up on non-responses. population. As a result, they may have been likely to improve their sleep habits regardless of the effects of your intervention. 4-Avoid convenience sampling.

Use Quizgecko on...
Browser
Browser