Data Collection: Types, Sources, and Methods

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson
Download our mobile app to listen on the go
Get App

Questions and Answers

In systematic sampling, researchers select participants based on their subjective judgment of who would be most helpful.

False (B)

Qualitative data are measures of values or counts and are expressed as numbers.

False (B)

A Type I error occurs when we reject a false null hypothesis.

False (B)

In a well-designed experiment, it is important to introduce bias when assigning treatments to avoid confounding variables.

<p>False (B)</p>
Signup and view all the answers

The union of two disjoint sets contains all the elements that are common to both sets.

<p>False (B)</p>
Signup and view all the answers

A parameter is a numerical summary of a sample.

<p>False (B)</p>
Signup and view all the answers

The probability of an event and its complement always sums to 2.

<p>False (B)</p>
Signup and view all the answers

A hypothesis is defined as a proven fact that has been tested numerous times.

<p>False (B)</p>
Signup and view all the answers

In stratified sampling, the population is divided into subgroups that share similar characteristics, and then a random sample is taken from each subgroup.

<p>True (A)</p>
Signup and view all the answers

Experimental studies allow researchers to establish correlation but not causation.

<p>False (B)</p>
Signup and view all the answers

Flashcards

Data

Figures collected in a systematic manner for a predetermined purpose. Can be defined as the quantitative and qualitative value of a variable, one of the most vital aspects of any research study.

Primary Data

Collected from first-hand experiences. It has more reliable, authentic and not been published anywhere.

Secondary Data

Data that have already been collected by others.

Sampling

Process of identifying a subset of a population that provides an accurate reflection of the whole.

Signup and view all the flashcards

Random Sampling

Picking respondents with no design or order, like picking names out of a hat.

Signup and view all the flashcards

Probability

Statistical measure of how likely an event is to occur or how likely it is that a proposition is true.

Signup and view all the flashcards

Conditional Probability

The probability that is based on part of the sample space.

Signup and view all the flashcards

Hypothesis

A premise or claim that we want to test.

Signup and view all the flashcards

Null Hypothesis

Currently accepted value for a parameter.

Signup and view all the flashcards

Egon S. Pearson

Innovated Neyman-Pearson Theory (1933).

Signup and view all the flashcards

Study Notes

Obtaining Data: Methods and Types

  • Data comes from figures collected systematically for a predetermined purpose
  • Data includes both quantitative and qualitative values of a variable for research

Types of Data

  • Quantitative data measures values/counts and are expressed as numbers
  • Qualitative data measures types and can be names, symbols, or number codes

Sources of Data

  • Primary data originates from first-hand experiences, it has more reliability and authenticity, and is unpublished
  • Secondary data refers to data that others have already collected

Data Collection Process

  • Data collection involves a researcher gathering the information required to address a research problem
  • Data collection factors: Which data to collect, how to collect, who will collect, and when data should be collected

Methods of Collecting Primary Data

  • Primary data collection includes the observation method, interview method, questionnaire, and survey/schedule method

Observation Method

  • Observation involves collecting information without asking questions.
  • Advantages: No bias, current data
  • Disadvantages: Time-consuming and subjective
  • Divided into structured, unstructured, participant, non-participant, controlled and uncontrolled types

Interview Method

  • Interviewing is a direct method of data collection via social interaction
  • Advantages: In-depth information and flexibility
  • Disadvantages: Can be expensive and time-consuming
  • Types: Structured, unstructured, focused, clinical, group, qualitative, quantitative, individual, selection, and depth

Questionnaire

  • Sending questionnaires to respondents contains a set of questions
  • Advantages: Low cost, unbiased, and adequate response time
  • Disadvantages: Low response rate and requires literate respondents

Survey/Schedule Method

  • Involves a document with questions filled by enumerators
  • Advantages: Clarifies questions and has a high response rate
  • Disadvantages: Requires trained enumerators, time-consuming

Secondary Data Collection

  • Sources of data include publications from government, technical/trade journals, books, magazines, newspapers, reports from industries/banks/stock exchanges, and reports from researchers/universities/economists/public records
  • Factors to consider: Reliability, suitability, and adequacy of data

Sampling Methods

  • Sampling identifies a population subset to accurately represent the entire group

Random Sampling

  • Random sampling is picking respondents without design or order

Systematic Sampling

  • Systematic sampling follows a set of rules to create regularity in the sampling process

Convenience Sampling

  • Convenience sampling is the easiest method, but the least reliable, involves selecting the closest and easiest to reach participants

Clustered Sampling

  • Clustered sampling involves using predefined groups and subgroups of populations

Stratified Sampling

  • Stratified sampling divides a population into subgroups that share characteristics

Surveys

  • Surveys are used to ask a lot of people well-constructed questions
  • A survey contains a series of unbiased questions for the subject to answer

Steps for Designing a Survey

  • Determine the survey goal
  • Identify the sample population
  • Choose an interviewing method
  • Determine the ask and order of the questions and phrasing
  • Conduct the interview and collect the information
  • Analyze results by creating graphs and drawing conclusions

Planned Experiments

  • An experiment refers to any process that generates sets of data

Experiment Terminology

  • Response: The measurable outcome of interest
  • Factors: Variables manipulated
  • Treatment: Specified factor levels for an experimental run
  • Replications: Systematic duplication of series of experimental runs
  • Hypothesis: A supposition or proposed explanation
  • Accuracy: Conforms to the correct value or a standard
  • Precision: Measure of statistical variability to the true value
  • Trueness: Closeness to the true value
  • Control Group: The baseline measure
  • Treatment Group: Item or subject being manipulated

Experiment Design

  • A well designed and well conducted experiment is controlled, replicated and randomized

Randomized Design

  • Experimental units are randomly assigned to either the control or treatment group

Block Randomization

  • Involves placing subjects into groups of similar individuals

Scientific Method Importance

  • The backbone of any experiment
  • Is a process used to explore observations and answer questions, following scientific procedures

Statistical Methods for Experiment Design

  • Includes analysis of variance (ANOVA), linear regression, and factorial design

Analysis of Variance

  • ANOVA reveals statistically significant differences between samples/treatments

Linear Regression

  • Linear regression models the relationship between a dependent variable and independent variable(s)
  • Simple linear regression only has one variable
  • While multiple linear regression has more than one

Factorial Design

  • Factorial design evaluates how multiple factors affect a variable

Experiment Arithmetics

  • Arithmetics involved in experimentation includes mean, error, percent error, deviation, and percent deviation

Mean

  • The average of a set of numerical values

Error

  • Error is the unknown difference between value retained and the true value

Percent Error

  • An inaccuracy of a measurement, relative to the measurement size

Deviation

  • Deviation measures the difference between the observed and another value, like the mean

Percent Deviation

  • Percent deviation measures data points relative to the statistic average

Probability Definition

  • Probability is a numerical description of how likely an event will occur
  • . 0 on the scale is impossibility and 1 on the scale equals certainty

Calculating Probability

  • Probability is the number of favorable outcomes divided by the total possible outcomes

Probability Terms

  • P = Probability of an event to happen
  • Q = probability to fail, p+q = 1

Experiment Definition

  • Experiment is an outcome that cannot be predicted
  • Sample space is the range of all possible expirement outcomes
  • Every outcome in a sample space is an Element

Sets

  • Complement: Subset of all elements not in set A
  • Intersection: Event containing all elements common to sets A and B
  • Disjoint: Sets A and B have no elements in common
  • Union: Event that contains all elements in set A or set B or both

Venn Diagrams

  • Venn diagrams graphically display probabilities of overlapping events

Combining Operations

  • Operation-1 can be done in n1 ways, and if, for each of these, Operation-2 can be done in n2 ways
  • Combine to get n1*n2 ways

Permutation

  • Permutation is an arrangement of all parts of a set of objects
  • Permutations are the number of ways to arrange a subset in a specific order

Distinct Permutations

  • Distinct permutation of r objects taken from n
  • Formula given as n!/(n-r)!

Cyclic Permutation

  • The permutation of a objects in a circle is N=(n-1)!

Combination Distinct

  • Combination of n distinct object is calculated as n!/r!(n-r)!
  • Used to determine the ways to choose items without order

Probability

  • Quantitative measure of how likely an event will occur

Probability Axiom

  • For any event, the probability is more than zero

Probability Axiom 2

  • 𝑃(𝑆) = 1 The sum of probabilities in a sample space is always 1

Probability Axiom 3

  • If sets don't intersect, infinite collection is the sum of probabilities

Likely Outcomes

  • The probability of event A happening is equal to one over number of outcomes

Independent Events

  • Independent events is where probability of one event isn't affected by another

Exclusive Events

  • Mutually exclusive events don't happen at the same time

Conditional Probability

  • Is the probability based with the set sample space

Independence

  • In independence, 2 evens remain the same regardless of other occurences
  • P(A or B) = P(A) +P(B) - P(A and B)

Multiplication Rule

  • This rules states of the product of probabilities of independent events occurring together

Random Variables

  • Function on the sample space denotated by Capital letter

Discrete Variable

  • Discrete means finite or countably infinite

Probability Function

  • Probability of a random variable, is always between 0-1

Distributions

  • Distributed Discrete Probability: distribution for random variables

Expectation

  • Value is how much the values are spread out from the mean

Standard Deviation

  • Standard is the sq root of the variation

Binomial

  • Number of occurrences follows a fixed probability throughout trials

Distribution Contraints

  • Trials are identical for times n
  • Only two outcomes
  • Consistant probability
  • Events won't affect outcomes of others

Hyperthesis testing-Ronald A.Fisher

  • Ronald was a English Statistician, Biologist, Genetics, and Eugenics knighted statistics

Jerzy Neyman

  • Poland mathematician and innovator of neyman theory

Karl Pearson

  • Karl was an english mathematician statistician

Egon S Pearson

  • Karl's son and also an innovator

Hypotheses

  • Test of hypothesis: includes null, alternate, test data etc

Hypotheses Type

  • Nul Hypthesis accepted value
  • Alternated hypothesis known and reserach hypothysis

Hypthesis Testing

  • Outcome is based on results rejecting or failing to reject Test value is in critical regin

One Tail tests

  • Test value is one side for either testing directions

Tails of tests

  • In the null, is 𝑢=k.

Test statistic

  • Reject or accept in value, with sample data
  • Value is based on sample data and probability calculated using tests

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Related Documents

More Like This

Use Quizgecko on...
Browser
Browser