Survey Design and Data Privacy Issues

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

What type of error is indicated when respondents are unable to provide accurate answers due to misunderstandings?

  • Measurement Error
  • Adjustment Error
  • Processing Error
  • Response Error (correct)

What does a high standard deviation in survey statistics indicate?

  • Data is clustered around the mean
  • Data points are all identical
  • Responses vary widely from the mean (correct)
  • Responses are homogenous

Which group is incorrectly represented if only iPhone users are included in a survey?

  • Tablet users
  • Android users
  • Desktop users
  • All mobile users (correct)

What aspect fails if the technology used for a survey is malfunctioning?

<p>Response Accuracy (D)</p> Signup and view all the answers

How can the adjustment error in survey responses be identified?

<p>Through post-survey corrections made by analysts (A)</p> Signup and view all the answers

What is the implication of a poor sampling frame in a survey?

<p>Bias and lack of generalizability (D)</p> Signup and view all the answers

In survey design, what does validity assess?

<p>The accuracy of the measurement of concepts (B)</p> Signup and view all the answers

What is a major concern when evaluating if steps represent physical activity?

<p>Representative sampling of the general population (D)</p> Signup and view all the answers

What is a major privacy concern related to sensor data?

<p>Data streams might be intercepted by unauthorized parties. (A)</p> Signup and view all the answers

How does the level of privacy concern affect participation willingness?

<p>Higher privacy concerns correlate with lower willingness to participate. (D)</p> Signup and view all the answers

What is a potential risk associated with connecting multiple streams of data?

<p>It can lead to re-identification of previously anonymous users. (C)</p> Signup and view all the answers

What was the total sample size for the hypothetical willingness to share sensor data from the LISS Panel?

<p>2,678 participants (B)</p> Signup and view all the answers

Which of the following methods does NOT collect big data?

<p>Offline transactions (B)</p> Signup and view all the answers

What aspect of digital trace data is introduced in the content?

<p>Design considerations for data collection. (D)</p> Signup and view all the answers

Which type of data is specifically mentioned as being at risk of unauthorized access?

<p>Sensor data (A)</p> Signup and view all the answers

Increased willingness to share sensor data is likely correlated with which of the following?

<p>Lower perceived risks associated with data sharing (D)</p> Signup and view all the answers

What characterizes non-probability sampling in survey research?

<p>Relies on volunteers from specific demographics to participate in surveys. (D)</p> Signup and view all the answers

Which sampling method involves ensuring specific characteristics are proportionally represented?

<p>Quota sampling (C)</p> Signup and view all the answers

What is a key feature of the Era of Expansion in survey research?

<p>The introduction of computer-assisted online surveys. (B)</p> Signup and view all the answers

Which of the following is an example of digital trace data?

<p>Data obtained from web browsing history. (D)</p> Signup and view all the answers

What is a potential issue with recruitment via targeted advertisements for surveys?

<p>It often involves participants who self-select to join. (B)</p> Signup and view all the answers

What is a primary characteristic of probability samples?

<p>Allow for easy inferences to the general population (D)</p> Signup and view all the answers

How can digital trace data be collected from survey participants?

<p>By requesting participants to donate their web data. (C)</p> Signup and view all the answers

Which term best describes a database of potential respondents who agree to participate in future surveys?

<p>Online non-prob panels (C)</p> Signup and view all the answers

Which of the following statements is true about non-probability samples?

<p>Selection bias is likely due to self-selection (D)</p> Signup and view all the answers

What is meant by 'river sampling' in surveys?

<p>Inviting website visitors to complete immediate surveys through pop-up windows. (C)</p> Signup and view all the answers

What is a disadvantage of digital trace data compared to surveys?

<p>Limited number of covariates (D)</p> Signup and view all the answers

Which of the following is a common issue related to surveys?

<p>High measurement errors from social desirability (C)</p> Signup and view all the answers

What is an example of an advantage of probability samples?

<p>They allow for model-based inferences (B)</p> Signup and view all the answers

Why might non-probability samples be considered more convenient?

<p>They can be conducted quickly and affordably (D)</p> Signup and view all the answers

How does the collection approach of digital trace data differ from surveys?

<p>Digital trace data is collected for purposes other than research (D)</p> Signup and view all the answers

What can be a consequence of falling response rates in probability samples?

<p>Higher costs of data collection (D)</p> Signup and view all the answers

What is a potential source of bias when sampling respondents who only downloaded a mobile app?

<p>Coverage error (C)</p> Signup and view all the answers

What does nonresponse error refer to in the context of surveys?

<p>Not receiving responses from selected participants (D)</p> Signup and view all the answers

Which error arises from malfunctioning devices during data collection?

<p>Measurement error (B)</p> Signup and view all the answers

What limitation might arise from only collecting data from iPhone users in a survey?

<p>Limited demographic representation (A)</p> Signup and view all the answers

Adjustments made after data collection to correct for known biases are referred to as what?

<p>Adjustment errors (B)</p> Signup and view all the answers

What characterizes adjustment errors in survey research?

<p>They improve the accuracy of survey statistics (D)</p> Signup and view all the answers

Why is it important to understand the characteristics of respondents when conducting surveys?

<p>It helps mitigate potential biases (D)</p> Signup and view all the answers

Which technique can enhance the reliability of survey data collection?

<p>Incorporating data from sensors and apps (B)</p> Signup and view all the answers

What was the primary concern of the researcher in Case 1?

<p>Understanding the percentage of individuals worried about climate change (B)</p> Signup and view all the answers

What percentage of young people did the researcher conclude were suffering from social isolation in Case 2?

<p>23% (D)</p> Signup and view all the answers

What was identified as a significant issue with using social media data as a replacement for official statistics?

<p>Alignment between social media and traditional indexes degraded after 2011 (A)</p> Signup and view all the answers

What do some researchers argue about social media data's ability to replace survey data?

<p>It is only fit for limited conditions (B)</p> Signup and view all the answers

What are some potential reasons for inferential 'failures' in social media data analysis?

<p>Measurement and selection problems (D)</p> Signup and view all the answers

What did research by Conrad et al. (2021) determine about the relationship between social media data and traditional indexes?

<p>It was more than a chance occurrence (B)</p> Signup and view all the answers

What element was noted as having a strong potential impact on research results in the analysis of social media data?

<p>Micro-decisions in the analysis (A)</p> Signup and view all the answers

What can be inferred about the reliability of social media data based on the findings presented?

<p>It should be cautiously analyzed and can sometimes fail (B)</p> Signup and view all the answers

Flashcards

Era of Invention

The period of time between 1930-1960 marked by the development of survey methodologies, mainly relying on face-to-face interviews and mail surveys.

Era of Expansion

The period of time between 1960-1990 where survey research evolved with new technologies like random digit dialing (RDD) and computer-assisted telephone surveys.

Era of Expansion

The period of time between 1990 and present, where survey research shifted to utilizing a mix of traditional and digital methods.

Data Donation

A type of data collection where respondents voluntarily share their digital traces, like browsing history or app usage, for research purposes.

Signup and view all the flashcards

Digital Trace Data

Data collected automatically through digital footprints, like website visits, app interactions, or social media interactions.

Signup and view all the flashcards

Quota Sampling

A non-probability sampling method where specific characteristics of the population are intentionally represented in the sample, ensuring proportionality.

Signup and view all the flashcards

River Sampling

A non-probability sampling method where respondents are recruited directly from the pool of individuals visiting a specific website.

Signup and view all the flashcards

Recruitment via Targeted Advertisement

A non-probability sampling technique where researchers utilize various social media platforms to reach and recruit participants.

Signup and view all the flashcards

Digital Trace Data in Surveys

Data collected from sources like apps and web browsers during a survey, providing insights into user behavior and preferences.

Signup and view all the flashcards

Post-Survey Data Collection

A model that uses data collected after a survey to gather information about user behavior and preferences.

Signup and view all the flashcards

Designing for Digital Trace Data

The design of surveys and data collection methods to intentionally gather digital trace data.

Signup and view all the flashcards

Introducing Design to Digital Trace Data

The process of integrating digital trace data into existing research methods to gain deeper insights.

Signup and view all the flashcards

Privacy Concerns with Sensor Data

Individuals may have concerns about potential risks associated with sharing sensor data, such as data interception, re-identification, and misuse for discriminatory purposes.

Signup and view all the flashcards

Privacy Concerns and Participation

The level of concern about privacy affects individuals' willingness to share sensor data. Higher privacy concerns generally lead to lower participation rates in data collection.

Signup and view all the flashcards

Hypothetical Willingness to Share Sensor Data

A hypothetical scenario examining the willingness of participants to share sensor data, such as location or video recordings.

Signup and view all the flashcards

Social Media as a Replacement for Official Statistics

Using social media data like tweets or Instagram posts to estimate trends or opinions in a larger population, like the percentage of people worried about climate change.

Signup and view all the flashcards

Probability Sample Survey

A method for collecting data where every individual in the population has a known chance of being selected for the study. This leads to more representative samples.

Signup and view all the flashcards

Non-Probability Sample Survey

Collecting data without a well-defined probability of selecting individuals, often leading to biased results.

Signup and view all the flashcards

Measurement Problems

Issues with how information is measured or collected in a survey, potentially leading to inaccurate results. For example, a survey question may be confusing or biased.

Signup and view all the flashcards

Selection Problems

Problems related to how participants are chosen for a study, which can lead to an unrepresentative sample. For instance, surveys on social media might only reach specific demographics.

Signup and view all the flashcards

Inferential Reliability

The ability of a study's results to accurately predict or reflect the real world, meaning they are not just a fluke.

Signup and view all the flashcards

Replicating Studies

The process of repeating and refining a research study to confirm earlier findings and improve their reliability.

Signup and view all the flashcards

Degradation of the Relationship Between Social Media and Traditional Indexes

The tendency for the relationship between social media data and traditional statistics to weaken or change over time.

Signup and view all the flashcards

Sampling

The process of choosing a subset of individuals from a larger population to represent the whole group.

Signup and view all the flashcards

Sample

A group of individuals selected from a larger population to study.

Signup and view all the flashcards

Population

The entire group of individuals that a researcher is interested in studying.

Signup and view all the flashcards

Randomization

A process of ensuring that every individual in a population has an equal chance of being selected for the sample.

Signup and view all the flashcards

Probability Samples

A sampling technique where every individual in the population has a known probability of being selected.

Signup and view all the flashcards

Non-Probability Samples

A sampling technique where the probability of each individual being selected is unknown or not equal.

Signup and view all the flashcards

Selection Bias

A bias that occurs when the sample is not representative of the population due to how individuals were selected.

Signup and view all the flashcards

Self-Selection Bias

The tendency for individuals with strong opinions to be more likely to participate in a survey, leading to a biased sample.

Signup and view all the flashcards

Digital Trace Data Collection

The process of collecting data from digital traces, like browsing history, app usage, or social media interactions.

Signup and view all the flashcards

Privacy Concerns with Digital Trace Data

People may worry about their privacy when sharing their digital traces for research. This can affect their willingness to participate.

Signup and view all the flashcards

Hypothetical Willingness to Share Digital Trace Data

Asking people if they would be willing to share their digital traces for research, even if they don't actually have to.

Signup and view all the flashcards

Era of Sensors and Apps

The period where new technologies like smartphones and the internet opened up new possibilities for collecting data through digital traces.

Signup and view all the flashcards

Measurement Error

A type of survey error that occurs when the survey instrument itself is flawed, leading to inaccurate or unreliable data.

Signup and view all the flashcards

Sampling Error

A type of survey error that arises when the sample doesn't accurately reflect the population being studied, leading to biased results.

Signup and view all the flashcards

Response Error

A type of survey error that occurs when respondents provide inaccurate or misleading information, either intentionally or unintentionally.

Signup and view all the flashcards

Nonresponse Error

A type of survey error that occurs when respondents refuse to participate or drop out of the survey before completion, leading to incomplete data.

Signup and view all the flashcards

Survey Instrument Error

A type of survey error that occurs when the survey instrument itself is flawed, leading to inaccurate or unreliable data. This can happen due to unclear wording, leading questions, or biased response options.

Signup and view all the flashcards

Validity Error

A type of survey error that occurs when the data collected is not relevant or appropriate to the research question being asked.

Signup and view all the flashcards

Processing Error

A type of survey error that occurs when the survey instrument is not consistently administered or interpreted, leading to inconsistencies in the data.

Signup and view all the flashcards

Sampling Frame Error

A type of survey error that occurs when the sample frame used to select respondents doesn't accurately represent the target population, leading to bias.

Signup and view all the flashcards

Study Notes

Introduction to Survey Research

  • Survey research has evolved through three eras: Invention, Expansion, and the current era, which incorporates "organic data."
  • The Era of Invention (1930-1960) focused on area probability sampling and face-to-face/mail surveys.
  • The Era of Expansion (1990-present) saw the rise of non-probability sampling methods like computer-assisted online surveys and the integration of big data sources.
  • "Designed data" is supplemented by "organic data"

Non-Probability Sample Surveys

  • Online non-probability panels are databases of potential respondents who've stated they'll cooperate for future data collection.
  • Quota sampling involves representing specific population characteristics proportionally in the sample.
  • River sampling is used to invite website visitors to immediate surveys.
  • Recruitment can occur via targeted advertisements on social media platforms like Facebook and Twitter.
  • A potential problem with non-probability samples is that some parts of the population may be systematically excluded because participation is voluntary.

Digital Trace Data

  • Digital trace data is a kind of data that is collected through various online activities.
  • It is collected through sources like social media, web browsing history, applications and geographic location data.
  • It can be collected directly from the web through APIs or web scraping.
  • Data donation from survey participants, such as web browsing history, specific apps used, and activity data (e.g., Strava), is also a source of digital trace data.
  • Designed big data and smart surveys also contribute to digital trace data; collected within or after a survey, through apps, and browsers.
  • This data can be obtained by using data download packages (DDPs).

Introducing "Design" to Digital Trace Data

  • Example workflow: Data collected with sensors. An initial population register feeds into a sample selection process.
  • Respondents participating provide consent, and their data is processed, with sensor data captured.

Privacy Concerns

  • Potential risks related to sensor data include interception by unauthorized parties.
  • Connecting multiple data streams can potentially re-identify previously anonymous users.
  • Information from sensor data can be used to negatively impact an individual's credit, employment, or insurability.
  • Higher privacy concerns tend to correlate with decreased willingness to participate in studies.

Hypothetical Willingness from the LISS Panel

  • The data from the LISS panel, which covers roughly 2,678 Dutch smartphone users, illustrates that willingness to share sensor data varies across the sharing of locations, videos, photos of a house and photos of yourself.
  • Data on the willingness to share sensor data, from the CBS Consent Survey, for approximately 1,883 Dutch smartphone and tablet users, illustrates that willingness differs among types of data. GPS data has greater willingness than sharing photo of homes and videos.

Examples of New Forms of Data

  • Smartphone sensors such as NFC, Bluetooth, thermometers, Wi-Fi, GPS, cellular networks, fingerprint sensors, barometers, accelerometers, pedometers, gyroscopes, and camera are examples.

Examples of Use of Sensor Data

  • SurveyMotion: JavaScript-based and total acceleration.
  • Completion Behaviour: Fitness Tasks.
  • Wearables. Wrist-worn GENEActiv, Axivity ax3 at the upper thigh, total physical activity.

More on Sensors

  • Several organizations are using cameras to scan receipts to learn about consumer spending patterns.
  • Some work focuses on linking sensor data to administrative data, such as tax records, for research purposes.

Examples of App Data

  • The Tabi app with Statistics Netherlands, focuses on travel mode and history.

Some Examples of Studies Using Sensors

  • Presented are several examples, in a table format, of specific studies using sensors to investigate areas, such as, social networks, spatial segregation, urbanicity, mobility, and more.

Some Examples of Studies Collecting Biomeasures

  • A table summarises studies collecting specific biomeasures, such as blood and saliva, from particular geographical areas, and associated methodologies.

Some Examples of Studies Using Linkage to Administrative Data

  • A table summarises examples of research studies using linkages with administrative data.

Questions Concerning Study Conclusions

  • Researchers analyzing tweets to gauge climate change worry levels among Germans. Researchers advertise a survey regarding social isolation on Instagram, with the aim of finding out the social isolation levels from this sample.

Social Media Use as a Replacement for Official Statistics

  • Initially, social media was a promising alternative to official statistics due to its alignment with existing metrics.
  • However, issues emerged, as social media usage patterns changed leading to a degradation of the correlation with traditional metrics. Subsequently, job loss trends in social media data were not aligned to actual official unemployment figures.

Concerns of using Social Media as a basis for study

  • Micro-decisions during the analysis may strongly affect the results of the study
  • Social media data may not be appropriate for replacing survey data, except under a limited set of highly controlled conditions.

Comparing Two Types of Online Survey Samples

  • Opt-in samples, used in online surveys, are significantly less accurate than probability-based panel samples, which use random sampling methods.

Probability vs Non-Probability Sample Surveys

  • Probability samples are preferred for generalizable inferences but are often expensive and time-consuming.
  • Non-probability samples are usually more affordable, timely, and convenient but might not allow for generalization.
  • Diverse methods like surveys, digital traces, and others are included in non-probability samples.

Probability Surveys vs Digital Trace Data

  • Designed data is used for research purposes.
  • Researchers have control over content and a large number of covariates in probability surveys, while organic data, from digital traces, is collected from different purposes with often little control.

What can we do if none of the sources are perfect?

  • Combining data sources might be helpful. Combining the strengths of multiple data sources can improve inference while mitigating biases from any single source.

Two Principles of Data Integration

  • Data integration (DI) is context dependent, i.e. it depends on the specific purpose of the data being integrated and the data in question.
  • DI, considered as a puzzle, requires aligning data quality, such as timeliness, coverage, quality, and size.

Additional Reading

  • This section provides further sources of scholarly material that support the themes and examples discussed in the presentation.

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

More Like This

Use Quizgecko on...
Browser
Browser