EPITA Data Privacy by Design Course

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson
Download our mobile app to listen on the go
Get App

Questions and Answers

Which of the following best describes qualitative data?

  • Distinct, separate values.
  • Description that refers to the quality of something. (correct)
  • Description of something in numbers.
  • Categories or groups without inherent order.

Quantitative data is used to describe the qualities of something, like color or texture.

False (B)

A data system involves collecting the weights of individuals. Which type of data would '65.7 kilograms' represent?

  • Continuous data (correct)
  • Discrete data
  • Categorical data
  • Qualitative data

What is the primary characteristic of unstructured data?

<p>It requires structuring to be processed effectively. (A)</p> Signup and view all the answers

Structured data is organized in a way that makes it difficult for machines to extract information.

<p>False (B)</p> Signup and view all the answers

Which of the following formats is an example of structured data?

<p>CSV (Comma Separated Values) (A)</p> Signup and view all the answers

What is the focus of 'Privacy as contextual integrity'?

<p>Ensuring appropriate information flows conform with contextual norms. (A)</p> Signup and view all the answers

According to Westin (1970), data privacy involves the claim of individuals to determine when, how, and to what extent information about them is communicated to ______.

<p>others</p> Signup and view all the answers

Which legal framework emphasizes transparency, purpose, proportionality, and accountability in data handling?

<p>GDPR (A)</p> Signup and view all the answers

According to ECHR Art 8, users do not have the right to respect for private and family life, home, and correspondence.

<p>False (B)</p> Signup and view all the answers

Which principle of Privacy by Design (PbD) emphasizes anticipating and preventing privacy issues before they occur?

<p>Proactive not Reactive (B)</p> Signup and view all the answers

Which of the following aligns with the "Respect for user privacy" principle in Privacy by Design (PbD)?

<p>Designing systems that are user-centric (A)</p> Signup and view all the answers

Match each Privacy by Design (PbD) principle with its description:

<p>Proactive not Reactive = Anticipating and preventing privacy issues Privacy as the default setting = Automatically protecting data without user intervention Privacy Embedded into Design = Integrating privacy into the design process itself</p> Signup and view all the answers

Which GDPR article does explicitly define Data Protection principles by design and by default?

<p>Article 25 (C)</p> Signup and view all the answers

According to GDPR's Article 25, it is optional for organizations to implement data-protection principles into their processes.

<p>False (B)</p> Signup and view all the answers

Which of the following strategies falls under the concept of 'Data Privacy by Design and by Default'?

<p>Minimize Data Linkability (D)</p> Signup and view all the answers

What is the main goal of LINDDUN?

<p>Systematic threat assessment methodology (A)</p> Signup and view all the answers

What is the primary purpose of creating a Data Flow Diagram (DFD) in the LINDDUN methodology?

<p>To model the system (B)</p> Signup and view all the answers

In the LINDDUN methodology, what do threat trees help to identify?

<p>Threats related to privacy (A)</p> Signup and view all the answers

Within the LINDDUN framework, the ability to link two or more pieces of information related to an individual is known as ______.

<p>linkability</p> Signup and view all the answers

Detectability, in the context of LINDDUN threat categories, refers to ensuring someone cannot deny the validity of their actions or transactions.

<p>False (B)</p> Signup and view all the answers

What does 'Unawareness' refer to as a threat category in LINDDUN?

<p>Lack of awareness regarding data rights (C)</p> Signup and view all the answers

________ is known as failing to comply with data privacy regulations.

<p>Non-compliance</p> Signup and view all the answers

What inference can be drawn from household electricity consumption data collected by smart energy meters?

<p>Sensitive personal attributes. (A)</p> Signup and view all the answers

Match each electricity smart metering domain with its characteristic:

<p>User domain = Under control of the user, e.g., user devices Service domain = Outside the control of the user, e.g., backend systems</p> Signup and view all the answers

Which data should be collected in the service domain?

<p>Only collect necessary data (C)</p> Signup and view all the answers

What is the purpose of threat modeling?

<p>Systematically thinking about negative scenarios</p> Signup and view all the answers

Which of the following techniques can be used for threat modeling??

<p>All of the above (D)</p> Signup and view all the answers

In the context of data distribution in the architecture of a smart metering system, what does 'data sharing' refer to?

<p>Sharing data with third-party entities (B)</p> Signup and view all the answers

What is the first thing to consider in addressing the threats/risks?

<p>Keeping data out of the service domain (D)</p> Signup and view all the answers

Take clear user consent with pre-ticked boxes.

<p>False (B)</p> Signup and view all the answers

At what level of project should take privacy considerations?

<p>All levels</p> Signup and view all the answers

Match activities to the stage of the project:

<p>Classify entities in domain = 1 Identify necessary data = 2 Distribute data in architecture = 3 Select appropriate technologies = 4</p> Signup and view all the answers

Flashcards

What is Data?

Facts and statistics collected for reference or analysis.

What is Qualitative data?

Description that refers to the quality of something (e.g., color, texture, feel of an item, ...)

What is Quantitative data?

Description of something in numbers (e.g., number, size, price of an item, ...).

What is Categorical data?

Data representing categories or groups without any inherent order (e.g., used/unused, yes/no).

Signup and view all the flashcards

What is Discrete data?

Consists of distinct, separate values (e.g., number of books on a shelf).

Signup and view all the flashcards

What is Continuous data?

Data that can take any value within a given range (e.g., height or weight).

Signup and view all the flashcards

What is Unstructured data?

Lack of a fixed underlying structure, difficult for machines to parse.

Signup and view all the flashcards

What is Structured data?

Organized with a fixed underlying structure, easy for machines to parse.

Signup and view all the flashcards

What is Data privacy?

The claim of individuals to determine when, how, and to what extent information about them is revealed.

Signup and view all the flashcards

What is privacy as contextual integrity?

Appropriate information flows that conform with contextual information norms.

Signup and view all the flashcards

What are GDPR's data protection principles?

Transparency, purpose, proportionality, and accountability as outlined in the GDPR.

Signup and view all the flashcards

What do users expect from companies regarding data requests?

ensuring companies request only data needed to deliver service.

Signup and view all the flashcards

What do users want to know about data handling?

Users want to know who accesses their data, how, and why.

Signup and view all the flashcards

What is the user control of personal data?

In short, users expect to stay in control of their personal data.

Signup and view all the flashcards

What does Article 25 of the GDPR state?

Implement data-protection principles to meet regulatory requirements & protect data subjects.

Signup and view all the flashcards

What is the overarching goal of Data Privacy by Design?

Minimizing privacy risks and trust assumptions on other parties

Signup and view all the flashcards

What are the six strategies to minimize data privacy risks?

Collection, Disclosure, Linkability, Centralization, Replication, Retention

Signup and view all the flashcards

What does LINDDUN 'Elicit threats' involve?

Maps threats to DFD and identify threats using threat trees.

Signup and view all the flashcards

What is involved in LINDDUN 'Manage threats'?

Prioritize with a DPO and mitigate using a taxonomy of PETs.

Signup and view all the flashcards

What is Linkability in LINDDUN?

The ability to link two or more pieces of information related to an individual.

Signup and view all the flashcards

What is Identifiability in LINDDUN?

The ability to identify an individual from a dataset.

Signup and view all the flashcards

What is Non-repudiation in LINDDUN?

Ensuring someone cannot deny the validity of their actions or transactions.

Signup and view all the flashcards

What is Detectability in LINDDUN?

The possibility to detect the presence of data or actions without their explicit presence.

Signup and view all the flashcards

What is Disclosure of information in LINDDUN?

Accidental or unintended disclosure of sensitive information.

Signup and view all the flashcards

What is Unawareness in LINDDUN?

Lack of awareness regarding the rights of individuals or the sensitivity of data.

Signup and view all the flashcards

What is Non-compliance in LINDDUN?

Failing to comply with data privacy regulations.

Signup and view all the flashcards

What are the privacy risks with Smart energy meters?

Inference of sensitive personal attributes from household consumption.

Signup and view all the flashcards

What is Threat modeling?

Techniques to think systematically about negative privacy scenarios .

Signup and view all the flashcards

What should you do to 'Identify necessary data'?

Assess data sensitivity, minimize collection, anonymize where possible.

Signup and view all the flashcards

Study Notes

  • École Pour l'Informatique et les Techniques Avancées – EPITA is running a master's program in May-June 2024.
  • The instructor for the "Data Privacy by Design" course is M. Salman Nadeem ([email protected]).
  • The class uses a Creative Commons Attribution 4.0 International License.

Course Schedule

  • 25/05/2024 (G1) and 01/06/2024 (G2): Introduction and DPbD fundamentals with case studies (3 hours).
  • 07/06/2024 (G2, G1): Data privacy risks, Crypto. Package, Data masking (Anonymization vs Pseudonymisation) (3 hours).
  • 22/06/2024 (G2, G1): Privacy Enhancing Technologies (PETs), DPbD, and General Data Protection Regulation (GDPR) (3 hours).
  • 28/06/2024 (G2, G1): Recap and Conclusion (3 hours).
  • 12/07/2024 (G2, G1): Final evaluation (3 hours). Total course duration is 15 hours.

Grading Criteria

  • Class participation (attendance & reactivity): 10%.
  • Class activities (quizzes): 30%.
  • Final project (individual report & group presentation): 60%.

Notes & Collaboration

  • The course uses a MS Teams Channel called 'Data Privacy by Design – Spring 2024' for collaboration, using the code 'jp6i9h1'.
  • The channel publishes announcements, provides course slides/material, and manages assignments/projects.
  • A course mindmap is available for better organization and refreshing of course topics.

Lecture 1 Outline

  • Data & its types.
  • Data privacy.
  • Data Privacy by Design (PbD) principles.
  • Data PbD goal, strategies & methodology.
  • Case study.
  • Quiz.
  • The course involves threat modeling, data masking, GDPR compliance, and a final project.

Data

  • Data are facts and statistics collected for reference or analysis (Oxford dictionary definition).
  • Data is all around us.
  • Data transforms into Information, Knowledge, and Wisdom, depicted in the DIKW pyramid.

Types of Data

  • Qualitative data: Description referring to the quality of something.
  • Quantitative data: Description in numbers.
  • Categorical data: Categories or groups without inherent order.
  • Discrete data: Distinct, separate values (e.g., count of items).
  • Continuous data: Any value within a range (e.g., measurements).

Unstructured vs Structured Data

  • Unstructured data is easy for humans to understand but not for machines, lacking a fixed underlying structure.
  • PDFs and scanned images are examples of unstructured data.
  • Structured data is machine-readable: CSV format is an example
  • There are many more formats out there that are structured and machine readable e.g., https://opendatahandbook.org/guide/en/appendices/file-formats
  • JSON is another example of a structured data format

Data Privacy Definitions

  • Data privacy is the claim of individuals to determine when, how, and to what extent their information is communicated to others (Westin, 1970).
  • "Privacy as contextual integrity" involves appropriate information flows conforming with contextual information norms (Nissembaum, 2004).
  • Legal frameworks include GDPR (transparency, purpose, proportionality, accountability) and ECHR Art 8 (respect for private and family life).

Privacy by Design (PbD) Principles

  • Proactive not Reactive; Preventive not remedial.
  • Privacy as the default setting.
  • Privacy Embedded into Design.
  • Full functionality – Positive-Sum, not Zero-sum.
  • End-to-end security – Full lifecycle protection.
  • Visibility and transparency – keep it open.
  • Respect for user privacy – keep it user-centric.

An Obligation

  • Users expect transparency regarding the use and access of their data, handled with care and security.
  • Organizations need to cover both LAW requirements and USERS' expectations
  • GDPR Article 25 mandates data protection by design and by default.

Data Privacy by Design & by Default

  • Minimizing privacy risks and trust assumptions placed on other entities/parties.
  • Six strategies: Minimize Collection, Minimize Disclosure, Minimize Linkability, Minimize Centralization, Minimize Replication, Minimize Retention.

LINDDUN

  • LINDDUN (https://linddun.org/) is a systematic threat assessment methodology.
  • The threat analysis process includes modeling the system, eliciting threats, and managing threats.

LINDDUN Threat Categories

  • Linkability: The ability to link pieces of information related to an individual.
  • Identifiability: The ability to identify an individual from a dataset.
  • Non-repudiation: Ensuring actions or transactions cannot be denied.
  • Detectability: The possibility to detect the presence of data or actions.
  • Disclosure of information: Accidental or unintended release of sensitive information.
  • Unawareness: Lack of awareness regarding the rights of individuals or the availability/sensitivity of data.
  • Non-compliance: Failing to comply with data privacy regulations.

Case study 1: Electricity Smart Metering System

  • Smart energy meters record household consumption every 30 minutes.
  • Privacy Risks involves inferences of sensitive personal attributes.
  • Requirements: accurate billing, aggregate statistics, and fraud/tampering detection.

Case Study 1 Iterations

  • Starting assumptions: Functionality defined.
  • Explicit activities: Actionable steps.
  • Use an agile approach with multiple iterations.

Electricity Smart Metering

  • The user domain elements are trusted.
  • The service domain is non-trusted.

Data Identification

  • User domain: Personal, billing, and consumption data.
  • Service domain: Personal, billing, consumption, and transaction logs.

Data Distribution Analysis

  • Electricity smart metering system data distribution exists in domains.

Threat Modeling & the Data Lifecycle

  • A strategy to evaluate systematically thinking about negative scenarios.
  • Techniques like STRIDE, STRIPED, and LINDDUN are used.
  • Risk analysis uses a likelihood vs. impact assessment.

Technological Solutions

  • Address main threats/risks: Keep data out of service domain, use transport encryption, process data with obfuscation, enhance privacy technologies.
  • Apply the six strategies (minimize collection, disclosure, linkability, centralization, replication, and retention).

Methodology Freedom

  • No fixed methodology is prescribed.
  • Take privacy considerations and perform risk management.
  • Assert data subject rights and integrate appropriate controls.
  • Focus on raising transparency of service/product.

Project Kick-off

  • Form groups (3-5 members)
  • Select project
  • The Individual report includes: dataset explanation, creation of system reference diagram, perform threat modeling using LINDDUN categories.
  • The deadline to complete the assignment should be found on Teams "Assignment section"

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Related Documents

More Like This

Use Quizgecko on...
Browser
Browser