🎧 New: AI-Generated Podcasts Turn your study notes into engaging audio conversations. Learn more

Lecture_05.pdf

Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...

Full Transcript

Law ISEC411: Privacy & Anonymity Course Instructor: Dr. Hanane LAMAAZI Outline Review: HIPAA privacy rule Attribute classification K-anonymity definition Topic 21: Data Privacy 2 WHAT IS A GOOD...

Law ISEC411: Privacy & Anonymity Course Instructor: Dr. Hanane LAMAAZI Outline Review: HIPAA privacy rule Attribute classification K-anonymity definition Topic 21: Data Privacy 2 WHAT IS A GOOD LAW? Consent is a priority if we can obtain consent from subject, then this is the most ethical approach If not feasible, then anonymity Meaning: make the data anonymized (identity is hidden) No need for consent in that case? Topic 21: Data Privacy 3 HIPAA privacy rule - HIPAA is Health Insurance Portability and Accountability -Adopted in 1996 -Restricts the use and disclosure of (personal health information (PHI) that pertains to a patient, consumer, or healthcare services. -Applies to covered entities (hospitals, nursing homes, daycare centers) -Defines 3 standards for data sharing HIPAA privacy rule HIPAA Privacy Rule Allows Secondary Uses of Data under one of 3 standards: Identified patient data: if patients consent to the data use consent Limited data set: - Removal of 16 attributes (the direct identifiers) - Recipient signs data use contract De-identified Data: - Option 1: Safe Harbor (18 attributes* ) - Option 2: Expert Determination anonymization * The 18 attributes of HIPAA are referred to as “Protected health information” or PHI Malin et al Malin et al Expert Determination (statistical principle) 8 Classification of Attributes We already know one type How can we classify the rest? Classification of Attributes Direct identifiers (key identifiers, unique identifiers, PII): Name, Address, Cell Phone which can uniquely identify an individual directly Always removed before release. Quasi-Identifier: 5-digit ZIP code, Birth date, gender A set of attributes that (when taken together) can be potentially linked with external information to re-identify entities Classification of Attributes(Cont’d) Sensitive Attribute: Medical record, credit card number, etc. It is the information that subjects would like to keep confidential Assumption: not known to attacker Not used in the re-identification Always released directly. These attributes is what the researchers need. It depends on the requirement. Example 1 EID name Profession Marital- Gender Work Hypertens Salary status hours ion 123 X lawyer D M 35 n 40k 456 Y teacher D F 40 y 50k 789 Z teacher D M 35 n 100k 134 W doctor M M 35 y 250k 145 P unemploy M F 50 n 5k ed 167 Q nurse S M 40 n 50k Table type? Which attributes are the direct, QI, sensitive? How would you classify salary Give an example of another attribute value that is hard to classify Down syndrome K-anonymity Expert-determination Topic 21: Data Privacy 16 Motivating example 1 (dataset of all people taking ISEC 411) name age area Marital status john 21 Al ain married Salem 22 Dubai single jana 21 Dubai married joe 21 Al ain Single marie 21 Al ain single Assume I have a database (above) of all students taking ISEC 411 for fall 2018 containing their age and location and marital status Which of the following statements can be released? A 21 year old student taking ISEC411 has family issues. A 21 year old student, living in Al-Ain and taking ISEC411 has family issues A 21 year old student, married and taking ISEC411 has family issues A 21 year old student, married, living in Al-Ain, and taking ISEC411 has family issues 18 Motivating example 1 (dataset of all people taking ISEC 411) name age area john 21 Al ain married Salem 22 Dubai single jana 21 Dubai married joe 21 Al ain Single marie 21 Al ain single Assume I have a database (above) of all students taking ISEC 411 for fall 2023 containing their age and location and marital status How many members of the population correspond to each statement A 21 year old student, married and taking ISEC411 has family issues (john and Jana) A 21 year old student, married, living in Al-Ain, and taking ISEC411 has family issues (john) 19 K-Anonymity Sweeny came up with a formal protection model named k- anonymity What is K-Anonymity? If the information for each person contained in the released dataset cannot be distinguished from at least k-1 individuals whose information also appears in the release. What does this mean???? K-Anonymity Consider the database below with birth year, gender and disease. 1995 M cancer 1996 M flu 1995 M flu 1996 F cancer 1996 F cancer 1996 F cancer 1996 F cancer 1996 F cancer 1998 F flu 1998 F flu 1998 F flu 1996 M cancer 1. What are the QI’s? 2. How many unique combinations of QI’s do we have? 3. What is the count of the records in each combination? 4. If I know that john is born in 1995 and is in the database How many people correspond to john?

Use Quizgecko on...
Browser
Browser