A Right To Reasonable Inferences: Re-Thinking Data Protection Law In The Age Of Big Data And AI PDF
Document Details
Uploaded by ImportantResilience
University of Oxford
Sandra Wachter & Brent Mittelstadt
Tags
Summary
This article discusses the limitations of current data protection laws in addressing the risks of inferential analytics in the age of big data and AI. It argues for a new right to reasonable inferences, requiring ex-ante justification for inferences drawn from personal data, and mechanisms for challenging unreasonable inferences.
Full Transcript
A RIGHT TO REASONABLE INFERENCES: RE-THINKING DATA PROTECTION LAW IN THE AGE OF BIG DATA AND AI Sandra Wachter* & Brent Mittelstadt** Big Data analytics and artificial intelligence (AI) draw non-intuitive and unverifiable inferences and predictions about the behaviors, preferences, and private lives...
A RIGHT TO REASONABLE INFERENCES: RE-THINKING DATA PROTECTION LAW IN THE AGE OF BIG DATA AND AI Sandra Wachter* & Brent Mittelstadt** Big Data analytics and artificial intelligence (AI) draw non-intuitive and unverifiable inferences and predictions about the behaviors, preferences, and private lives of individuals. These inferences draw on highly diverse and feature-rich data of unpredictable value, and create new opportunities for discriminatory, biased, and invasive decision-making. Data protection law is meant to protect people’s privacy, identity, reputation, and autonomy, but is currently failing to protect data subjects from the novel risks of inferential analytics. The legal status of inferences is heavily disputed in legal scholarship, and marked by inconsistencies and contradictions within and between the views of the Article 29 Working Party and the European Court of Justice (ECJ). This Article shows that individuals are granted little control or oversight over how their personal data is used to draw inferences about them. Compared to other types of personal data, inferences are effectively “economy class” personal data in the General Data Protection Regulation (GDPR). Data subjects’ rights to know about (Articles 13–15), * Corresponding author. E-mail: [email protected]. Oxford Internet Institute, University of Oxford, 1 St. Giles, Oxford, OX1 3JS, UK; the Alan Turing Institute, British Library, 96 Euston Road, London, NW1 2DB, UK. ** Oxford Internet Institute, University of Oxford, 1 St. Giles, Oxford, OX1 3JS, UK; the Alan Turing Institute, British Library, 96 Euston Road, London, NW1 2DB, UK. The authors would like to thank Prof. Viktor Mayer-Schönberger and Dr. Christopher Russell for their incredibly detailed and thoughtful feedback that has immensely improved the quality of this work. The authors would also like to thank Dr. Alessandro Spina, Prof. Manfred Stelzer, Prof. Lee Bygrave, and Dr. Patrick Allo for their insightful and considerate comments from which this Article greatly benefitted. No. 2:494] A RIGHT TO REASONABLE INFERENCES 495 rectify (Article 16), delete (Article 17), object to (Article 21), or port (Article 20) personal data are significantly curtailed for inferences. The GDPR also provides insufficient protection against sensitive inferences (Article 9) or remedies to challenge inferences or important decisions based on them (Article 22(3)). This situation is not accidental. In standing jurisprudence the ECJ has consistently restricted the remit of data protection law to assessing the legitimacy of input personal data undergoing processing, and to rectify, block, or erase it. Critically, the ECJ has likewise made clear that data protection law is not intended to ensure the accuracy of decisions and decision-making processes involving personal data, or to make these processes fully transparent. Current policy proposals addressing privacy protection (the ePrivacy Regulation and the EU Digital Content Directive) and Europe’s new Copyright Directive and Trade Secrets Directive also fail to close the GDPR’s accountability gaps concerning inferences. This Article argues that a new data protection right, the “right to reasonable inferences,” is needed to help close the accountability gap currently posed by “high risk inferences,” meaning inferences drawn from Big Data analytics that damage privacy or reputation, or have low verifiability in the sense of being predictive or opinion-based while being used in important decisions. This right would require ex-ante justification to be given by the data controller to establish whether an inference is reasonable. This disclosure would address (1) why certain data form a normatively acceptable basis from which to draw inferences; (2) why these inferences are relevant and normatively acceptable for the chosen processing purpose or type of automated decision; and (3) whether the data and methods used to draw the inferences are accurate and statistically reliable. The ex-ante justification is bolstered by an additional ex-post mechanism enabling unreasonable inferences to be challenged. I. II. Introduction................................................................. 497 From Explanations to Reasonable Inferences.......... 502 496 COLUMBIA BUSINESS LAW REVIEW [Vol. 2019 A. The Novel Risks of Inferential Analytics and a Right to Reasonable Inferences........................... 505 III. Are Inferences Personal Data?................................... 515 A. Three-Step Model.................................................. 517 B. Subjectivity and Verifiability............................... 519 IV. Jurisprudence of the European Court of Justice...... 521 A. Joined Cases C-141/12 and C-372/12: YS and M and S.................................................... 521 1. Inferences as Personal Data........................... 522 2. Remit of Data Protection Law........................ 527 B. Case C-434/16: Nowak.......................................... 531 1. Inferences as Personal Data........................... 532 2. Remit of Data Protection Law........................ 533 C. Lessons from Jurisprudence of the ECJ............. 537 V. Protection Against Inferences Under Data Protection Law................................................... 542 A. The Right to Know About Inferences.................. 543 B. The Right to Rectify Inferences........................... 548 C. The Rights to Object to and Delete Inferences... 550 D. Protections Against Sensitive Inferences........... 560 1. Can Inferences Be Sensitive Personal Data?................................................. 561 2. Intentionality and Reliability......................... 564 E. The Right to Contest Decisions Based on Inferences.............................................................. 568 VI. Re-Aligning the Remit of Data Protection Law in the Age of Big Data: A Right to Reasonable Inferences................................................. 572 A. Justification to Establish Acceptability, Relevance, and Reliability.................................... 581 B. Contestation of Unreasonable Inferences........... 588 VII. Barriers to a Right to Reasonable Inferences: IP Law and Trade Secrets.......................................... 591 A. Algorithmic Models and Statistical Purposes in the GDPR.......................................... 592 B. Algorithmic Models and the EU’s Copyright Directive.............................................. 600 C. Algorithmic Models and Outcomes and Intellectual Property Law.................................... 604 No. 2:494] A RIGHT TO REASONABLE INFERENCES 497 D. Algorithmic Models and Outcomes and Trade Secrets........................................................ 606 VIII. Conclusion and Recommendations............................ 610 A. Re-Define the Remit of Data Protection Law..... 614 B. Focus on How Data is Evaluated, Not Just Collected................................................. 615 C. Do Not Focus Only on the Identifiability of Data Subjects........................................................ 616 D. Justify Data Sources and Intended Inferences Prior to Deployment of Inferential Analytics at Scale.............................. 617 E. Give Data Subjects the Ability to Challenge Unreasonable Inferences...................................... 619 I. INTRODUCTION Big Data analytics and artificial intelligence (“AI”) draw non-intuitive and unverifiable inferences and predictions about the behaviors, preferences, and private lives of individuals. These inferences draw on highly diverse and feature-rich data of unpredictable value and create new opportunities for discriminatory, biased, and privacy-invasive profiling and decision-making.1 Inferential analytics methods are used to infer user preferences, sensitive attributes (e.g., race, gender, sexual orientation), and opinions (e.g., political stances), or to predict behaviors (e.g., to serve advertisements). These methods can be used to nudge or manipulate us, or to make important decisions (e.g., loan or employment decisions) about us. The intuitive link between actions and perceptions is being eroded, leading to a loss of control over identity and how individuals are perceived by others. Concerns about algorithmic accountability are often actually concerns about the way in which these technologies draw privacy-invasive and non-verifiable inferences that cannot be predicted, understood, or refuted. 1 See Brent Daniel Mittelstadt, Patrick Allo, Mariarosaria Taddeo, Sandra Wachter & Luciano Floridi, The Ethics of Algorithms: Mapping the Debate, BIG DATA & SOC’Y, July–Dec. 2016, at 1–2. 498 COLUMBIA BUSINESS LAW REVIEW [Vol. 2019 Data protection law is meant to protect people’s privacy, identity, reputation, and autonomy, but it is currently failing to protect data subjects from the novel risks of inferential analytics. The broad concept of personal data in Europe could be interpreted to include inferences, predictions, and assumptions that refer to or impact an individual. If seen as personal data, individuals would be granted numerous rights under data protection law. However, the legal status of inferences is heavily disputed in legal scholarship, and marked by inconsistencies and contradictions within and between the views of the Article 29 Working Party 2 and the European Court of Justice. It is crucial to note, however, that the question of whether inferences are personal data is not the most important one. The underlying problem goes much deeper and relates to the tension of whether individuals have rights, control, and recourse concerning how they are seen by others. 2 It is worth noting that as of the implementation of the General Data Protection Regulation (“GDPR”) on May 25, 2018, the Article 29 Working Party has ceased to exist and has been succeeded by the European Data Protection Board (“EDPB”). See European Data Prot. Bd., The European Data Protection Board, Endorsement 1/2018 (May 25, 2018), https://edpb.europa.eu/sites/edpb/files/files/news/endorsement_of_ wp29_documents.pdf [https://perma.cc/8H9A-RQR3]. One of the first acts of the EDPB was to adopt the positions and papers drafted by the Article 29 Working Party pertaining to the GDPR. For a full list of adopted documents, see id. Only one set of guidelines produced by the EDPB between May 25, 2018 and April 2019 are relevant to the topics addressed herein. See European Data Prot. Bd., Guidelines 2/2019 on the Processing of Personal Data Under Article 6(1)(b) GDPR in the Context of the Provision of Online Services to Data Subjects (Apr. 8, 2019), https://privacyblogfullservice.huntonwilliamsblogs.com/wpcontent/uploads/sites/28/2019/04/edpb_draft_guidelines-art_6-1-bfinal_public_consultation_version_en.pdf [https://perma.cc/R3GD-J75Y]. This Article therefore continues to focus on the opinions, guidelines, and working papers of the Article 29 Working Party, which remain a key source of interpretation for the GDPR and the preceding 1995 Data Protection Directive and have proven influential in standing jurisprudence of the European Court of Justice pertaining to data protection law. It is of course likely that in the future the EDPB will adopt additional positions in support of or contradictory to the views of the Article 29 Working Party, which may be relevant to the analysis carried out here. No. 2:494] A RIGHT TO REASONABLE INFERENCES 499 This Article will show that individuals are granted little control and oversight over how their personal data is used to draw inferences about them. Compared to other types of personal data, inferences are effectively “economy class” personal data in the General Data Protection Regulation (“GDPR”). Data subjects’ rights to know about (Art. 13–15), rectify (Art. 16), delete (Art. 17), object to (Art. 21), or port (Art. 20) personal data are significantly curtailed when it comes to inferences, often requiring a greater balance with the controller’s interests (e.g., trade secrets or intellectual property) than would otherwise be the case. Similarly, the GDPR provides insufficient protection against sensitive inferences (Art. 9) or remedies to challenge inferences or important decisions based on them (Art. 22(3)). This situation is not accidental. In standing jurisprudence, the European Court of Justice (“ECJ”) 3 and the Advocate General (“AG”)4 have consistently restricted the remit of data protection law to assessing the legitimacy of the input stage of personal data processing, including rectification and erasure of inputs, and objecting to undesired processing.5 Critically, the ECJ has likewise made clear that data protection law is not intended to ensure the accuracy of decisions and decision-making processes involving personal data, or to make these processes fully transparent. In short, data subjects have control over how their personal data is collected and processed, but very little control over how it is evaluated. The ECJ makes clear that if the data subject wishes to challenge their evaluation, recourse must be sought 3 See Case C–28/08 P, European Comm’n v. Bavarian Lager Co., 2010 E.C.R. I–6055, ¶¶ 49–50; Case C–434/16, Peter Nowak v. Data Prot. Comm’r, 2017 E.C.R. I-994, ¶¶ 54–55; Joined Cases C–141 & 372/12, YS, M and S v. Minister voor Immigratie, Integratie en Asiel, 2014 E.C.R. I-2081, ¶¶ 45–47. 4 Case C–434/16, Peter Nowak v. Data Prot. Comm’r, 2017 E.C.R. I582, ¶¶ 54–58; Joined Cases C–141 & 372/12, YS, M and S v. Minister voor Immigratie, Integratie en Asiel, 2013 E.C.R. I-838, ¶¶ 32, 54–60. 5 See, e.g., Case C–553/07, College van burgemeester en wethouders van Rotterdam v. M.E.E. Rijkeboer, 2009 E.C.R. I-293, ¶¶ 48–52. 500 COLUMBIA BUSINESS LAW REVIEW [Vol. 2019 through sectoral laws applicable to specific cases, not data protection law.6 Conflict looms on the horizon in Europe that will further weaken the protection afforded to data subjects against inferences. Current policy proposals addressing privacy protection—the ePrivacy Regulation and the EU Digital Content Directive—fail to close the GDPR’s accountability gaps concerning inferences. At the same time, the GDPR and Europe’s new Copyright Directive aim to facilitate data mining, knowledge discovery, and Big Data analytics by limiting data subjects’ rights over personal data. And lastly, the new Trade Secrets Directive provides extensive protection of commercial interests attached to the outputs of these processes (e.g., models, algorithms and inferences). This Article argues that a new data protection right, the “right to reasonable inferences,” is needed to help close the accountability gap currently posed by “high-risk inferences,” meaning inferences drawn through Big Data analytics that are privacy-invasive or reputation-damaging, or have low verifiability in the sense of being predictive or opinion-based while being used for important decisions.7 In cases where algorithms draw “high-risk inferences” about individuals, this right would require the data controller to provide ex-ante justification to establish that the inference to be drawn is See supra note 3. “Important” in this context refers to the existence of “legal or similarly significant effects” resulting from a given decision. This notion is derived from Article 22(1) of the GDPR regarding automated decision-making. See Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the Protection of Natural Persons with Regard to the Processing of Personal Data and on the Free Movement of Such Data, and Repealing Directive 95/46/EC (General Data Protection Regulation), 2016 O.J. (L119) art. 22(1). The precise scope of “legal or similarly significant effects” remains unclear in practice, though it will be clarified as the GDPR matures via legal commentary, national implementation, and jurisprudence. See generally Sandra Wachter, Normative Challenges of Identification in the Internet of Things: Privacy, Profiling, Discrimination, and the GDPR, 34 COMPUTER L. & SECURITY REV. 436 (2018); Sandra Wachter, The GDPR and the Internet of Things: A Three-Step Transparency Model, 10 LAW INNOVATION & TECH. 266 (2018). 6 7 No. 2:494] A RIGHT TO REASONABLE INFERENCES 501 reasonable. This disclosure would address (1) why certain data form a normatively acceptable basis from which to draw inferences; (2) why these inferences are relevant and normatively acceptable for the chosen processing purpose or type of automated decision; and (3) whether the data and methods used to draw the inferences are accurate and statistically reliable. The ex-ante justification would be bolstered by an additional ex-post mechanism enabling unreasonable inferences to be challenged. A right to reasonable inferences must, however, be reconciled with EU jurisprudence and counterbalanced with intellectual property (“IP”) and trade secrets law, as well as with freedom of expression8 and Article 16 of the EU Charter of Fundamental Rights9—the freedom to conduct a business. Part II first examines gaps in current work on algorithmic accountability before reviewing the novel risks of Big Data analytics and algorithmic decision-making that necessitate the introduction of a right to reasonable inferences. For such a right to be feasible under data protection law, inferences must be shown to be personal data. Part III reviews the position of the Article 29 Working Party on the legal status of inferences. Part IV then contrasts this with jurisprudence of the European Court of Justice, which paints a more restrictive picture of the scope of personal data and the remit of data protection law. Part V then assesses the current legal protection granted to inferences under European data protection laws. With the legal status and limited protection granted to inferences established, Part VI then describes the aims and scope of the proposed “right to reasonable inferences.” Part VII then examines barriers likely to be encountered in the implementation of the proposed right, drawing from data protection law, as well as IP law and the 8 See JORIS VAN HOBOKEN, SEARCH ENGINE FREEDOM: ON THE IMPLICATIONS OF THE RIGHT TO FREEDOM OF EXPRESSION FOR THE LEGAL GOVERNANCE OF WEB SEARCH ENGINES 316–32 (2012); see also JORIS VAN HOBOKEN, THE PROPOSED RIGHT TO BE FORGOTTEN SEEN FROM THE PERSPECTIVE OF OUR RIGHT TO REMEMBER (2013). 9 Charter of Fundamental Rights of the European Union, 2000 O.J. (C364) 1. 502 COLUMBIA BUSINESS LAW REVIEW [Vol. 2019 new EU Trade Secrets Directive. In Part VIII, the Article concludes with recommendations on how to re-define the remit of data protection law to better guard against the novel risks of Big Data and AI. In the same way that it was necessary to create a “right to be forgotten” in a Big Data world,10 it is now necessary to create a “right on how to be seen.” II. FROM EXPLANATIONS TO REASONABLE INFERENCES Recent years have seen a flurry of work addressing explainability as a means to achieve accountability in algorithmic decision-making systems.11 This work has taken many forms, including calls for regulation,12 development of 10 See generally VAN HOBOKEN, THE PROPOSED RIGHT TO BE FORGOTTEN, supra note 8; see also VIKTOR MAYER-SCHÖNBERGER, DELETE: THE VIRTUE OF FORGETTING IN THE DIGITAL AGE (2009). 11 See, e.g., FRANK PASQUALE, THE BLACK BOX SOCIETY: THE SECRET ALGORITHMS THAT CONTROL MONEY AND INFORMATION (2015); Joshua A. Kroll et al., Accountable Algorithms, 165 U. PA. L. REV. 633 (2017); Tim Miller, Explanation in Artificial Intelligence: Insights from the Social Sciences, ARTIFICIAL INTELLIGENCE, Feb. 2019, at 1; Brent Mittelstadt, Chris Russell & Sandra Wachter, Explaining Explanations in AI, in FAT* ‘19: CONFERENCE ON FAIRNESS, ACCOUNTABILITY, AND TRANSPARENCY (FAT* ’19), JANUARY 29–31, 2019, ATLANTA, GA, USA 279 (2019); S. C. Olhede & P.J. Wolfe, The Growing Ubiquity of Algorithms in Society: Implications, Impacts and Innovations, PHIL. TRANSACTIONS ROYAL SOC’Y A, Aug. 6, 2018, at 8; Sandra Wachter, Brent Mittelstadt & Luciano Floridi, Why a Right to Explanation of Automated Decision-Making Does Not Exist in the General Data Protection Regulation, 7 INT’L DATA PRIVACY L. 76 (2017); Sandra Wachter, Brent Mittelstadt & Chris Russell, Counterfactual Explanations Without Opening the Black Box: Automated Decisions and the GDPR, 31 HARV. J.L. & TECH. 841 (2018); Finale Doshi-Velez & Mason Kortz, Accountability of AI Under the Law: The Role of Explanation (Berkman Klein Ctr. Working Grp. on Explanation and the Law Working Paper, 2017); see also Jenna Burrell, How the Machine ‘Thinks’: Understanding Opacity in Machine Learning Algorithms, BIG DATA & SOC’Y, Jan.–June 2016, at 1 (describing sources of algorithmic opacity). 12 See, e.g., Marion Oswald, Algorithm-Assisted Decision-Making in the Public Sector: Framing the Issues Using Administrative Law Rules Governing Discretionary Power, PHIL. TRANSACTIONS ROYAL SOC’Y A, Aug. 6, No. 2:494] A RIGHT TO REASONABLE INFERENCES 503 technical methods of explanation13 and auditing mechanisms,14 and setting of standards for algorithmic accountability in public and private institutions.15 These diverse streams of work are essential in the quest to increase AI accountability and fortunately have made much progress in legal, ethical, policy, and technical terms. Yet each is still united by a common blind spot: a legal or ethical basis is required to justify demands for explanations and determine their required content.16 As a result, much of the prior work on methods, standards, and other scholarship around explanations will be valuable in an academic or developmental sense, but will fail to actually help the intended beneficiaries of algorithmic accountability: people affected by algorithmic decisions. Unfortunately, there is little reason to assume that organizations will voluntarily offer full explanations covering the process, justification for, and accuracy of algorithmic decision-making unless obliged to do so. These systems are often highly complex, involve (sensitive) personal data, and 2018, at 1, 3; Andrew Tutt, An FDA for Algorithms, 69 ADMIN. L. REV. 83 (2017). 13 See, e.g., Mittelstadt et al., supra note 11. 14 See, e.g., Brent Mittelstadt, Auditing for Transparency in Content Personalization Systems, 10 INT’L J. COMM. 4991 (2016); Pauline T. Kim, Essay, Auditing Algorithms for Discrimination, 166 U. PA. L. REV. ONLINE 189 (2017). 15 See, e.g., European Parliament Resolution of 16 February 2017 with Recommendations to the Commission on Civil Law Rules on Robotics (2015/2013(INL)), EUR. PARL. DOC. P8_TA(2017)0051, http://www.europarl.europa.eu/sides/getDoc.do?pubRef=//EP//NONSGML+TA+P8-TA-2017-0051+0+DOC+PDF+V0//EN [https://perma.cc/9H5H-W2UE]; NAT’L SCI. & TECH. COUNCIL COMM. ON TECH., EXEC. OFFICE OF THE PRESIDENT OF THE UNITED STATES, PREPARING FOR THE FUTURE OF ARTIFICIAL INTELLIGENCE 30–34 (2016); HOUSE OF COMMONS SCI. & TECH. COMM., HC 351, ALGORITHMS IN DECISION-MAKING 24–31, 39–40 (2018) (UK); Corinne Cath, Sandra Wachter, Brent Mittelstadt, Mariarosaria Taddeo & Luciano Floridi, Artificial Intelligence and the ‘Good Society’: The US, EU, and UK Approach, 24 SCI. & ENGINEERING ETHICS 505 (2018). 16 For an exploration of norms around explanation, see Doshi-Velez et al., supra note 11, at 3–6. 504 COLUMBIA BUSINESS LAW REVIEW [Vol. 2019 use methods and models considered to be trade secrets. Providing explanations thus imposes additional costs and risks for the organization. Where a general legal or ethical justification for explanations of algorithmic decisions does not exist,17 requests will require alternative grounds to be successful.18 This Article refers to these potential grounds for demanding information about an automated decision-making process as legal or ethical “decision-making standards.” Such standards define certain procedures that must be followed in particular decision-making processes and can be enshrined in individual rights, sectoral laws, or other regulatory instruments. Decision-making standards are not typically embedded in an absolute right that would require the full decision-making procedure to be disclosed; it remains, for example, within the private autonomy of the employer to make hiring decisions. Rather, decision-making standards provide grounds to demand limited explanations detailing the steps of a decisionmaking process necessary to determine whether the procedures in question were followed. So, for example, a job applicant may have a right to certain standards being followed within that procedure, such as not basing the hiring decision on a protected attribute (e.g., ethnicity) because doing so would constitute discrimination. Nonetheless, granting explanations is only one possible way forward in making algorithmic decision-making accountable. Explanations can provide an effective ex-post remedy, but an explanation can be rendered only after a 17 The GDPR’s right to explanation, even if legally binding, would be limited to decision-making based solely on automated processing with legal or similarly significant effects. These conditions significantly limit its potential applicability. See Wachter, Mittelstadt & Floridi, supra note 11, at 78; see also Article 29 Data Prot. Working Party, Guidelines on Automated Individual Decision-Making and Profiling for the Purposes of Regulation 2016/679, 17/EN, WP251rev.01, at 19 (Feb. 6, 2018), http://ec.europa.eu/newsroom/article29/document.cfm?doc_id=49826 (on file with the Columbia Business Law Review). 18 Doshi-Velez et al., supra note 11, at 4, for example, suggest that demands for explanation will not be justified unless accompanied by recourse for harm suffered. No. 2:494] A RIGHT TO REASONABLE INFERENCES 505 decision has been made.19 An explanation might inform the individual about the outcome or decision and about underlying assumptions, predictions, or inferences that led to it. It would not, however, ensure that the decision, assumption, prediction, or inference is justified.20 In short, explanations of a decision do not equal justification of an inference or decision. Therefore, if the justification of algorithmic decisions is at the heart of calls for algorithmic accountability and explainability, governance requires both effective ex-ante and ex-post remedies. Individual-level rights are required that would grant data subjects the ability to manage how privacy-invasive inferences are drawn, and to seek redress against unreasonable inferences when they are created or used to make important decisions. A. The Novel Risks of Inferential Analytics and a Right to Reasonable Inferences The following Sections explain how European law is not equipped to protect individuals against the novel risks brought on by automated decision-making driven by inferential analytics. This Article argues that a new right—a right to reasonable inferences—might help to close the accountability gap currently posed by these technologies in Europe.21 To explain why this new right is essential, it is first necessary to establish the source of risks in Big Data analytics and algorithmic decision-making systems. Automated decision-making, profiling, and related machine-learning techniques pose new opportunities for privacy-invasive, discriminatory, and biased decision-making based on See generally Wachter, Mittelstadt & Floridi, supra note 11. See Miller, supra note 11, at 8; see also Mireille Hildebrandt, Primitives of Legal Protection in the Era of Data-Driven Platforms, 2 GEO. L. TECH. REV. 252, 271 (2018). 21 See Wachter, Normative Challenges of Identification in the Internet of Things, supra note 7, at 448; Wachter, The GDPR and the Internet of Things, supra note 7, at 267–71. 19 20 506 COLUMBIA BUSINESS LAW REVIEW [Vol. 2019 inferential analytics.22 Modern data analytics has access to unprecedented volumes and varieties of linked-up data to assess the behaviors, preferences, and private lives of individuals.23 Inferences can be used to nudge and manipulate us. The range of potential victims of these harms is diversified by the focus in modern data analytics on finding small but meaningful links between individuals,24 and constructing group profiles from personal, third-party, and anonymized data.25 Numerous applications of Big Data analytics to draw potentially troubling inferences about individuals and groups have emerged in recent years.26 Major internet platforms are behind many of the highest profile examples: Facebook may be able to infer sexual orientation—via online behavior27 or 22 See Mittelstadt et al., supra note 1, at 7–10. See generally Solon Barocas & Andrew D. Selbst, Big Data’s Disparate Impact, 104 CALIF. L. REV. 671 (2016). 23 See generally VIKTOR MAYER-SCHÖNBERGER & KENNETH CUKIER, BIG DATA: A REVOLUTION THAT WILL TRANSFORM HOW WE LIVE, WORK, AND THINK (2013). See also Brent Daniel Mittelstadt & Luciano Floridi, The Ethics of Big Data: Current and Foreseeable Issues in Biomedical Contexts, 22 SCI. & ENGINEERING ETHICS 303, 304–06 (2016); Tal Z. Zarsky, Understanding Discrimination in the Scored Society, 89 WASH. L. REV. 1375 (2014). 24 See Danielle Keats Citron & Frank Pasquale, The Scored Society: Due Process for Automated Predictions, 89 WASH. L. REV. 1, 2–4 (2014); Peter Grindrod, Beyond Privacy and Exposure: Ethical Issues Within Citizen-Facing Analytics, PHIL. TRANSACTIONS ROYAL SOC’Y A, Dec. 28, 2016, at 10–12. 25 See Alessandro Mantelero, From Group Privacy to Collective Privacy: Towards a New Dimension of Privacy and Data Protection in the Big Data Era, in GROUP PRIVACY: NEW CHALLENGES OF DATA TECHNOLOGIES 139, 145 (Linnet Taylor, Luciano Floridi & Bart van der Sloot eds., 2017); Brent Mittelstadt, From Individual to Group Privacy in Big Data Analytics, 30 PHIL. & TECH. 475, 476 (2017). 26 See, e.g., Christopher Kuner, Fred H. Cate, Christopher Millard & Dan Jerker B. Svantesson, The Challenge of “Big Data” for Data Protection, 2 INT’L DATA PRIVACY L. 47 (2012). 27 See José González Cabañas, Ángel Cuevas & Rubén Cuevas, Facebook Use of Sensitive Data for Advertising in Europe (Feb. 14, 2018) (unpublished manuscript), https://arxiv.org/abs/1802.05030 [https://perma. cc/V2C8-FY3W]. No. 2:494] A RIGHT TO REASONABLE INFERENCES 507 based on friends28—and other protected attributes (e.g., race),29 political opinions30 and sadness and anxiety31 – all of these inferences are used for targeted advertising. Facebook can also infer imminent suicide attempts,32 while third parties have used Facebook data to infer socioeconomic status33 and stances on abortion.34 Insurers are starting to use social media data to set premiums, 35 which is troublesome because research suggests that a person’s social network can 28 Carter Jernigan & Behram F.T. Mistree, Gaydar: Facebook Friendships Expose Sexual Orientation, FIRSTMONDAY.ORG (Oct. 5, 2009), https://firstmonday.org/ojs//index.php/fm/article/view/2611 [https://perma.cc/AMK2-QB8U]. 29 Annalee Newitz, Facebook’s Ad Platform Now Guesses at Your Race Based on Your Behavior, ARS TECHNICA (Mar. 18, 2016), https://arstechnica.com/information-technology/2016/03/facebooks-adplatform-now-guesses-at-your-race-based-on-your-behavior/ [https://perma.cc/H6SB-MSAE]. 30 Jeremy B. Merrill, Liberal, Moderate or Conservative? See How Facebook Labels You, N.Y. TIMES (Aug. 23, 2016), https://www.nytimes.com/2016/08/24/us/politics/facebook-ads-politics.html [https://perma.cc/QNU7-YCBZ]. 31 Michael Reilly, Is Facebook Targeting Ads at Sad Teens?, MIT TECH. REV. (May 1, 2017), https://www.technologyreview.com/s/604307/isfacebook-targeting-ads-at-sad-teens/ (on file with the Columbia Business Law Review). 32 Josh Constine, Facebook Rolls Out AI to Detect Suicidal Posts Before They’re Reported, TECHCRUNCH (Nov. 27, 2017), https://techcrunch.com/2017/11/27/facebook-ai-suicideprevention/?guccounter=1 [https://perma.cc/QF62-WJEH]. 33 See Astra Taylor & Jathan Sadowski, How Companies Turn Your Facebook Activity into a Credit Score, THE NATION (May 27, 2015), https://www.thenation.com/article/how-companies-turn-your-facebookactivity-credit-score/ [https://perma.cc/V4V5-7H55]. 34 See Sharona Coutts, Anti-Choice Groups Use Smartphone Surveillance to Target ‘Abortion-Minded Women’ During Clinic Visits, REWIRE (May 25, 2016), https://rewire.news/article/2016/05/25/anti-choicegroups-deploy-smartphone-surveillance-target-abortion-minded-womenclinic-visits/ [https://perma.cc/VE5A-D5S9]. 35 Leslie Scism, New York Insurers Can Evaluate Your Social Media Use—If They Can Prove Why It’s Needed, WALL ST. J. (Jan. 30, 2019), https://www.wsj.com/articles/new-york-insurers-can-evaluate-your-socialmedia-useif-they-can-prove-why-its-needed-11548856802 (on file with the Columbia Business Law Review). 508 COLUMBIA BUSINESS LAW REVIEW [Vol. 2019 be used to draw acute and intimate inferences about one’s personality.36 Tendencies to depression can be inferred through Facebook37 and Twitter38 usage; Google has attempted to predict flu outbreaks39 as well as other diseases and their outcomes40; and Microsoft can likewise predict Parkinson’s disease 41 and Alzheimer’s disease 42 from search engine interactions. Amazon’s Alexa might be able to infer health status based on speech patterns.43 Other recent 36 See Kristen M Altenburger & Johan Ugander, Monophily in Social Networks Introduces Similarity among Friends-of-Friends, NATURE HUMAN BEHAVIOUR, Apr. 2018, at 284. 37 See Megan A. Moreno et al., Feeling Bad on Facebook: Depression Disclosures by College Students on a Social Networking Site, 28 DEPRESSION & ANXIETY 447 (2011). 38 See Moin Nadeem, Mike Horn, Glen Coppersmith & Sandip Sen, Identifying Depression on Twitter (July 25, 2016) (unpublished manuscript), https://arxiv.org/abs/1607.07384 [https://perma.cc/SKB6WT6K]. 39 Donald R. Olson, Kevin J. Konty, Marc Paladini, Cecile Viboud & Lone Simonsen, Reassessing Google Flu Trends Data for Detection of Seasonal and Pandemic Influenza: A Comparative Epidemiological Study at Three Geographic Scales, PLOS COMPUTATIONAL BIOLOGY, Oct. 2013, at 1. 40 See Anthony Cuthbertson, Google AI Can Predict When People Will Die with ‘95 Per Cent Accuracy’, INDEPENDENT (June 19, 2018), https://www.independent.co.uk/life-style/gadgets-and-tech/news/google-aipredict-when-die-death-date-medical-brain-deepmind-a8405826.html [https://perma.cc/D7RR-Y2M4]; Alvin Rajkomar et al., Scalable and Accurate Deep Learning with Electronic Health Records, NPJ DIGITAL MED., May 8, 2018, at 2–4. 41 See Ryen W. White, P. Murali Doraiswamy & Eric Horvitz, Detecting Neurodegenerative Disorders from Web Search Signals, NPJ DIGITAL MED., Apr. 23, 2018, at 1, 3; Liron Allerhand, Brit Youngmann, Elad Yom-Tov & David Arkadir, Detecting Parkinson’s Disease from Interactions with a Search Engine: Is Expert Knowledge Sufficient? 1 (May 3, 2018) (unpublished manuscript), https://arxiv.org/abs/1805.01138 [https://perma.cc/SF5A-4VTW]. 42 See White, Doraiswamy & Horvitz, supra note 41. 43 James Cook, Amazon Patents New Alexa Feature That Knows When You’re Ill and Offers You Medicine, TELEGRAPH (Oct. 9, 2018), https://www.telegraph.co.uk/technology/2018/10/09/amazon-patents-newalexa-feature-knows-offers-medicine/ [https://perma.cc/V346-HFWE]. No. 2:494] A RIGHT TO REASONABLE INFERENCES 509 potentially invasive applications 44 include Target’s prediction of pregnancy in customers,45 researchers inferring levels of user satisfaction with search results using mouse tracking,46 and, finally, China’s far-reaching social credit scoring system.47 None of these applications can claim to generate inferences or predictions with absolute certainty, and in several cases, they have suffered highly visible failures (e.g. Google Flu Trends).48 Many are likewise used solely for targeted advertising. Justification for these invasive uses of personal data is crucial from an ethical49 as well as legal 50 viewpoint to 44 For an interesting overview of applications that infer sensitive information, see Christopher Burr, Nello Cristianini & James Ladyman, An Analysis of the Interaction Between Intelligent Software Agents and Human Users, 28 MINDS & MACHINES 735 (2018). 45 See Charles Duhigg, How Companies Learn Your Secrets, N.Y. TIMES (Feb. 16, 2012), https://www.nytimes.com/2012/02/19/magazine/shoppinghabits.html [https://perma.cc/7Y84-6MWW]; MAYER-SCHÖNBERGER & CUKIER, supra note 23, at 57–58. 46 Ye Chen, Yiqun Liu, Min Zhang & Shaoping Ma, User Satisfaction Prediction with Mouse Movement Information in Heterogeneous Search Environment, 29 IEEE TRANSACTIONS ON KNOWLEDGE & DATA ENGINEERING 2470 (2017). 47 Simon Denyer, China’s Plan to Organize Its Society Relies on ‘Big Data’ to Rate Everyone, WASH. POST (Oct. 22, 2016), https://www.washingtonpost.com/world/asia_pacific/chinas-plan-toorganize-its-whole-society-around-big-data-a-rating-foreveryone/2016/10/20/1cd0dd9c-9516-11e6-ae9d-0030ac1899cd_story.html [https://perma.cc/Z3KP-KK2T]. For a discussion of the challenges of regulating uses of non-traditional data, such as data generated by Internet of Things devices, for credit and similar decisions, see Scott R. Peppet, Regulating the Internet of Things: First Steps Toward Managing Discrimination, Privacy, Security and Consent, 93 TEX. L. REV. 85 (2014). 48 See David Lazer, Ryan Kennedy, Gary King & Alessandro Vespignani, The Parable of Google Flu: Traps in Big Data Analysis, 343 SCIENCE 1203 (2014). 49 For ethical approaches to AI accountability and justification, see Reuben Binns, Algorithmic Accountability and Public Reason, 31 PHIL. & TECH. 543, 548–52 (2018); Hildebrandt, supra note 20. 50 Viktor Mayer-Schönberger & Yann Padova, Regime Change? Enabling Big Data Through Europe’s New Data Protection Regulation, 17 COLUM. SCI. & TECH. L. REV. 315, 332 (2016) (considering moving away from consent-based data protection to governance of fair and ethical data uses); 510 COLUMBIA BUSINESS LAW REVIEW [Vol. 2019 avoid inferential analytics that are privacy-invasive or damaging to reputation, particularly when these inferences are poorly verifiable or affected individuals receive no benefit. It is thus increasingly common to deploy inferential analytics at scale, based solely on the ability to do so and the perceived accuracy of the method or a belief that efficiency or revenue will improve. From the perspective of the individual, the potential value and insightfulness of data generated while using digital technologies is often opaque. Counterintuitive and unpredictable inferences can be drawn by data controllers, without individuals ever being aware, 51 thus posing risks to privacy52 and identity,53 data protection, reputation,54 and informational self-determination.55 As Tene and Polonetsky argue, “[i]n a big data world, what calls for scrutiny is often not the accuracy of the raw data but rather the accuracy of the see also Alessandro Mantelero, The Future of Consumer Data Protection in the E.U. Re-Thinking the “Notice and Consent” Paradigm in the New Era of Predictive Analytics, 30 COMPUTER L. & SECURITY REV. 643, 653–55 (2014). 51 See Mittelstadt & Floridi, supra note 23, at 312–13; Andrew D. Selbst & Solon Barocas, The Intuitive Appeal of Explainable Machines, 87 FORDHAM L. REV. 1085 (2018). 52 Paul Ohm, The Fourth Amendment in a World Without Privacy, 81 MISS. L.J. 1309, 1316–18 (2012); see also Pauline T. Kim, Data-Driven Discrimination at Work, 58 WM. & MARY L. REV. 857 (2017). 53 Luciano Floridi, The Informational Nature of Personal Identity, 21 MINDS & MACHINES 549, 550 (2011); Mittelstadt, supra note 25, at 476. 54 Sandra Wachter, Privacy: Primus Inter Pares―Privacy as a Precondition for Self-Development, Personal Fulfilment and the Free Enjoyment of Fundamental Human Rights (Jan. 22, 2017) (unpublished manuscript), https://papers.ssrn.com/abstract=2903514 [https://perma.cc/R LB3-G6SC]. 55 Urteil des Ersten Senats vom BVerfG [Volkszählungsurteil’], ’15, Dezember 1983, 1 BvR 209/83 (Ger.), https://openjur.de/u/268440.html [https://perma.cc/DRS7-HNRZ]; Judgement of German Constitutional Court, BVerfG · Urteil vom 15. Dezember 1983 · Az. 1 BvR 209/83, 1 BvR 484/83, 1 BvR 420/83, 1 BvR 362/83, 1 BvR 269/83, 1 BvR 440/83 (Volkszählungsurteil). For a critical voice on this subject see Jan Klabbers, The Right to Be Taken Seriously: Self-Determination in International Law, 28 HUM. RTS. Q. 186 (2006). No. 2:494] A RIGHT TO REASONABLE INFERENCES 511 inferences drawn from the data.”56 The Article 29 Working Party has recognised a similar challenge, arguing that, “[m]ore often than not, it is not the information collected in itself that is sensitive, but rather, the inferences that are drawn from it and the way in which those inferences are drawn, that could give cause for concern.”57 The European Data Protection Supervisor (EDPS) has likewise expressed concern over the privacy risks of inferences and the need for governance.58 Similarly, NGOs and activist groups are aware of these concerns and have recently submitted numerous complaints to fight for more clarity on the legal and ethical acceptability of inferential analytics.59 The unpredictability of the analytics behind automated decision-making and profiling can itself be harmful to individuals. As noted in jurisprudence of the European Court of Human Rights (“ECHR”)60, the use of untraditional data 56 Omer Tene & Jules Polonetsky, Big Data for All: Privacy and User Control in the Age of Analytics, 11 NW. J. TECH. & INTELL. PROP. 239, 270 (2013) (emphasis in original). 57 Article 29 Data Prot. Working Party, Opinion 03/2013 on Purpose Limitation, at 47, 00569/13/EN, WP203 (Apr. 2, 2013), https://ec.europa.eu/justice/article-29/documentation/opinionrecommendation/files/2013/wp203_en.pdf [https://perma.cc/X6PC-825X]. 58 See European Data Prot. Supervisor, EDPS Opinion on Online Manipulation and Personal Data at 5, 8–16, Opinion 3/2018 (Mar. 19, 2018), https://edps.europa.eu/sites/edp/files/publication/18-0319_online_manipulation_en.pdf [https://perma.cc/3KJ6-VSUD]. 59 Johnny Ryan, Regulatory Complaint Concerning Massive, Web-Wide Data Breach by Google and Other “Ad Tech” Companies Under Europe’s GDPR, BRAVE (Sept. 12, 2018), https://www.brave.com/blog/ adtech-data-breach-complaint/ [https://perma.cc/3DFW-JZTX]; Our Complaints against Acxiom, Criteo, Equifax, Experian, Oracle, Quantcast, Tapad, PRIVACY INT’L (Nov. 8, 2018), http://privacyinternational.org/ advocacy-briefing/2426/our-complaints-against-acxiom-criteo-equifaxexperian-oracle-quantcast-tapad (on file with the Columbia Business Law Review); Privacy International Files Complaints Against Seven Companies for Wide-Scale and Systematic Infringements of Data Protection Law, PRIVACY INT’L (Nov. 8, 2018), http://privacyinternational.org/pressrelease/2424/privacy-international-files-complaints-against-sevencompanies-wide-scale-and (on file with the Columbia Business Law Review). 60 For an overview on the jurisprudence on the right of privacy of the ECHR to 2017, see Council of Europe, Case Law of the European Court of 512 COLUMBIA BUSINESS LAW REVIEW [Vol. 2019 sources to make unpredictable and counterintuitive inferences about people can impact on the freedom of expression, the right to privacy and identity,61 and selfdetermination of individuals.62 The ECHR has a longstanding tradition of linking the right to personality to the right of privacy.63 This link suggests that, to remain in control of their identity in the face of uncertainty, data subjects may alter their behavior (e.g. self-censorship) when using digital technologies.64 Such chilling effects linked to automated decision-making and profiling undermine self-determination and freedom of expression and thus warrant more control over the inferences that can be drawn about an individual. Without greater control, inferences can operate as “an autonomy trap.”65 Therefore, there is also a public and collective interest in the protection of privacy.66 Human Rights Concerning the Protection of Personal Data, T-PD(2017)23 (2017), https://rm.coe.int/case-law-on-data-protection/1680766992 [https://perma.cc/H4F2-9WVZ]. 61 For an in-depth discussion on identity and profiling, see PROFILING THE EUROPEAN CITIZEN (Mireille Hildebrandt & Serge Gutwirth eds., 2008); Antoinette Rouvroy, Privacy, Data Protection, and the Unprecedented Challenges of Ambient Intelligence, 2 STUD. ETHICS, L., & TECH. 1, 3–4 (2008). 62 Nora Ni Loideain, Surveillance of Communications Data and Article 8 of the European Convention on Human Rights, in RELOADING DATA PROTECTION 183, 199–200, 202–03 (Serge Gutwirth, Ronald Leenes & Paul De Hert eds., 2014); Wachter, supra note 54, at 5. 63 See generally Wachter, supra note 54. For a critical view on guidelines of the Council of Europe’s new privacy guidelines, see Alessandro Mantelero, Regulating Big Data. The Guidelines of the Council of Europe in the Context of the European Data Protection Framework, 33 COMP. L. & SECURITY REV. 584 (2017). 64 PEN AMERICA, CHILLING EFFECTS: NSA SURVEILLANCE DRIVES U.S. WRITERS TO SELF-CENSOR 3–4 (2013); Jonathon W. Penney, Chilling Effects: Online Surveillance and Wikipedia Use (Sept. 8, 2016), https://papers.ssrn.com/abstract=2769645 [https://perma.cc/FGW8WMVP]. 65 Tal Z. Zarsky, “Mine Your Own Business!”: Making the Case for the Implications of the Data Mining of Personal Information in the Forum of Public Opinion, 5 YALE J.L. & TECH. 1, 35 (2002–03). 66 See generally Priscilla M. Regan, Privacy as a Common Good in the Digital World, 5 INFO., COMM’N. & SOC’Y 382 (2002). No. 2:494] A RIGHT TO REASONABLE INFERENCES 513 The tendency in mature information societies 67 to create, share, sell, and retain data, profiles, and other information about individuals presents additional challenges. Persistent records can be created through inferential analytics, consisting of unpredictable and potentially troubling inferences revealing information and predictions about private life, behaviors, and preferences that would otherwise remain private.68 Compared to prior human and bureaucratic decision-making, the troubling change posed by the widespread deployment of Big Data analytics is that the profile or information “at the basis of the choice architecture offered” to individuals need not be held and used by a single third-party for a specific purpose, but rather “persists over time, travels with the person between systems and affects future opportunities and treatment at the hands of others.”69 These tendencies contribute to the solidification of identity and reputation, undermining the individual’s right “to be allowed to experiment with one’s own life, to start again, without having records that mummify one’s personal identity forever.”70 Inferential analytics thus pose substantial and novel risks not only to identity, but to reputation and the choices offered to an individual by data-driven services. While the potential harms of inferences have been recognized by European legal scholars and policy-makers, data protection law and its procedural approach have not yet caught up. Data subjects receive little help in coming to terms with the informativeness of the data they provide to controllers, who are generally not legally obligated to disclose or justify their criteria and methods used to draw inferences and make decisions based upon them. 71 Rather, the default procedural approach in European data protection law to 67 Luciano Floridi, Mature Information Societies—a Matter of Expectations, 29 PHIL. & TECH. 1, 1 (2016). 68 See generally Mittelstadt & Floridi, supra note 23. 69 Mittelstadt, supra note 25, at 482. 70 Luciano Floridi, Four Challenges for a Theory of Informational Privacy, 8 ETHICS & INFO. TECH. 109, 112 (2006). 71 See infra Part IV. See generally Tene & Polonetsky, supra note 56 (arguing that decision-making criteria of companies should be disclosed). 514 COLUMBIA BUSINESS LAW REVIEW [Vol. 2019 protect the privacy of individuals is to grant oversight and control over how personal data is collected and processed. In other words, data protection law focuses primarily on mechanisms to manage the input side of processing. As will be explained below,72 the few mechanisms in European data protection law that address the outputs of processing, including inferred and derived data, profiles, and decisions, are far weaker. In the age of Big Data analytics, a myopic focus on input data in data protection law is troubling. The outputs of processing pose risks to individuals, yet data subjects are granted far less control over how these outputs are produced and used. Currently, individuals are not guaranteed awareness of potentially problematic decision-making and will often lack a legal basis to examine the decision-making process for problems in the first place. This situation is a result of the uncertain legal status of inferences and the scope of applicable control mechanisms in data protection law. Transparency and consent mechanisms designed to manage input data are no longer sufficient; rather, the spread of inferential Big Data analytics requires a reaction in data protection law, by which meaningful control and choice over inferences and profiles are granted to data subjects.73 As Judge Posner eloquently argues, “A seldom-remarked corollary to a right to misrepresent one’s character is that others have a legitimate interest in unmasking the deception.”74 This Article argues that the introduction of a right to reasonable inferences is precisely the type of reaction required. See infra Parts IV, V. See Serge Gutwirth & Paul De Hert, Regulating Profiling in a Democratic Constitutional State, in PROFILING THE EUROPEAN CITIZEN 271 (Mireille Hildebrandt & Serge Gutwirth eds., 2008); see also Ronald Leenes, Addressing the Obscurity of Data Clouds, in PROFILING THE EUROPEAN CITIZEN 293 (Mireille Hildebrandt & Serge Gutwirth eds., 2008) (also discussing the need for transparent decision-making processes). 74 Richard A Posner, The Right of Privacy, 12 GA. L. REV. 393, 395 (1978). 72 73 No. 2:494] A RIGHT TO REASONABLE INFERENCES 515 III. ARE INFERENCES PERSONAL DATA? To grant data subjects broadly applicable, non-sectoral rights over their inferences under data protection law, inferences must be seen as personal data. This Part defines inferences as information relating to an identified or identifiable natural person created through deduction or reasoning rather than mere observation or collection from the data subject. The type of inference discussed here are “high risk inferences” which are created or used by data controllers or third parties, are privacy-invasive or harmful to reputation—or have a high likelihood of being so in the future—or have low verifiability in the sense of being predictive or opinion-based while being used for important decisions.75 Several distinctions between “types” of personal data relevant to the legal status of inferences are evident in the GDPR itself as well as guidance issued by the Article 29 Working Party. Article 4 of the GDPR defines personal data as “any information relating to an identified or identifiable natural person.”76 Article 9(1) of the GDPR makes a further distinction between normal or non-sensitive personal data, and “special categories” of personal data that pertain to “racial or ethnic origin, political opinions, religious or philosophical beliefs, or trade union membership, and the processing of genetic data, biometric data for the purpose of uniquely identifying a natural person, data concerning health or data concerning a natural person’s sex life or sexual orientation[.]”77 Sensitive personal data incurs additional restrictions on processing under Article 9(2–4).78 If inferences are personal data, this distinction between sensitive and non- See supra Section II.A; see also infra Part VI. Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the Protection of Natural Persons with Regard to the Processing of Personal Data and on the Free Movement of Such Data, and Repealing Directive 95/46/EC (General Data Protection Regulation), 2016 O.J. (L119) art. 4. 77 Id. at art. 9(1). 78 Id. at art. 9(2–4). 75 76 516 COLUMBIA BUSINESS LAW REVIEW [Vol. 2019 sensitive types, and the higher standard of protection afforded to the former, will also apply. The Article 29 Working Party further distinguishes between provided and observed data on the one hand, and derived and inferred data on the other.79 Provided data includes any data that the data subject has directly provided to the data controller, for example the user’s name or email address.80 Observed data is also “provided by” the data subject, but indirectly or passively, including things such as location data, clicking activity, or unique aspects of a person’s behavior such as handwriting, keystrokes, or a particular way of walking or speaking.81 In contrast, derived (e.g. country of residency derived from the subject’s postcode) and inferred data (e.g. credit score, outcome of a health assessment, results of a personalization or recommendation process) are not “provided by” the data subject actively or passively, but rather created by a data controller or third party from data provided by the data subject and, in some cases, other background data.82 The Article 29 Working Party’s guidelines on data portability provide examples of personal data derived from non-traditional sources, such as data produced “from the observation of [a user’s] behaviour,” including clicking or browsing behavior and the inferences drawn from it.83 Additionally, their guidelines on profiling and automated decision-making argue that “profiling... works [by] creating derived or inferred data about individuals – ‘new’ personal 79 Art. 29 Data Prot. Working Party, supra note 17, at 8; Article 29 Data Prot. Working Party, Guidelines on the Right to Data Portability, 16/EN, WP242rev.01, at 9–11 (Dec. 13, 2016), https://ec.europa.eu/newsroom/document.cfm?doc_id=44099 (on file with the Columbia Business Law Review). 80 Id. at 9. 81 Article 29 Data Prot. Working Party, Opinion 4/2007 on the Concept of Personal Data, 01248/07/EN WP136, at 8 (June 20, 2007) http://ec.europa.eu/justice/article-29/documentation/opinionrecommendation/files/2007/wp136_en.pdf (on file with the Columbia Business Law Review). 82 Article 29 Data Prot. Working Party, supra note 79, at 10–11. 83 See id at 10, 10 n.20, 21. Note that inferences are not covered by Article 20, but rather by Article 15. No. 2:494] A RIGHT TO REASONABLE INFERENCES 517 data that has not been provided directly by the data subjects themselves.”84 Clearly, if inferences can be considered personal data, they are of the latter type: derived or inferred.85 A. Three-Step Model To determine whether data is “personal data,” the Article 29 Working Party86 has proposed a three-step model. According to this model, the content, purpose, or result 87 of the data (processing) must relate to an identifiable person either directly or indirectly.88 This approach allows for nonpersonal data to be transformed into personal data through linkage to an identified individual.89 For example, the value of a house can become personal data used to assess individuals, such as the amount of their tax obligations.90 Due to technical affordances, some commentators have argued 84 See supra note 17, at 9; see also note 79, at 9–10 (referring to “observed data” such as “activity logs, history of website usage or search activities”). 85 See Martin Abrams, The Origins of Personal Data and its Implications for Governance (Nov. 24, 2014) (unpublished manuscript), https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2510927 [https://perma.cc/9YZ5-FT96] (discussing the differences between derived and inferred data). 86 See generally Article 29 Data Prot. Working Party supra note 81; for an overview of EU jurisprudence on the definition of personal data, see Nadezhda Purtova, The Law of Everything. Broad Concept of Personal Data and Future of EU Data Protection Law, 10 LAW INNOVATION & TECH 40 (2018). 87 See Article 29 Data Prot. Working Party, supra note 79, at 10 (defining purpose as “to evaluate, treat in a certain way or influence the status or behaviour of an individual”). 88 See id. at 11. 89 For an excellent overview of the concept of personal data, see Douwe Korff, Data Protection Laws in the EU: The Difficulties in Meeting the Challenges Posed by Global Social and Technical Developments (Eur. Comm’n. Directorate-General Justice, Freedom & Sec., Working Paper No. 2, 2010), http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1638949 [https://perma.cc/8JUL-5S6L]. 90 See Article 29 Data Prot. Working Party, supra note 81, at 9. 518 COLUMBIA BUSINESS LAW REVIEW [Vol. 2019 that it is difficult to locate data that cannot potentially be transformed into personal data. 91 The third step of the model, ‘result’, is key to the legal status of inferences.92 The Article 29 Working Party argues that data being “likely to have an impact on a certain person’s rights and interests”93 is sufficient for it to be treated as personal data. In practice, this means that even if the data does not directly describe an identifiable person (“content”), or is not “used or... likely to be used... [to] evaluate, treat in a certain way or influence the status or behaviour”94 of the person (“purpose”), it can still be classified as “personal data” based on its potential impact on an identifiable person’s rights and interests (“result”).95 Information that is not directly readable from the data collected, but rather derived or inferred from it, can thus also be considered personal data. This conclusion is further supported by the usage of the term “any information” in Article 4(1) of the GDPR; identical language was used to define “personal data” in the 1995 Data Protection Directive (95/46/EC), which the Article 29 Working Party has previously taken as evidence of legislators’ intent to have a very wide definition of “personal data”. 96 They argue that personal data includes ‘subjective’ “information, opinions, or assessments” 97 relating to an identified or identifiable natural person in terms of content, purpose, or result. Further, such information does not need to be “true or proven.”98 This position is implicitly supported by the Article 29 Working Party granting rights to data subjects “to access 91 Stefan Ernst, Begriffsbestimmungen, in, DATENSCHUTZGRUNDVERORDNUNG BUNDESDATENSCHUTZGESETZ (Boris Paal & Daniel A Pauly eds., 2018). 92 See Korff, supra note 89, at 52–53 (arguing that profiles, understood as bundles of inferences and assumptions, should be treated as personal data). 93 See Article 29 Data Prot. Working Party, supra note 81, at 11. 94 Id. at 10. 95 Id. at 10–11. 96 Id. at 4. 97 Id. at 6. 98 Id. No. 2:494] A RIGHT TO REASONABLE INFERENCES 519 that information and to challenge it through appropriate remedies,”99 for example by providing additional comments.100 Several other guidelines issued by the Working Party similarly argue that certain individual rights apply to inferred and derived data, which by definition means these must be personal data.101 B. Subjectivity and Verifiability Inferences are often precisely these types of subjective and non-verifiable “information, opinions, or assessments” 102 created by a third-party through more than mere observation of the data subject. Several examples of such subjective or non-verifiable personal data are provided by the Article 29 Id. Id. at 6 n.5. 101 See Guidelines, supra note 17, at 17–18. Guidelines clarifies that the rights to rectification, erasure, and restriction of processing apply to inferred and derived data. Id.; see also Article 29 Data Prot. Working Party, supra note 81, at 11. Here, following the text of Article 20(1) of the GDPR, they clarify that the right to data portability covers only data “provided by” the data subject: “a personalisation or recommendation process, by user categorisation or profiling are data which are derived or inferred from the personal data provided by the data subject, and are not covered by the right to data portability.” Derived and inferred data thus do not fall within the scope of data portability. In practice, this means that Art. 20 only covers data provided by the data subject or observed by the controller but not the profile itself or other inferred and derived data. This could be taken to suggest that derived and inferred data are not a type of personal data on the basis that an individual data protection right (Art. 20), which by definition applies to personal data, does not apply to these types of data. This interpretation is incorrect. Footnote 20 accompanying the preceding quote clarifies that although Art. 20 does not apply, Art. 15 and 22 still apply to inferred and derived data. By definition, for these other Articles to apply, the data being processed needs to be personal data. The Guidelines therefore endorse classifying inferred and derived data as personal data, albeit indirectly. These limits on data portability are sensible, as the right is designed as a competition tool, not a data privacy tool. See also Paul De Hert, Vagelis Papakonstantinou, Gianclaudio Malgieri, Laurent Beslay & Ignacio Sanchez, The Right to Data Portability in the GDPR: Towards UserCentric Interoperability of Digital Services, 34 COMPUTER L. & SECURITY REV. 193 (2018). 102 See Article 29 Data Prot. Working Party, supra note 81, at 6. 99 100 520 COLUMBIA BUSINESS LAW REVIEW [Vol. 2019 Working Party. Concerning subjectivity, examples of subjective assessments are provided for several sectors: in banking, “assessment of the reliability of borrowers (“Titius is a reliable borrower”); in insurance (“Titius is not expected to die soon”) or in employment (“Titius is a good worker and merits promotion”).” 103 Such subjective third-party assessments can be considered a type of inference, as the assessment involves inferring a non-observed characteristic or subjective opinion of the subject from data already held 104 Concerning non-verifiability, a second example is provided of a child’s drawing depicting her family and her mood towards them.105 Such a drawing, although created by the child, can allow for information about the behaviors of the child’s parents to be inferred. As a result, the drawing itself, and any information about her parents’ behavior inferred from it, is classified as the parents’ personal data. Such inferences are not necessarily verifiable, and are subjective due to interpretation being required to derive information about the parents’ behaviors. 106 Each of these examples shows that the Article 29 Working Party believes opinions and assessments, understood here as inferences, do not need to be objective or verifiable to be considered personal data. Several legal commentators have reached similar conclusions. Ernst, for example, argues that predictions and inferences about a data subject constitute personal data irrespective of their timeframe or whether they address the past, present, or future.107 By definition, predictions cannot be verified at the time they are made, but can nonetheless describe an identified or identifiable person. Klabunde similarly believes that assumptions and Id. For a discussion of opinions and assessments being classified as personal data under EU data protection law, see generally Korff, supra note 89. 105 See Article 29 Data Prot. Working Party, supra note 81, at 8. As such, the child’s parents can exercise their right of access in relation to the drawing. Id. 106 Id. 107 Ernst, supra note 91, at 14–18. 103 104 No. 2:494] A RIGHT TO REASONABLE INFERENCES 521 assessments are also personal data, irrespective of whether they are accurate or verifiable.108 IV. JURISPRUDENCE OF THE EUROPEAN COURT OF JUSTICE While the legally non-binding guidelines of the Article 29 Working Party clearly endorse the view that inferences are personal data, the legally binding jurisprudence of the European Court of Justice (ECJ) is less generous in its interpretation. Even though the ECJ also believes in a broad interpretation of the concept of personal data, the Court has historically held a more restricted view of the scope of “personal data” and applicable rights. 109 Two recent cases (YS. and M. and S.110, and Nowak111) are particularly relevant to determining the legal status of inferences and the remit of data protection law more broadly. A. Joined Cases C-141/12 and C-372/12: YS and M and S YS and M and S addressed whether an applicant has a right to access the legal analysis (or “information about the assessment and application” 112) underlying a decision of legal residency. The ECJ’s judgement 113 and the associated opinion 108 Achim Klabunde, Begriffsbestimmungen, in DATENSCHUTZGRUNDVERORDNUNG BUNDESDATENSCHUTZGESETZ 7–8 (Eugen Ehmann & Martin Selmayr eds., 2017). 109 For an in-depth overview of the ECJ’s concept of personal data, see Case C-101/01 Lindqvist E.C.R. I-12971, ¶ 24; Joined Cases C465/00, C-138/01 and C-139/01 Österreichischer Rundfunk and Others E.C.R. I-4989, ¶ 64; Case C-73/07 Satakunnan Markkinapörssi and Satamedia E.C.R. I-9831, ¶¶ 35, 37; Case C-524/06 Huber E.C.R. I-9705, ¶ 43; and Case C-553/07 Rijkeboer E.C.R. I-3889, ¶ 62. 110 See supra notes 3–4 and accompanying text. 111 See supra notes 3–4 and accompanying text. 112 Joined Cases C–141 & 372/12, YS, M and S v. Minister voor Immigratie, Integratie en Asiel, 2014 E.C.R. I- 2081, ¶ 40. 113 For in-depth analyses of the judgment, see Evelien Brouwer & Frederik Zuiderveen Borgesius, Access to Personal Data and the Right to Good Governance During Asylum Procedures after the Cjeu’s YS. and M. and 522 COLUMBIA BUSINESS LAW REVIEW [Vol. 2019 of the Advocate General114 in this case suggest a troubling direction of travel for the protection of data subjects for three reasons: (1) the limited scope of personal data; (2) the limited rights of access and rectification; and (3) the view that data protection law does not aim to ensure accurate or lawful decision-making, and thus does not govern how inferences are drawn in decision-making processes. 1. Inferences as Personal Data The ECJ ruled “that the data relating to the applicant for a residence permit contained in the minute [a document containing the reasoning of the case officer] and, where relevant, the data in the legal analysis contained in the minute are ‘personal data’ within the meaning of that provision, whereas, by contrast, that analysis cannot in itself be so classified.”115 This ruling indicates that only the personal data contained or used within the legal analysis, but not the analysis itself, is personal data subject to protection under the 1995 Data Protection Directive. Specifically, the ECJ noted that only the “name, date of birth, nationality, gender, ethnicity, religion and language of the applicant,”116 or only data that is “about” the data subject are personal data.117 This judgement is interesting because historically the Court has been predominantly asked to rule on the legal status of observations or verifiable data (e.g. “facts” about a person), not assessments or non-verifiable data.118 Examples S. Judgment, 17 EUR. J. MIGRATION & L. 259 (2015); Xavier Tracol, Back to Basics: The European Court of Justice Further Defined the Concept of Personal Data and the Scope of the Right of Data Subjects to Access It, 31 COMPUTER L. & SECURITY REV. 112 (2015); see also Purtova, supra note 86. 114 See generally supra note 4. 115 See Joined Cases C–141 & 372/12, YS, M and S v. Minister voor Immigratie, Integratie en Asiel, 2014 E.C.R. I- 2081, ¶ 48. 116 Id. ¶ 38. 117 See Purtova, supra note 86, at 28. 118 Of course, one must keep in mind that the Court can only rule on the cases referred to it, and thus the Court has no power to take views that fall outside the cases it considers. No. 2:494] A RIGHT TO REASONABLE INFERENCES 523 of personal data named in prior judgements include “telephone [numbers], and information about his/her working conditions or hobbies,”119 “the surname and given name of certain natural persons whose income exceeds certain thresholds” as well as “their earned and unearned income,”120 “IP addresses,”121 “fingerprints,”122 “record of working time... and... rest periods,”123 “data... collected by... private detectives,”124 “image of a person recorded by a camera,”125 “tax data,”126 and “press releases.”127 In contrast, in YS and M and S, the ECJ addressed whether legal analysis can be considered personal data. This determination is incredibly relevant for the legal status of inferences. A legal analysis is comparable to an analysis of personal data where new data is derived or inferred. Such analysis can consist of multiple inferences connected to an identified or identifiable individual (i.e. assessment of how the 119 Case C-101/01, Criminal Proceedings Against Bodil Lindqvist, 2003 E.C.R. I-12992. 120 Case C-73/07, Tietosuojavaltuutettu v. Satakunnan Markkinapörssi Oy & Satamedia Oy, 2008 E.C.R. I-09831. 121 Case C-70/10, Scarlet Extended SA v. Société Belge des Auteurs, Compositeurs et Éditeurs SCRL (SABAM), 2011 E.C.R. I-12006; Case C‑582/14, Patrick Breyer v. Bundesrepublik Deutschland, 2016 E.C.R. I-779 (stating that “all the information enabling the identification” does not need to be in the “hands of one person”). 122 Case C-291/12, Michael Schwarz v. Stadt Bochum, 2013 E.C.R. I670. 123 Case C-342/12, Worten–Equipamentos para o Lar SA v. Autoridade para as Condições de Trabalho (ACT), 2013 E.C.R. I-355. 124 Case C‑473/12, Institut professionnel des agents immobiliers (IPI) v. Geoffrey Englebert, 2013 E.C.R. I-715. 125 Case C-212/13, František Ryneš v. Úřad pro Ochranu Osobních údajů, 2014 E.C.R. I-2428, ¶ 22. 126 Case C-201/14, Smaranda Bara and Others v. Preedintele Casei Naionale de Asigurări de Sănătate, 2015 E.C.R. I-638, ¶ 29. 127 LARAINE LAUDATI, EUROPEAN ANTI-FRAUD OFFICE, SUMMARIES OF EU COURT DECISIONS RELATING TO DATA PROTECTION 2000–2015, at 32 (2016), https://ec.europa.eu/anti-fraud/sites/antifraud/files/caselaw_2001_ 2015_en.pdf [https://perma.cc/DLF9-XBMG] (discussing Case T-259/03, Kalliopi Nikolaou v. Comm’n of the European Communities, 2007 E.C.R. I254). 524 COLUMBIA BUSINESS LAW REVIEW [Vol. 2019 law applies to a case), leading to a final opinion, result, or inference (i.e. the applicant does not meet the required standards of residency), and followed by a decision or action (i.e. denial of legal residency). Three issues arose: (1) is the legal analysis, and the inferences drawn within it, personal data, (2) are the final opinions, results, or inferences about an identifiable individual resulting from the analysis personal data; and (3) is the consequent decision or action personal data? The ECJ’s judgement makes clear that the first question must be answered in the negative, meaning the analysis and constituent inferences are not considered personal data.128 The ECJ, as opposed to the AG, does not distinguish between the legal analysis and the resulting opinions, results, or inferences created in the processing.129 As a result, no answer is provided to the second question. Finally, the ECJ does not address the third question. An alternative view, potentially inspired by the AG’s distinction between medical analysis and results,130 could be that the analysis is not equivalent to inferences, but rather the reasoning or logic that leads to the inference. First, it must be noted that this distinction only appears in a footnote in the opinion131 and was not taken up by the ECJ in this case or in the Nowak case.132 Second, the reasoning leading to an inference might be better conceived as a cognitive process, while the analysis is regarded as the recorded output of the reasoning. It is difficult to imagine the reasoning or logic in a “legal analysis” not involving the creation of inferences about the applicant’s case. Even if one wishes to argue that this is not the case, meaning the legal analysis is merely the reasoning leading to inferences, the outcome of this Article’s argument would not change as the problems remain the same. Regardless of how broadly one defines “inference,” the rights 128 Cases C-141/12 & 372/12, YS, M and S v. Minister voor Immigratie, Integratie en Asiel, 2014 E.C.R. I-2081, ¶ 39, 48. 129 Cases C-141/12 & 372/12, YS, M and S v. Minister voor Immigratie, Integratie en Asiel, 2013 E.C.R. I-838, at ¶ 49 n.40. 130 Id. 131 Id. 132 See infra Section IV.B.1. No. 2:494] A RIGHT TO REASONABLE INFERENCES 525 granted over inferred or derived personal data are very limited.133 The main concern addressed by this Article remains the limited rights, control, and recourse given to individuals over inferences, or how they are analyzed and assessed by third parties. In this regard the judgement followed the opinion of the Advocate General (AG).134 The AG defines legal analysis as “the legal classification of facts relating to an identified or identifiable person... and their assessment against the background of the applicable law,” 135 or “the reasoning underlying the resolution of a question of law.” 136 Based on this definition, legal analysis cannot be considered personal data, as she argues that “only information relating to facts about an individual can be personal data,” 137 and thus a “legal analysis is not itself personal data.” 138 To unpack the distinction between facts (as personal data) and analysis, the AG used the example of information describing a person’s weight. Allowing that “facts” can be described in “objective” (e.g. kilos) or “subjective” (e.g. “underweight,” “obese” terms,139 she argued that that “the steps of reasoning by which the conclusion is reached that a person is ‘underweight’ or ‘obese’ are not facts, any more than legal analysis is.” 140 As a result, legal analysis, and more broadly “the steps of reasoning by which [a] conclusion is reached”141 about an individual, cannot be considered personal data.142 See infra Sections IV.A.2, IV.B.2, and Part V. Cases C-141/12 & 372/12, YS, M and S v. Minister voor Immigratie, Integratie en Asiel, 2013 E.C.R. I-838. 135 Id. ¶ 54. 136 Id. ¶ 59. 137 Id. ¶ 56. 138 Id. ¶ 61. 139 For a discussion of objective and subjective communication of facts, see id. ¶ 57. 140 Id. ¶ 58. 141 Id. ¶ 58. 142 Id. ¶¶ 58–59. 133 134 526 COLUMBIA BUSINESS LAW REVIEW [Vol. 2019 The distinction made here between describing a person as underweight or obese and “the steps of reasoning by which the conclusion is reached”143 is important for answering the second question. Elsewhere in the opinion, the AG suggests that it is unhelpful “to distinguish between ‘objective’ facts and ‘subjective’ analysis,” 144 as “[f]acts can be expressed in different forms, some of which will result from assessing whatever is identifiable.” 145 Assessments themselves, insofar as they can be considered a subjective expression of a fact, may therefore be considered personal data. Supporting this, the AG admits that she cannot “exclude the possibility that assessments and opinions may sometimes fall to be classified as [personal] data.” 146 In this example, the AG clearly distinguishes between facts or outputs of an assessment process (i.e. an “assessment” or “opinion”), and the process itself (i.e. the “reasoning”).147 The positions taken by the ECJ and AG in YS and M and S appear to be at odds with the view of the Article 29 Working Party.148 According to their three-step model, personal data is not limited to data about an identified or identifiable individual. Rather, data that has the purpose to assess the data subject or results in having an effect on the data subject must also be considered personal data. In her opinion, the AG even refers to the Article 29 Working Party’s guidelines on the concept of personal data (which she notes are not legally binding). She explains that the Article 29 Working Party document only attributes personal data status to “results of a medical analysis,” 149 but leaves open how the analysis or reasoning leading to the assessment should be classified. Id. ¶ 58. Id. ¶ 57. 145 Id. 146 Id. 147 See id ¶¶ 57–59. (“However, the steps of reasoning by which the conclusion is reached that a person is ‘underweight’ or ‘obese’ are not facts, any more than legal analysis is,” and “[t]he explanation itself is not information relating to an identified or identifiable person.”). 148 See supra Part III. 149 Id. ¶ 49, n.40 (emphasis in original). 143 144 No. 2:494] A RIGHT TO REASONABLE INFERENCES 527 Interestingly enough, the AG also leaves open how results of the analysis (the second question) should be classified, even though it seems highly unlikely that the outputs of analysis underlying a residency decision (i.e. inferences about the application) and the decision itself are not considered personal data. The AG’s definition of personal data as “facts about an individual,”150 and the irrelevance of whether such facts are stated in objective or subjective terms, suggests that the she views verifiability as a necessary component of personal data. A troubling sort of test for personal data based upon verifiability can be inferred, wherein assessments and opinions can be classified as personal data only if they meet some unnamed threshold, or are sufficiently based upon verifiable facts to be considered a “subjective statement” of these facts. Where this threshold lies remains unclear. 2. Remit of Data Protection Law Another troubling aspect of the ruling is the position taken by the ECJ on the remit of data protection law. The ECJ argued that the purpose of data protection law is not to assess the accuracy of decision-making processes involving personal data. On this basis, the applicants’ requests for access were denied, as their intention was to assess the accuracy of an assessment of personal data. Rather than being provided by data protection law, the ECJ argued that other laws applicable to the specific case should be consulted to assess whether the decision-making procedure is accurate. Specifically, the ECJ stated that: In contrast to the data relating to the applicant for a residence permit which is in the minute and which may constitute the factual basis of the legal analysis contained therein, such an analysis... is not in itself liable to be the subject of a check of its accuracy by that applicant and a rectification under Article 12(b) of Directive 95/46... extending the right of access of the applicant for a residence permit to that legal 150 Id. ¶ 56. 528 COLUMBIA BUSINESS LAW REVIEW [Vol. 2019 analysis would not in fact serve the directive’s purpose of guaranteeing the protection of the applicant’s right to privacy with regard to the processing of data relating to him, but would serve the purpose of guaranteeing him a right of access to administrative documents, which is not however covered by Directive 95/46.151 YS and M and S is not the first time that the ECJ has claimed that data protection law (when personal data is processed by community institutions and bodies), and the right of access in particular, is not designed to provide access to or facilitate assessments of the accuracy for decisionmaking processes.152 In European Commission v. Bavarian Lager, the ECJ ruled that:... when examining the relationship between Regulations Nos 1049/2001 and 45/2001 for the purpose of applying the exception under Article 4(1)(b) of Regulation No 1049/2001 to the case in point, it must be borne in mind that those regulations have different objectives. The first is designed to ensure the greatest possible transparency of the decision-making process of the public authorities and the information on which they base their decisions. It is thus designed to facilitate as far as possible the exercise of the right of access to documents, and to promote good administrative practices. The second is designed to ensure the protection of the freedoms and fundamental rights of individuals, particularly their private life, in the handling of personal data. 153 In YS and M and S, the ECJ referred to Bavarian Lager and explained the overall aim, remit and purpose of data protection law Regulation No 45/2001 is not designed to ensure the greatest possible transparency of the decision-making 151 Joined Cases C-141/12 & 372/12, YS v. Minister voor Immigratie, Integratie en Asiel, 2014 E.C.R. I-2081, ¶¶ 45–46. 152 See Case C-28/08 P, European Comm’n v. Bavarian Lager, 2010 E.C.R. I-6055. 153 Id. ¶ 49. No. 2:494] A RIGHT TO REASONABLE INFERENCES 529 process of the public authorities and to promote good administrative practices by facilitating the exercise of the right of access to documents. That finding applies equally to Directive 95/46, which, in essence, has the same objective as Regulation No 45/2001. 154 Thus, data protection law in general, and the right of access in particular, are not designed to provide full transparency in decision-making involving personal data, or to guarantee “good administrative practices.” 155 These particular limits on the right of access are not oneoff. In College van burgemeester en wethouders van Rotterdam v. M. E. E. Rijkeboer, the ECJ ruled that the right of access is limited to providing information regarding the scope of data undergoing processing (which is necessary to rectify or erase this data), to verify the lawfulness of processing, or to object to processing.156 They covered similar territory in YS and M and S, arguing that full access to personal data does not need to be granted under the right of access. 157 Rather, as the ECJ held in YS and M and S, “it is sufficient that the applicant be in possession of a full summary of those data in an intelligible form, that is to say a form which allows that applicant to become aware of those data and to check that they are accurate and processed in compliance with that directive[.]”158 The AG, like the ECJ, views the remit of data protection law in a very limited way. She views legal analysis as not falling “within the sphere of an individual’s right to privacy,”159 and cannot see a “reason to assume that that individual is himself uniquely qualified to verify and rectify it 154 Joined Cases C-141/12 & 372/12, YS, M and S v. Minister voor Immigratie, Integratie en, 2014 E.C.R. I-2081, ¶ 47. 155 Id. 156 Case C-553/07, College van burgemeester en wethouders van Rotterdam v. M. E. E. Rijkeboer, 2009 E.C.R. I-3889, ¶¶ 51–52. 157 Joined Cases C-141/12 & 372/12, YS v. Minister voor Immigratie, Integratie en Asiel, 2014 E.C.R. I-2081, ¶ 44. 158 Id. ¶ (70)2. 159 Cases C-141/12 & 372/12, YS, M and S v. Minister voor Immigratie, Integratie en, 2013 E.C.R. I-838, ¶ 60. 530 COLUMBIA BUSINESS LAW REVIEW [Vol. 2019 and ask that it be erased or blocked.” 160 She does admit that data subjects have a valid interest in “knowing exactly what circumstances were relevant to the decision taken,”161 but believes this interest does not fall under the scope of data protection law because it does not “cover opinions and other measures taken during the preparation and investigation” of a case.162 Instead, review of “the decision for which... legal analysis was prepared” 163 should be left to a relevant “independent judicial authority.”164 Data subjects are thus seen to have a valid interest in the accuracy of decisions taken about them, but lack an equivalent right of review. This is a very troubling view and relates to the discussion above of legal and ethical decision-making standards.165 First, a legal analysis contains the (interim) inferences, assumptions or opinions underlying final inferences and subsequent decisions. Excluding access and review of such analysis from the scope of data protection law means data subjects are unable to assess how potentially highly impactful inferences and decisions are made about them,166 unless relevant sectoral laws allow them to do so. Second, requiring only a summary of personal data undergoing processing to be shared with the data subject via the right of access severely limits the data subject’s ability to Id. Id. ¶ 36. 162 Id. ¶ 32. 163 Id. ¶ 60. 164 Id. 165 See supra Part II. 166 See Douwe Korff, The Proposed General Data Protection Regulation: Suggested Amendments to the Definition of Personal Data, EU LAW ANALYSIS (Oct. 15, 2014), http://eulawanalysis.blogspot.com/2014/10/theproposed-general-data-protection.html [https://perma.cc/SRY9-JDW8]; Robert Madge, Five Loopholes in the GDPR, MEDIUM (Aug. 27, 2017), https://medium.com/mydata/five-loopholes-in-the-gdpr-367443c4248b [https://perma.cc/L8EM-8YPM]; Steve Peers, Data Protection Rights and Administrative Proceedings, EU LAW ANALYSIS (Jul. 17, 2014), http://eulawanalysis.blogspot.com/2014/07/data-protection-rights-and.html [https://perma.cc/69YU-8U9H]. 160 161 No. 2:494] A RIGHT TO REASONABLE INFERENCES 531 assess lawfulness of data processing and the accuracy of their personal data used to make the decision. Third, the limited remit of data protection law is alarming. It might be the case that generally applicable decision-making standards exist in the public sector based on democratic legitimacy,167 but comparable broadly applicable standards are less likely to govern the private sector. Even though the decision-making autonomy of private entities is bound by certain laws (e.g. anti-discrimination law), companies are less likely than the public sector to have legally binding procedures or rules they need to follow when making decisions. The spread of Big Data analytics and the resulting increase in the capacity of data controllers to infer information about the private lives of individuals, modify and solidify their identity, and affect their reputation, suggest that a higher level of protection is required than has previously been the case for human and bureaucratic decision-making. Thus, according to the ECJ, when a private company draws inferences from collected data or makes decisions based on them, even if the final inferences or decisions are seen as personal data, data subjects are unable to rectify them under data protection law. Data subjects also lack access to the reasoning underlying the decisions, which is not considered personal data, as well as means to rectify the analysis under data protection law. B. Case C-434/16: Nowak The ECJ’s view in YS and M and S seems to be partly at odds with its later ruling in Peter Nowak v. Data Protection Commissioner168 in December 2017. In the case, an exam candidate (Mr. Nowak) requested to exercise his right of access and “correction” in relation to his marked exam script.169 As with YS and M and S, the case centered on the See generally De Hert & Gutwirth, supra note 73, at 271, 276–77. Case C-434/16, Peter Nowak v. Data Prot. Comm’r, 2017 E.C.R. I994, ¶ 60; see also Purtova, supra note 86, at 66–67. 169 Case C-434/16, Peter Nowak v. Data Prot. Comm’r, 2017 E.C.R. I582, ¶¶ 9–13. 167 168 532 COLUMBIA BUSINESS LAW REVIEW [Vol. 2019 question of whether opinions and assessments, in this case an exam script and the comments of an assessor, constitute personal data. 1. Inferences as Personal Data The ECJ determined that both the exam script and comments of the assessor are the candidate’s personal data. In making this determination, the ECJ referred to a broad definition of personal data, which includes data “in the form of opinions and assessments, provided that it ‘relates’ to the data subject.”170 Specifically, the Court determined that an opinion or assessment that is “linked to a particular person” by “reason of its content, purpose or effect” counts as personal data.171 Both the answers provided by the candidate and the comments made by an assessor on the exam script were deemed personal data on this basis. 172 The ECJ argued that the assessment, comments and evaluation of the candidate can have an “effect” on him and his private life, and are thus his personal data.173 It is worth noting, however, that exam questions were not considered the candidate’s personal data.174 The AG held a similar view, arguing that “the personal data incorporated in an examination script is not confined to the examination result, the mark achieved or even points scored for certain parts of an examination. That marking merely summarises the examination performance, which is recorded in detail in the examination script itself.”175 The ECJ also considered whether the interests of other parties can influence the classification of data as personal data. They responded in the negative, arguing that the fact 170 Case C-434/16, Peter Nowak v. Data Prot. Comm’r, 2017 E.C.R. I994, ¶ 34. 171 Id. ¶¶ 34–35. 172 Id. ¶¶ 42, 44. 173 Id. 174 Id. ¶ 58. 175 Case C-434/16, Peter Nowak v. Data Prot. Comm’r, 2017 E.C.R. I582, ¶ 27. No. 2:494] A RIGHT TO REASONABLE INFERENCES 533 that the assessment of the assessor also constitutes his or her personal data cannot block classification of the assessment as the candidate’s personal data. 176 Further, both the ECJ and AG argued that the fact that certain rights like the right of access or rectification might be exercised due to the classification of the exam answers and the comments as personal data is, in fact, irrelevant to making such a classification, even if their exercise would otherwise be thought undesirable.177 The status of personal data should therefore not be denied based on the data subject potentially exercising the right of rectification in an unintended way (i.e. correcting answers after the fact). 2. Remit of Data Protection Law While the ECJ acknowledged in Nowak that opinions and assessments can be personal data, they did however note that the ability to fully exercise relevant individual data protection rights does not automatically follow from this classification. Rather, the ECJ argued that the scope of the rights attached to personal data have to be interpreted teleologically, with reference to both the aims of data protection law and the purpose for which the data was collected and processed. 178 In other words, the scope of data protection rights must be interpreted contextually, or with reference to the specific purposes for which data was collected, and the broader aims of data protection law. This means that the reason for which this data is collected defines the data protection rights. In this context, someone was asking to be assessed, and therefore the situation is inherently antagonistic, which means that the data subject cannot rectify how they are being assed, apart from ensuring that their input data was complete. 176 Case C-434/16, Peter Nowak v. Data Prot. Comm’r, 2017 E.C.R. I994, ¶ 44. 177 Id. ¶ 46; Case C-434/16, Peter Nowak v. Data Prot. Comm’r, 2017 E.C.R. I-582, ¶¶ 31, 34. 178 Case C-434/16, Peter Nowak v. Data Prot. Comm’r, 2017 E.C.R. I994, ¶ 53. 534 COLUMBIA BUSINESS LAW REVIEW [Vol. 2019 For an exam script, the rights of access and rectification should not result in the candidate being allowed to correct answers a posteriori.179 A sensible use of the right of rectification in this context allows the candidate to discover whether by mistake, the examination scripts were mixed up in such a way that the answers of another candidate were ascribed to the candidate concerned, or that some of the cover sheets containing the answers of that candidate are lost, so that those answers are incomplete, or that any comments made by an examiner do not accurately record the examiner’s evaluation of the answers of the candidate concerned.180 Thus, the right of rectification was not taken to cover the content of the assessor’s comments, which can be understood as a type of inference about the candidate’s performance based on his answers.181 The AG’s opinion aligned closely with the ECJ on the teleological interpretation of data protection rights. The AG argued that allowing the candidate to rectify answers after Id. ¶ 51–52. Id. ¶ 54. 181 Id. ¶ 56 (“In so far as written answers submitted by a candidate at a professional examination and any comments made by an examiner with respect to those answers are therefore liable to be checked for, in particular, their accuracy and the need for their retention, within the meaning of Article 6(1)(d) and (e) of Directive 95/46, and may be subject to rectification or erasure, under Article 12(b) of the directive, the Court must hold that to give a candidate a right of access to those answers and to those comments, under Article 12(a) of that directive, serves the purpose of that directive of guaranteeing the protection of that candidate’s right to privacy with regard to the processing of data relating to him (see, a contrario, judgment of 17 July 2014, YS and Others, C‑141/12 and C‑372/12, EU:C:2014:2081, paragraphs 45 and 46)[.]”). This could give the impression that the assessment also falls under the right of rectification. However, considering the examples provided for a sensible use of rectification, see Case C-434/16, Peter Nowak v. Data Prot. Comm’r, 2017 E.C.R. I-994, ¶ 45, and the general goal of data protection—assessing the lawfulness of data processing—it is inconceivable that the right to rectification would also apply to the comments of the assessor. 179 180 No. 2:494] A RIGHT TO REASONABLE INFERENCES 535 completing the exam would be nonsensical, as the purpose for which the data was collected was to evaluate the candidate’s performance.182 Rather, to be sensible, the right to rectification must be limited to assessments of whether the “script inaccurately or incompletely recorded the examination performance of the data subject. For example... [if] the script of another examination candidate had been ascribed to the data subject, which could be shown by means of, inter alia, the handwriting, or if parts of the script had been lost.”183 While the AG acknowledged that assessments (i.e. the assessor’s comments) can be personal data, 184 she remained dubious about the applicability of “a right of rectification, erasure or blocking of inaccurate data, under data protection legislation, in relation to corrections made by the examiner.”185 This narrower view is based on the AG’s doubt “that comments made on the script could in fact refer to another script or not reflect the examiner’s opinion,”186 as “[i]t is precisely that opinion that the comments are meant to record.”187 Rectification would therefore be inappropriate, as “such comments would not be wrong or in need of correction even if the evaluation recorded in them were not objectively justified.”188 Here, the AG again indicates that the remit of data protection law is not to assess the justification behind an assessment or decision, in this case the mark on an exam script. In contrast to the right to rectification, the ECJ acknowledged that the right of access must be granted 182 Case C-434/16, Peter Nowak v. Data Prot. Comm’r, 2017 E.C.R. I582, ¶ 35. 183 Case C-434/16, Peter Nowak v. Data Prot. Comm’r, 2017 E.C.R. I994, ¶ 36. 184 It is interesting to note the AG even points at the similarities between legal analysis and comments, and points towards the tension between interpretations in YS and M and S and Nowak, but ultimately refuses to address it. See id. ¶ 58–59. 185 Id. ¶ 54. 186 Id. 187 Id. 188 Id. 536 COLUMBIA BUSINESS LAW REVIEW [Vol. 2019 “irrespective of whether that candidate does or does not also have such a right of access under the national legislation applicable to the examination procedure.”189 The ECJ did, however, explain that the right of access can be restricted by Member State laws or when the rights of freedoms of others are concerned.190 This caveat reflects the ECJ’s belief that the actual protection afforded by the right of access (and by extension, other data protection rights) must be determined contextually.191 These limitations on the rights of rectification and access align with several of the ECJ’s prior decisions, which state that the remit of data protection law is not to ensure the accuracy of decision-making processes.192 Other data protection rights not involved in the case were also addressed in the ECJ’s judgement. The right of erasure was determined to be applicable to examination answers and the examiner’s comments after an appropriate period of time.193 The ECJ also explained that the candidate might have an interest that this data is not “being sent to third parties, or published, without his permission.”194 In short, in Nowak the ECJ and AG seemingly broadened the scope of personal data to include opinions and assessments but followed their previous opinions in that only limited rights are granted over assessments (e.g. opinions, 189 Case C-434/16, Peter Nowak v. Data Prot. Comm’r, 2017 E.C.R. I994, ¶ 56. 190 Id. ¶¶ 60–61. 191 Id. ¶¶ 60–61. Specifically, the ECJ suggests that “Member States may adopt legislative measures to restrict the scope of the obligations and rights provided for in, inter alia, Article 6(1) and Article 12 of that directive, when such a restriction constitutes a necessary measure to safeguard the rights and freedoms of others.” Id. ¶ 60. The scope of rights is thus subject to restriction on the basis of purpose- or case-specific risks to the rights and freedoms of others. 192 See Joined Cases C-141/12 & 372/12, YS, M and S v. Minister voor Immigratie, Integratie en Asiel, 2014 E.C.R. I-2081, ¶¶ 45–47; Case C-28/08 P, European Comm’n v. Bavarian Lager, 2010 E.C.R. I-06055, ¶ 49. 193 Case C-434/16, Peter Nowak v. Data Prot. Comm’r, 2017 E.C.R. I994, ¶ 55. 194 Id. ¶ 50. No. 2:494] A RIGHT TO REASONABLE INFERENCES 537 inferences). Further, data protection law was seen to not have the aim to evaluate whether these assumptions are accurate. Data subjects lack a right to rectify the comments (interim inferences) or the results of exams (final inferences) or exam questions.195 Rather, other applicable laws and remedies need to be consulted, for example through examination procedures.196 Finally, the remit of data protection law was again limited to discovery of the scope of data being processed, and to assess whether the processing is lawful. Assessment of the accuracy of inferential analytics and decision-making processes remains outside its scope.197 Owing to the fact that the rights in the GDPR have to be interpreted teleologically, it is not unthinkable that future jurisprudence will grant the right to rectification in relation to the content of assessments and inferences. However, in many cases people will request an assessment (e.g. to obtain employment, insurance, or a loan). In such cases the aim of processing will be to evaluate the person, which is often an inherently antagonistic situation in which a right to rectify one’s assessment would defeat the purpose or telos of the assessment. Paired with