Information Technology and the Law PDF

Document Details

InvincibleSuprematism

Uploaded by InvincibleSuprematism

Università di Bologna

Tags

information technology law information society technology

Summary

This document explores the characteristics of the information society and the role of information and communication technologies in shaping social structures. It touches upon topics like network society, big data, and AI, ultimately aiming to understand how law needs to adapt to this ever-evolving digital landscape.

Full Transcript

AROGMENTO SPIEGAZIONE NOTE Information society The information society is characterized by the shift from the industrial society to...

AROGMENTO SPIEGAZIONE NOTE Information society The information society is characterized by the shift from the industrial society to a society where The rise of information and communication technologies information and communication technologies play a key role. (ICTs) has led to a profound shift from industrial society to what is now known as the information society. This transformation is Characteristics (Manuel Castells): characterized by the central role that information plays in 1. Information is the raw material: self feeding system: information is used as input in order to produce shaping social and economic structures. In the information output society, information is not merely a resource; it serves as the 2. Pervasive effects of new technologies: infosphere* impacting different aspects of individual and raw material that fuels innovation and progress. New social life technologies have a pervasive impact, influencing every aspect 3. Interconnection: creation of a network society of individual and social life. The interconnected nature of these 4. Flexibility: reprogramming the different ways in which information is processed, used, distributed, and technologies facilitates communication and collaboration on a social interactions global scale, blurring geographical boundaries. Moreover, the 5. Convergence: different tools and methods information society is marked by flexibility and constant evolution as technological systems adapt and reconfigure to Information technology has led to the network society, where interactions are not limited by accommodate new advancements. The convergence of different geographical distance. technologies further amplifies this dynamic, leading to integrated - "Esse est percipi": Social reality is represented in ICT systems, leading to a shift where humans systems where distinct fields, like informatics and become supervisors of information processing rather than operators. telecommunications, blend seamlessly. - As a consequence: there is (1) an increase in available inforation; (2) automation of all activities The Role of Dynamics: - New opportunities and challenges for individuals - The emergence of big data and artificial intelligence - Concerns about privacy, autonomy, and inequality The information society also has to do with the "Internet of Things" (IoT) in which physicval objects will be conected to the Internet and be able to identify themselves to other devices. An infosphere is created: an enviornment where information is created and processed. Because of this new environment, law is called to adapt. Network society, network effect, and monopolies The network effect describes how the value of a network increases as more users join. It is a As per the network effect: an example is the spread of MS socioeconomic phenomenon where the value of a network increases exponentially as more users join. Windows - more MS Windows users > more software producers - Larger network > larger utility > larger value create products for MS Windows > more MS Windows users - Network is almost proportional to the square of the number of connected users of the system The risk with the network effect is the hinderance of growth of (1) Encourages rapid product adoption better prodcuts or technological solutions (2) Can lead to monopolies Information assymetry example: "market for lemons" where - The EU has started to apply competition law to address these issues sellers of low-quality products may exploit the information - As a network grows, it becomes increasingly advantageous for users to adopt the dominant product or assymetry to deceive buyers, leading to a schenario where service, even if it is not objectively the best inferior goods dominate the market. - Network effects may lead to the creation of horizontal monopolies in which whoever dominates the market for a product tends to use that power to extend control to contiguous products. To limit these At the same time, the ICT market also exhibits countervailing tendencies, software interoperability might be promoted, ensuring that users of respective trends. Online marketplaces like Amazon, enabled by the applications can easily exchange data reduced cost of online distribution, can offer a vast array of products, known as the "long tail". This increased product (3) Can make the users feel "trapped": "long tail" tendencies diversity benefits consumers by providing more choices and Long tail tendencies: reducing the cost of distribution allows sellers to offer a wide range of products empowering niche producers to reach a wider audience (even those for which demand is low) - The internet allows sellers to offer a wider range of products - Reduces the cost of distribution - Gives an advantage to online marketplace sover traditional stores Decreased cost of physical capital: the cost of physical capital is much lower > those who have the skills required to offer an IT service or good can participate in the information economy, widening the market - Peer production - Open source distribution of goods (4) Information assymetry - Producers often know more about products than consumers; - Risk of market distortions Social reality, data processes and legal protection The pervasive use of ICT systems has profound implications for social reality. George Berkeley's statement "esse est percipient" encapsulates a key concept: the representation of reality within information systems. Social reality, encompassing events, facts, and their outcomes, is increasingly represented in digital form within ICT systems. This shift has led to a change in the human role in data processes. From being "operators" directly involved in data processing, humans have transitioned into "supervisors," overseeing the automated systems that handle the bulk of data operations. This new dynamic necessitates legal protection to ensure individual rights and prevent potential harm. It becomes crucial to guarantee that individuals have control over their digital representations stored in these systems and that automated decisions respect fundamental legal principles, including privacy, dignity, and non-discrimination. The demands for legal protection are: (1) maintaining human control on digital representation stored in ICT systems; (2) respect of fundamental legal principles; (3) safety and security of the data; (4) preventin of digital crimes; (5) ensuring citizen's access to public data; (6) ensuring authenticity and integrity of digital documents Big data and AI The information society has been further transformed by the rise of big data and artificial intelligence Surveillance Capitalism: people are subject to manipulation, (AI). are deprived of control over their future and of a space in which to develop their personality (e.g. personalized ads; data used to Big Data, characterized by its immense volume and the speed at which it is processed, coupled with influence thoughts). This is also caused with the critique of advancements in AI, allows for unprecedented data analysis and decision-making capabilities. alghoritmic biases: if the data input is biased, inevitably the However, these powerful technologies also raise concerns. Privacy and autonomy are threatened as output is expected to also be biased. vast amounts of personal data are collected and analyzed, often without individuals' full awareness or consent. The potential for algorithmic bias in AI systems exacerbates existing inequalities, impacting access to opportunities and resources. The dissemination of misinformation and the increasing difficulty in discerning truth from falsehood, particularly with the rise of synthetically generated content, further complicate the landscape, leading to what has been termed the "Synthetic Society" Legal informatics Legal informatics focuses on the use of IT in law, aiming to improve efficiency and promote legal values. The topics it includes are: (1) improved access to legal sources, (2) development of legal information systmes, (3) development of legal drafting systems, (4) computer forensics, (5) systems of legal training, (6) studies on legal-knowledge modelling and legal reasoning, (7) development of systems for legal determinations, (8) systems for legal planning simulation. It works on different contexts: 1. Legislatvie informatics 2. Administrative informatics 3. Judicial informatics 4. Informatics for the legal professions Digital law Digital law focus on legal issues related to the development and use of computers. It covers topics such as: 1. Intellectual property 2. Data protection 3. Electronic document and digital signatures 4. Virtual identity and presence 5. E-commerce 6. E-government 7. Computer crimes 8. IT and fundamental rights Computers Hardware Euclid first introduced algorithms 7000 years ago, defining it as - Hardware refers to the physical components of a computer system that can be touched, handled, and a step-by.-step description of the set of operations to be manipulated. These components include the central processing unit (CPU), memory devices such as perfomed to accomplish a particular task. RAM and hard drives, input devices like keyboards and mice, and output devices such as monitors and printers. Hardware serves as the tangible foundation upon which software operates, making it essential Non programmable machines: Pascaline and Stepped for the computer’s functionality. Reckoner - could only execute one operation at a time and - The CPU, often referred to as the “brain” of the computer, executes instructions and processes data. couldn't be programmed to execute an entire combination Other key hardware components, like the motherboard, power supply, and cooling systems, ensure the ofoperation seamless operation of the entire system. Collectively, these parts enable users to perform computational tasks, interact with software, and manage data effectively. Programmable Machines: Jacquarde > Babbage > Lady Augusta > Hollerit Census Tabulator Software - Jaquarde: punch cards to control weave patterns; - Software is the intangible set of instructions, programs, and data that tells a computer how to perform - Babbage: mechanical machine for calculating logarithms - specific tasks. It bridges the gap between users and hardware, enabling interaction and functionality. control flow + integrated memory. Only idealized, not physically There are two main categories of software: system software and application software. made - System software includes operating systems (e.g., Windows, macOS, Linux) and utilities that manage - Lady Augusta Ada Lovelace: wrote the first algorithm hardware and basic functions. Application software, on the other hand, consists of programs designed - Hollerit Census Tabulator: captured and processed data by for user-specific tasks, such as word processing, web browsing, and gaming. Both types work in reading holes on special paper punch cards tandem to provide a functional computing experience. Without software, hardware would be inert and incapable of executing any meaningful operation. Modern computers are: - Digital Precursors of computers - Electronic - Before the advent of modern computers, various mechanical and analog devices laid the foundation - Programmable for computational technology. These precursors include tools such as the abacus, which was one of the - Universal earliest instruments for arithmetic calculations, dating back to ancient civilizations. - In the 17th and 18th centuries, inventors like Blaise Pascal and Gottfried Wilhelm Leibniz designed mechanical calculators capable of basic mathematical operations. These devices represented significant steps toward automated computation. Charles Babbage’s design of the Analytical Engine in the 19th century marked a major leap forward. Although never built in his lifetime, it was the first conceptual machine to include features like conditional branching and memory, elements seen in today’ s computers. Programmable machines - Programmable machines represent the evolutionary step from mechanical calculators to fully functional computers. The concept involves devices that can execute a sequence of instructions or programs to perform complex tasks. Early examples include the Jacquard loom, developed in 1804, which used punched cards to control the weaving patterns of textiles. This innovative use of instructions became a precursor to computer programming. - Later developments, such as Alan Turing’s theoretical concept of the Turing machine in 1936, further advanced the notion of programmability. Turing’s ideas laid the groundwork for the architecture of modern computers, emphasizing the ability of a machine to execute any computable function given the appropriate program. This foundation enabled the creation of digital computers that could be reprogrammed for a variety of applications, revolutionizing technology and society. Universal Turing Machine Hypothetical machine that could process any algorithm expressed as a program, applying it to data provided as input to the machine. - All universal machines are equivalent; - Every universal machine is capable of executing every algorithm; - Every universal machine is also able to execute the algorithm that determines the functioning of every other universal machine One machine could do it all: every universal machine is able to emulate every other universal machine Limits: - The machine can solve any problem as long as the procedure for solving it is represented with an algorithm - Halting problem: it's mathematically impossible to program a computer so that it can deterine whether any program Q will stop or run forever, when applied to data D. As a consequence (1) Impossible to develop a program that instructs a computer to verify the correctness of other programs under all possible conditions and all possible sets of inputs; (2) Legal implications for legal liability in software development: impossible to guarnantee that a software will be absolutely error-free. Liability may be attached to the software developer or the user Von Neumann Architecture The Von Neumann architecture is a digital computer architecture described in 1945 by John von Neumann. This architecture consists of the following key components: (1) A central processing unit (CPU), which includes - Control unit (CU): the CU identifies the instruction to be executed and retrieves the relevant data. - Arithmetic and logic unit (ALU): the ALU executes the instruction. - Modern computers are built around the central processing unit (CPU) described by the Von Neumann architecture. The CPU consists of a combination of switches that allow electric current to flow or be blocked, and different technologies have been adopted over the years to develop this grid of switches. (2) A central memory (internal memory), which stores both instructions and data. - This memory is a read-write, random-access memory (RAM). - The central memory is usually volatile, meaning it requires power to retain the stored instructions and data. - When the power is interrupted, the content is lost. (3) Input and output devices which allow the computer to communicate with the user or other computers. The Von Neumann architecture is a stored-program digital computer architecture. This means that the computer can access instructions in its memory faster, execute them sequentially, and execute conditional branches. - In order to be executed, programs usually need to be copied from the mass memory to the central memory. Moore's laws Moore's First Law is an observation about the evolution of computing hardware. It states that the There's a trend allowing computers to become smaller, more number of transistors in a dense integrated circuit doubles approximately every two years. This affordable and faster. However there is a n increase in overall exponential improvement has dramatically enhanced the effect of digital electronics in nearly every production costs. segment of the world economy. - The law was first stated in 1964 by Gordon E. Moore, who was the co-founder of Intel Corp. Moore's second law states that the cost of computer power to the consumer falls, and the cost that manufacturers would have to bear to keep pace with Moore's first law follows an opposite trend: research & development, manufacturing, and test costs have increased steadily with each new generation of chips. - The global microprocessor market is dominated by a limited number of companies, including Intel Corp, which accounts for 80% of the market of CPUs for personal computers. Moore's law explains why new enterprises have the resources needed to design new microprocessors. Personal Computer (PC) and Networks (1) Man-Machine-Environment interaction Napster case: Napster was held liable for contributory - Automated system where sensors collect information from the environment > information is processed infrigement and vicarious infringement of US record companies' by the computer > the computer instruct effectors (devices capable of producing physical results) to copyrights; the system was stopped. execute operations. - This type of architecture is often used in robotics. Peppermint case: copyright infringements via P2P networks. Peppermint was a German record company (2) Client-server and peet-to-peer Client-server Pirate Bay: joint criminal and civil prosecution case in Sweden - Several computers (servers) are used specifically to provide services to individual users' PC (clients) for copyright infringement through the website containing - Input and output devices receive and transmit information only to human users information for illegally accessing copyrighted materials over the Peer-to-peer (P2P) P2P system BitTorrent - Every computer acts as both a client and a server - e.g. Skype communication platform, file-sharring systems - Pro: they enable users with limited resources to use the Internet and contribute to its development even if they cannot afford servers or pay for their services; more efficient use of resources - Cons: risk of illegal sharing of content protected by intellectual property law > Napster case and Peppermint case Cloud computing Model in which large groups of online servers are networked to provide access to software, data, services, and platforms. - The user does not use hardware and software resources on a PC, but rather accesses online resources (the cloud) provided by the network of servers. - Cloud computing services are typically provided by large companies, such as Google, Microsoft, Apple, and Amazon. These companies offer a range of online services such as email, file storage, and online applications. Cloud computing has several advantages: - It reduces costs: users do not have to buy, maintain, and update expensive hardware and software. - It provides efficiency and flexibility: powerful PCs are often underused, while the cloud can provide resources to users in such a way as to automatically adjust to their needs. - It offers better backup and security than individual PCs. However, cloud computing also has some risks and downsides: - The cloud-computing market is controlled by a few large enterprises, which may lead to oligopoly. - It creates privacy and data protection risks, as the gathering of large amounts of information creates new risks, such as the possibility of hackers attacking the cloud service. - Users can experience a loss of control over the system, since the companies that manage the cloud have complete control and power, which could be used against users and limit their freedom. Computers as digital programmable machines Analogical and digital representations (1) The AND connector - An analogical representation is a way of presenting information using a continuous physical quantity. The AND connector combines two propositions and is For example, in analog recordings, magnetic waves fluctuate continuously on a tape to represent represented by the word AND. sound. - The proposition “A AND B” is true only if both A and B are true; - A digital representation, on the other hand, converts continuous quantities, like sound, into digital otherwise, it is false. quantities (numbers). These can be represented using decimal or binary numbers > Computers are - The truth table for the AND connector shows that the output is digital representations 1 only when both inputs are 1. - A special electronic circuit can be designed to mimic the AND Advantages of digital representation over analogical representation include: operator, where the output is a high voltage (1) only if both (1) Precision: Digital representation can be reproduced with absolute accuracy. In contrast, analogical inputs are high voltage. representations lose quality each time they are copied. (2) Durability: Digital formats do not degrade over time, whereas analog recordings are prone to (2) The OR connector quality loss due to wear. The OR connector combines two propositions and is (3) Compatibility with computers: Digital information, being in the form of numbers, can be directly represented by the word OR. processed by computers. - The proposition “A OR B” is true if at least one of A or B is true. It is false only when both are false. While digital representation may seem less precise than analogical representation, the “steps” between - The truth table for the OR connector shows that the output is 1 digital values are so small that, in practical applications, they provide a sufficiently accurate if either one or both inputs are 1. approximation of the original phenomenon. - A special electronic circuit can be built to mimic the OR operator, where the output is a high voltage (1) if either of the The binary system inputs is a high voltage. - Modern computers use the binary system to store and process information. This system operates with only two digits: 0 and 1. Each position in a binary number represents a power of 2. (3) The NOT connector - In contrast, the decimal system uses 10 digits (0 through 9), and each position represents a power of The NOT connector negates a proposition and is represented by 10. the word NOT. - The smallest unit of information in the binary system is the bit (binary digit). A sequence of 8 bits - The proposition “NOT A” is true if A is false, and it is false if A forms a byte. is true. - Computers can represent all types of information, including text, pictures, and images, using the - The truth table for the NOT connector shows that the output is binary system. For instance, text characters can be represented as binary numbers using the ASCII the opposite of the input: if the input is 1, the output is 0, and if code. the input is 0, the output is 1. - A special electronic circuit can be designed to mimic the NOT It is easier to develop electrical component that work with only 2 alternative states (on, off > 1 and 0) operator, where the output will be a low voltage (0) if the input is rather than 10 states. a high voltage (1), and vice versa. - The binary system is a positional system: the value of each digit depends on its position Boolean Algebra and Computers - Boolean algebra is a system of logic that evaluates propositions using two values: TRUE and FALSE, which are represented by 1 and 0, respectively. - Boolean algebra employs logical operators such as “and”, “or”, and “not” to combine propositions. Logic gates, such as the AND gate, OR gate, and NOT gate, are created by applying Boolean algebra rules. - These gates operate on electrical signals, using different voltage levels: true (1) is represented by a high voltage, and false (0) is represented by a low voltage. - By combining AND, OR, and NOT operators, more complex logical formulas and devices can be created. Code as law Software and hardware architecture can regulate behavior in cyberspace by defining what is possible Cyberspace regulation (according to Lessig) and influencing user actions. 1. Law - Unlike traditional legal and social norms, this form of regulation presents new challenges regarding 2. Social norms power, fairness, and accountability in the digital age. 3. Market 4. Code: most important; understood as (a) programming code; How Code Functions as Regulation (b) set of computational processes every user has to interact Virtual rules constrain the area of what is "virutally possible", therefore they indirectly regulate human with behavior 1. Virtual Rules: Code creates “virtual rules” that govern the behaviors and properties of computational processes. These rules indirectly influence human behavior by determining what is possible or impossible within the digital space. 2. Enabling and Disabling Actions: Code enables or restricts specific actions. For example, it can allow actions such as downloading files or sending emails, or prevent them entirely, shaping user behavior through these limitations. 3. Information Control: Code controls what information is accessible to users and how it is presented. It can either reveal or obscure certain options or data, thereby guiding user behavior and decision- making. 4. Indirect Influence: Unlike legal or social norms, virtual rules influence human behavior indirectly through the design of the digital environment. Code shapes what actions are possible, thereby regulating user behavior through its design. The Power of Virtual Rules 1. Effectiveness over Legal Rules: Virtual rules can be more effective than legal rules because they can make certain actions impossible. For instance, digital rights management (DRM) systems use code to prevent the unauthorized copying of copyrighted content. 2. Substitution for Legal Rules: In some cases, virtual rules can replace legal rules. When code makes a certain action technically impossible, it acts as a substitute for legal restrictions. 3. Surveillance and Identification: Virtual rules can facilitate surveillance, tracking, and user identification, raising concerns about privacy and control over personal data. 4. Lack of User Influence: A significant concern is that virtual rules, often applied by automated systems, typically give users no influence over their application. Moreover, users may not even be aware of these rules or their effects. Comparison to Legal Rules - Legal Rules: These are created through political processes, public debate, and judicial review, allowing citizens to challenge laws in court. - Virtual Rules: Typically the result of private decisions made by developers, virtual rules are not subject to public debate or judicial oversight. They are applied without considering fairness or equality; they usually are created to protect the interest of a single party, typically the developer of the IT system. The other parties can only abide or suggest that they be revissed, there is no mean to challenge the law in court as what occurs in legal rules - Complementary Regulation: Legal rules and virtual rules should complement one another. Legal rules should step in when virtual rules are insufficient or unjust, ensuring balance and fairness in regulation. Implications - Representation of Interests: The rise of “code as law” underscores the importance of ensuring that a wide range of interests are represented in decision-making, to ensure that technical choices are both effective and fair. - Ethical and Legal Considerations: It is crucial to recognize the power of code in shaping behavior and to consider the ethical and legal implications of this form of regulation. Programs and algorithms Algorithms Programmers write the code (source code) to be translated into - Algorithms are sets of instructions that define a sequence of operations to solve a specific problem or machine language. This formulation of the algorithm is usually class of problems. They are based on mathematical concepts and can be executed by computers called pseudocode. automatically, without requiring additional instructions. - Definition: finite set of instructions that specify a sequence of operations to be carried out in order to solve a specific problem or class of problems. Key Aspects of Algorithms: 1. Input: The data needed for the algorithm to work. 2. Output: The result of the algorithm’s process. 3. Procedure: The specific steps taken to achieve the output from the input. 4. Finiteness: An algorithm must finish after a finite number of steps. 5. Generality: An algorithm should solve not just one problem but a class of related problems. 6. Non-ambiguity: Every step in the algorithm should be clearly defined, with only one interpretation. 7. Repeatability (Determinism): Given the same input, the algorithm’s output must always be the same. Representation of Algorithms: - Algorithms can be expressed in natural language or a programming language, with the latter being more suitable for execution on computers. - An algorithm is considered correct if, for any given input, it always produces the correct result. If it fails to do so, it is deemed incorrect. Programs: - Programs are the concrete expression of an algorithm, written in a programming language that computers can understand. A program is an executable version of an algorithm. 1. Execution: Programs are executed by computers, which follow the provided instructions. 2. Programming Languages: These languages enable programmers to write instructions for computers. They range from high-level languages (more human-readable) to low-level machine languages. - Machine Language: The binary code (0s and 1s) directly understood by computers. - Assembler: A tool that converts human-readable instructions into machine language. - High-Level Languages: More accessible for humans to write and understand, like Basic, Pascal, and JavaScript. - Source Code: The text written by the programmer in a high-level language. - Object Code: The compiled version of the source code, in machine language, ready for execution. 3. Compilation: The process of converting source code into object code without immediate execution. This version is more challenging to reverse-engineer. 4. Interpretation: The process of translating source code into machine language and executing it immediately. Relationship between Algorithms and Programs: - Algorithms provide the logic and instructions for solving a problem, while programs are the actual implementation of these instructions in a computer-readable form. - An algorithm can be implemented in various ways and languages, resulting in different programs that may solve the same problem. - Algorithms are the theoretical foundation, whereas programs are the practical realization of these ideas. - The economic utility of software is based on its ability to convey information efficiently. - Algorithms are abstract steps designed to solve a problem, while programs are their tangible, executable implementations in a language that computers can understand and run. Algorithms Examples and errors 1. Chocolate Cookies Recipe: A real-world example where the input (ingredients) and output (cookies) are defined, demonstrating a basic algorithm for human execution. 2. Multiplication Algorithm: A simple algorithm that multiplies two integers to produce their product. 3. Sequential Search Algorithm: Searches for a phone number in a list by checking each name in sequence, with an updated version that handles cases where the name is not found. 4. Binary Search Algorithm: A more efficient search algorithm for sorted lists that divides the list in half repeatedly until the item is found or not present. 5. Euclid’s Algorithm: A mathematical algorithm to find the greatest common divisor (GCD) of two numbers, implemented in Basic programming. The correctness of an algorithm (1) Correctness Criteria: - An algorithm is correct if it consistently produces the correct result for all possible inputs. - If an algorithm fails to find the correct answer or provides an incorrect result, it is considered incorrect. (2) Consequences of Errors: - Programming errors can lead to catastrophic consequences, especially in complex systems where software is critical. These errors may result in unexpected or harmful behavior. - Not all software defects can be detected during development and validation, making software failure an inherent risk. - It is essential for operators to have the ability to intervene manually during a software failure, especially for high-priority tasks. Efficiency and computational complexity Computational Complexity: Efficiency is evaluated by the computational complexity of an algorithm. This refers to how the time (or other resources) required to execute the algorithm increases as the size of the input grows. (1) Low Computational Complexity: - Algorithms with low computational complexity experience a minimal increase in process time as the input size grows. - Binary search is an example: doubling the size of the phone book results in a minimal increase in execution time, making it efficient. (2) Average Computational Complexity: - Algorithms with average computational complexity experience a proportional increase in execution time as the input size grows. - Sequential search is an example: doubling the size of the phone book leads to a proportional increase in execution time. (3) High Computational Complexity: - Algorithms with high computational complexity experience a more-than-proportional increase in execution time as the input grows. These algorithms can become inefficient for large inputs. From algorithms to programming languages From machine language to assembler Machine language is the basic binary code, assembler is a more - Machine language is the binary code (0s and 1s) directly understood by a computer, but it is readable translation into machine code, and high-level challenging to write complex programs in it due to its low-level nature and hardware dependence. languages are human-readable and independent of hardware. - Assembler is a more readable language that uses words and decimal numbers to represent machine Source code can be translated into object code via interpretation instructions. An assembler program translates this language into machine code. (line-by-line) or compilation (whole program at once). - Code contains: Operation to be executed > operand, address of the memory cell containing the data on which the operation should be executed High level languages compared to machine language - Designed to be easier for humans to read and understand, closer to human language, and independent of hardware specifics. - Examples of high-level languages include Basic, Pascal, and JavaScript, which are translated into machine code by the computer. From source code to object code: the programmer writes the source code > computer translates it into object code > execution - Source code is the human-readable program written in a high-level language. - Object code is the machine-readable version of the source code, which the computer executes. - Translation is a necessary step, without which the computer cannot execute the program. There are two means of translation of source code > object code (1) Interpretation - The computer reads and translates the source code line by line, executing each instruction immediately. This process happens every time the program runs. - Source code > reads the next statement of source code > translation of statement into set of machine- language instructions > immediately executes the instructions - Slower execution but great for initial development and testing (piece by piece translation-execution) (2) Compilation - The entire source code is translated into machine code at once, creating an object code file that is executed directly by the computer. - Source code > translation into object code > translation saved in a new file: new file is available containing the complete and permanent version of the program in machine language - Compilation makes execution faster and is suitable for distributing compiled code without exposing the source code. - The programmer can distribute the object code obtained by compilation without releasing the source code - Decompilation: complex reverse-engineering method from object code to source code; does not ensure that a complete result can be obtained Data and conceptualisation Data and conceptualisation A good digital representation requires the identification of Representation of data in a computer system requires conceptualisation (=defining the entities and relevant classes for the domain and their characterisation with relationships that reflect aspects of reality). relevant attributes. Key attributes identifies one and only one of - This involves making decisions on what kinds of entities to include (e.g., “subscriber” in a telephone the instances of a class. book) and how to structure their data, such as linking a subscriber with a surname and phone number. - Data minimization, necessity and adequacy are required: EU and Italian data protection rules bans the storing of unnecessary The Semiotic Triangle attributes. Natural language, linguistic signs (words) refer to entities, representing their intention (concept) and extension (real-world objects). - The semiotic triangle illustrates the relationship between: 1. The linguistic sign (e.g., the word “judge”). 2. The intention (concept of “a public official appointed to decide cases”). 3. The extension (e.g., real-world instances like “Judge Brown” and “Judge Harris”). Classes, instances, attributes, relationships Classes are general concepts or schemas from which individual instances can be derived, while attributes are characteristics of these instances. - Example: “lawyer” is a class, with attributes like “name”, “surname”, and “tax code”. An instance of the class could be a specific lawyer, such as “Rossi” as the surname. - In IT: class is the blueprint; instances are examples built from the blueprint with essential attributes. Relationships between classes help structure systems, like linking “lawyer” with “client” via the relationship “follows”. - A well-designed digital system identifies relevant classes and attributes, ensuring proper identification of instances. - Storing unnecessary attributes can violate data protection regulations. Files and file formats Files and File Formats Common File Formats: - File: collection of data identified only by the file name - Text files - Archives (potentially stored and indexed in a database) Types of Files: - MS Word doc files (1) Structured Files - archives: these contain predictable information - RTF (Rich Text Format) - Organized according to a specific schema or structure, containing instances of a class; e.g. a - HTML structured file about lawyers contains multiple lawyer records. - LaTeX - Each instance (record) consists of fields (e.g. student number, name, and surname). - Techniques for defining field boundaries include: - Fixed length: Pre-defined field lengths. - Separators: Special characters marking field separation. - Tags: Explicit opening and closing tags for each field. (2) Unstructured Text Files: - Sequences of words, with additional formatting information (e.g., font size, color, position). - Formatting data may include the document’s start and end points. File Formats: - Information can be stored in various ways using different file formats, each with specific rules on binary conversion (e.g., ASCII), semantic value, and typographic formatting. - Some file formats may only be readable by the software used to create them. Databases and Information Retrieval Systems Databases Databases are organized collections of data designed for storing structured information. - They are often managed by Database Management Systems (DBMSs) = software tools that help users store, manage, and analyze data. Information retrieval systems An Information Retrieval System is a system used to search and retrieve specific textual data from large collections of unstructured text. - In the legal field, many legal sources (such as case law, statutes, and legal articles) are stored in unstructured text documents, making information retrieval systems particularly useful. - These systems use an indexing system to help locate relevant text. The indexing system has two main parts: the indexer and the search engine. (1) The indexer scans through text files, identifies key words, and adds these words to a special file called the inverted file. The inverted file is essentially an alphabetical index of all words in the text, along with the document IDs in which each word appears. (2) The search engine uses this inverted file to quickly locate which documents contain the search terms. It retrieves those documents for the user. Queries and performance of information retrieval - Users can query an information retrieval system using combinations of words and logical operators. These include AND, OR, and NOT. (1) AND: This operator is used for conjunctive queries. - The search results will only show documents that contain all the specified words. - For example, a query with “law” AND “court” will only return documents that have both terms. (2) OR: This operator is used for disjunctive queries. - The search will return documents that contain at least one of the specified words. For example, “law” OR “court” will return documents that contain either or both of the terms. (3) NOT: This operator is used for negative queries, meaning it excludes documents containing a certain word. The performance of an information retrieval system is typically measured using two criteria: recall and precision. (1) Recall: how many relevant documents were retrieved compared to how many relevant documents exist in total. - High recall means the system retrieves most of the relevant documents. (2) Precision: how many of the retrieved documents are actually relevant. - High precision means that most of the retrieved documents are useful and not irrelevant. Additional factors in performance: (1) Silence is the failure to retrieve relevant documents (missing information). (2) Noise refers to retrieving irrelevant documents (irrelevant information). Markup languages Markup languages Used for annotating (electronic) documents, such as textual fragments, and other elements of the document. - They use tags to identify the different elements and can add metadata (data about data) to the original text, without interfering with the original content. - Markup languages can be used to print, categorize, structure, extract, and process textual content. Categories of Markup Languages (1) Proprietary or non-proprietary: - Proprietary markup is defined by one or more software companies - company have copyright on the markup language - Non-proprietary (or public) markup is defined by international standards bodies. (2) Readable or non-readable: - Usually, markup is readable by humans, but there are also non-readable markup standards. (3) Procedural or declarative*: - Procedural markup includes instructions on how to process the text, such as alignment, font, and size. - Declarative markup labels text according to its function within the text structure or meaning, without specifying how it should be visualized or processed. *Most important distinction (4) Strict or metalanguages: - Strict markup languages have a fixed set of tags that cannot be changed. - Metalanguages allow the user to define their own custom markup language. HTML language - HTML (Hyper-Text Markup Language) is a markup language used for the development of web pages. - It is a procedural markup language in which tags are used to express the formatting of text. Most instructions are related to graphical representation of the text - HTML is a strict markup language, as the tags that can be used are pre-defined. XML - Extensible Markup Language - XML is a meta-language used across various applications and information systems - Universal language. - Tags are used to express the semantic function of different fragments of the text - It is independent of specific technologies or applications. - Information Description vs. Representation: XML separates information description from its representation, unlike HTML, which specifies how text should be displayed. Features of XML Documents: 1. Searchable, readable, and modifiable. 2. Hierarchical structure: XML is rigorous but extensible. 3. Tags represent elements related to the function of the text, not its appearance. Customization: 1. XML allows users to create custom sets of tags to annotate various types of documents (e.g., laws, case law, contracts). 2. Use in Legal Systems: 3. XML improves system functionality, such as enhancing text search capabilities in legal contexts. DTD and XML standards A Document Type Definition (DTD) is a separate document that defines the rules an XML document must follow to be considered valid. - It specifies which elements and attributes can be used, which elements are required, how often they can appear, and the type of content allowed within them. - The DTD ensures uniform annotation across documents, such as legal texts, and prevents interoperability issues by standardizing tag usage. - DTD allows any user to theoretically define their own XML In the legal domain, XML is widely adopted, with standards like AKOMANTOSO used for digitalizing parliamentary acts. - Legal editors, enhanced by metadata, assist specialists in properly marking up documents and automating parts of the process. Legal ontologies A legal ontology is a formal representation of legal knowledge, providing a complete and detailed description of a specific domain, such as law. It includes not only the concepts of the domain but also their properties and relations. - A legal ontology enables the computer to process and retrieve legal information. For example: 1. Describing the correlation between “right” and “obligation”. 2. Defining that “A has an obligation towards B to do C” is the same as “B has the right that A does C”. 3. Describing connections between concepts like “contract” and “party” and linking them to “obligation” and “right”. Functions of Legal Ontologies according to Sartor 1. Provide a pre-defined set of terms for exchanging information between users and systems. 2. Supply knowledge to systems to infer information relevant to user requests. 3. Classify, filter, and order information. Challenges in Legal Ontologies - Legal ontologies are more complex than those in other fields due to the abstract, mental, and social concepts they must address. - For example, the concept of “intention” requires reference to general concepts beyond the legal domain. Ontology Development Methods - Various methods and languages are used to develop ontologies, with OWL (Ontology Web Language) being one of the most important. - The level of abstraction in an ontology depends on its representation dimension, with three main categories: (1) Foundational Ontologies - These describe general concepts (basic structure) applicable across all domains, aiming to represent the world’s categorical structure. - Example: DOLCE (Descriptive Ontology for Linguistic and Cognitive Engineering). (2) Core Legal Ontologies - Mediate between foundational and domain ontologies. - They include concepts like agent, role, intention, document, norm, right, and responsibility. - LRI-Core onthology (3) Domain Ontologies - These represent specific domains or parts of the world, focusing on the particular meanings of concepts applied to those domains. Examples of Legal Ontologies - The LRI-Core Ontology is a core legal ontology based on DOLCE. - The LKIF-Core Ontology is a domain ontology that provides common terminology for legal information exchange. - An ontology for copyright law would define concepts like “author”, “work of mind”, and “intellectual property rights”, specifying their properties and relations. XML and Legal Ontologies - By using formal ontologies with XML markup, it is possible to link text fragments within legal documents to the concepts defined in the ontology, enhancing the document’s semantic understanding and processing. Cryptography Cryptography Cryptography comes from Greek > hiddem writing - Cryptography is the study of how to encode readable text into an unreadable format, ensuring that only authorized parties can decrypt and read it. - It is used to maintain the confidentiality of documents, verify their integrity, and confirm the authenticity of digital identities and transactions. Symmetric and assymetric cryptographic systems (1) Symmetric Cryptographic Systems - These systems use the same key to both encrypt and decrypt messages. - The key must be agreed upon and exchanged between the sender and recipient before the message is transmitted - one shared key - Example: Caesar’s Cipher is a symmetric system where each letter is shifted three positions forward in the alphabet. (2) Asymmetric Cryptographic Systems - These systems use two related keys: a public key, which is available to everyone, and a private key, which is kept secret by the recipient. - A message encrypted with the public key can only be decrypted using the corresponding private key. - A message encrypted with the private key can only be decrypted using the corresponding public key. - Asymmetric cryptography forms the basis of technologies such as the digital signature. Digital signature Digital signature A hash is a short fixed-length value or key that represents the A digital signature is based on asymmetric cryptography and is used to ensure the authenticity and original document or message. integrity of digital documents. - It is generated by a mathematical function that takes the - Digital signatures are linked to the signed document, becoming part of it, but are not physically complete text message as input and produces as output a fixed- attached to it. size value. - A digital signature can be verified using a publicly known verification algorithm. - This value is always dependent on the message. It is almost impossible that two different messages will create the same Process hash value. Creating a Digital Signature (1) Generate a Hash Code: The hash can be considered a sort of “abstract” representation - The sender generates a hash code from the document using a hash function. of the document; even a very long document (e.g., hundreds of (2) Encrypt the Hash Code: pages) will always generate a very short, unique hash. - The hash code is encrypted using the sender’s private key, producing the digital signature. - Two documents differing by just one character will generate (3) Send the Document: two completely different hash codes. - The document and the encrypted hash are sent to the recipient. Verifying a Digital Signature (1) Receive the Document: - The recipient receives the digitally signed document. (2) Generate a New Hash Code: - The recipient uses the hash function to generate a new hash code from the document. (3) Decrypt the Encrypted Hash: - The hash originally encrypted by the sender is decrypted by the recipient using the sender’s public key. (4) Compare the Hashes: - The recipient compares the two hashes. If they are equal, the signature is valid, and the document has not been changed. If the hashes are not equal, the document is either false or altered. EU and Italian law on electronic signatures In the EU and Italy, digital and electronic signatures are regulated by the “Digital Administration Code” (CAD). The EU E-Idas Regulation defines three types of electronic signature: 1. Electronic Signature 2. Advanced Electronic Signature 3. Qualified Electronic Signature The digital signature is a particular type of advanced electronic signature based on a qualified certificate and on a system of correlated cryptographic keys, one public and one private. In the Italian legal system, all three types of signatures have the same legal value as a hand-written signature, except in a few cases related to real estate transactions, where a digital signature is required (art. 21, c.2, CAD). Advanced electronic signature (1) Uniquely Linked to the Signatory: the signature must be uniquely associated with the person signing. (2) Capable of Identifying the Signatory: it must be possible to identify the signatory from the signature. (3) Controlled by the Signatory: the signature must be created using electronic signature creation data that only the signatory can control. (4) Linked to Data: it must be linked to the signed data in such a way that any subsequent change in the data is detectable. Qualified electronic signature A qualified electronic signature is an advanced electronic signature that meets additional requirements: (5) It is created by a qualified electronic signature creation device. (6) It is based on a qualified certificate for electronic signatures. Cryptography and blockchain Blockchain is an application of asymmetric cryptography that combines the following elements: 1. Asymmetric Cryptography 2. Peer-to-Peer Networking 3. Protocols for Digital Interactions Without a Trusted Third Party This combination creates a system that facilitates secure digital relationships without the need for a trusted intermediary. Applications of Blockchain Technology 1. Cryptocurrencies (e.g., Bitcoin): blockchain technology ensures the security of exchanges within cryptocurrencies like Bitcoin by using its decentralized, secure platform. 2. Smart Contracts: computer programs that automate and express the content and execution of a contractual agreement, allowing for self-executing contracts based on predefined terms. Big data Big Data refers to the dramatic increase in the quantity of data available in digital form, surpassing the Reasons behind increase in data leading to "big data": data generated by humans. - Increased availability of data 1. Volume: Large quantities of data are generated. - Availability of communication power 2. Velocity: Data is produced and processed at high speeds. - Increase in storage capacities 3. Variety: Data comes in diverse formats, including structured data, text, images, and audio. - Increase of processing power Big data is further characterized by the availability of communication power and the increase in storage capacities. Potential and Challenges: - Big data offers significant potential in fields such as healthcare, markets, and understanding human interactions. - However, it also raises challenges concerning privacy, data protection, fairness, and the potential for discrimination. Data Analysis: involves extracting useful information and value from data. The process typically includes: 1. Collecting data 2. Storing data 3. Analyzing data 4. Extracting valuable insights Real-time processing has incentivized the collection of vast datasets, often supported by new tools utilizing Artificial Intelligence. Algorithmic decision-making Algorithmic Decision-Making: increasingly enabled by ICT systems, leveraging various approaches: Data Mining: applies machine learning algorithms to discover patterns in large datasets. It automates the identification of (1) Classical procedural software programs: useful patterns, creating a “model” that enables tasks such as: - Pro: Fast and integrated with databases - Classifying entities or activities of interest; - Con: difficult to update, fix; no justification for the decision is provided by the system; difficulty - Estimating the value of unobserved variables; anticipating individual outputs - Predicting future outcomes - e.g. used in Public Administration to deliver fines, provide benefits, calculate taxes (2) Systems based on man-made rules: expert systems - Pro: Easy to build and update; ensures consistency and speed - Con: May be limited by engineers’ knowledge and inflexible to new rules; unable to provide a margin of discretion or appreciation in the decision; tendency to induce a de-responsibilitation of users - e.g. supporting business activities and processes (3) Machine learning systems: - Pro: Easier to build and maintain; less dependent on engineers’ specific knowledge - Con: Automatically derived rules can lack transparency and justification Legal challenges of data mining and algorithmic Data fundamentalism: tendency to believe that the correlation assessed by the algorithm implies We are already living in an algorithmic society, one organized decision making casuality, and that the analysis carried out with data mining techniques on large sets of data always around social and economic decision making by algorithms and provides an objective view of reality. robots, and artificial intelligence agents who make decision and in some cases carry them out The challenges are: 1. Discrimination: Algorithms trained on biased historical data can perpetuate and amplify existing prejudices, resulting in unfair outcomes. Input tends to influence the output; if the input is biased, so will be the output 2. Transparency and Accountability: Automated systems must be comprehensible to users, allowing for decision review and correction. Transparecy allows citizens to understand how it takes decisions. 3. Data Protection: Algorithmic decision-making necessitates the collection of personal data, posing risks to privacy and concerns about profiling. Internet Internet ARPANET's goals: 1. Physical Infrastructure: Includes optical fibers, telephone lines, radio bridges, Wi-Fi, and satellites 1. Develop a network larger than existing ones that transmit data encoded as bits. 2. Optimize scarce resources, directing requests towards 2. Transmission Management: Specialized computers govern the flow, addressing, and operation of computers that were available at each time data across the network. 3. Shared Protocols: The TCP/IP protocol suite underpins data transfer, with TCP handling transmission and reception, and IP addressing packets. 5. Devices: From mega-computers to smartphones, all connected devices form the backbone of the network. 6. Virtual Entities: Websites, games, and virtual environments represent digital spaces created by interconnected computing processes. 7. Users: Individuals and organizations utilize the Internet for various purposes. 8. Governance: Institutions oversee protocol development, address allocation, and coordinate global operations. Origins of the Internet 1. 1960s Beginnings: Computer networks were limited to research centers, connecting expensive machines. 2. ARPANET Development: - Initiated by ARPA with the vision of a “Galactic Network.” - 1969: First packet-switching network message sent from UCLA to Stanford. - Goals: Expand networks and enable resource sharing by routing tasks to available computers. Internet architecture Distributed Architecture: - Each node processed information and maintained multiple connections, ensuring flexibility and resilience. - Each node is capable of processing information, and is connected to other nodes with multiple direct and indirect links Packet-Switching: - Messages were divided into packets for efficient transmission. This replaced the less flexible circuit- switching method. - In packet-switched network, the message to be transmitted from a node to another nose is firstly broken in packets (=smaller pieces) > packets are routed across a network path, that can be modified as needed during the transmission > packets reach the recipient node > packeets are reassembled in the correct order to form the original message In circuit-switched network (prior to ARPANET): in order to reach the recipient, packets need to carry with them addition information concerning the recipient, the sender, and also the position of each single packet in relation to the original message - Messages are sent without being split and can follow only a fixed network path > no ways to reroute the transmission on a different path in case of issues along the line; - if a line is busy with a message transmission, it cannot be used for any other task until the transmission is completed; - If part of the transmission is lost or incorrect, the sender must retransmit the entire message (in packet-switching, possiblity to only retransmit missing or incorrect packet) TCP and IP: Foundations of Internet Communication The TCP/IP protocol suite is the cornerstone of how devices communicate over the Internet. Each protocol plays a distinct yet complementary role: (1) IP (Internet Protocol): - Manages the addressing of packets across the network: it addresses packets - Identifies devices using IP addresses, which are 32-bit binary numbers. - Static IPs: Fixed addresses assigned to devices in a network. - Dynamic IPs: Temporarily assigned by Internet Service Providers. (2) TCP (Transmission Control Protocol): - Ensures the reliable exchange of packets between devices: it ensures a reliable delivery of packets - Maintains the correct order of data packets during transmission. (3) How TCP and IP Work Together: - TCP-IP protocol are adopted by existing networks to connect themselves to the Internet - Data is divided into packets or datagrams, each with headers containing the sender’s and recipient’s IP addresses. - IP routes packets to the intended destination. - TCP ensures packets are sent, received, and reassembled accurately. - Headers (IP and TCP) include critical information for proper delivery and assembly of the data. Together, TCP ensures reliability, and IP provides the addressing and routing, forming the foundation for seamless Internet communication. Internet Service Providers (ISP) - The access service to the users is provided by ISP - Several networks developed with the client-server architecture are interconnected by means of different hardware devices and of connection services provided by special computers, or groups of computers (routers) Internet layers and protocol stack Internet layers (1) Application layer: - The top layer where user data is created. - Provides services to applications like web browsers, chat apps, and email clients. - Examples: HTTP, FTP, SMTP, DNS. - Adds application-specific headers to data. (2) Transport layer: - Ensures reliable communication between applications. - Manages data transfer, error correction, and flow control. - Protocols: TCP (reliable, ordered transmission) and UDP (faster, less reliable). - Adds TCP/UDP headers with sequence and acknowledgment numbers. (3) Network layer: - Handles packet routing and addressing across networks. - Uses IP for addressing and forwarding packets. - Key feature: Unique IP addresses identify devices. - Adds IP headers with source and destination IPs. (4) Data link layer: - Establishes a reliable link between devices on the same network. - Adds MAC addresses and error-checking information. - Splits data into frames for transmission over physical media. (5) Physical layer: - Deals with the actual transmission of raw bits over a medium (e.g., fiber optics, radio waves). - Defines hardware standards, signaling, and physical connections. Protocol stack (1) Encapsulation: Each layer adds its own header (and sometimes footer) to the data as it passes through the stack. These headers contain the information needed for the respective layer’s function, ensuring smooth communication. (2) Abstraction: Layers operate independently, focusing on their specific tasks without interacting with the internal data of other layers. This modular design allows flexibility and scalability in network operations. (3) Interdependence: The la

Use Quizgecko on...
Browser
Browser