Bioinformatics Lecture Summary PDF
Document Details
Uploaded by FastReasoning2184
Saurav Ranjan Barik
Tags
Related
Summary
This document is a lecture summary on bioinformatics, covering the evolution of computers, healthcare informatics, and related topics. It also discusses the importance of health informatics and how data is used in healthcare settings.
Full Transcript
Lecture 1: What is informatics? Informatics is the discipline focused on the acquisition, storage and use of information in a specific setting or domain. Therefore “Medical informatics” —> “Health informatics”: Health informatics is the discipline concerned with management of healthcare d...
Lecture 1: What is informatics? Informatics is the discipline focused on the acquisition, storage and use of information in a specific setting or domain. Therefore “Medical informatics” —> “Health informatics”: Health informatics is the discipline concerned with management of healthcare data and information through the application of computers and other information technologies. What is “HIT”: Health information technology (HIT or health IT) is defined as the application of computers and technology in healthcare settings. Numbering systems: Binary Decimal Octal Hexadecimal Ancient Numbering systems: Roman numerals Hebrew numerals Indian numerals Greek numerals Phoenician numerals Chinese rod numerals Ge’ez numerals Armenian numerals Khmer numerals Thai numerals Abjad numerals Eastern Arabic numerals Western Arabic Numerals Evolution of computers: Digital Calculus First generation computers: Electron Tube The first “triode”, invented in 1906 by Lee de Forest Period: 1940 - 1954 1946 USA - the ENIAC (Electronic Numerical Integrator and Computer) Second generation computers: Period 1955 - 1965 Transistor based technology Invented in 1948 Third generation computers: Microprocessors Period: 1970 - present day High integration technologies (LSI, VLSI - very large scale integration) High emphasis on software development Lecture 2: History: Computers: four generations starting from 1946 1949 - German Society for Medical Documentation, Computer Science and Statistics 1960 - the term “Informatique Medicale” appears in France 1960s - MEDLINE and MEDLARIS were created to organise the world’s medical literature 1970s - Artificial intelligence (AI) projects in Medicine 1970s - the Internet 1990s - WWW (World Wide Web) 1970s - EHR, 1991 - formally recommended in USA 1996 - mobile technology 2003 - Humand Genome Project completed Motivating factors to HIT adoption: Increase the efficiency of healthcare (improve productivity) Improve the quality (patient outcomes) of healthcare Reduce healthcare costs Improve healthcare access (with technologies such as telemedicine) Improve communication, coordination and continuity of care Improve medical education for clinicians and patients Standarisation of medical care The natural diffusion of technology - also exerts an influence Wi-Fi Mobile technologies Voice recognition Digital imaging Wearable devices 3D printing Key users of HIT: Public health (govt. agencies) Purchaser (employer, government) Payor (insurance company, government) Provider (doctors, nurses, clerks, admin, management, etc.) Patient relatives Barriers to HIT adoption: Inadequate time ◦Busy clinicians complain that they don’t have enough time to read or learn about new technologies. ◦2016 - “for every hour physicians provide direct clinical face time to patients, nearly 2 additional hours is spent on EHR and desk work within the clinic day” Inadequate information ◦Clinicians need information, not data ◦Current HIT systems are data rich, but information poor Inadequate expertise and workforce ◦Widespread HIT adoption and implementation will require education of all healthcare workers ◦Educational offerings will need to be expanded at universities and colleges ◦HIT vendors are looking for applicants with both IT and clinical experience, in addition to good people skills and project management experience Inadequate cost and return on investment data ◦often cited barrier - a mismatch between costs and benefits of HIT ◦The clinicians bear the costs (and do the extra work), whereas hospitals/insurers/government reap the benefits Behavioural change - Technology adoption life cycle Clinicians will dread widespread implementation of anything new unless they feel certain it will make their lives or the lives of their patients better Selecting clinical champions and conducting intensive training are critical to implementation success Future trends: More patient centric medical care and associated technologies Mobile technologies will continue to be an important medical platform for patients and clinicians Expect more AI in medicine (AIM) to retrospectively and prospectively interpret medical data Precision medicine - integration and analysis for a variety of phenotypical and genotypical information Interoperability issues - APIs and new data standard Types of computers: Supercomputer: used for a wide range of computationally intensive tasks in various fields ◦Quantum mechanics ◦Weather forecasting ◦Climate research ◦Oil and Gas exploration ◦Molecular modelling ◦Physical simulations ◦Cryptanalysis ◦Etc. Mainframe computers: Very large size High level of reliability Processing power and storage Parallel processing is a key feature Capability to host multiple operating systems Many users connected at the same time Server: Serve many users Handle many transactions across a network “Server farms” Desktop PC “Wintel” - mostly Intel-based processors running the Windows operating system Apple Macintosh computers running Apple’s MacOS Laptop PC: Sufficient processing and storing capabilities for day-to-day use High portability Mobile devices - Tablets: Lower processing and storing capabilities than laptops Ultra-high portability Apple iOS, Google Android “mHealth” A big challenge is to keep mobile computers easy to use and creating innovative applications with them but securing their data and networks Mobile devices - Smartphones Handheld computers Ultra-high portability Connected to cellular wireless networks Large impact - in developing countries, where wireless networks are being deployed, bypassing the wired computer networks Mobile and wearable devices and gadgets: Collect data from sensors Usually connected to a smartphone Moore’s Law: the observation that the number of transistors in a dense integrated circuit doubles about every 2 years Mitigating factors: Increase in computing power is counterbalanced by and increase in functionality ◦Results in a slower increase in the overall computing experience ◦E.g. GUI (Graphical User Interfaces) “Code bloat” - programmers have so much power at their disposal that they have less incentive to be efficient in the programs that they write “Feature creep” - software producers want to add more features to compete with their rivals, even if some of those features are minimally used Lecture 3: Hardware architecture of computers: Hardware = physical Software = non-physical CPU (Central Processing Unit) Cooler Memory: ◦Short term memory: RAM (Random Access Memory) ‣ Used for intermediate tasks and data processing ‣ Allows random access for fast operation, thus being called “Random Access Memory” ‣ Data in RAM is volatile, disappearing when the device turns off ‣ RAM speed is measured by latency, the time it takes to retrieve or modify data ◦Long term memory - stores data permanently and exists in three main forms: ‣ Magnetic storage: stores data as magnetic patterns on spinning disks (hard drives) Advantages: cost-effective and large capacity Disadvantages: High latency and sensitivity to heat ‣ Optical storage: Encodes data as light/dark dots on reflective discs (eg DVD, Blu-Ray) Advantages: Inexpensive and Removable Disadvantages: Slow Latency and limited storage capacity ‣ Solid state drives (SSD): Uses floating gate transistors to trap/remove electrical charges Advantages: Fastest and most durable, no moving parts Disadvantages: Expensive and vulnerable to degradation from repeated writes ◦“Work memory: RAM ◦“Storage” memory: HDD (Hard disk drives) or SSD (Solid state drives) CD/DVD unit (no longer present in most modern computers and portable computers) Motherboard Graphics Card Power Source What is software: Instructions that “tell” the computer what to do Enables the user to instruct the computer to do specific tasks AKA “computer programs” Computer programs consist of algorithms (the instructions on what to do) and data structures (the data in a structured form on which the algorithms could operate) The operating system (OS): System software that manages computer hardware and software resources and provides common services for computer programs The OS acts as an intermediary between programs and the computer hardware, for functions such as input and output and memory allocation Users can interact directly with the OS through a user interface such as a command line or a graphical user interface (GUI) Operating system (OS): Definition of an operating system: ◦Master program managing the interaction between software and hardware Basic functions of an operating system: ◦Program installer - loads new software into the computer’s memory ◦Resource management - decides when a program runs and how it accesses hardware ◦Device control - manages input and output devices for software ◦Multi-tasking - share CPU time by quickly switching between programs How it works: ◦It works by using software commands in binary (1s and 0s) ◦It translates electrical signals to control hardware What is an “API”? APPLICATION PROGRAMMING INTERFACE ◦Computer programs interact with each other via APIs ◦Data structures + Instructions What is OS virtualisation: The use of software to allow system hardware to run multiple instances of different OS concurrently, called “virtual machines” Allows the user to run different applications requiring different operating systems on one computer Types of programming languages: 1. Low-level languages: A. Car code: a. Binary code directly understood by computers b. Hardware specific and hard for humans to understand B. Assembly Language: a. Use symbols and letters (eg ADD, SUB) for improved readability b. It requires a translation program (assembler) to convert it to machine language c. Used in applications that require optimisation or precise timing 2. High-level languages: A. Closer to human language (eg English and mathematical symbols) B. Examples: C++, Python, Java C. Advantages: a. Easier to learn and use (logical and human readable) b. Portable on multiple hardware systems D. Disadvantages: a. Must be translated into machine code (not hardware specific) Programming and execution languages: Translation methods: Compiler: ◦Converts the entire program to machine code before execution ◦Produces reusable compiled files (eg. exe.) ◦Faster execution because the translation happens only once ◦Keep the source code hidden for you own protection ◦Examples: C, C++ Interpreter: ◦Translate and execute the code line by line ◦It allows rapid testing and flexibility ◦It does not produce reusable compiled files ◦Examples: Python, JavaScript Open Source vs. Proprietary software: Proprietary - Cost of acquisition, maintained and updated by producer Open source - free, still needs resources to manage and use Lecture 4: What is a network? A computer network or data network is a digital telecommunications network which allows nodes to share resources. In computer networks, networked computing devices exchange data with each other using a data link. The connections between nodes are established using either cable media or wireless media. Main characteristics of a computer network: Connectivity - a node can communicate with any other node Reliability - data transfer without data loss Scalability - network can grow by adding new nodes Modularity - compatibility between different pieces of hardware induced in the network Network hardware vs. Network services: Hardware: PCs Servers Routers Switches Cables Fibre-optics Wires Radio Transmitters Etc. Services: E-mail File sharing Instant messaging Video streaming Audio streaming Searching Etc. PAN vs. LAN vs. WAN: Wide Area Network (WANs): Cross city, state or national borders. The internet could be considered a WAN and is often used to connect LANs together. Virtual Private Network (VPNs): VPN extends a private network across a public network, and enables users to send and receive data across shared or public networks as if their computing devices were directly connected to the private network. Authentication and overall security are key elements of setting up remote access to someone else’s computer network. LAN vs. WAN: LAN + services = Intranetwork WAN + services = Internetwork The Internet - what is it? Who invented the internet? Vinton Gray Cerf (born June 23, 1943) is an American internet pioneer, who is recognised as one of “the fathers of the internet”, sharing this title with TCP/IP co-inventor Bob Kahn. Robert Elliot Kahn (born December 23, 1938) is an American electrical engineer, who, along with Vint Cerf, invented the Transmission Control Protocol (TCP) and the Internet Protocol (IP). Bandwidth: Transmission capacity, measured by bitrate Bitrate: The number of bits per second a system can transmit Latency: Time it takes for a bit to travel from sender to receiver The Wi-Fi uses RADIO WAVEs of different frequency to transmit data. The internet is a design philosophy; An address on the internet is just a number that’s unique to each device on the network. The internet runs on protocols: IP address = Internet Protocol How does information travel over the internet? Information travels over the internet by being broken into small pieces called packets. These packets are sent from your device to the destination using a network of routers and switches. Each packet may take a different route, but they are reassembled in the correct order when they reach their destination. This process ensures efficient and reliable delivery of data. What does DNS stand for and how does it work: DNS (Domain Name System) is a system that translates human-readable domain names (like www.example.com) into IP addresses (like 192.168.1.1) that computers use to identify each other on the Internet. It acts like a phonebook for the web. How DNS Works (Summary): 1. User Request: When you type a URL into your browser, your device sends a request to resolve the domain name into an IP address. 2. DNS Resolver: This request is sent to a DNS resolver (usually provided by your ISP or a public DNS service like Google DNS). The resolver acts as a middleman. 3. Search for the IP: Cache Check: The resolver first checks its cache to see if it already knows the IP address. Recursive Query: If not in the cache, the resolver queries a sequence of DNS servers: ◦Root Servers: Direct the resolver to the correct Top-Level Domain (TLD) server (e.g.,.com,.org). ◦TLD Servers: Point the resolver to the authoritative DNS server for the specific domain. ◦Authoritative Server: Provides the IP address for the domain. 4. Response: The resolver returns the IP address to your device. 5. Connection: Your browser uses the IP address to connect to the server hosting the website. 6. Caching: The IP address is cached by your device and the resolver for faster future access. What is DNS spoofing? DNS spoofing (or DNS cache poisoning) is a cyberattack where fake DNS information is inserted into a DNS server’s cache. This tricks users into being redirected to malicious websites instead of the legitimate ones they intended to visit, often to steal sensitive data like passwords or financial information. What is a router? A router is a device that connects different networks and directs data (like emails or web pages) between them, ensuring it reaches the right destination. What is TCP? TCP (Transmission Control Protocol) is a communication method that ensures data is sent and received reliably over the internet by breaking it into packets, confirming delivery, and reassembling it at the destination. What is a Web Browser? A web browser is software that allows you to access and interact with websites on the internet. It retrieves web pages, displays text, images, and videos, and lets you navigate between sites. Examples include Google Chrome, Firefox, and Safari. What is a URL (Uniform Resource Locator? A URL (Uniform Resource Locator) is the address used to access a specific resource on the internet, like a webpage, image, or file. For example, https://www.example.com is a URL. It tells the browser where to find the resource. What is HTTP (Hypertext Transfer Protocol)? HTTP (Hypertext Transfer Protocol) is a system that allows computers, like your web browser and a website’s server, to communicate and exchange information. It works by sending requests (e.g., to load a webpage) from your browser to the server, and the server responds with the requested data, such as text, images, or videos. What is HTML (Hyper Text Markup Language)? HTML (HyperText Markup Language) is the standard language used to create and structure content on the web. It uses tags to define elements like headings, paragraphs, links, images, and more, which browsers interpret to display a webpage. In short, HTML is the building block of web pages. What is a cookie? A cookie is a small piece of data that a website stores on your device when you visit it. Cookies help websites remember things about you, like your preferences, login information, or items in a shopping cart, so they can provide a more personalised experience. What is SSL? SSL (Secure Sockets Layer) is a technology that encrypts the connection between your web browser and a website. It ensures that the data exchanged, like passwords or credit card details, is secure and cannot be read by hackers. SSL is now largely replaced by its updated version, TLS (Transport Layer Security), but the term “SSL” is still commonly used. Websites using SSL/TLS have “https://” in their URL and often show a padlock icon in the browser. Lecture 5: World Wide Web (Web): ◦A system for accessing information stored on the internet, used for everyday activities such as checking the weather, socialising, shopping, etc. ◦It is accessible from devices such as computers, mobile phones, or even cars. The distinction between the Web and the Internet: ◦Internet: A global network that connects computers to exchange information. ◦Web: The most common use of the internet, representing a collection of interconnected servers and websites. What is a WEB Host? A web host is a service that stores websites and makes them available on the internet. What is a WEB server? A web server is a computer or software that stores website files and delivers them to users over the internet when they request them through a browser. A computer always connected to the Internet, specifically designed to store information and share it. 1. Web server: A. Computers permanently connected to the Internet, designed to store and distribute information. 2. Web hosting: A. Companies that provide server space for websites. B. Anyone can set up a web server with the necessary equipment and knowledge. 3. Internet Service Provider (ISP): A. A company that providers users with access to the internet. ISPs are essential for connecting personal devices, such as computers, phones or tablets, to the global Internet network. 4. Web register: A. This is an accredited company that sells and manages domain name registrations on the Internet. These providers allow users to reserve a unique domain name, such as www.example.com, and associate it with a website. What are DATA centers? Data centres are facilities that house many computers and servers used to store, manage, and process large amounts of data for businesses, websites, and online services. How is information organised in the WWW? Information on the World Wide Web (WWW) is organised using websites, which are made up of web pages connected by links (hyperlinks). These pages are stored on servers and can be accessed through unique addresses (URLs). Who invented the WWW? Sir Timothy John Berners-Lee (born 8 June 1955), is an English engineer and computer scientist. Currently a professor of computer science at the University of Oxford, UK. The “Semantic Web”: Find and interpret the data or create a common framework for data sharing Data will need to be tagged with metadata tags (data that describes data) and known as “linked data” World Wide Web Consortium (W3C) has promoted the notion of Resource Description Framework (RDF) as the means to describe documents and images Sir Timothy Berners-Lee, now promotes the concept of linked data as part of the RDF Future trends: The Internet of Things (IoT): IoT - a huge network of devices connected through the Internet Many “smart” devices available to the average consumer/patient that can be connected to the Internet “Smart homes” and other appliances that have a variety of sensors that could all be connected or linked using the Internet Sensor data: activity sensors, accelerometers, medical sensors like EEG, EMG, ECG, BP, temperature etc. Health IoT: Usually consists of sensors, a handheld device and a network server This infrastructure could be shared using an API The possibility of a connected home, hospital or city Would generate huge amounts of data Potential for AI and analytical algorithms Risks: security of the network and of the generated data Data vs. Information vs. Knowledge vs. Wisdom: Data: Data are symbols representing observation about the world There is no meaning associated with data (‘5’ could represent five fingers, five minutes or have no real meaning at all) Computers store, manage, process and transmit data accurately and rapidly. Information: Information is meaningful data or facts from which conclusions can be drawn by humans or computers For example, five fingers has meaning in that it is the number of fingers on a normal human hand Knowledge: Knowledge is information that is justifiably considered to be true For example, an elevated fasting blood sugar level suggests an increased likelihood of diabetes mellitus “Knowledge systems” Wisdom: Wisdom is the critical use of knowledge to make intelligent decisions and to work through situations of signal versus noise For example, a rising blood sugar can indicate diabetes and other secondary causes of hyperglycemia Data, Information, Knowledge and Wisdom Hierarchy: Not all data is meaningful, thus there is more data than information, knowledge or wisdom produced. Health Information Technology (HIT) provides the tools to generate information from data that humans (clinicians and researchers) can turn into knowledge and wisdom. Lecture 6: Representation of DATA in computers: The conceptual model contains only the parts of the physical world that are relevant to the computation. ◦Everything that is not in the conceptual model is excluded from the computation and assumed to be irrelevant. The computational model contains variables that characterise the system being studied. The contents of the computational model can be manipulated using formal methods. Formal methods are methods that manipulate data using systematic rules that depend only on form, not content (meaning). ◦Thus only a human can ensure that the input and output of a formal method (e.g. computer program) correctly capture and preserve meaning. If the formal method does not violate the rules of the physical world, one can apply the method to solve problems in the real world. When the real world, the conceptual model and the computational model match, such as the case when a critical constraint was left out of the conceptual model, the answer obtained from the computer are not useful. Computers DO NOT represent meaning The input, store, process and output zero (off) and one (on) These bits have NO intrinsic meaning - they can represent anything or nothing at all (e.g. random sequences) Data types in computers: Bits within computers are aggregated into a variety of data types: ◦Integers such as 32767, 15 and -20 ◦Floating point numbers (or floats) such as 3.14159, -12.014 and 14.01 ◦Characters “a”, and “z” ◦(Character) Strings such as “hello” or “ball” These data types DO NOT define meaning. A computer does not “know” whether 3.14159 is a random number or the ratio of the circumference to the diameter of a circle (known as Pi). Data can be aggregated into a variety of file formats these formats specify the way data are organised within the file File formats in computers: File Extension: A string of characters attached to a filename, usually preceded by a full stop and indicating the format of the file File extensions are used by the operating system to identify what apps are associated with what file types — in other words, what app opens when you double click the file Neither data types nor file formats define the MEANING of the data, except for the purpose of storing or display on a computer. For example, photographs can be stored in JPEG files, however nothing about the file format helps us recognise the subject of the photograph. Information Technology (IT) and Computer Science: Informatics: concentrate on technology, including computing systems composed of hardware and software as well as the algorithms implemented in such systems. ◦E.g. develop algorithms to search or sort data more efficiently (what is being sorted or search is irrelevant) Addresses information and knowledge Studies the representation, processing, and communication of information in natural and engineered systems. It has computational, cognitive and social aspects. The central notion is the transformation of information. Informatics deals with the MEANING of data. To an informatician, computers are tools for manipulating information. “Information retrieval” vs. “Data retrieval” ◦E.g. finding documents that describe the raltionship between aspirin and heart attack (myocardial infarction). Quantification of Data: Can we measure the capacity of a data storage device? Can we measure the capacity of a data transfer channel? Solution: We measure the quantity of data solely based on the properties of the system of representation, without taking into account the content (meaning) of the data. The elementary unit of data is one bit (b). A group of 8 bits = 1 byte (B) 1000 bytes strung together is a Kilobyte. Converting data to information to knowledge: We live in the real world that contains physical objects (e.g. aspirin tablet), people (e.g. John Smith took and aspirin tablet) and other concepts. To do useful computation in this context, one must segregate some part of the physical world and create models. Data to information: Goal: to give “meaning” to data Usually accomplished by the use of dictionaries (ICD-10-CM, SNOMED-CT) or ontologies When data is transmitted across different platforms, it is of critical importance that the meaning is preserved Interoperability — requires consistency of interpretation in the context of a particular task or set of tasks Information to Knowledge: In the clinical world, most available knowledge is best described as justified (i.e. evidence exists that it is true), rather than proven fact (i.e. it must be true) This is an important distinction from traditional hard sciences such as physics or mathematics Clinical data warehouses (CDWs) are often the basis for attempts to turn clinical information into know knowledge Clinical data warehouse (CDWs): a database system that collects integrates and stores clinical data from a variety of sources including electronic health records, radiology and other information systems Structures and unstructured data ETL = Extract, Transform and Load Meta-data — data that describes other data (e.g. the notation that a data item is an ICD-10-CM term) CDWs are not updated in real-time, daily or weekly updates are typical Designed to support analytics ◦Simple analytics - summary statistics such as counts, means, medians and standard deviations ◦More sophisticated analytics include associations (e.g. does A co-occur with B) and similarity determinations (e.g. is A similar to B) Designed to support queries about groups of patients ◦EHRs are designed for efficient real-time updating and retrieval of individual data ◦CDWs are designed to support queries about groups of patients (e.g. ‘retrieve all women who are 40 years old or older who have not had a mammogram in the past year’) ◦CDWs are also used to identify trends in the data (e.g. ‘did screening mammograms detect breast cancer at an early stage?’) Information to Knowledge — methods: Meta-data NLP (Natural Language Processing) Analytics Concept extraction Classificatio AI (Artificial Intelligence) Cognitive computing etc. Future trends: Fundamental challenge: there is a mismatch between what HIT can represent (data) and concepts relevant to health care (data + meaning). HIT must augment human cognition and abilities. Recognise the complementary strengths of humans and computers (humans are good at constructing and processing data). Open challenge: Defining scenarios when HIT is beneficial with all relevant parameters and demonstrating that using HIT is reliably beneficial in these scenarios. Lecture 7: How search works? Does a search engine look for the results, over the Web, in real-time? No, search engines retrieve results from their pre-built index, not directly from the web in real-time. What is a ‘spider’? A spider, also known as a web crawler, is a program used by search engines to automatically browse the internet, discover web pages, and collect information to index them for search results. What is a search index? What does it contain? A search index is a database used by search engines to store and organise information from web pages. It contains keywords, content, metadata, and links to help quickly retrieve relevant results for a search query. What is the main challenge of a search engine in delivering results? The main challenge is delivering accurate and relevant results from vast amounts of data while filtering out spam and low-quality content. What is a ranking algorithm? A ranking algorithm is a set of rules used by search engines to determine the order of search results based on relevance, quality, and other factors. What type of other information do search engines use to deliver results, apart from what is explicitly provided in a search? Search engines use factors like user location, search history, device type, and trending topics to deliver more relevant results. What is “machine learning”? Machine learning is a type of artificial intelligence that enables computers to learn from data and improve performance without being explicitly programmed. Is the search engine responsible for the reliability of the information returned in the results? No, a search engine is not directly responsible for the reliability of the information; it only indexes and ranks content from the web. Users must evaluate the credibility of sources. Information Retrieval tends to focus on knowledge-based information, which is information based on scientific research, in distinction to patient-specific information that is generated in the care of patient. Knowledge-based information it typically subdivided into 2 categories: Primary knowledge-based information (aka primary literature) = original research that appears in journals, books, reports, and other sources. Secondary knowledge-based information consists of the writing that review, condenses, and/or synthesises the primary literature. Approaching used to indexing knowledge-based content: 1. Manual indexing — where human indexers, usually using a controlled terminology, assign indexing terms and attributes to documents, often following a specific protocol. 2. Automated indexing — where computers make the indexing assignments, usually limited to breaking out each word in the document (or part of the document) as an indexing term. Controlled Terminologies: A controlled terminology contains a set of terms that can be applied to a task, such as indexing. When the terminology defines the terms, it is usually called a vocabulary. When it contains variants or synonyms of terms, it is also called a thesaurus. A thesaurus contains relationships between terms, which typically fall into three categories: Hierarchical — terms that are broader or narrower. Synonym — terms that are synonyms, allowing the indexer or searcher to express a concept in different words Related — terms that are not synonymous or hierarchical but are somehow otherwise related. Future trends: Researching is taking place in several areas related to IR, which include: Information extraction and text mining — usually through the use of NLP Summarisation — Providing automated extracts or abstracts summarising the content of longer documents Question-answering — going beyond retrieval of documents to providing actual answers to questions (e.g. IBM’s Watson, medical Chatbots, etc.) Google search operators: I. Google - Basic search operators II. Google - Advanced search operators Google basic search operators: Using a wildcard = * Searching for a phrase = “ “ Examples: ◦Child behaviour ◦Child* behaviour ◦“Child* behaviour” Boolean operators ◦AND ◦OR ◦NOT Pseudo boolean operators ◦+ ◦- Grouping the terms of a search — ( ) Examples: ◦“Da vinci” - painting - restaurant ◦- “Da vinci” + (surgery* OR robot*) ◦site: ◦inurl: ◦intitle: , allintitle: ◦inanchor: , allinanchor: ◦intext: , allintext: ◦filetype: ◦link: Lecture 8: What is ENCRYPTION? Encryption is the process of converting data into a secure code to protect it from unauthorised access. What is DECRYPTION? Decryption is the process of converting encrypted data back into its original form to make it readable. What is an encryption key? An encryption key is a string of characters used to encode or decode data, ensuring secure communication of data. ◦What is a 10 digit key? ‣ A 10-digit key is a security code consisting of 10 numerical digits used for encryption, authentication, or access control. ◦What is the length of the encryption keys regularly used today? ‣ Encryption key lengths today typically range from 128 to 4096 bits, depending on the method used. AES (Advanced Encryption Standard) uses 128-256 bits, RSA (Rivest-Shamir-Adleman) uses 2048-4096 bits, and ECC (Eliptic Curve Cryptography) uses 256-521 bits. What is a PUBLIC key? What is a PRIVATE key? A PUBLIC key is a key that is shared openly and used to encrypt data or verify digital signatures. A PRIVATE key is a secret key known only to the owner, used to decrypt data or create digital signatures. Together, they form a key pair used in encryption and security systems like SSL/TLS and cryptocurrencies. Cyber security: Cyber security strategy: 1. Network VPN Hardware Layer (network and entry points) 2. Firewall (guarding access) 3. Employee Security (Awareness Training) 4. Internal Defense (on the servers and workstations) 5. Compliance (regulatory, HIPAA, PCIO, SCADA … etc.) 6. Forensics (real time situational awareness) What do CYBER CRIMINALS take advantage of? Cybercriminals take advantage of weak passwords, outdated software, phishing attacks, unsecured networks, human errors, and system vulnerabilities to steal data, commit fraud, or disrupt services. What is a VIRUS? How do we “get” a VIRUS? What harm can a VIRUS do? A virus is a type of malicious software that spreads by attaching itself to files or programs and can harm or disrupt a system. You can get a virus by downloading infected files, opening malicious email attachments, visiting unsafe websites, or using infected USB drives. A virus can steal data, slow down your device, corrupt files, crash systems, and spread to other devices, causing security risks and financial loss. What is a DDoS (Distributed Denial of Service) attack? A DDoS attack overloads a website or network with too much traffic, making it slow or completely unavailable. Cybercriminals use many infected devices to carry out the attack. Can we unwillingly participate in a DDoS? Yes, if your device is infected with malware, hackers can use it as part of a botnet to launch a DDoS attack without your knowledge. What is a PHISHING scam? A phishing scam is a trick where cybercriminals pretend to be a trusted source (like a bank or company) to steal personal information, usually through fake emails, messages, or websites. What is the most probable cause of a computer being hacked? The most probable cause of a computer being hacked is weak or stolen passwords. What can we do to protect ourselves in an on-line world? Use strong, unique passwords, enable two-factor authentication, keep software updated, avoid suspicious links, and be cautious with personal information. Basic security principles in healthcare: Electronic health records + personal health records and health information exchanges = monumental challenges in the coordination of protection for that data. Data classification policy A data classification program should be broken down by Security Objective and Potential Impact if that data is compromised Security Objectives: Confidentiality, Integrity, and Availability ◦Confidentiality - prevention of data loss. Encryption access control, and secure authentication are typical controls used to protect confidentiality. ◦Availability refers to system and network accessibility, and often focuses on power loss or network connectivity outages. ◦Integrity describes the trustworthiness and performance of data, an assurance that the lab results or personal medical history of a patient is not modifiable by unauthorised entities or corrupted by a poorly designed process. Database best practices, data loss solutions, and data backup and archival tools are implemented to prevent data manipulation, corruption, or loss. Organisations also use audit logging for access to systems and databases as a means of protecting the integrity of the data. Defence in depth: A set of policies and procedures around data privacy and security New employees are given a criminal background Any new employee is given a badge with their photo and the ability to access only some areas Each employee has their own user name and password to access the entities electronic resources Access is given to only the systems that the employee needs to access, minimum necessary The computer that the employee uses is encrypted, has patches automatically pushed to the device, and runs anti-virus software The email system the employee accesses has antivirus installed on the server, has patches updated by the technology team, and is encrypted The network that the email system, and computers operate on is located behind a firewall, intrusion prevention system, and switches and routers which employ access control lists (ACLs) to limit the types of traffic allowed on the network ◦Threat actors: ‣ Insiders ‣ Hacktivists ‣ Organised crime ‣ Nation States (or hackers sponsored by nation states) ◦Types of attacks - Social engineering ‣ Phishing - (email) ‣ Sohulder surfing ‣ Tailgating ‣ The promise of free hardware ◦Types of attacks - other types ‣ Denial of Service (DoS) / DDoS ‣ Brute Force attack ‣ Doxing Tools used to protect healthcare privacy and security: Client protection: Patching Anti-virus Encryption Application and database protection: Strong authentication (password, token, biometric) Privileged account management (PAM) Backup and continuity of operations Vulnerability analysis and penetration testing Border defenses: Firewalls Intrusion detection Web filtering Virtual Local Area Networks (VLANs) Encryption in transit Administration controls: Policy Contracting Physical controls: Proximity cards RFID (radio-frequency identification) Cameras Lecture 9: Electronic Health Records - Brief History: For decades: Paper records Steps towards digitisation — along with evolution of computers Earliest forms: basic (administrative) tasks Internet era = transformative change Better hardware & software, widespread internet access enabled the development of more sophisticated EHR systems. Role of government policies Present: EHRs not just digital records! Shift of focus — interoperability, patient engagement, predictive healthcare Electronic Health Records: Definition: !! - no universally accepted definition of an EHR. As more functionality is added, the definition will need to be broadened. A definition: An EHR is the systematised collection of patient and population electronically-stored health information in a digital format. These records can be shared across different healthcare settings. ELECTRONIC HEALTH RECORD KEY COMPONENTS: Electronic patient encounter Referral management feature Retrieval of lab and x-ray reports electronically Retrieval of prior encounters and medication history Computerised Physician Order Entry (CPOE) Clinical decision support systems (CDSS) Secure messaging (e-mail or text messaging) for communication between patients and office staff Integration with a picture archiving and communication system (PACS) Practice management software, scheduling software and patient portals Problem summary list that is customisable and includes the major aspects of care Public health reporting and tracking The ability to input or access information via a smartphone or tablet PC Remote access from the office, hospital or home Electronic prescribing etc. Electronic Health Records - CHALLENGES: Financial barriers Physician resistance Loss of productivity Work flow changes Reduced physician-patient interaction Usability issues Integration issues Quality reporting issues Lack of interoperability Privacy concerns Legal aspects Inadequate proof of benefit Integrated and realtime analytics Increasing standardisation Enhancing interoperability More integration between hospital EHRs and medical devices with “smart” — IoT Remote patient monitoring Apps for EHRs, analogous to smartphone apps Lecture 10: Clinical Decision Support Systems: Definition: Clinical decision support (CDS) provides clinicians, staff, patients or other individuals with knowledge and person-specific information, intelligently filtered of presented at appropriate times, to enhance health and health care. “Clinical decision support systems link health observations with health knowledge to influence health choices by clinicians for improved health care.” — Robert Haywards, Centre for Health Evidence The Five Rights of CDS: 1. The right information (what): should be based on the highest level of evidence possible and adequately referenced. There should be good internal and external validity. 2. To the right person (who): that is the person who is making the clinical decision, be that the physician, the patient or some other member of the healthcare team. 3. In the right format (how): should the information appear as part of an alert, reminder, infobutton or order set? That depends on the issue and clinical setting. 4. Through the right channel (where): should the information be available as an EHR alert, a text message, email alert, etc. 5. At the right time (when) in workflow: new information, particularly in the format of an alert should appear early in the order entry process so clinicians are aware of an issue before they complete, e.g. an electronic prescription. CDS benefits and goals: Improvement in patient safety: ◦Medication alerts ◦Improved ordering Improvement in patient care: ◦Improved patient outcomes ◦Better chronic disease management ◦Alerts for critical lab values, drug interactions and allergies ◦Improved quality adjusted life years (QALY) ◦Improved diagnostic accuracy Reduction in healthcare costs: ◦Fewer duplicate lab tests and images ◦Fewer unnecessary tests ordered ◦Avoidance of Medicare penalties for readmission for certain conditions ◦Fewer medical errors ◦Increase use of generic drugs ◦Reduced malpractice ◦Better utilisation of blood products Dissemination of expert knowledge: ◦Sharing best evidence ◦Education of all staff, students and patients Management of complex clinical issues: ◦Use of clinical practice guidelines, smart forms and order sets ◦Interdisciplinary sharing of information ◦Case management Monitoring clinical details: ◦Reminders for preventive services ◦Tracking of diseases and referrals Improvement of population health: ◦Identification of high-cost/needs patients ◦Mass customised messaging Management of administrative complexity: ◦Supports coding, authorisation, referrals and care management Support clinical research: ◦Ability to identify prospective research subjects A. Knowledge based systems: 1. Knowledge Acquisition — includes expert-based knowledge or data-based knowledge A. Expert-based: may come from clinical practice guidelines and from clinical expertise B. Data-based — may come from models built on data from outside the institution or from data mining from within the institution. 2. Knowledge Representation A. Knowledge base (evidence-based information) B. An inference engine (software to integrate the knowledge with patient-specific data) C. A means to communicate the information to the end-user such as a pop-up alert in the EHR. 3. Knowledge maintenance A. There is a need to keep knowledge up to date, from the level of the program through the committees in charge and to track changes. B. Has proven challenging to most organisation. B. Non-Knowledge based systems: When the knowledge representation is a model derived from data mining, the system is called a non- knowledge based CDS. Data mining methods may involve artificial intelligence (AI), (e.g. neutral networks, machine learning) or more traditional statistical methods, like linear or logistic regression. AI machine learning has also been used for “pattern recognition” which has become relatively routine in medical diagnostic devices (interpreting images, electrocardiograms, etc.). ◦E.g. Google Deep Mind, IBM Watson CHALLENGES: ◦Data quality and volume ◦Interpretability and transparency ◦Continual learning and adaptation Lecture 11: Definitions in Medical Imaging: Medical Imaging Informatics (MII) is the study and application of process of information and communications technology for the acquisition, manipulation, analysis and distribution of medical image data. Picture Archiving and Communication Systems (PACS) is a medical imaging technology which provides economical storage of, and convenient access to, images from multiple modalities. Biomedical Imaging Informatics (BII) is a discipline that focuses on improving patient outcomes through the effective use of images and imaging-derived information in research and clinical care. tends to be broader than MII and includes radiology, pathology, dermatology, and ophthalmology. it is a sub-field of biomedical informatics and can include cellular and molecular imaging. DICOM defines how images and the metadata associated with those images is stored and moved between various electronic devices, including information systems. Digital Imaging and Communications (DICOM) standard is intended for the transport of images. DICOM defines how images and the metadata associated with those images is moved between various electronic devices, including information systems. There are two overall parts to a DICOM message, the header and the actual image data. DICOM message = header + image data The header data contains information about the patient, the type of image, and how it was captured. It also contains information about the structure and compression of an image. The image data usually represents high resolution grayscale images. Conventional image files also have a header. - JPEG header - Contains only “technical” meta-data about the file DICOM standard also contains a networking protocol which details a specific procedure for DICOM devices to connect. PACS Key components: Digital acquisition devices: the devices that are the sources of the images. CT, MRI, Digital angiography, fluoroscopy, mammography etc. The Network: ties the PACS components together — the pathway for image transmission from the scanners to the image archive, and from there to the radiologist at a reading station. Database server: high speed and robust central computer to process information. This answers the request of the reading radiologist to provide the images at his/her workstation. Archival server: responsible for storing images. There is a separate back up, usually off-site to prevent data loss in aa disaster situation. Radiology Information system (RIS): System that maintains patient demographics, scheduling, billing information and interpretations. Workstation or soft copy display: contains the software and hardware to access the PACS. This is where the radiologist reviews the imaging study and dictates his diagnostic report. Teleradiology: the ability to remotely view images at a location distant from the site of origin. Medical imaging on the internet and on mobile technology: Having patient images present only within a single health care system limits what would otherwise be a potentially widely available resource. Benefits of web-based technology provide on-demand, online access to electronic images regardless of the location of patient records, reports and images. DICOM images are not browser compatible — that is to say, DICOM images cannot be viewed using a standard Internet browser, as can JPEG, GIF, PNG and other file formats. DICOM images are not browser compatible ◦One solution to this problem is for the browser to serve as a link to a server, which can open and display the images, and then stream them to the viewer. ◦In this instance, client software must be present on the viewing computer to allow this functionality. ◦An alternative type of system enables direct viewing in the browser without client software, enabling its use on any computer with Internet functionality. ◦This type of solution is known as a “zero-footprint” web viewer. ◦DICOM images are pre-converted to GIF files, which are then embedded in a webpage. DICOM and Mobile technology: In 2011, the FDA approved the first primary diagnostic radiology application for mobile devices. Performance evaluation reviewed by the FDA consisted of tests for measured luminance, image quality (resolution), and noice referenced by international standard and guidelines. Limitations of mobile imaging: Images will likely have lower resolution, compared to a dedicated work station. Mobile programs may not permit report generation or editing. Comparing old and new images side by side is generally not possible. Lecture 12: Introduction to Evidence Based Medicine: “Patients should receive care based on the best available scientific knowledge. Care should not vary illogically from clinician to clinician or from place to place” (Crossing the Quality Chasm, the Institute of Medicine, USA). Every effort should be made to find the best answers and that these answers should be standardised and shared among clinicians Standardisation implies that clinical practice should be consistent with the best available evidence that would apply to most patients. Definition of Evidence Based Medicine: EBM is “a systematic approach to clinical problem solving which allows the integration of the best available research evidence with clinical expertise and patient values” (Centre for EBM, USA) Importance of Evidence Based Medicine: Reasons for using EBM resources and tool in clinical practice: Current methods of keeping medically or educationally up-to-date do not work. Translation of research into practice is often very slow. Lack of time and the volume of published material results in information overload. The pharmaceutical and medical device industries bombard clinicians and patients every day: often with misleading or biased information A lot of what is considered the “standard of care” in every day practice has yet to be challenged and could be wrong. Traditional Methods for Gaining Medical Knowledge: Continuing Medical Education (CME) Clinical Practice Guidelines (CPGs) Expert Advice Reading medical literature EBM Steps to Answering Clinical Questions: The physician sees a patient and generates a clinical well-constructed question, using the PICO method. 1. P (Patient, Population, or Problem) – Who is the patient or population? What is the condition or problem of interest? 2. I (Intervention or Exposure) – What is the intervention, treatment, or exposure being considered? 3. C (Comparison) – Is there an alternative intervention or control group for comparison? (Not always necessary) 4. O (Outcome) – What are the expected results or effects of the intervention? Seek the best evidence for that question via an EBM resource or PubMed. Critically appraise that evidence by examining its internal and external validity and the potential impact of an intervention. Apply the evidence to your patient considering patient’s values, preferences and circumstances. Evidence appraisal: validity, results and applicability: Internal validity: is the study believable? Low internal validity — poor design or biases or errors in selecting patients, measuring outcomes, conducting the study, or analysis of the data. Internal Validity: common sources of research bias Results: should be assessed in terms of the magnitude of treatment effect and precision (narrower confidence intervals or statistically significant results indicate higher precision). Applicability, also called external validity: indicates that the results reported in the study can be generalised to the patients of interest. The Evidence Pyramid: Case reports/case series — collections of reports on the treatment of individual patients without control groups. Case control studies — retrospectively study patients having a specific condition and compare with people who do not have the condition. Cohort studies — prospectively evaluate and follow patients who have a specific exposure or receive a particular treatment over time and compare them with another group that is similar but has not been affected by the exposure being studied. Randomised controlled trials (RCTs) — subjects are randomly assigned to a treatment or a control group that received placebo or no treatment: are often “double blinded” meaning that both the investigators and the subject do not know whether they received an active medication or a placebo. Systematic reviews — protocol-driven comprehensive reproducible literature searches that aim at answering a focused question: multiple RCTs are evaluated to answer a specific question: usually conducted by several different researchers to reduce selection bias. Meta-analyses — the quantitative summary of systematic reviews that take the systematic review a step further by using statistical techniques to combine the results of several studies as if they were one large single study. Compared to individual studies, Meta-analyses: include a larger number of events, leading to more precise (i.e., statistically significant) findings apply to a wider range of patients because the inclusion criteria of systematic reviews are inclusive of criteria of all the included studies. Clinical Practice Guidelines (CPG): Definition of Clinical Practice Guidelines: CPGs are “statement that include recommendations intended to optimise patient care that are informed by a systematic review of evidence and an assessment of the benefits and harms of alternative care options” (Institute of Medicine, USA). Developing Clinical Practice Guidelines: The process starts with a panel of content and methodology experts commissioned by a professional organisation (methodology experts are experts in evidence-based medicine, epidemiology, statistics, cost analysis, etc.) The panel refines the question, usually in PICO format A systematic literature search and evidence synthesis is done Evidence is graded, and recommendations are negotiated Voting is often needed to build consensus Appraisal and Validity Clinical Practice Guidelines: Multiple tools have been developed to appraise CPGs and determine their validity. These usually evaluate: The process of conducting CPGs The quality and rigor of the recommendations The clarity of the presentation of recommendations Some attributes of good quality CPG: Evidence-based, preferably linked to systematic reviews of the literature. Considers all relevant patients groups and management options. Considers patient-outcomes (as opposed to surrogate outcomes). Updated frequently Clarity and transparency in describing the process of CPGs development (e.g., voting, etc.) Clarity and transparency in describing the conflicts of interests of the guidance panel. Addresses patients’ values and preferences. Level of evidence and strength of recommendation are given. Simple summary or algorithm that is easy to understand. Available in multiple formats (print, online smartphone, etc.) and in multiple locations. Compatibility with existing practices. Simplifies, does not complicate decision making. Lecture 13: Terminology of analytics: “The extensive use of data, statistical and quantitative analysis, explanatory and predictive models, and fact- based management to drive decisions and actions.” “The systematic use of data and related business insights developed through applied analytical disciplines (e.g. statistical, contextual, quantitative, predictive, cognitive, other [including emerging] models) to drive fact-based decision making for planning, management, measurement and learning.” Levels: Adams and Klein define three levels and their attributes of the application of analytics: Descriptive — standard types of reporting that describe current situations and problems Predictive — simulation and modelling techniques that identify trends and portend outcomes of actions taken Prescriptive — optimising clinical, financial, and other outcomes Big Data — 5 Vs: Volume: massive amounts of data are being generated each minute Velocity: data is being generated so rapidly that it needs to be analysed without placing it in a database Variety: roughly 80% of data in existence in unstructured so it won’t fit into a database Veracity: current data can be “messy” with missing data and other challenges Value: the capability to extract meaning from data ANI: Artificial Narrow Intelligence Artificial Narrow Intelligence is also known as weak AI and it is the only type of AI that exists in our world today. Narrow AI is goal oriented and is programmed to perform a single task and is very intelligent in completing the specific task that it is programmed to do. Some examples of ANI are Siri, Auto pilot in an airplane, chat bots, self-driving cars etc. AGI: Artificial General Intelligence Artificial General Intelligence also referred to as strong AI is a concept in which machines exhibit human intelligence. Using strong AI we can have the ability to build machines that can think, strategise and perform multiple tasks under uncertain conditions. ASI: Artificial Super Intelligence Artificial Super Intelligence is a hypothetical AI where machines will be capable of exhibiting intelligence that surpasses that of the brightest humans. ◦“Terminator” like instance may be induced. ChatGPT: Challenges to data analytics: Data may be inaccurate or incomplete Data may be transformed in ways that undermine its meaning (e.g., coding for billing priorities) Data may also incompletely adhere to well-known standards, which makes combining it from different sources more difficult Patients receive care at different locations and the data from such care might not be readily available A large part of clinical data is “locked” in text Ethical issues Path forward for analytics: Adherence to best practices for use of data standards and interoperability Processes to evaluate availability, completeness, quality and transformability of data Toolkits and pipelines to manage data and its attributes Challenges and metrics for assessing “research grade” of operational data Standardised reporting methods for operational data and its attributes Adaptation of “best evidence” approaches to use of operational data Appropriate use of informatics expertise to assist with optima use of operational data and to develop published guidelines for doing so Research agenda to determine biases inherent in operational data and to assess informatics approaches to improve data. The “best evidence” approach to clinical data: Ask an answerable question — can the question be answered by the data we have? Find the best evidence — in this case, the best evidence is the EHR data needed to answer the question Critically appraise the evidence — does the data answer the question? Are there confounders? Apply it to the patient situation — can the data be applied to this setting? Lecture 14: Introduction to Consumer Health Informatics: Consumer health informatics (CHI) is the area of health informatics focused on the interaction of consumers, patient, and others with health information systems and applications. Emergence: Widespread availability of the Internet and online information resources The consumer movement that aimed to empower those who were ill (patients) and not yet ill (consumers) with information to maintain and improve their health, as well as engage in the treatment of their disease. Definition of Consumer Health Informatics: CHI is “the field devoted to informatics from multiple consumer or patient views. These include: Patient-focused informatics Health literacy Consumer education The focus is on information structures and processes that empower consumers to manage their own health (e.g. health information literacy, consumer-friendly language, personal health records, and Internet-based strategies and resources). Consumer Health Informatics: analyses consumers’ needs for information studies and implements methods for making information accessible to consumers models and integrates consumers’ preferences into health information systems Consumer Health Informatics — satellite terms: mHealth — stands for “mobile health“ and typically refers to the use of smartphones and other mobile devices digital health — related to the integration of genomics and digital devices partcipatory medicine — “a movement in which patients and health proffesionals actively collaborate and encourage one another as full partners in healthcare Patient engagement — systems and functions which allow the patient to actively engage in the healthcare process Personal Health Records: Definition: An electronic application through which individuals can access, manage, and share their health information, and that of others for whom they are authorised, in a private, secure, and confidential environment Principles: A lifelong resource Individuals owning and managing their information Information which comes from healthcare providers and the individuals themselves Information maintained in a secure private environment The individual determines the rights of access to the information The PHR does not replace the legal record of the healthcare provider Types of PHR: 1. The tethered PHR A. Is an extension of the healthcare provider’s HER B. Provides access to some (not all) of the information for the individual C. Allows communication with the provider D. May allow the patient to add information into the record E. Also known as “patient portal“ 2. The standalone PHR A. Is an isolated application B. It does not take information from other sources C. Only contains the information that the patient enters into it D. It may be on a mobile device or a website 3. The interconnected or integrated PHR A. A separate application, but it has the ability to interact with one or more providers EHRs B. It will allow operations such as accessing test results, scheduling, data collection, etc. Policy issues related to PHRs: The Markle Foundation advocates that: The PHR should be controlled by the individual, who decides what can be accessed by whom, and for how long. The PHR should contain information from one’s lifetime and from all providers. It should be accessible anywhere, at any time, private and secure. Information should be transparent (whoever entered or viewed information should be captured and viewable). There should be easy exchange with other health information systems and professionals. The American College of Physicians advocates that: Physicians should only be reponsible for reading and acting on the tethered portion of the PHR and should not be obligated to read or act on the non-tethered part. The physician should be compensated for time spent interacting with the PHR. PHRs and Clinical Notes: Should patients have access to their clinical notes? “OpenNotes“ — an initiative aiming to provide patients with access to the entirety of their medical record, including clinical notes. Studies show that patients with OpenNotes: ◦Felt they were in more control of their care. ◦Showed increased adherence to medication. ◦Expressed privacy concerns (25% of the patients). ◦Felt confusion, worry, or offense as to what was in the notes (1-6% of the patients). Studies show that clinicians that take care of patients using OpentNotes: ◦Did not perceive any increase time outside of visits. ◦Did not report any changing content in notes. ◦Did not report requiring more time to write notes. Almost all patients and physicians who started using the system, continuted doing so after the initial study ended. PHRs and Data Ownership: Should patients own their medical data? An increasingly advocated view goes beyond patients viewing their data and notes to actually providing them ownership and control of their data. Challenges generated by this view: ◦Patients would have the right to control access to data for chosen healthcare professionals, institutions, and researchers. ◦Transition from vendor-centric to patient-centric data storage needed. ◦HER systems would need to be redesigned to pull from and push back data to the patient’s cloud- based store. ◦A model for how this approach would be paid for is needed. ◦Standardisation issues Patient Engagement: Should patients own their medical data? An increasingly advocated view goes beyond patients viewing their data and notes to actually providing them ownership and control of their data. Challenges generated by this view: ◦Patients would have the right to control access to data for chosen healthcare, professionals, institutions, and researchers. ◦Transition from vendor-centric to patient-centric data storage needed. ◦HER systems would need to be redesigned to pull from and push back data to the patient’s cloud- based store. ◦A model for how this approach would be paid for is needed. ◦Standardisation issues