AI Fundamentals PDF
Document Details
Uploaded by PleasurableLivermorium1883
Arteveldehogeschool
Tags
Summary
This document provides a basic introduction to Artificial Intelligence (AI), covering fundamental concepts like algorithms, data, and hardware. It touches upon instruction-based programming and machine learning.
Full Transcript
AI Fundamentals =============== **The Building Blocks** To go through cycles, an Al system requires three large building blocks - Algorithms - Data - Hardware **Instruction Based** Programming instructions can be quite challenging for a human. It involves coding line by line, where...
AI Fundamentals =============== **The Building Blocks** To go through cycles, an Al system requires three large building blocks - Algorithms - Data - Hardware **Instruction Based** Programming instructions can be quite challenging for a human. It involves coding line by line, where if situation A arises, the system needs to execute action B. This process is time-consuming due to its meticulous nature. Additionally, clarity and transparency are key aspects of programming. **Machine Learning** Computers learn from data and adapt through experience, without a programmer defining the rules. This ability allows them to respond effectively to previously unknown situations. However, one drawback is that Al systems are not always transparent in their decision-making processes. It\'s important to note that the majority of what we refer to as Al in today\'s context is actually based on Machine Learning (ML) principles. A screen shot of a computer Description automatically generated **Husky vs Wolf** The husky vs. wolf Al case is a famous example of how biased training data can lead to flawed machine learning outcomes. Researchers trained an Al to distinguish between images of wolves and huskies, but the model often misclassified the animals. Upon investigation, it was discovered that the Al had learned to associate wolves with snowy backgrounds, as many of the wolf images in the training set had snow, while husky images did not. **Skin Cancer** In the Al skin cancer detection case, researchers developed a model to identify cancerous skin lesions from images, but it was later discovered that the Al was making predictions based on irrelevant visual cues. Instead of learning to detect cancer based on the characteristics of the lesions themselves, the Al often associated the presence of medical instruments like rulers, commonly used by doctors in images of serious cases, with cancer. **I\'m Fine** In November 2018, Dong Mingzhu, chairwoman of Gree Electric Appliances, had her face mistakenly displayed on a public screen in Ningbo, China, as part of a traffic police system using facial recognition to identify jaywalkers. The Al system erroneously captured her image from an advertisement on the side of a moving bus, mistaking it for a jaywalker in the street. This incident highlighted the flaws in Al-driven surveillance, as it showcased how the technology can misinterpret data, leading to false accusations. **Narrow Al** Dedicated to assist with or take over specific tasks **AGI, General Intelligence** Takes knowledge from one domain, transfers to other domain. **Super Al** Machines that are an order of magnitude smarter than humans. **Human Centered Al** Humans are lazy by design, and that laziness is driving force for innovation. We need to be aware of - Al is a computer system - Al is not a religion - See Al as a third arm or a second brain More transparency from Al systems **So, what is an algorithm?** An algorithm is a set of instructions for solving a problem or accomplishing a task **Self-learning algorithms** Sometimes, the number of rules becomes extremely long and complex, or is simply impossible to solve via rules - Learning how to talk - Recognising a cat in a photo We use Al to mimic our brains, make automatic connections between input and output; drawing on examples **Self-learning algorithms** -------- --------- --------- -------- Input1 Input 2 Input з Output 2 4 5 3 5 2 8 2 2 2 1 3 3 3 5 ? -------- --------- --------- -------- Sometimes, the number of rules becomes extremely long and complex, or is simply impossible to solve via rules - Learning how to talk - Recognising a cat in a photo We use Al to mimic our brains, make automatic connections between input and output; drawing on examples **Self-learning algorithms** --------- --------- --------- -------- Input 1 Input 2 Input 3 Output 2 4 5 3 5 2 8 2 2 2 1 3 3 3 5 4 --------- --------- --------- -------- We recognize a pattern and can state that output = (input 1 x input 2) - input 3 - Input 1, 2 and 3 are the attributes - More complex algorithms require more attributes A variable gives weights to an attribute Compared to a recipes, the attributes are the ingredients, and the variables are the weights that determine how much of each ingredient we need. **Neural Networks** Neural networks, like our brains, take inspiration from neurons. They are a popular type of Al algorithms that can adapt and evolve their internal structure based on data. This adaptability makes them powerful tools for solving complex problems in artificial intelligence. ![A computer screen shot of a diagram Description automatically generated](media/image2.jpeg) A computer screen shot of a diagram Description automatically generated![A computer screen shot of a diagram Description automatically generated](media/image4.jpeg) **Neural Networks** It might sound a bit abstract, but let\'s dive into it. Did you know that YouTube employs a sophisticated system with 30 layers to suggest new videos based on your preferences? On the other hand, Facebook utilizes a single layer in DeepFace, which goes as deep as 9 layers, processing over 120 million input attributes. This system has been trained on a massive dataset of four million images uploaded by Facebook users, boasting an impressive 97% reliability rate. Additionally, it offers the convenience of automatically tagging faces. **Al Model** When you train an algorithm on a collection of data, you get a so-called Al model Imagine training an algorithm is like baking a cake. The algorithm is the recipe, and the data are the ingredients. Just like using different ingredients leads to a different cake, using different data results in a different model. It\'s all about mixing things up to create something unique! **Al Model** Let\'s dive into how our model works to make accurate predictions. During the training phase, the system learns to establish the best connection pattern between input and output data. With each iteration, the system fine-tunes the weights based on the comparison between predicted and actual values. Training continues until the system reaches the desired level of accuracy. Next, we evaluate the model by testing its predictions on known data. If the model performs well, it moves to the operational phase, ready to be deployed. Otherwise, we go back to the drawing board for a review. **Al Model** When our data is 97% accurate during training but drops to 50% with unseen data, we refer to this as overfitting. On the other hand, if our data is only 50% accurate during training but improves to 97% with unseen data, we call this underfitting, like a student randomly guessing an answer without really understanding **Supervised Learning** When the algorithm is trained on a labeled dataset, it learns what the correct output should be. It then tries to apply this knowledge to new examples it hasn\'t encountered before. This can involve classifying data, like distinguishing between spam and non-spam, or performing regression, where the output is a continuous value such as a price. Supervised learning is generally more accurate than unsupervised methods, but it also requires a significant amount of effort. **Unsupervised Learning** In data analysis, algorithms work without human input to uncover hidden patterns. Clustering groups similar experiences, such as identifying customers as big spenders or window shopper. Association seeks connections between variables, like when a customer purchases one item and also buys another. Algorithms autonomously identify patterns and group data withsut any human interference, focusing on pattern recognition rather than making predictions. **Reinforcement Learning** Imagine having a goal in mind, but not knowing the best way to reach it. It\'s like learning to ride a bike: you feel good when you keep pedaling. If you fall, it\'s a bummer The algorithm compares each new move with what it already knows. It doesn\'t rely on examples, but learns through practice and experience. AlphaGo was taught using this method.A screenshot of a computer Description automatically generated ![A computer screen shot of a computer Description automatically generated](media/image6.jpeg) A screenshot of a computer Description automatically generated **Five Dimensions** In the past decade, we\'ve noticed a rise in five data aspects, all beginning with the letter \'V\': - Volume: the quantity of data - Variety: the various data types - Velocity: how quickly data is accessible - Value: the importance of data - Veracity: the reliability of data **Variety** Structured Data can be stored in a database or table. They are structured in columns and rows, similar to the way that spreadsheet software like Excel classifies data. Unstructured Data cannot be stored in a traditional row-column database. E.g. photos, videos, sound files or large texts. They do not have a fixed data model. Semi-structured data are somewhere between the two. E.g. photos that have metadata, information baked in the file like location, etc. **Velocity** How fast the system gets new data and how it handles it. Batch Processing Decisions are made by. the system based on data that becomes available days, months, or even years later. For example, a car salesman decides how to approach you based on data collected over the years. Near real-time processing We process data when events happen but don\'t react Immediately. For instance, if you browse items in an online store without making a purchase, you might receive an email the next day. Real-time processing Data is processed instantly, allowing the system to respond to events as they happen (e.g., Apple Vision Pro, self-driving cars). **Value & Veracity** Value Data storage comes with a price tag. It\'s important to carefully choose what to store and aim to retain the data that matters. Veracity Top-notch data is crucial for top-notch algorithms. Poor data leads to biased Al systems. **Gathering Data** Consumer Data - Smartphones - Smart phone speakers - Surveillance cameras - Internet Of Things (IOT) We need data to create a consumer profile for the best consumer experience ![A computer screen with a diagram Description automatically generated](media/image8.jpeg) **GDPR** The GDPR, also known as the General Data Protection Regulation, lays down the rules concerning the handling and protection of personal information of individuals in the European Union. Responsibility in data management should be a shared effort between businesses and individuals. It\'s important to know that you can request your data to be deleted if needed. \"Feel free to accept cookies to enhance your browsing experience.\" A computer screen shot of a screen Description automatically generated Energy Consumption Al consumes energy in the training phase and in the operational phase. GPT-4 was trained 90 - 100 days on 25000 Nvidia GPUs - 3125 physical computers (server) are needed - A server uses 6.5kWh - 100 days \* 24 hours = 2400 hours per server 2400 hours + 6,5KWh = 15600 wwh 15600 KWh \* 3125 servers = 48 750 000 KWh Is around 19 000 metric tons CO2 (inUSA) Around 890 000 trees - 5000 households (estimation) Energy Consumption Now 2% CO2 emission by Al/lOT systems 2030 20% CO2 emission by Al/lOT systems ![A computer screen with text and images Description automatically generated](media/image10.jpeg) Optimize Models Optimize models to run on local devices, such as Gemini Nano Microchips When it comes to placing electronics, microchip manufacturers work with measurements as tiny as nanometers. Just to give you an idea, one nanometer is equivalent to 0.0000001 centimeters. This shows how we\'re really pushing the boundaries of what\'s physically possible. A computer screen shot Description automatically generated Al Accelerators GPU (Graphical Processing Unit) - Used for video games - Is faster, more performant but energy-wise not suitable for training Al - Place the calculations and memory at the same place Slaughterbots The technology behind this is real - Flying automatically - Face recognition But\... \"Simply deactivate the battery or wait until the drone runs out of memory\"? Researchers are working on a battery that recharges itself by extracting CO2 form the air. This would create a device that would no langer run out of battery. Fundamental Principles - Ethics by design - Security by design - Fairness by design - Explainability by design Asimov\'s Laws Asimov defined three laws of robotics, which still influence thinking about Al and ethics. These laws were set out for the first time in 1942 and are direct inspiration for sci-movies like Terminator, Blade Runner, etc. - First Law: A robot may not injure a human being or, through inaction, allow a human being to come to harm - Second Law: A robot must obey the orders given it by human beings except where such orders would conflict with the first law - Third Law: A robot must protect its own existence as long as such protection does not conflict with the First or Second law. The robot is therefore at the service of humankind and must not harm it. - Zeroth Law: A robot may not harm humanity, or, by inaction, allow humanity to come to harm.