Summary

This document provides an overview of Artificial Intelligence (AI) concepts. It covers various aspects, including logical reasoning, intelligent agents, and applications like robotics and health. Keywords include AI, computer vision, and robotics.

Full Transcript

-​ Genomic Diagnosis (genetic susceptibility to Intelligence: Logical Reasoning, inference of missing diseases, finding genes that can resist certain information, understanding and abstraction, problem solving, diseases) learning from experience, decision makin...

-​ Genomic Diagnosis (genetic susceptibility to Intelligence: Logical Reasoning, inference of missing diseases, finding genes that can resist certain information, understanding and abstraction, problem solving, diseases) learning from experience, decision making, planning, creativity, -​ Drug design and development consciousness and self-awareness (no single definition) -​ Operating theatre robotics -​ Community data analysis (outbreaks of epidemics Intelligence in humans and animals: process ambiguous and pandemics, superbug resistance, societal tendencies noisy information, manage lack of observability, handle – obesity, diabetes, etc – treatment efficacy. uncertainty, solve problems that are too complex for conventional algorithmms, assimilate new knowledge, adapt to Applications: Games a dynamic environment, compete for reward against adversaries. -​ Simulated adversaries (board games, strategy games, combat ai) What is AI? Oxford: The theory and development of computer -​ Non-player characters (interaction with player, systems to be able to perform tasks normally requiring human establishing dialogue, interaction with simulated intelligence such as visual perception, speech recognition, environment) decision-making, and translation between languages. -​ Virtual and augmented reality (processing sensor data, detecting user intent, detection of physical Narrow AI: (more advancements) environment) -​ Solve a puzzle - OTHER: Personal assistance, education, military, search -​ Play a game of Chess & rescue, hostile environments (space, underwater, -​ Recognize speech, faces volcanoes, mining), economics and finance (risk, portfolio -​ Identify objects in images management, trading), transportation and logistics, -​ correlate symptoms astronomy, weather forecasting, climate monitoring, -​ control vehicles agriculture. -​ Need reconfiguration, or a new algorithm to solve a different task Intelligent Agents: An autonomous entity which acts, directing its activity towards achieving goals, upon an using observation General AI (Artificial General Intelligence) through sensors and consequent actuators. Anything that can be viewed as perceiving its environment through sensors and acting upon that environment through actuators. -​ Apply intelligent system to any problem -​ Assimilate concepts -​ Sentience -​ Consciousness -​ self-awareness Main AI research areas: reasoning, learning, problem solving, perception. Applications: Robotics -​ Industrial (assembly lines, packaging and warehouse management) -​ Autonomous Systems (self-driving cards, drones, space rovers and vehicles) -​ Domestic Robots (cleaning, customer service, assistants/companions e.g to elderly) Applications: Industrial Automation -​ Intelligent Control (power plants, smart grids, manufacturing, high risk operations) -​ Defect detection and prevention (early fault detection, quality control, problem diagnosis, removal of human error) -​ Safety and security (pre-empt disaster situations, detect when humans are in danger) -​ Optimization (resource usage, supply chain management, just-in time systems) Applications: Health Percept Sequence = the complete history of all the data the -​ Interpretation of Patient data (x-ray, mri, ct scan agent received from its sensors. Maintaining a complete history images, eeg and ecg wave analysis) is not always feasible, and enumerating all permutations of possible percept sequences is practically impossible. 1 Rational Agents: Rationality depends on -​ The performance measure that defines the success criterion -​ The agent’s prior knowledge of the environment -​ The actions that the agent can perform -​ The agent’s percept sequence to date. OBSERVABILITY: Not all necessary information to make an optimal decision might be available to the agent’s sensors. This makes the environment partially observable. E.G Chess player can see the whole board = fully observable Poker player only seeing their own cards = partially observable Autonomous car can only sense what is immediately coming up on the road = partially Maintaining historical information in memory (e.g. counting cards) often helps an agent to make better decisions under partial observability. STOCHASTICITY: The outcome of an agent’s actions are not necessarily fully predictable An action is deterministic if the effect of applying it always results in the same outcome. An action is stochastic if the effect of applying it can result in different outcomes. Throwing a dice = stochastic, making a chess move = deterministic. DISCRETE: if there is a finite number of action choices and a finite number of states to represent it. (chess game) CONTINUOUS: if the space of possible states or actions that could be applied may be infinite (game of tennis, autonomous car) ADVERSARIAL: if the agent is competing against other agents, possible humans, to achieve its objective. E.g playing chess, pokeer. An autonomous vehicle or predicting the weather is benign. AI To manage uncertainty through computation techniques. It comes from sources like -​ Exploring all permutations being too complex or impossible -​ Environment is partially observable -​ Outcomes of actions can be stochastic -​ Environment is adversarial or multi-agent -​ Sensor inaccuracy and noise -​ Dynamic environments ​ 2 Challenges: Fully vs Partially observable, deterministic vs stochastic, discrete vs continuous, static vs dynamic, single vs multi agent, competitive vs cooperative. Algorithmic complexity – measure of how much computational resources the algorithm needs in proportion to the size of its input. Complexity is typically represented through big O notation. Approximation of an algorithm’s growth rate in terms of its input size. E.g searching linearly through a list of n items has complexity O (n) Time complexity – will the algorithm take a long time to complete as n grows Space complexity – will the algorithm need more memory to run as n grows Uninformed (Blind) Search Expand each state to find all its successors Keep expanding until the desired state is found The order with which states are expanded can impact greatly the time it takes to find a solution. It might never be found at all. Different strategies determine in which order the states are expanded. -​ Breadth first search -​ Depth first -​ Variants of above SEARCH Powerful technique to solve AI problems, Breadth First Search formulated as a Directed Graph (set of nodes, set of directional edges connecting nodes). The intelligent agent searches -​ Expand all nodes at same level before proceeding to through the graph to find the solution. expand successors -​ Shallowest unexpanded node is selected for Solving problems using search:​ expansion State space search:​ -​ New discovered nodes are added to the back of the Nodes – possible states​ queue Edges – state transitions​ -​ Guaranteed to find the shallowest path, not Solution – path from start node to the​ necessarily optimal. desired end node -​ (Shallowest path is optimal only if the cost is a non-decreasing function of the depth of the node. E.G Automated planning. Given (initial state, goal condition, -​ Attractive to use bc most commonly the depth possible actions)​ corresponds to cost, even if not necessarily optimal. System must find the right action sequence to achieve the goal -​ However, its performance is heavily dependent on condition. the branching factor of the problem (how many successors each node generates) -​ Logical Numeric, or Temporal Constraints -​ Time complexity (if each node generates b -​ Partial observability and uncertainty (likelihood that successors, solution is at depth d, complexity is conditions change during the process/until) O(b^d) -​ Space complexity is also O(b^d). So is unexpanded E.G Constraint Solving. Search for an “arrangement” that successors in queue. Explored set if doing graph satisfies all constraints.​ search is O(b^d-1). E.G Optimization. Solution must -​ Exponential complexity. -​ achieve the goal -​ satisfy all constraints -​ minimize/maximize some cost (utility) function, e.g maximize production, minimize cost, etc 3 Depth first search A heuristic is a criteria to help us decide which course of action to take if we have incomplete information. Humans use -​ Expand the first successor, and its first successor heuristics all the time when making decisions. Like in search, it recursively, until a node with no successors is found minimises the chances of going down the wrong path and -​ Then back track one level upwards and repeat having to back track our life choices. -​ No optimality guarantees, a solution w a shorter path could be left unexplored The heuristic function h(n) estimates the cost of the cheapest -​ Time complexity is O(b^m) where m is maximum path from the state at node n to a goal state. Heuristics that depth. incorporate the domain knowledge of the problem being solved -​ Space complexity can be lower; if doing tree search can be very powerful to guide a search algorithm. its O(bm), if doing graph search its O(b^m) -​ Can keep expanding one branch infinitely. Greedy Best-first search Uniform-cost search -​ Expands the node that Is estimated to be closest to the goal. Evaluation function f(n) = h(n) -​ Variant of breadth first that takes into account -​ Greedy bc at each step it tries to get as close to the different costs for each edge. Uses a priorty queue, goal as it can. Ignores the cost to reach the node from least cumulative cost first rather than shallowest first. the initial node. -​ Greedy best-first search is complete in graph search Depth-limited search (maintaining a list of explored states to avoid cycles) as long as the search space is finite. Otherwise it -​ variant of depth first search, but sets a limit, l, to the suffers from the same problem as depth first search depth (to avoid expanding infinitely). Search is not -​ The worst case time and space complexity is still incomplete, it can miss the solution if it’s at a depth 0(b^m) but if it is using an informative heuristic d>l. function this is rarely the case -​ Does not guarantee optimality. Iterative-Deepening Depth First Search A* Search -​ start with l = 0, and gradually increment the depth limit -​ Very popular best-first strategy -​ combines the advantages of breadth first search -​ Incorporates both the cost to reach the node n from (shallowest solution is found first) and those of depth the initial node and also the estimate to reach the goal first search (lower space complexity) from node n -​ with each iteration, the same tree is generated -​ Evaluation function f(n) = g(n) + h(n) again, incurring time -​ Node with the lowest estimated total cost to reach the goal is explored first -​ g(n) is adjusted if a cheaper path to n is discovered Hybrid solutions are also possible -​ v good strategy if coupled with a good heuristic function: -​ eg starting w breadth first search until starting to -​ Complete (guaranteed to find the solution if it exists) approach memory limits, and switching to -​ Optimal if the heuristic is admissible and iterative-deepening depth first search afterwards. consistent(in case of graph search) Informed (best-first) search Variants: -​ try to determine attractiveness of a node n, when -​ Iterative deepening A* (IDA*)– Perform a depth-first compared to the others that are still to be expanded. search until the total cost 𝑓 𝑛 = 𝑔 𝑛 + ℎ 𝑛 exceeds a -​ Use some evaluation function f(n) which produces a given threshold. Less space complexity, with repeated cost estimate from the state information, and the goal evaluation trade-off. it is trying to reach -​ Weighted A* - Giving a weight to the heuristic -​ Unexplored nodes are maintained in a priority component of the evaluation function:​ queue, ordered by f(n), lowest first) 𝒇 𝒏 = 𝒈 𝒏 + 𝜺𝒉 𝒏. If 𝜀 > 1, bias is towards states that are estimated closer to the goal. Trades off optimality Evaluation function f(n) can involve multiple components, for speed. Empirically shown to be significantly faster typically: than A* in a number of domains. -​ The actual cost g(n) to reach the node n from the Constraint satisfaction and optimisation: initial state -​ The estimated cost h(n) to reach a goal node from -​ Path to reach the goal is irrelevant, goal node is node n feasible solution in itself -​ Initial state -> g(n) -> current node n -> h(n) -> goal -​ Solution space search tends to be used node -​ Start node is a candidate solution, not necessarily feasible The cost estimation function h(n) is known as the heuristic -​ Each successor node is also a candidate solution function. -​ Search through variants of the initial solution to find a feasible one, or a higher quality one 4 -​ Quality is measured in terms of some objective plateaus. Typically the tabu list contains solutions that function (minimise cost/maximise utility. were already seen over the last n iterations, (minimising the chances of revisiting the same Local (neighbourhood) search solution space). -​ Particle Swarm Optimization - Starts with a population -​ Aka meta-heuristics “swarm” of candidate solutions (particles), which are -​ Explore the neighbourhood of the node (solution) updated in every iteration. Particles “move” according and choose the next node to explore to a “velocity” calculated from the difference between -​ Typically use minimal amount of memory their fitness and the best fitness value. -​ Often effective in large, possibly infinite/continuous -​ Evolutionary Algorithms Several algorithms inspired search spaces by nature have been proposed such as Ant Colony -​ Useful for optimization problems (Modify the current Optimization (ACO), Artificial Bee Colony (ABC), solution a tiny bit to see if its objective function can be Bees Algorithm, Hunting Search, Firefly Algorithm, improved) Cuckoo Search etc. -​ Susceptible to getting stuck in local maxima/minima (Introduce random jumps/perturbations, explore the Knowledge and Reasoning space from multiple starting points) Knowledge representation is a challenge in AI: -​ Representing the state of the environment (efficiently). -​ Inferring new knowledge from the state information. Hill climbing (greedy local search -​ The actions that the agent can do in the environment, including Any pre-conditions that must be true to allow -​ Simplest local search algorithm it to take place, AND How it is expected to affect the -​ Choose the successor with the best value and repeat environment. until no other successors have a better value -​ How do we expect the environment to change as time -​ Will get stuck in a local maximum goes by. Are there other agents at play -​ Incomplete (if “best” solution is not actually feasible, -​ Possibilities about information we can’t observe. Do the algorithm does not know how to look for we have a probabilistic distribution over the alternative solutions in the search space) possibilities? -​ Simulated Annealing Knowledge-Based Agent: maintains an updated knowledge base and uses an inference engine to deduce and update -​ Combines hill climbing with a random walk to get a knowledge, and choose the best action. chance to escape local minima/maxima -​ 1 randomly choose a successor, instead of best -​ 2 if successor is better, accept it -​ 3 if not better, accept with some probability less than 1, determined according to how worse the successor is and how long the algorithm has been searching for Local beam search -​ Parallelises the search by looing at population instead of just 1 solutioni -​ 1 start with a population of k randomly generated Knowledge base = a collection of statements (sentences) in nodes, (k>1) some knowledge representation language. -​ 2 generate all successors of all k nodes -​ Axioms are sentences that were not derived from -​ Select a population of k best successors other sentences. -​ Repeat from step 1 until the solution is found/cannot -​ TELL operation adds new sentences to the be improved knowledge base -​ Stochastic beam search = a variant which selects the -​ ASK operation queries the knowledge base k successors with a probability based on their -​ Inference rules derive new sentences from other ‘goodness” sentences. ▪ Can be applied both during TELL and during ASK. Genetic algorithms Logic In AI we use (formal) logic to represent and reason about -​ Starts from population of k random nodes knowledge. -​ 1 each individual is evaluated using a fitness function -​ Syntax – how well formed sentences look like. -​ Probability of an individual to reproduce depends on -​ Semantics – the meanings of sentences, i.e. the truth its fitness of the sentence -​ A crossover operator applies to pairs of individuals -​ Logic defines the relationship between different selected for reproduction to obtain a new individual sentences in every model -​ A random mutation with a small independent -​ A model is an abstraction of a possible state of the probability is applied environment. -​ Entailment – if 𝛼 is true, then 𝛽 is true, in every model. Other local search variants Logical Inference -​ Tabu Search – A variant of simulated annealing that Inference is the process by which conclusions are reached keeps a “tabu list” (solutions to avoid) to help escape based on some known evidence and reasoning. 5 Some desirable properties of an inference algorithm: ▪ Sound is used in conjunction with implies, ⇒, since we want to target (truth-preserving) – derives only entailed sentences It does a specific category of objects. not make things up that are not true. ▪ Complete – can derive The Existential Quantifier, ∃, pronounced there exists, any sentence that is entailed. It can prove all the true introduces a variable, such that the sentence is true for at least sentences that can be proved. one object. Nested Quantifiers We can also nest quantifiers and introduce Propositional Logic, a simple but powerful logic multiple variables ▪It consists of a set of atomic sentences (atoms), 𝑃 ▪ Each atom evaluates to true or false. ▪Some operators (logical FOL Satisfiability and Validity connectives) which can be used to create complex sentences. Some examples of satisfiability and validity in FOL: ▪ We can also use parenthesis to connect complex sentences together. ⊤ - tautology, always true. ⊥ - falsity, always false. ¬ - negation of a sentence ∧ − conjunction of two sentences - true if both are true ∨ - disjunction of two sentences - true if any one is true ⇒ - implication (implies) - true as long as when the premise (antecedent), is true, the conclusion (consequent), is also true. ⇔ - equivalence (bidirectional) - true if both true or both false Ontologies Engineering How to create a general and flexible representation for objects, A satisfiable sentence is true in at least one model of the relationships, properties, actions, time, measures, events etc. environment. Can we organise objects into categories? An unsatisfiable sentence is one which cannot ever be true. We can see a category as a set of its members. Properties of a A valid sentence is true in all models of the environment. category can be inferred to hold for all its members. Can we compose objects of other objects? Used for: How can we work with numbers? ▪ Normalising sentences (NNF, CNF, DNF) ▪ Systematically Functions that map to numeric symbols. determining satisfiability and validity. ▪ Inferring sentences. Relationships between numeric symbols. Symbolic AI Solver, does not handle objects and their Can we extend these concepts to time, measures, events etc.? characteristics and the relationships between them, cannot Categories model aggregates and set membership. Organising objects into categories is an important aspect of knowledge representation. Subcategories define inheritance First-Order Logic: can represent objects in the environment, relationships and create a taxonomy, or taxonomic hierarchy, functions on these objects, and relationships between which can be used in inference. objects.model of enviro includes: We also can define membership of a (sub)category by properties. ▪A set of objects. ▪ Every model has at least one object. ▪ One FOL allows us to build hierarchies in different ways: ​ can refer to a specific object with a constant symbol. ▪A set of Categories as relationships: 𝐹𝑟𝑢𝑖𝑡 𝐴𝑝𝑝𝑙𝑒 Categories as objects: functions that maps from an object to another object. ▪A set of 𝑀𝑒𝑚𝑏𝑒𝑟 𝐴𝑝𝑝𝑙𝑒, 𝐹𝑟𝑢𝑖𝑡 Subcategories as objects: 𝑆𝑢𝑏𝑠𝑒𝑡 𝐹𝑟𝑢𝑖𝑡, relations over the objects (can also be over no objects) 𝐹𝑜𝑜d We also need to capture transitive relationships. ​ A term is a logical expression that refers to an object. ▪ We should be able to infer 𝑀𝑒𝑚𝑏𝑒𝑟 𝐴𝑝𝑝𝑙𝑒, 𝐹𝑜𝑜𝑑 automatically. -​ Constant symbols are terms that refer directly to an ▪ This can be achieved by asserting that: ∀𝑠1, 𝑠2 𝑆𝑢𝑏𝑠𝑒𝑡 𝑠1, 𝑠2 object. ▪ John, Mary, A, 2 ⟺ (∀𝑥 𝑀𝑒𝑚𝑏𝑒𝑟 𝑥, 𝑠1 ⇒ 𝑀𝑒𝑚𝑏𝑒𝑟 𝑥, 𝑠2 ) -​ Functions refer to an object which is a property of other objects: ▪ LeftLeg(John) ▪ SonOf(John, Mary) -​ Variables refer to placeholders of objects (used in Intersection and Union in FOL quantifiers) ▪ 𝑥, 𝑦 ▪ Variables can be arguments of An element is part of the intersection between two categories functions: LeftLeg(𝑥) (sets): ∀𝑥, 𝑠1, 𝑠2 𝐼𝑛𝑡𝑒𝑟𝑠𝑒𝑐𝑡𝑖𝑜𝑛 𝑥, 𝑠1, 𝑠2 ⇔ 𝑀𝑒𝑚𝑏𝑒𝑟(𝑥, 𝑠1) ∧ A sentence is a statement over object relations that is a true 𝑀𝑒𝑚𝑏𝑒𝑟(𝑥, 𝑠2) Similarly, we can say that it is part of the union fact. between two sets: ∀𝑥, 𝑠1, 𝑠2 𝑈𝑛𝑖𝑜𝑛 𝑥, 𝑠1, 𝑠2 ⇔ 𝑀𝑒𝑚𝑏𝑒𝑟(𝑥, 𝑠1) -​ Atomic sentences ▪ Vowel(A) ▪ Married(John, Mary) ∨ 𝑀𝑒𝑚𝑏𝑒𝑟(𝑥, 𝑠2) -​ Equality is a special atomic sentence denoting the terms of both sides of the equality operator refer to Composition the same object. ▪ 2 = 2 ▪ SonOf(John, Mary) = David Objects can be part of another object. This concept is widely ▪ Inequality, ≠, is an abbreviation. 𝐽𝑜ℎ𝑛 ≠ 𝑀𝑎𝑟𝑦 means known as Composition. ¬(𝐽𝑜ℎ𝑛 = 𝑀𝑎𝑟𝑦) ▪Engine is part of a car. -​ The same operators (connectives) of Propositional ▪Mouth is part of one’s face. Just like categories, composition Logic can be used to combine FOL atomic sentences can also form a hierarchy: into complex sentences. ▪Birkirkara is part of Malta. ▪Malta is part of Europe. ▪Europe is -​ ¬,∧,∨, ⇒,⇔ part of Earth. The Universal Quantifier, ∀, pronounced for all, introduces a Composition in FOL placeholder variable (typically lowercase) such that the We can create another special relation in FOL to represent sentence is true for all objects.Typically, the universal quantifier composition: ▪Birkirkara is part of Malta: 𝑃𝑎𝑟𝑡𝑂𝑓 𝐵𝑖𝑟𝑘𝑖𝑟𝑘𝑎𝑟𝑎, 𝑀𝑎𝑙𝑡𝑎 ▪Malta is part of Europe: 𝑃𝑎𝑟𝑡𝑂𝑓 𝑀𝑎𝑙𝑡𝑎, 𝐸𝑢𝑟𝑜𝑝𝑒 ▪Europe is 6 part of Earth: 𝑃𝑎𝑟𝑡𝑂𝑓 𝐸𝑢𝑟𝑜𝑝𝑒, 𝐸𝑎𝑟𝑡ℎ We can also make it DOMAIN SPECIFIC PLANNING transitive (like we did for subsets): ∀𝑥, 𝑦, 𝑧 𝑃𝑎𝑟𝑡𝑂𝑓 𝑥, 𝑦 ∧ 𝑃𝑎𝑟𝑡𝑂𝑓 §Dedicated knowledge representation and techniques. 𝑦, 𝑧 ⇒ 𝑃𝑎𝑟𝑡𝑂𝑓(𝑥, 𝑧) §Solution is fine-tuned to the specific rules and assumptions of the problem domain. Numbers in FOLcan be representedas constant symbols. §System is designed for only one application. ▪However, we might want to also define some numeric §Example: § Path/Route planning § Puzzle Solvers properties. For example, we can define the natural numbers (positive integers) including 0 as follows: ▪𝑁𝑎𝑡𝑁𝑢𝑚(0) – 0 is a DOMAIN INDEPENDENT natural number ▪∀𝑛 𝑁𝑎𝑡𝑁𝑢𝑚 𝑛 ⇒ 𝑁𝑎𝑡𝑁𝑢𝑚(𝑆 𝑛 ) – each natural §Generic knowledge representation and search techniques. number has some successor, where 𝑆 𝑛 is a function that maps §Solution is generic, but tries to infer or deduce rules from a symbol to its successor ▪∀𝑛 ¬(0 = 𝑆 𝑛 ) – 0 is not the domain model. successor of any number ▪∀𝑚, 𝑛 𝑚 ≠ 𝑛 ⇒ 𝑆 𝑚 ≠ 𝑆 𝑛 §Reusable across multiple domains. We can even define operators such as addition etc. §Dynamically adaptive to changes to the domain model or the rules of the environment. Further Extensions: of enviro §Focus of a significant part of A.I. Planning research. ▪Measurements: 𝐿𝑒𝑛𝑔𝑡ℎ 𝑅𝑢𝑙𝑒𝑟 = 𝐶𝑒𝑛𝑡𝑖𝑚𝑒𝑡𝑒𝑟𝑠 30 ▪Events: 𝑇(𝐷𝑎𝑦𝑙𝑖𝑔ℎ𝑡, 6) – from 6 onwards there will be daylight Solving a Planning Problem ▪Processes: 𝐻𝑎𝑝𝑝𝑒𝑛𝑠(𝐴𝑖𝑟𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛𝑖𝑛𝑔, 8,16 ) – Airconditioning §Model as a state-transition system, enumerating the different will be on between 8 and 16. states the system may be in as a result of applying actions in ▪Time Intervals: 𝑂𝑣𝑒𝑟𝑙𝑎𝑝𝑠 𝑅𝑒𝑖𝑔𝑛𝑂𝑓 𝐺𝑒𝑜𝑟𝑔𝑒𝑉𝐼 , 𝑅𝑒𝑖𝑔𝑛𝑂𝑓 different orders. 𝐸𝑙𝑖𝑧𝑎𝑏𝑒𝑡ℎ𝐼 §Typically solved using some flavour of combinatorial search: State-space Search: Find a state that satisfies the goal Alternative Systems for Knowledge Representation condition and extract the path from the initial state to the goal -​ Semantic Networks ▪ Objects and categories are state. Plan-space Search: Find a valid plan from a graph of nodes, and relationships are edges. ▪ Inference by partial plans (subset of actions, partially defined ordering, etc.) traversing nodes, depending on the type of edge. §Most state-of-the-art planners make use of state-space -​ Description Logics ▪ Formal language for constructing search with: § Heuristics (to improve search guidance) § and combining categories. Pruning Techniques (to reduce the number of states and branching factor) Planning is: §Plan-space search had some success too, mostly using -Devising a strategy to achieve some desired goal state. local-search techniques. -Choosing actions to maximise the probability of obtaining some outcome. SATISFYING PLANNING -Determining the right sequence of actions to achieve some §Finding a solution to the problem is enough. §Easily desired condition without violating constraints. -Thinking about verifiable, by validating the plan. §Less complex the prerequisites and consequences of an action. computationally (any plan will do). §Plan cost/quality can be -An indicator of high intelligence? (humans, apes, improved post-hoc by continuing search for better plans. chimpanzees, crows) OPTIMAL PLANNING Planning: Identifying the tasks or actions that need to take §The best solution is required. §Often difficult to verify place to achieve your objective. optimality § How do you know there isn’t a better plan? Scheduling: Choosing the right time when such actions should §Computationally much more complex, unless the problem has take place. very specific properties § A* search guarantees optimality if the problem has an admissible (and reasonably informative) The two are very related (a plan without a schedule is not heuristic. really actionable), which causes confusion. Resource allocation spans across both planning and scheduling, depending on whether it affects the viability of the A Classical Planning Task: plan itself.​ A planning task, Π = ⟨si, A, g⟩, where § si! is the initial state, g is the goal condition, Automated Planning aka AI Planning: The ability for an We want to have the result state after applying our plan to intelligent system to make autonomous decisions that lead to a satisfy this goal. sequence of actions to achieve its goals. A is the set of possible actions with each action a ∈ A having -​ a precondition, pre(a), that needs to be satisfied for Brings in cross-disciplinary research, such as: § Knowledge the action to be applicable, and Representation § Combinatorial/Heuristic Search § Constraint -​ an effect, eff(a), that transitions the system to another Programming § Discrete Optimization § Operations Research state. (Linear/Mixed Integer Programming) § Machine Learning (Reinforcement Learning) A Plan: The action sequence Π = (a1 … an) is a valid plan for a Real world applications: planning task Π -> (i subscript) §Industrial Automation in Oil & Gas drilling operations -​ S0 ⊨ pre(a1) §On-board autonomy in Space & Interplanetary Systems -​ Si ⊨ pre(ai + 1), where si = eff (ai)(si-1), i > 0 §Automation of Unmanned Underwater Vehicles §Autonomous -​ Sn ⊨ g Control of Smart Grid power systems §Logistics automation The ⊨ relation means that the state on the left satisfies the and optimisation §Autonomous driving condition on the right. 7 STRIPS Representation for Classical Planning (Named after the Stanford Research Institute Problem Solve automated planner.) The most basic form of representation. -​ Propositional STRIPS: § Facts can be true or false. § State is represented as the set of true facts. § Closed world assumption (omitted facts are assumed false). § Actions have preconditions, add effects (true) and delete effects (false). § Preconditions and Goals are a set of facts that must be true. Planning as State-Space Search: The representation of a §Does the world state look like what we expect it to be planning problem can map to a search problem. according to plan? §Define the search space as a subset of the state space of the §Do we need to replan or can we perform some kind of plan problem. repair? §Nodes correspond to environment states. §Edges (arcs) correspond to the state transitions (applicable Contingent Planning: what if the effects of actions are actions). non-deterministic? §A path in the search space from the initial node to a goal -​ Plan ahead for different possible outcomes of an node corresponds to the plan. action. § Forward state-space algorithms start from the initial node -​ Enrich plans with conditional statements that depend and search until they find a state that satisfies the goal. on the evaluation of a logical statement in real time. § Domain-independent heuristics are a very active area of The plan dictates that sensing activity needs to take research. place. State-Space Plan Extraction §The solution (plan) of a planning problem is a path from the initial state, s0, to a goal state, sg ⊨ g. §The plan, Π = a1, a2, … , a3 , can be extracted from the node of sg by traversing back the parent, putting each action in a stack (LIFO). Conformant Planning §A belief state represents a set of world states. E.g Not sure if the door is locked or not, Location of robot is uncertain. §Find a plan that leads to the goal irrespective of the uncertainty in our belief state, Assume door is locked and get the key anyway, Move robot up and left until we are sure it is in Planning and Acting the corner, this way the location is known in the state. - Planning looks ahead into the future and finds a plan that §Can also be extended to non-deterministic actions. §The achieves a goal or maximises some utility. However things do agent needs to find a plan in the belief space. Conformant not always go “according to plan” … §Stochasticity – Do the Planning is much harder in terms of both time and space actions always have a deterministic outcome? complexity § Will “accelerate” always move the car forward in the same way? Time, Resources, and Exogenous Events §Other agents – Can they interfere with the plan? Are they -​ Classical Planning (STRIPS) can only model facts competitive agents (i.e. their goal to actually stop us from that are true or false. Real-world Planning Tasks achieving our goal)? often involve more sophisticated requirements. §Partial Observability – State information we don’t know about. -​ Temporal Constraints (Temporal Planning) We need Traffic jam on a route we originally thought was the fastest. to arrive to the airport before boarding time of our §Incorrect knowledge – Issues with the data that might affect flight. our decisions Maps not up to date, Wrong GPS coordinates. -​ Numeric Resources (Numeric Planning) We need to have enough fuel to reach our destination. Interleaving Planning with Execution -​ Relationships between numeric properties and time. Distance travelled as a function of time and speed. -​ Processes and events that take place in the environment. Leaving the water tap open and doing no action means that the bucket will overflow and the room will get flooded. Temporal, Numeric and Hybrid Planning allow for richer modelling at the expense of computational complexity 8 PROBABILITY: one of the most imp tool used in AI to handle uncertainty ▪Partial Observability Which facts are most likely given the data I am observing? ▪Stochasticity What is the most likely outcome of some action? What is the probability distribution of the different states I could be in? Maintaining a belief state of all possible world states can become very impractical. (Recall Contingent Planning) Belief state is large and complex. All possible conditions will make the plan arbitrarily large, to even handle unlikely contingencies. What if there is no plan that is guaranteed to achieve the goal? ▪ Reaching the airport on time is contingent on the fact that the car will not break down, it is not involved in an accident, there are no traffic jams … ▪ Choose the actions that are most likely to lead to the goal. Probability for Interference Probability: The sum of probabilities of all possible outcomes of an event must always be 1. The sum of the probabilities of mutually exclusive outcomes conditioned on the same prior is 1. The probability of a dependent variable must take into account the probability of all priors. Joint Probabilities of independent events are multiplied. Probabilities of alternative mutually exclusive outcomes are added. 9 Bayes Networks: Compact representations of the probability distributions over graphs of random variables. Probabilistic Graphical Models that are used in a lot of fields, such as: ▪Pattern Observations and detecting anomalies Manufacturing and Industrial settings, spam filtering. ▪Diagnostics System troubleshooting, health, tracing cause and effect. ▪Predictions Forecasting most probable future state (weather, finance) ▪Decision Making Identifying actions that lead to the most favourable probable outcome (robotics, autonomous vehicles, decision support systems) Bayes Networks are also known as Bayesian Networks, Belief Networks, Decision Networks. Bayes Networks in AI Bayes Networks are also the basic building blocks of more advanced AI techniques such as: ▪Causal Networks ▪Hidden Markov Models ▪Markov Decision Processes ▪Partially Observable Markov Decision Processes ▪Particle Filters ▪Kalman Filters Inference with Bayes Networks Variables are categorised into 3 types. Query Variables – what we want to know, Evidence Variables – what we can observe, Hidden Variables – they have some kind of influence on our model 10 We typically want to find one of the following: ▪Prediction: Determine some future event from data being The probability distribution of our query variables given the observed. E.g. predict if the stock market price will go up, evidence. The most likely explanation to the evidence. What predict if it is going to rain tomorrow ▪Recommendation values should the query variables take to maximise the joint Determine what a user’s interests are depending on similarities probability. with others. E.g. music or movie recommendations ▪Anomaly Detection Determine whether some input data follows the expected trend. E.g. detecting faults in plant machinery Other Inference Techniques: ▪Inference by Enumeration can get very computationally expensive. ▪Variable Elimination is a technique to group variables together to reduce the number of variables in the network. ▪Approximate Inference Techniques: Use simulation or sampling to approximate the joint probabilities. Markov Chain Monte Carlo methods generate samples of a random variable at the specified known probability distribution. MACHINE LEARNING: A sub-field of AI about discovering models from data. Makes heavy use of statistics Applications of Machine Learning ▪Customer purchase patterns Product recommendations ▪Robotics Extracting information from noisy sensors. Machine vision. ▪Natural Language Processing, Machine language translation, Extracting meaning, Speech recognition. ▪Health and Biomedical Engineering, Processing of EEG and ECG data., Analysis of X-Ray and MRI imaging. ▪Fintech, Stock market patterns for trading. The system learns: ▪Model parameters Probability distributions for certain events. E.g. If it is cloudy in December, what is the likelihood of rain? ▪Structure Relationships between variables. E.g. Air Pressure is correlated with the weather. ▪Hidden Concepts Identifiers of certain clusters or groups. E.g. Customers who have certain tastes of music. What are we trying to achieve? ▪Classification: Determine whether some input data set belongs to some class or not. E.g. determining whether an email is spam, whether an image contains a cat ▪Regression: Determine a continuous value from some input value. E.g. estimate the house price given the location and floor size 11 Issues with linear regression: -​ very fast to train and works well if there is a clear linear correlation between the input variables and the target values. -​ If the data has a lot of noise (outliers) they will steer the regression line away from the real trend-line. -​ If the data is non-linear, linear regression will have poor predictions. 12 Some other supervised learning techniques: ▪Naïve Bayes Classifiers ▪Support Vector Machines ▪Artificial Neural Networks ▪ Many variants such as Recurrent Networks, Convolution Networks etc. ▪Nearest Neighbour (kNN) ▪Decision Trees and Random Forests 13 REINFORCEMENT LEARNING: a machine learning technique to determine which decisions or actions deliver most reward. In a sense, it is very related to Planning. The agent learns by exploring and observing the effects of its actions on the environment. Training from the real environment. Training from historical data. Training from simulations. The objective of Reinforcement Learning is to find a policy Π, that recommends the actions that maximise some reward. Planning: A planning task can be represented as a state transition system. Actions transition from one state to the other.We search through the state transition system to find a path from the initial state to some goal state. If the number of states is finite, it is also called a Finite State Machine or Finite State Automaton. So far we have only talked about problems that are deterministic and fully observable. If the number of states is finite, it is also called a Finite State Machine or Finite State Automaton. So far we have only talked about problems that are deterministic and fully observable. Planning under uncertainty: 14 15 MULTI-AGENT SYSTEMS: Single Agent vs Multi-Agent The techniques we discussed so far tackle the various characteristics of AI problems from the point of view of a single intelligent agent. -​ Computational Complexity -​ Stochasticity -​ Partial Observability -​ Unknown Environments Games: Single-player = one agent is trying to win the game or maximise reward Multi-player = multiple agents are trying win the game or maximise reward. Games can also be stochastic, like rolling dice, partially observable e.g cards hidden. Zero-sum games: the gain of some participants is EQUAL to the loss of others. Summing total utility of all participants is 0. 16 GAME THEORY: studies strategies to apply in different kinds of situations. -​ Sequential (turn-based) vs Simultaneous Games. -​ Co-operative vs non-cooperative games. ▪ -​ Perfect vs Imperfect Information (partial observability). -​ Symmetric (participants have same goal) vs Asymmetric Games. -​ Zero-sum vs Non-zero-sum Games. It can also be applied to more serious situations (auctions, business, politics, war & defence) 17 18 ROBOTICS: Although Robotics and AI are separate fields, they have always been associated together. -​ In fiction, AI has always been embodied in robotics. -​ The term “robotics” was coined by science fiction author Isaac Asimov in his novels, which often have moral and ethical AI dilemmas in their plots. Online State estimation: Infers the most likely state from measurements. Filters are algorithms that estimate a belief state, a probability distribution over the possible real-world states. A bayes filters is used to estimate the belief state, xt, given -​ The most recent measurements from sensors. zt -​ The most recent applied action, at -​ The previous belief state of the system x t-1 If the Markov property is assumed, then we dont care about belief states prior to x t-1. 19 20 Planning in robotics: Planning to perceive: §Planning in Robotics is typically layered into different levels of Explicit sensing actions could be introduced as part of the plan abstraction. in order to gather information from the environment. §At a higher abstraction level, planning can be planned in a -​ A sensing action could include the configuration of the quasi- deterministic fashion. Plan the sequence of towns we sensor and how to process its data. (For example: need to visit to reach our destination. Plan the tasks that need execute an object recognition algorithm on the image to be performed in the right order. read from a camera.) §At a lower abstraction level, plans are more short term, and -​ These kinds of actions are usually very important in need to be more robust: robotic systems. -​ Continuous values. Symbol Anchoring: -​ Partial observability -​ The perceived information needs to be mapped to the -​ Stochasticity objects and facts described in the planning model. -​ Dynamic environments -​ Anchoring is the process of maintaining the -​ Multi-agent correspondence of the symbolic state of an object to the sensor data that refers to the same physical object. (how does the system recognise two similar objects from 1 another, pattern recognition and object signatures, interference from known information about the objects, re-establishing of anchors after occlusion) Supervision and Monitoring: -​ A lot of things can go wrong during execution, and blindly executing a plan will inevitably lead to frequent failures. -​ Monitoring is in charge of 1. Detecting discrepancies between predicted states and observations 2. Diagnosing possible causes. 3. Triggering recovery activities -​ The platform itself also needs to be monitored for malfunction. Are we observing an incoherent state because things didn’t go as planned, or because the sensor is faulty? Planning with Uncertainty: §Conformant Planning - Find a plan that is successful for all the possibilities in our belief state. §Contingent Planning - Generate a decision tree that determines which part of the plan to execute. - Interleave sensing actions to determine which branch of the tree to take. §Markov Decision Processes (MDP) - Model stochastic actions with their probabilities (if known). - Find a policy to reach the goal. §Partially Observable Markov Decision Processes (POMDP) MDPs with Belief States. §Plan Supervision and Monitoring - Determine if the plan is still valid, and recover (repair / replan) accordingly. 21 PRINCIPLES OF COMPUTER VISION -​ Object detection: Computer vision… how cam we capture the world around us, Image processing techniques do not require annotated images, how can we make it easier for machines to capture visual data, restricted for complex scenarios, occlusion, illumination, how can the machines make sense of this data, how will they shadows. learn? -​ Deep learning methods CV: the automatic extraction, analysis, interpretation of images Supervised, unsupervised learning or videos. Cv converts photos and videos into numerical arrays, enabling ML algorithms to draw inferences, make Libraries for ML and DL predictions, and even generate new images based on user -​ Open CV< scikit-image, numpy, pytorch. defined inputs. Helps machines make sense out of received -​ Specialised: detectron2 for object detection, visual data. mediapipe for real time ML solutions, simple cv: cv interface, kornia: differentiable cv, fastAI deep Applications: learning applications -​ Preview of a digital image -​ Image data base query Object detection applications:​ -​ Mri CV gives the machines the sense of sight - it allows them to -​ Processing scanned images “see” and explore the world thanks to ML and DL algorithms. -​ Processing birds eye view images This powerful technology has quickly found applications across multiple industries, becoming an indispensable part of technological development and digital transformation. E.g: - manufacturing sector -​ [product assembly automation -​ Defect Detection -​ 3D Vision System -​ Computer vision-guided Die Cutting -​ Predictive Maintenance -​ Safety and Security Standards -​ Barcode Analysis -​ Inventory Management E.g - Transportation -​ Detecting Traffic and Traffic Signs -​ Pedestrian Detection -​ Traffic Flow Analysis -​ Parking Management -​ License Plate Recognition -​ Road Condition Management -​ Automatic Traffic Incident Detection Stages of computer vision: -​ Driver Monitoring -​ Acquisition -​ Pre-processing -​ Low-level processing E.g - healthcare -​ High-level processing -​ Improved Medical Imaging Better Diagnostic -​ Decision making Applications Cancer Screening Surgery Assistance Research and Identifying Trends Retention 1.​ Acquisition Management in Clinical Trials Training Injury The process of acquiring images (2D, 3D). An engineering Prevention discipline focusing on automating/digitising the human vision system. E.g - agriculture Camera parameters: ​ -​ Quality Inspection of Agricultural Food Products intrinsic parameters = focal length, principal point, lens Image-based Plant Disease and Pests Detection distortion. ​ Weed Detection Soil Sampling and Mapping with Extrinsic parameters: rotation, translation (relative to other Drones Livestock Management Ecient Yield cameras or original position) Analysis Grading and Sorting of Crops Phenotyping Indoor Farming -​ Image digitisation A numeric representation of an image on a 2D grid. Each E.g retail element is referred to as a pixel and its value represents the -​ Visual Search for Enhanced Customer Experience shade or colour of that segment. Product Recommendations AR-based Try before you Buy Fitting Rooms with Magic Mirrors -​ 2D Images to 3D Scenes Automating Categorisation Improved Search How can a machine understand the complexity of an enviro, Accuracy Better Inventory Management which objects are where? 22 E.g - sports and fitness No frequencey information -​ Real time Action Management Ecient Ball Tracking More brittle and slower than statistical approaches (Tennis and other sport) Training and Development Often more precise than statistical approaches Analytics Prevention of Life Threatening Situations Error analysis is usually easier than for statistical approaches Emerging trends in CV: Market Statistics for CV Size and growth Emerging Job Roles - Computer Vision Engineer - ML Statistical: Operations Specialist - AI Research Scientist - Edge Supervised or non-supervised Computing Developer - Data Generation Specialist - CV Rules acquired from large size corpora Quality Assurance Engineer Not much linguistic expertise required Robust and quick Required future skills: Requires large size (annotated) corpora Technical: * Deep learning frameworks, Edge computing, * Error analysis is often difficult Data generation * Model optimisation Non-technical: Ethics understanding * Privacy regulations * Crossdomain knowledge * System design Resources and corpora: -​ Disk space becomes cheap Future trends: -​ ML readable text becomes common -​ US funding emphasises large scale evaluation on “real data” -​ A computational thesaurus developed by psycholinguists 1990s Wordnet -​ British National Corpus 1994 -​ WWW used as corpus Syntax: phrases/ sentences -​ Syntactic analysers and surface realisers assign a syntactic structure to a string/semantic representation on the basis of a grammar -​ Syntax tree -​ Applications: construct meaning, construct grammatical sentence, perform machine translation -​ John sat down -> sit(j,d) The core strength of computer vision is the high accuracy with Semantics: meaning which it can replace human vision if and when trained rightly. Lexical semantics = ,meaning pf words Ambiguity! Disambiguation vs paraphrasing NATURAL LANGUAGE PROCESSING Pragmatics: Knowledge about the kind of actions that speakers intend by their use of sentences (request, statement, question BIRTH OF NLP -> speech act analysis) Turing’s (1936) model of algorithmic computation Discourse McCulloch-Pitts neuron (McCulloch and Pitts, 1943) a simplified model of the neuron as a kind of computing element NLP: (propositional logic) Documents -> language detection -> pre-processing -> Kleene (1951) and (1956) finite automata and regular modelling -> output expressions Chomsky (1956) finite state machines as a way to Heuristics-based: characterize a grammar Rule-based system : Thesaurus Dictionaries First machine speech recognizers (early 1950s). 1952, Bell Regular Expressions Lab, statistical system that could recognize any of the 10 digits Context-free Grammars (CFG) from a single speaker (Davis et al., 1952) -​ Regex MACHINE TRANSLATION: -​ CFGs ( S for subjc, V for verb etc) Major attempts in US and USSR Russian to English and reverse ML STEPS: Weaver (1955) states When I look at an article in Russian, I 1. Extract features from the text say “This is really written in English, but it has been coded in 2. Use the features to train a model some strange symbols, I will now proceed to decode it”. 3. Evaluate & Improve the model George Town University, Washington system: Translated E.g Naïve Bayes Support Vector Machines Hidden Markov sample texts in 1954 Models The ALPAC report (1964) Assessed research results of groups working on MTs Concluded: MT not possible in near DL: future Funding should cease for MT ! Modern reincarnation of the Artificial Neural Networks A collection of mathematical models that combined together APPROACHES: are giving us unprecedented results Each unit is not particularly intelligent on its own but it interact Symbolic: with other units, intelligence emerges dataset of 10 million labels, it usually match or exceeds Based on hand written rules human performance Requires linguistic expertise Inspired by what we know of the biological brain 23 demonstrated lower accuracy rates for certain ethnicities and DL refers to genders -​ The nested representation of the data -​ The depth of the computations required 2.​ Transparency & Explainability -​ The number of times a representation is updated Modern AI systems, especially deep learning models, often -​ The depth of the concept graph operate as "black boxes." This raises concerns about: Understanding how decisions are made Ability to audit and DL tackles problems by: verify AI systems Right to explanation for aected individuals Uses representations built on other simpler representations Accountability when things go wrong Builds complex concepts out of simpler ones Uses a multilayer perceptron 3.​ Privacy & Data Rights Computer learns the right representation AI systems require vast amounts of data, raising questions about: Data collection and consent Personal information protection Right to be forgotten Data sovereignty and cross-border data flows. 4.​ Accountability & Responsibility When AI systems make mistakes or cause harm, who's responsible? Software developers? Companies deploying the AI? Users of the system? The AI itself? Healthcare AI Benefits: AI powered diagnostic tools Benefits: early detection of diseases, reduced human error Risks: Over-reliance on AI, privacy concerns, bias in training data UNIVERSALITY MEANS THAT A NETWORK CAN FIT TO Autonomous Vehicles: The self-driving car industry continues ANY TRAINING DATA YOU GIVE IT DOES NOT MEAN IT to grapple with ethical decisions: The trolley problem in real WILL INTERPOLATE TO NEW DATA POINTS IN A life: How should vehicles prioritise dierent lives? Recent REASONABLE WAY! statistics show autonomous vehicles are generally safer than human drivers But: Public trust remains a signicant challenge Transformers: Legal frameworks are still evolving to address liability issues Translation, summarisation, chatbot, question answering Criminal Justics: The use of AI in criminal justice systems has Large language model: revealed signicant ethical challenges: Risk assessment tools A neural network trained to generate human-like text showing racial bias Facial recognition in law enforcement Learns to predict words in a sequence raising privacy concerns Questions about due process when Utilises vast datasets and billions of parameters AI systems inuence judicial decisions Transformer-based architecture Handles a range of tasks: summarisation, translation, text Current regulatory frameworks: generation -​ The EU AI Act (2024): The world's rst comprehensive AI regulation: Risk-based approach to AI regulation Strict rules for high-risk AI applications Transparency Word -> word embeddings -> dimensionality reduction -> requirements Penalties for non-compliance visualisations of word embeddings in 2D Implementation timeline and requirements -​ US AI Bill of rights blueprint Handles, classes, etc and other relationships -​ Chinas AI governance framework -​ National AI strategies and their ethical components Mapping multiple kinds of data: learn mappings between data in a single representation, like Emerging ethical challenges: english and chinese words that have similar meanings so -​ LLMs: Issues with misinformation and hallucination appear close together in the map Copyright and intellectual property concerns Impact on creative industries Educational integrity concerns Deep learning can go further, mapping images -​ AI generated content: Deepfakes and synthetic media Authentication challenges Impact on trust in digital ChatGPT: learn from users, validate with them, use media Creative rights and attribution reinforcement -​ Environmental impact: Energy consumption of AI systems Carbon footprint of training large models AI ETHICS: Sustainability considerations in AI development 1.​ Fairness & Algorithmic Bias Practical GUIDELINES for ethical AI development: AI systems can perpetuate and amplify existing societal 1.​ Design principles, privacy fairness transparency biases. Recent examples: In 2023, multiple studies found that security by design large language models showed gender and racial biases in 2.​ Implementation strategies: regular bias testing, their outputs AI recruitment tools have shown bias against diverse development teams, stakeholder consultation, certain demographic groups Facial recognition systems have impact assessments 24

Use Quizgecko on...
Browser
Browser