Intelligent Agents PDF
Document Details
Uploaded by SolicitousMeteor
Tags
Summary
This document discusses various aspects of intelligent agents, including the definition of an intelligent agent. It covers aspects such as agent function, rationality, different agent types, and various environment types. It provides a clear comparison of old-school and modern vacuum robots.
Full Transcript
Intelligent Agents Image: "Robot at the British Library Science Fiction Outline PEAS (Performan ce What is an measure, Environme Agent intelli...
Intelligent Agents Image: "Robot at the British Library Science Fiction Outline PEAS (Performan ce What is an measure, Environme Agent intelligent Rationality Environme nt types types agent? nt, Actuators, Sensors) Outline PEAS (Performan ce What is an measure, Environme Agent intelligent Rationality Environme nt types types agent? nt, Actuators, Sensors) What is an Agents? An agent is anything that can be viewed as perceiving its environment through sensors and acting upon that environment through actuators. Control theory: A closed-loop control system (= feedback control system) is a set of mechanical or electronic devices that automatically regulate a process variable to a desired state or set point without human interaction. The agent is called a controller. Softbot: Agent is a software program that runs on a host device. Agent Function and Agent Program The agent functionmaps from the set of all possible percept sequences to the set of actions formulated as an abstract mathematical function. 𝒑 𝒂= 𝒇 (𝒑 ) 𝒂 The agent program is a concrete implementation of this function for a given physical system. Agent = architecture (hardware) + agent program (implementation of ) Sensors Memory Computational power Example: Vacuum-cleaner World Percepts: Location and status, e.g., [A, Dirty] Actions: Most Left, Right, Suck, recent NoOp Percept Agent function: Implemented agent program: Percept Sequence function Vacuum-Agent([location, Action status]) [A, Clean] returns an action Right [A, Dirty] if status = Dirty then return Suck Suck else if location = A then return Right else if location = B then return Left … [A, Clean], [B, Clean] Left … Problem: This table can become [A, Clean], [B, infinitively Clean], [A, large! Outline PEAS (Performan ce What is an measure, Environme Agent intelligent Rationality Environme nt types types agent? nt, Actuators, Sensors) Rational Agents: What is Good Behavior? Foundation Consequentialism: Evaluate behavior by its consequences. Utilitarianism: Maximize happiness and well-being. Definition of a rational agent: “For each possible percept sequence, a rational agent should select an action that maximizes its expected performance measure, given the evidence provided by the percept sequence and the agent’s built-in knowledge.” Performance measure: An objective criterion for success of an agent's behavior (often called utility function or reward function). Expectation: Outcome averaged over all possible situations that may arise. Rule: Pick the action that maximize the expected utility Rational Agents Rule: Pick the action that maximize the expected utility This means: Rationality is an ideal – it implies that no one can build a better agent Rationality ≠ Omniscience – rational agents can make mistakes if percepts and knowledge do not suffice to make a good decision Rationality ≠ Perfection – rational agents maximize expected outcomes not actual outcomes It is rational to explore and learn – I.e., use percepts to supplement prior knowledge and become autonomous Rationality is often bounded by available memory, computational power, available sensors, etc. Example: Vacuum-cleaner World Percepts: Location and status, e.g., [A, Dirty] Actions: Left, Right, Suck, NoOp Agent function: Implemented agent program: Percept Sequence function Vacuum-Agent([location, Action status]) [A, Clean] returns an action Right [A, Dirty] if status = Dirty then return Suck Suck else if location = A then return Right else if location = B then return Left … [A, Clean], [B, Clean] Left … What could be a performance measure? Outline PEAS (Performan ce What is an measure, Environme Agent intelligent Rationality Environme nt types types agent? nt, Actuators, Sensors) Problem Specification: PEAS Performance measure Performan Environme ce Actuators Sensors nt measure Components Defines and rules of how Defines Defines utility and actions affect available percepts what is the actions rational environment. Example: Automated Taxi Driver Performan Environme ce Actuators Sensors nt measure Safe Roads Steering Cameras fast other wheel sonar legal traffic speedomet comfortabl pedestrian accelerator er e trip s brake GPS maximize customers signal Odometer profits horn engine sensors keyboard Example: Spam Filter Performan Environme ce Actuators Sensors nt measure Accuracy: A user’s Mark as Incoming Minimizing email spam messages false account delete other positives, email etc. informatio false server n about negatives user’s account Outline PEAS (Performan ce What is an measure, Environme Agent intelligent Rationality Environme nt types types agent? nt, Actuators, Sensors) Environment Types Fully observable: The agent's Partially observable: The agent sensors give it access to the vs. cannot see all aspects of the complete state of the environment. E.g., it can’t see environment. The agent can through walls “see” the whole environment. Stochastic: Changes cannot be determined from the current state and Deterministic: Changes in the the action (there is some randomness). environment is completely vs. Strategic: The environment is determined by the current state of stochastic and adversarial. It chooses the environment and the agent’s actions strategically to harm the agent. action. E.g., a game where the other player is modeled as part of the environment. Known: The agent knows the Unknown: The agent cannot predict rules of the environment and can vs. the outcome of actions. predict the outcome of actions. Environment Types Static: The environment is not changing while agent is Dynamic: The environment is vs. changing while the agent is deliberating. Semidynamic: the environment is static, but the deliberating. agent's performance score depends on how fast it acts. Discrete: The environment provides Continuous: Percepts, actions, state a fixed number of distinct percepts, vs. variables or time are continuous leading to actions, and environment states. an infinite state, percept or action space. Time can also evolve in a discrete or continuous fashion. Episodic: Episode = a self- Sequential: Actions now affect the contained sequence of actions. The vs. outcomes later. E.g., learning makes agent's choice of action in one problems sequential. episode does not affect the next episodes. The agent does the same task repeatedly. Single agent: An agent operating vs. Multi-agent: Agent cooperate or by itself in an environment. compete in the same environment. Examples of Different Environments Vacuum cleaner Chess with Scrabble Taxi driving world a clock Observable Partially Fully Partially Partially Determ. DeterministiDeterministic game Stochastic Stochastic c +Strategic Mechanics Episodic + Strategic* Episodic Episodic Sequential Episodic? Static Semidynamic Static Dynamic Static Discrete Discrete Discrete Continuous Single Multi* Multi* Multi* Discrete * Can be models as a single agent problem with the other agent(s) in the environment. Outline PEAS (Performan ce What is an measure, Environme Agent intelligent Rationality Environme nt types types agent? nt, Actuators, Sensors) Designing a Rational Agent Remember the definition of a rational agent: 𝑓 “For each possible percept sequence, a action rational agent should select an action that maximizes its expected performance measure, given the evidence provided by the percept sequence and the agent’s built-in knowledge.” Agent Function Hardware + Percept to the Important: Represents the agent function an event loop Everything “brain” Read the outside the Assess sensors Ask agent agent function performance function represents the measure for action Execute 𝑎= 𝑓 (𝑝) environment. Remember action This includes percept Action from the the physical sequence agent function robot, its Built-in to execute sensors and knowledge its actuators, and event Hierarchy of Agent Types Utility-based agents Goal-based agents Model-based reflex agents Simple reflex agents Simple Reflex Agent Uses only built-in knowledge in the form of rules that select action only based on the current percept. This is typically very fast! The agent does not know about the performance measure! But well- designed rules can lead to good performance. The agent needs no memory and ignores all past percepts. 𝑎= 𝑓 (𝑝) The interaction is a sequence: Example: A simple vacuum cleaner that uses rules based on its current sensor input. Model-based Reflex Agent Maintains a state variable to keeps track of aspects of the environment that cannot be currently observed. I.e., it has memory and knows how the environment reacts to actions (called transition function). The state is updated using the percept. There is now more information for the rules to make better decisions. 𝑠 ′ 𝑠 =𝑇 (𝑠 , 𝑎) 𝑎= 𝑓 (𝑝 , 𝑠) The interaction is a sequence: Example: A vacuum cleaner that remembers were it has already State Representation States help to keep track of the environment and the agent in the environment. This is often also called the system state. The representation can be Atomic: Just a label for a black box. E.g., A, B Factored: A set of attribute values called fluents. E.g., [location = left, status = clean, temperature = 75 deg. F] Action causes Variables transition describing the system state are called “fluents” We often construct atomic labels from factored information. E.g.: If the agent’s state is the coordinate x = 7 and y = 3, then the atomic state label could be the string “(7, 3)”. With the atomic representation, we can only compare if two labels are the same. With the factored state representation, we can reason more and calculate the distance between states! The set of all possible states is called the state space This set is typically very large! Transition Function State The environment is modeled as a discrete dynamical system. Example of a state diagram Actio for the Vacuum cleaner world. n States change because of a. System dynamics of the environment (the environment evolves by itself). b. The actions of the agent. Both types of changes are represented by the transition function written as 𝑇 : 𝑆 × 𝐴→ 𝑆 or 𝑠 ′=𝑇 (𝑠 , 𝑎) … set of states … set of available actions … an action … current state … next state Old-school vs. Smart Thermostat Old-school Smart thermostat thermostatStates Percepts Percepts States Many sensors, Old-school vs. Smart Thermostat internet connectivity, Contac memory. Setting ts Change temperatu Set target re when Bi- temperature you are metal too spring cold/warm. Old-school Smart thermostat Percepts States thermostatStates Percepts Sensors Temp: deg. F Factored Setting: Someone states No states Estimated Cool, off, walking by (only reacts Someone time to heat cool the to the changes temp. current house Contact: Someone percepts) Internet Open, Outside temp. home? closed Weather report How long Energy till curtailment someone Day & time is coming … home? Schedule …. Goal-based Agent The agent has the task to reach a defined goal state and is then finished. The agent needs to move towards the goal. As special type is a planning agent that uses search algorithms to plan a sequence of actions that leads to the goal. Performance measure: the cost to reach the goal. [ ] 𝑇 𝑎=argmin𝑎 ∈ A 0 𝑐 | ∑ 𝑡 𝑇 𝑠 ∈ 𝑆 𝑔𝑜𝑎𝑙 𝑡=0 plan Sum of the cost of a planed sequence of actions that leads to a goal state The interaction is a sequence: cost Example: Solving a puzzle. What action gets me closer to the solution? Utility-based Agent The agent uses a utility function to evaluate the desirability of each possible states. This is typically expressed as the reward of being in a state. Choose actions to stay in desirable states. Performance measure: The discounted sum of expected utility over time. [ ] ∞ 𝑎=arg𝑚𝑎𝑥 𝑎 ∈ A 𝔼 0 ∑ 𝛾𝑡 𝑟𝑡 𝑡 =0 Utility is the expected future discounted reward Techniques: Markov decision processes, reinforcement learning interaction is a sequence: reward ample: An autonomous Mars rover prefers states where its battery is not criticall Agents that Learn The learning element modifies the agent program (reflex-based, goal-based, or utility-based) to improve its performance. How is the agent currently performing? Updates how the performance element chooses actions. Generate actions for exploration Example: Modern Vacuum Robot Features are: Control via App Cleaning Modes Navigation Mapping Boundary blockers Source: https://www.techhive.com/article/3269782/best-robot-v acuum-cleaners.html PEAS Description of a Modern Robot Vacuum Performan Environme ce Actuators Sensors nt measure PEAS Description of a Modern Robot Vacuum Performan Environme ce Actuators Sensors nt measure Time to Rooms Wheels Bumper clean 95% Obstacles Brushes Cameras/ Does it get Dirt Blower dirt sensor stuck? People/ Sound Laser pets Communic Motor … ate to sensor server/app (overheati ng) Cliff detection Home base locator Intelligent Systems a Sets of Agents: Self-driving Car Make sure the passenger has a High- Utility-based agents pleasant drive (not too much sudden level It should learn! breaking = utility) plannin Goal-based agents Plan the route to the g destination. Remember where every other car is Model-based reflex and calculate where they will be in the agents next few seconds. React to unforeseen issues like Low- Simple reflex agents a child running in front of the level car quickly. plannin g AI Areas Intelligent agents inspire the research areas of modern AI Stay within given Optimize constraints Search for a goal functions (constraint satisfaction (e.g., navigation). problem; e.g., reach the (e.g., utility). goal without running out of power) Learn a good agent program Deal with from data and Sensing uncertainty (e.g, natural improve over time (e.g., current traffic on language the road). processing, vision) (machine learning). What You Should Know What an agent function is and how it interacts with the environment. What are states and what is the transition function? How environments differ in terms of observability, uncertainty (stochastic behavior), and if the transition function is known. How to identify different types of agents.