Cs620 Handouts (131-258) PDF

FinalTerm Preparation 131 to 258 Modeling and Simulation(Cs620) Lesson131: Spatial Environments Spatial environments in agent-based models generally have two variants:  discrete spaces and  Continuous spaces. Continuous Spaces: In a mathematical representation, in continuous spaces between any pair of points, there exists another point, Discrete Spaces: While in discrete spaces, though each point has a neighboring point, there do exist pairs of points without other points between them, so that each point is separated from every other point. ABM Implementation However, when implemented in an ABM, all continuous spaces must be implemented as approximations, so that continuous spaces are represented as discrete spaces where the spaces between the points are very small. It should be noted that both discrete and continuous spaces can be either finite or infinite. Discrete Spaces: The most common discrete spaces used in ABM are lattice graphs (also sometimes referred to as mesh graphs or grid graphs), which are environments where every location in the environment is connected to other locations in a regular grid. Toroidal Square Lattice For instance, every location in a toroidal square lattice has a neighboring location up, down, to the left, and to the right. As mentioned, the most common representation of the environment in NetLogo is patches, which are located on a 2D lattice underlying the world of the ABM (see figure 5.9 for a colorful pattern of patches whose code is simply ASK PATCHES [ SET PCOLOR PXCOR * PYCOR ]). This uniform connectivity makes them different from network-based environments. The two most common types of lattices are:  Square Lattices  Hexagonal Lattices Square Lattices The square lattice is the most common type of ABM environment. A square lattice is one composed of many little squares, akin to the grid paper used in mathematics classrooms. There are two classical types of neighborhoods on a square lattice:  The von Neumann neighborhood, consisting of four neighbors located in the cardinal directions (see figure 5.10a); and  The Moore neighborhood, comprising the 8 adjacent cells (See figure 5.10b). Von Neumann Neighborhood A von Neumann neighborhood (named after John von Neumann, a pioneer of cellular automata theory among other things) of radius 1 is a lattice where each cell has four neighbors: up, down, left, and right. Moore Neighborhood A Moore neighborhood (named after Edward F. Moore, another pioneer of cellular automata theory) of radius 1 is a lattice in which each cell has eight neighbors in the eight directions that touch either a side or a corner: up, down, left, right, up-left, up-right, down-left, and downright. In general, a Moore neighborhood gives you a better approximation to movement in a plane, and since many ABMs model phenomenon where planar movement is common, it is often the preferred modeling choice for discrete motion. Hex Lattices: A hex lattice has some advantages over square lattices. The center of a cell in a square grid is farther from the centers of some of the adjacent cells (the diagonally adjacent ones) than other such cells. However, in a hex lattice, the distance between the center of a cell and all adjacent cells is the same. Moreover, hexagons are the polygons with the most edges that tile the plane, and for some applications, this makes them the best polygons to use. Both of these differences (equidistance between centers and the number of edges) mean that hex lattices more closely approximate a continuous plane than square lattices. But because square lattices match more closely a Cartesian coordinate system, a square lattice is a simpler structure to work with; even when a hexagonal lattice would be superior, many ABMs and ABM toolkits nevertheless employ square lattices. However, with a little effort, any modern ABM environment can simulate a hexagonal lattice in a square lattice environment. For instance, you can see a hex lattice environment in the NetLogo Code examples, Hex Cells example, and Hex Turtles example (see figures 5.11 and 5.12). In the Hex Cells example, each patch in the world maintains a set of six neighbors, with the agents located on each patch having a hexagonal shape. In other words, the world is still rectangular, but we have defined a new set of neighbors for each patch. In the Hex Turtles example, the turtles have an arrow shape and start along headings that are evenly divisible by 60 degrees; when they move they only move along these same 60-degree angles. Continuous Spaces  In a continuous space, there is no notion of a cell or a discrete region within the space. Instead, agents within the space are located at points in the space. These points can be as small as the resolution of the number representation allows.  In a continuous space, agents can move smoothly through the space, passing over any points in between their origin and destination, whereas in a discrete space, the agent moves directly from the center of one cell to another.  Because computers are discrete machines, it is impossible to exactly represent a continuous space. We can, however, represent it at a level of resolution that is very fine.  In other words, all ABMs that use continuous spaces are actually using a very detailed discrete space, while the resolution is usually high enough that to suffice for most purposes. Boundary Conditions One other factor that comes into play when working with spatial environments is how to deal with boundaries, an issue for Hex and Square lattices as much as for continuous spaces. If an agent reaches a border on the far left side of the world and wants to go farther left, what happens? Three Standard Approaches There are three standard approaches to this question, referred to as topologies of the environment: (1) it reappears on the far right side of the lattice (toroidal topology); (2) it cannot go any farther left (bounded topology); or (3) it can keep going left forever (infinite plane topology). Toroidal Topology A toroidal topology is one in which all of the edges are connected to another edge in a regular manner. In a rectangular lattice, the left side of the world is connected to the right side, while the top of the world is connected to the bottom. Thus, when agents move off the world in one direction, they reappear at the opposite side. This is sometimes called wrapping, because the agents wrap off one side of the world and onto another. In general, using a toroidal topology means that the modeler can ignore boundary conditions, which usually makes model development easier. If the world is nontoroidal, then modelers have to develop special rules to handle what to do when an agent encounters a boundary in the world. In a spatial model, this conflict is one of whether the agent should turn around simply take a step backward. Indeed, the most commonly used environment for an ABM is a square toroidal lattice. Bounded Topology A bounded topology is one in which agents are not allowed to move beyond the edges of the world. This topology is a more realistic representation of some environments. For example, if you are modeling agricultural practices, it is unrealistic for a farmer to be able to keep driving the tractor east and then end up back on the west side of the field. Using a torus environment may affect the amount of fuel required to plow the fields, so a bounded topology might be a better choice depending on the questions the model is trying to address. It is also possible to have some of the limits of the world be bounded while others are wrapping. For instance, in the Traffic Basic model, where the cars only drive from left to right, the top and bottom of the world are bounded. Infinite-plane Topology Finally, an infinite-plane topology is one where there are no bounds. In other words, agents can keep moving and moving in any direction forever. In practice, this is done by starting with a smaller world such that whenever agents move beyond the edges of the world, the world is expanded. At times, infinite planes can be useful if the agents truly need to move in a much larger world. While some ABM toolkits provide built-in support for infinite plane topologies, NetLogo does not. However, it is possible to work around this limitation by giving each turtle a separate pair of x and y coordinates from the built- in ones. Then, when an agent moves off the side of the world, we can hide the turtle and keep updating this additional set of coordinates until the agent moves back onto the view. In most cases, however, a toroidal or bounded topology will be the more appropriate (and simpler) choice to implement. Lesson132: Network-Based Environments Link: A link is defined by the two ends it connects, which are frequently referred to as nodes. However, we will use the network/node/link vocabulary throughout this module. In NetLogo, links are their own agent-type. Lattice networks In fact, lattice graphs can also be called lattice networks, with the property that each position in the network looks exactly like every other position in the network. However, ABM environments usually do not implement lattice environments as networks for both conceptual and efficiency reasons. Additionally, using patches as the default topology allows for either discrete or continuous representations of the space, whereas a network is always discrete. Random Networks Network-based environments have been found useful in studying a wide variety of phenomena, such as the spread of disease or rumors, the formation of social groups, the structure of organizations, or even the structure of proteins. There are several network topologies that are commonly used in ABM. Besides the regular networks described earlier, the three most common network topologies are random, scale-free, and small-world. In random networks, each individual is randomly connected to other individuals. These networks are created by randomly adding links between agents in the system. For example, if you had a model of agents moving around a large room and connected every agent in the room to another agent based on which agent had the next largest last two digits of their social security number, you would probably create a random network. We show one simple methods for creating a random network. This code is also in the Random Network model in the chapter 5 subfolder of the IABM Textbook folder of the NetLogo models library Code for creating a random network Scale-free networks have the property that any subnetwork of the global network has the same properties as the global network. A common way to create this type of network is by adding new nodes and links to the system so that extant nodes with a large number of links are more likely to receive new connections (Barab á si, 2002). This technique is sometimes called preferential attachment, since nodes with more connections are attached to preferentially. This method for network creation tends to produce networks with central nodes that have many radiating links; because of the resemblance to a bicycle wheel, this network structure is sometimes also called hub-and-spoke. Many real-world networks such as the Internet, electricity grids, and airline routes have similar properties to a scale-free network. Small-world Network The final standard network topology is known as a small-world network. Small-world networks are made up of dense clusters of highly interconnected nodes joined by a few long distance links between them. Due to these long distance links, it does not take many links for information to travel between any two random nodes in the network. Small-world networks are sometimes created by starting with regular networks, like the 2D lattices described before, then randomly rewiring some of the connections to create large jumps between occasional agents. Ways to Characterize Networks There are many ways to characterize networks. Two commonly used ways are  Average Path Length  Clustering Coefficient Average Path Length The average path length is the average of all the pairwise distances between nodes in a network. In other words, we measure the distance between every pair of nodes in the network and then average the results. Average path length characterizes how far the nodes are from each other in a network. Clustering Coefficient of a Network The clustering coefficient of a network is the average fraction of a node ’ s immediate neighbors who are also neighbors of the node ’ s other neighbors. In other words, it is a measure of the fraction of my friends who are also friends with each other. In networks with a high average clustering coefficient, any two neighboring nodes tend to share many of their neighbors in common, while in networks with a low average clustering coefficient there is generally little overlap between groups of surrounding neighbors. NetLogo also includes a special extension, the network extension, for creating, analyzing, and working with networks. Lesson133: Special Environments The two methods we have demonstrated for defining the environment, two-dimensional (2D) grid-based (i.e., lattice) and network-based, are both instances of “ interaction topologies. ” Interaction topologies describe the paths along which agents can communicate and interact in a model. Besides the two interaction topologies we have looked at so far, there are several other standard topologies to consider. Two of the most interesting topologies involve the use of 3D worlds and Geographic Information Systems (GIS). 3D worlds allow agents to move in a third dimension as well as the two dimensions in traditional ABMs. GIS formats enable the importation of layers of real-world geographical data into ABMs. We will discuss each of these in turn. 3D Worlds 3D environments enable model developers to explore complex systems which are irreducibly bound up with a third dimension as well as sometimes increasing the apparent physical realism to their models. There is a version of NetLogo called NetLogo 3D (it is a separate application in the NetLogo folder) that allows modelers to explore ABMs in three dimensions. There are many implementations of classic ABMs that have been developed for this environment in the NetLogo 3D models library (Wilensky, 2000). For instance, there is a three-dimensional version of the Percolation model (see figure 5.16). Working With 3D Worlds Working with 3D worlds is not much different from working with 2D worlds. Although, using a 3D world does require additional data and commands. For instance, agents now have a Z- coordinate, and new commands are needed to manipulate this new degree of freedom. In 3D, the orientation of an agent can no longer be described by its heading alone, we must also use pitch and roll to describe the orientation of the agent. If you think about an agent as an airplane, then the pitch of the agent is how far from horizontal the nose of the airplane is pointing. For instance, if the nose of the airplane is pointing straight up then the pitch is 90 degrees (see figure 5.17). Using the airplane metaphor the roll of an agent is how far the wings are from horizontal. For instance, if the wings are pointing up and down then the roll of the agent is 90 degrees (see figure 5.18). In many 3D systems, heading has a new name, which is yaw, but in NetLogo we still use HEADING regardless of whether you are in 2D or 3D NetLogo to keep things consistent. To see the difference between writing 3D models and 2D models, let us examine the code in each model. The code for the PERCOLATE procedure in 2D Percolation looks like this: Notice that in the 2D version, the code has each patch look at patches that are below, left, and right of the patch. This will change in the 3D version. In 3D Percolation (Wilensky & Rand, 2006), PERCOLATE looks like this: The difference between 3D percolate and 2D The only difference between the 3D percolate and its 2D counterpart is that instead of percolating a line of patches at a time, we percolate a square of patches. As a result, each patch must ask four patches below it (in a cross shape) to percolate, not just two as it does in the 2D version. GIS-Based Geographic Information Systems (GIS) are environments that record large amounts of data that are related to physical locations in the world. GISs are widely used by environmental scientists, urban planners, park managers, transportation engineers, and many others, and help to organize data and make decisions about any large area of land. Using GIS, we can index all the information about a particular subject or phenomenon by its location in the physical world. Moreover, GIS researchers have developed analysis tools that enable them to quickly examine the patterns of this data and its spatial distribution. GIS Tools And Techniques As a result, GIS tools and techniques allow for a more in-depth exploration of the pattern of a complex system. Agents moving on a GIS terrain may be constrained to interact on that terrain. Thus, GIS can serve as an interaction topology. The 3D model, Flocking 3D Where does ABM fit in to this picture? GIS can provide an environment for an ABM to operate in. Since ABM encompasses a rich model of process of a complex system, it is a natural match for GIS, which has a rich model of pattern. By allowing ABM to examine and manipulate GIS data, we can build models that have a richer description of the complex system we are examining. GIS thus enables modelers to construct more realistic and elaborate models of complex phenomena. Lesson134: Interactions Now that we have discussed both agents and the environments in which they exist, we will look at how agents and environments interact. There are five basic classes of interactions that exist in ABMs:  agent-self  environment-self  agent-agent  environment-environment  agent-environment We will discuss each in turn along with some examples of these common interactions. Agent-Self Interactions Agents do not always need to interact with other agents or the environment. In fact, a lot of agent interaction is done within the agent. For instance, most of the examples of advanced cognition that we discussed in the Agent section involve the agent interacting with itself. The agent considers its current state and decides what to do. One classic type of self-agent interaction is birth. Birth events are a typical event in ABMs In these, one agent creates another agent. Below is the birth routine: As you can see, the agent considers its own state and, based on this state, decides whether or not to give birth to a new agent. It then manipulates its state, lowering its energy and creating the new agent. This is the typical way of having agents reproduce. Environment-Self Interactions Environment-self interactions are when areas of the environment alter or change themselves. For instance, they could change their internal state variables as a result of calculations. The classic example of an environment-self interaction is when the grass regrows: Each patch is asked to examine its own state and increment the amount of grass it has, but if it has too much grass then it is set back to the maximum value it can contain. Agent-Agent Interactions Interactions between two or more agents are usually the most important type of action within agent-based models. We saw a canonical example of agent-agent interactions in the Wolf Sheep Predation model when the wolves consume the sheep: In this case, one agent is consuming another agent and taking its resources, whereby the wolf always eats the sheep. However, it is also possible to add competition or flight to this model, where the wolf gets a chance of eating the sheep and the sheep gets a chance to flee. Competition is another example of agent-agent interaction. Communication A final example of agent-agent interaction is communication. Agents can share information about their own state as well as that of the world around them. This type of interaction allows agents to gain information to which they might not have direct access. Environment-Environment Interactions: Interactions between different parts of the environment are probably the least commonly used type of interaction in agent-based models. However, there are some common uses of environment-environment interactions: one of these is diffusion. In the Ants model discussed in chapter 1, the ants place a pheromone in the environment, which is then diffused throughout the world via an environment-environment interaction. This interaction is contained in the following piece of code in the main GO procedure: Agent-Environment Interactions: Agent-environment interactions occur when the agent manipulates or examines part of the world in which it exists, or when the environment in some way alters or observes the agent. A common type of agent- environment interaction involves agents observing the environment. The Ants model demonstrates this kind of interaction when the ants examine the environment to look for food and sense pheromone: In the Ants model, the patches contain food and chemical, so the first part of this code checks to see if there is any food in the current patch. If there is food, the ant then picks up the food and turns around back to the nest, and the procedure stops. Otherwise, the ant checks to see if there is chemical, wherein it follows the chemical in that direction. Agent Movement Another common type of agent-environment interaction is agent movement. In some ways, movement is simply an agent-self interaction, since it only alters the current agent’s state. In the Ants model the ants move around by “ wiggling ”. Lesson135: Observer/User Interface Now that we have talked about the agents, the environments, and the interactions that occur between agents and environmental attributes, we may discuss who controls the running of the model. The observer is a high-level agent that is responsible for ensuring that the model runs and proceeds according to the steps developed by the model author. The observer issues commands to agents and the environment, telling them to manipulate their data or to take certain actions. Most of the control that model developers have with an ABM is mediated through the observer. However, the observer is a special agent. It does not have many properties, though it can access global properties like any agent or patch can. Properties Specific to the observer The only properties that one could consider to be specific to the observer are those relating to the perspective from which the modeled world is viewed. For instance, in NetLogo the view may be centered on a specific agent, or focusing a highlight on a certain agent, using the FOLLOW, WATCH, or RIDE commands (See figure 5.21). User Input and Model Output We have already made use of many of the standard ways to interface with an ABM when we extended models in chapter 3 and when we built our first model in chapter 4, but it is worth recapping some of them herein. ABMs require a control interface or a parameter group that allows the user to setup different parameters and settings for the ABM. The most common control mechanism is a button, which executes one or more commands within the model; if it is a forever button it will continue to execute those commands until the user presses the button again. Command Center A second way that the user can request that actions be performed within the ABM is via the command center (along with the mini – command centers within agent monitors). The command center is a very useful feature of NetLogo, as it allows the user to interactively test out commands, and manipulate agents and the environment. Sliders, Switches & Input Boxes Sliders enable the model user to select a particular value from a range of numerical values. For instance, a slider could range from 0 to 50 (by increments of 0,1) or 1 to 1,000 by increments of 1. In the Code tab the value of a slider is accessed as if it were a global variable. Switches enable the user to turn various elements of a model off or on. In the Code tab, they are also accessed as global variables, but they are Boolean variables. Choosers enable a model user to select a choice from a predefined drop-down menu that the modeler has created. Again these are accessed as global variables in the Code tab, but they are variables that have strings as their values, these strings being the various choices in the chooser. Finally, input boxes are more free-form, allowing the user to input text that the model can use. Monitors , Plots & Notes As for output controls, monitors display the value of a global variable or calculation updated several times a second. They have no history but show the user the current state of the system. Plots provide traditional 2D graphs enabling the user to observe the change of an output variable over time. Output boxes enable the modeler to create freeform text-based output to send to the user. Finally, notes enable the modeler to place text information on the Interface tab (for example, to give a model user directions on how to use the model). Unlike in monitors, the text in notes is unchanging (unless you manually edit them). Visualization Visualization is the part of model design concerned with how to present the data contained in the model in a visual way. Creating cognitively efficient and aesthetic visualizations can make it much easier for model authors and users to understand the model. Though there is a long history of work on how to present data in static images (Bertin, 1967; Tufte, 1983, 1996), there is much less work on how to represent data in real-time dynamic situations. However, attempts have been made to take current static guidelines and apply them to dynamic visualizations (Kornhauser, Rand & Wilensky, 2007). In general, there are three guidelines that should be kept in mind whenever designing the visualization of an ABM: simplify, explain, and emphasize. Guidelines for Designing Visualization Simplify the visualization Make the visualization as simple as possible so that anything that does not present additional usable information (or that is irrelevant to the current point being explained) has been eliminated from the visualization. This prevents the model user from being distracted by unnecessary “ graph clutter ” (Tufte, 1983). Explain the components If there is an aspect of the visualization that is not immediately obvious then there should be some quick way to determine what that visualization is illustrating, such as a legend or description. Without clear and direct descriptions of what is going on the model user may misinterpret what the model author is attempting to portray. If a model is to be useful it is necessary that anyone viewing the model can easily understand what it is saying. Emphasize the main point Model visualizations are themselves simplifications of all the possible data that a model could present to the model user. Therefore, a model visualization should emphasize the main points and interactions that the model author wants to explore, and, in turn, to communicate to end users. By exaggerating certain aspects of the visualization they can draw attention to these key results. NetLogo Ethnocentrism model Every NetLogo agent has a shape and color, and by selecting appropriate shapes and colors we can highlight some agents while backgrounding others. For instance, to simplify visualization if agents in our model are truly homogenous, like the ants, then we might make all the agents the same color, as in the Ants model. To explain the visualization we might change their shape to indicate something about their properties. For instance, in the Ethnocentrism model based on Hammond and Axelrod ’ s model (2003) agents employing the same strategies have the same shape, even if they are different colors. Batch vs. Interactive When you open a blank NetLogo model, it starts with a command center that says “ Observer. ” This blank window allows you to manipulate NetLogo in an interactive manner. If you open a model, you still have to press SETUP and GO to make the model work. Moreover, with most models, even when they are running, you can manipulate sliders and settings to see how these new parameters affect the performance of a model even during the middle of a run. This kind of spur-of-the-moment control is called interactive running, because the user can interact as the model is running. In contrast to interactive running, another type of user running of a model is called batch running. With batch running, instead of controlling the model directly, the user writes a script to run the model many times, usually with different seeds for the pseudo-random number generator and with different parameter sets. Lesson136: Schedule The schedule is a description of the order in which the model operates. Different ABM toolkits can have more or less explicit representations of the schedule. In NetLogo, there is no single identifiable object that can be identified as “ the schedule. ” Rather, the schedule is the order of events that occur within the model, which depends on the sequence of buttons that the user pushes, and the code/procedures that those buttons run. We will first discuss the common SETUP/GO idiom which is employed in almost all agent-based models, and then move on to discuss some of the subtler issues concerning scheduling in ABMs. SETUP and GO First of all, there is usually an initialization procedure that creates the agents, initializes the environment, and readies the user interface. In NetLogo this procedure is usually called SETUP and it executes whenever a user presses the SETUP button on a NetLogo model. The SETUP routine usually starts by clearing away all the agents and data related to the pervious run of the model. Then it examines how the user has manipulated the various variables controlled by the user interface creating new agents and data to reflect the new run of the model. For instance, in Traffic Basic the SETUP procedure looks like this: The other main part of the schedule is what is often called the main loop, or in NetLogo, the GO procedure. The GO procedure describes what happens in one time unit (or tick) of the model. Usually that involves the agents being told what to do, the environment changing if necessary, and the user interface updating to reflect what has happened. In Traffic Basic the GO procedure looks like this: Asynchronous vs. Synchronous Updates If a model uses an Asynchronous update schedule, this means that when agents change their state, that state is immediately seen by other agents. In a Synchronous update schedule changes made to an agent are not seen by other agents until the next clock tick — that is, all agents update simultaneously. Sequential vs. Parallel Actions Within the realm of asynchronous updating, agents can act either sequentially or in parallel. Sequential actions involve only one agent acting at a time while Parallel actions are those in which all agents act independently. In NetLogo (versions 4.0 and later), sequential action is the standard behavior for agents. Moreover, for the agents to act truly in parallel, you would need parallel hardware so that the actions of each agent would be carried out by a separate processor. However, there is an intermediate solution. Simulated concurrency uses one processor to simulate many agents acting in parallel. Lesson137: Types of Measurements Heretofore, we have examined ABMs, modified them, built them from scratch, and analyzed their behavior. In this module, we will learn to employ ABM to produce new and interesting results about the domain that we are investigating. What kinds of results can ABMs produce? There are many different ways of examining and analyzing ABM data. Choosing just one of these techniques can be limiting; therefore, it is important to know the advantages and disadvantages of a variety of tools and techniques. It is often useful to consider your analysis methods before building the ABM, to enable you to design output that is conducive to your analysis. Modeling the Spread of Disease If someone catches a cold and is coughing up a storm, he might infect others. Those that he comes into contact with — his friends, co-workers, and even strangers — may catch the cold. If a cold virus infects someone, that person might spread that disease to five other people (six now infected) before they recover. In turn, those five other people might spread the cold to five more people each (thirty-one are now infected), and those twenty-five people might spread the cold to five additional people (a hundred and fifty-six people are now infected). In fact, the rate of infection initially rises exponentially. However, since this infection count grows so quickly, any population will eventually reach the limit of the number of people who can be infected. For instance, imagine that the 156 people mentioned above all work for the same company of 200 individuals. It is impossible for the remaining 125 people to each infect five new people, and thus the number of infected people will tail off because there is no one left to infect. As we have described it so far, this simple model assumes that each person infects the same number of people, which is manifestly not the case in real contexts. As a person moves through their workspace, it might be the case that, they happen to not see many people in one day, whereas another individual might see many people. Also, our initial description assumes that if one person infects five people and another person infects five, there will not be any overlap. In reality, there is likely to be substantial overlap. Thus, the spread of disease in a workplace is not as straightforward as our initial description suggests. Suppose that we are interested in understanding the spread of disease, and we want to build an ABM of such a spread. How should we go about doing it? First, we need some agents that keep track of whether they are infected with a cold or not. Additionally, these agents need a location in space and the ability to move. Finally, we need the ability to initialize the model by infecting a group of individuals. That is exactly how the NetLogo model we will be discussing in this module behaves (See figure 6.1). Individuals move around randomly on a landscape and infect other individuals whenever they come into contact with them. Though this model is simple, it exhibits interesting and complex behavior. For instance, what happens if we increase the number of people in the model? Does the disease spread quicker throughout the population, or does it take a longer time because there are more people? Let us run the model at population sizes of 50, 100, 150, and 200, and examine the results. We will keep the size of the world constant, so that, as we increase the number of individuals, we are also increasing the population density. Along the way we will write down at what time the entire population becomes infected (see table 6.1). Based on these results, we conclude that as the population density increases, the time to full infection dramatically decreases. Let us suppose that we show this data to a friend of ours and she does not believe it. She believes that the time to 100 percent infection should grow linearly with the population. She looks over the code and determines that it seems to match the description (a process called verification, which we will discuss in the next chapter). After that, she runs the model and collects the same data that we did (a form of replication). Her data is in table 6.2. These results do not support our friend’s prediction that the time to 100 percent infection will grow as the population increases, but, on the other hand, they are quite different from the results that were originally collected. In fact, the time to 100 percent infection for 150 and 200 increases in our friend’s data, seemingly contradicting our original results. Moreover, if we run the model several more times you might again get different results. We need some methods for determining if there are trends in the data. The data is inconsistent because most ABM models employ randomness in their algorithms — i.e., the code makes use of a random number generator. How the agents move around on the landscape is not specifically determined, but instead is the result of several calls to the random number generator at each time step that determine the actions of any particular agent. Moreover, these random decisions occur at least once for each agent in the population for each time step. Clearly, then, one set of runs is not enough to characterize the behavior of this model. Suppose, then, that we collect additional data for a set of population densities for ten different model runs as in table 6.3. Though most of these runs look more like our original results than our friend ’ s results, it might be difficult to see clear trends, and to analyze the results overall. Thus, to describe these patterns of behavior it makes sense to turn to some statistics. Lesson138: Here, we will understand about the different types of measurements in an ABM. Statistical Analysis of ABM:  Moving beyond Raw Data  Statistical results are the most common way of looking at any kind of scientific data  The general methodology behind descriptive statistics is to provide numerical measures These measures summarize a large data set  They also describe the data set in such a way that it is not necessary to examine every single value.       We have learnt about the types of measurements in the analysis of an ABM.  Credits: Uri Wilensky book. Lesson139: Here we will Model the spread of disease in detail.  If a cold virus infects someone, that person might spread that disease to five other people (six now infected) before they recover.  In fact, the rate of infection initially rises exponentially.  Suppose that we are interested in understanding the spread of disease, and we want to build an ABM of such a spread. How should we go about doing it?  First, we need some agents that keep track of whether they are infected with a cold or not  We need the ability to initialize the model by infecting a group of individuals  Individuals move around randomly on a landscape and infect other individuals whenever they come into contact with them.  We conclude that as the population density increases, the time to full infection dramatically decreases  In the beginning, when the first person becomes infected, if there are not many other people around, the person has no one to infect, and thus the infection rate increases slowly.  However, if there are many people around then there will be plenty of infection opportunities  Moreover, at the end of the run, when there are only one or two uninfected agents, they will be more likely to run into someone with an infection if the population count is high.  This is true despite the fact that the total number of people that need to be infected increases.  To describe these patterns of behavior it makes sense to turn to some statistics  Credits : Uri Wilensky Lesson140: Here we will learn about the general methodology behind descriptive statistics that is to provide numerical measures that summarize a large data set and describe the data set in such a way that it is not necessary to examine every single value.  Suppose we are interested in determining whether a coin is fair. (i.e., it is as likely to turn up heads when flipped as it is to turn up tails) then we can conduct a series of experiments where we flip the coin and observe the results.  It is much easier to look at means and standard deviations than it is to examine large series of data  (e.g., for HHHHTTHTTT, the observed probability is 0.5, and the expected outcome for ten trials is to observe five heads with a standard deviation of 1.58).  To apply this technique to our Spread of Disease model in more depth, we can create summary statistics   We see that the mean time to 100 percent infection declines as the population density increases.  Another interesting result is that as the population density goes up, the standard deviation goes down.  This means that the data is less varied.  These results seem to confirm our original hypothesis that as population density increases the mean time to infection declines.  Within ABM, statistical analysis is a common method of confirming or rejecting hypotheses  ABMs create large amounts of data (the Spread of Disease model is just a small example), and if we can summarize that data we can examine large amounts of output in an efficient manner.  Most ABM toolkits give you the basic ability to carry out simple statistical analysis within the package itself  (e.g., in NetLogo there are MEAN and STANDARD-DEVIATION primitives). Thus, while the model is running, the ABM itself can generate summary statistics.  The NetLogo R extension can be used to conduct analyses with the R statistical package  Credits : Uri Wilensky Lesson141:  When you are trying to collect statistical results from an ABM you should run the model multiple times and collect different results at different points.  Here we will understand how ABM toolkits will provide you with a way to collect the data from these runs automatically.  In NetLogo there is a tool called BehaviorSpace. ABM toolkits are often full-featured programming languages, allowing you to write your own tools for creating experiments to produce the data sets you want to analyze.  These tools will automatically run the model multiple times with multiple different settings and collect the results in some easy to use format like the CSV files mentioned earlier.  Let us call this experiment “ population density”   The SETUP and GO commands allow you to specify any additional NetLogo code that you need to make the model start and go.  “ Stop condition ” allows you to specify special stop conditions for each run, and  “ Final commands ” allows you to insert any commands that you want executed between runs of the model  In general in ABM, it is important to carry out multiple runs of your experiments so that you can determine if some result is truly a pattern or just a one-time occurrence.  One common way is to start by manually running your model multiple times, but to get a better sense of the results it is usually much easier to use a batch experiment tool.  We have illustrated the BehaviorSpace tool, which is the batch experiment tool for NetLogo.  Credits : Uri Wilensky Lesson142: In NetLogo creating simple graphs is easy to do and will often suffice for simple data analysis. But with complex data sets, designing a useful and immediately informative graph can be challenging and is the subject of an extensive body of literature.  We can quickly see how the data is distributed and how the data changes with population density   This new figure might not be easier to understand than previous figure, but if there were one hundred data points in previous figure rather than ten data points, then a figure like this one might be very helpful.  Many ABM toolkits include capabilities for continually updating graphs and charts during the running of a model and thus enable you to see the progress of the model temporally  For instance, in the Spread of Disease model there is a graph that illustrates the change in the fraction of infected agents as time proceeds   This is one example of how we can use time series to help understand the behavior of a model.  Summary: We have seen the use of graphs (plots) in simulation software  Credits: Uri Wilensky Lesson143: Here, we understand the analysis of “networks within ABM.”  Interactions do not occur in physical space. But rather they can also occur across social networks.  E.g. some diseases spread only through certain kinds of social networks. If we set the chooser to “network” we can explore this further.  The property which determines how many connections.  Each individual has to other individual in the network.  Known as “connections-per-node”.   A well known property of random graph is the average number of individuals infected.  Grows substantially  The connections-per-node exceeds 1.0  Forms a giant component in the network.  The property of average path length, which measure the distance between any two nodes in the network.  Affect the spread of disease in the network.  Another property is clustering co-efficient.  Different properties and tools are there to measure and analyze a wide variety of metrices  Associated with social network analysis (SNA).  To summarize, each of these network properties can be analyzed as to their effect on the spread of disease.  Reports and toolkits like UCInet can be used for further examination.  Credits: Uri Wilensky book. Lesson144: In this section we are going to see agent-to-environment and environment-to-agent transmission.  The key areas to understand are: o Spread of disease model for examining. o Environmental interaction effect.   The path below any agent will become yellow.  Change the rate if DISEASE-DECAY 0 to 10 at a single time intervals  A long DIESEASE-DECAY might have negligible over having a small DISEASE- DECAY.  Investigating using behaviour space, we notice that a powerful aspect of ABMs is that it also shows us the pattern of infection.   Leaving long stringy patterns of environmental infection.  To summarize, we have seen the working of environmental data and ABMs.  Credits: Uri Wilensky book Lesson145:  Here, we understand the about the “correctness of a model”. Output relating the concerned issue must be accurate.  Model accuracy is evaluated through validation, verification and replication.  Implemented model corresponds to, and explains, some phenomenon in the real world.  Model verification: determining whether an implemented model corresponds to the target conceptual model.  Make sure that the model has been implemented correctly.  Model replication is the implementation by one researcher or group of researchers of a conceptual model previously implemented by someone else.  Set of results from a model that corresponds to the real world is not sufficient.  Multiple runs are often needed to confirm that a model is accurate.  Verification, validation, and replication collectively underpin the correctness, and thus utility, of a model.  Credits: Uri Wilensky book. Lesson146:  Here, we understand the role of “Verification” in Correctness of a Model.  In larger models, the code can be difficult to understand as it evolves over time.  Verification ensures the elimination of “bugs” from the code.  The process of debugging becomes more difficult for complex models.  Our goal, however, is to keep the process easy and simple.  Build the model simply to begin with – it will be easier to verify. Expand the complexity of the model as necessary. This incremental approach makes it easy to verify the additional components.  Even if all the components are verified, it is still possible that the system is not. Complications may arise from the interaction between model components.  In this section we have examined the issue of verification of a model in the context of a simple ABM.  Credits: Uri Wilensky book Lesson147: Here, we understand need of “communication” for the correctness of a model.  Sometimes a team of people builds a model. Other team members actually implement the model. Therefore, verification becomes critical.  Communication is critical to ensure that the implemented model is correct.  It is essential.  E.g. In the voting model.  Political scientists differentiate between Moore and Van Neumann neighborhoods.  Small world network and hexagonal versus a rectangular grid.  In an ideal situation, the model author and the implementer is the same person which averts the sort of communication errors.  But when it is not, then there is often room for human error and misunderstanding.  In the past it was difficult to be expert model implementer and model author.  However, low-threshold ABM languages, such as NetLogo is narrowing the gap between author and implementer.  We have seen how communication plays an important role in the correctness of a model.  How communication can fill the gap between the implementer and the author of the model.  Credits: Uri Wilensky book. Lesson148: Here, we will understand how “Describing Conceptual Models” plays a role in analyzing correctness of a model.  Implementing the voting model, we may realize subsequently that we never understand the idea completely. This could happen if we talked with a political scientist.  Describing how we plan to implement the model is a specific type of document.  We and the political scientist should have the same “conceptual model” in our mind.  Describe the model in more formal terms. This includes a pictorial description of the model using flowcharts.  Subsequently, we can convert the flowcharts into pseudo-code  The goal of pseudo-code is to serve as a midway point between natural language and programming language.  Pseudocode:   Other methods are UML, choosing a language similar to pseudo-code and NetLogo.  We have learnt about processes involved in describing conceptual model which include flowcharts, pseudo-code, UML, and NetLogo.  Credits: Uri Wilensky book Lesson149: Here, We will discuss and understand how we can use “Verification Testing” for finding out about the correctness of a model.  When implementing the design into code we need to follow ABM core design principles.     This verification technique is a form of unit testing. We can modify the code without disrupting previous code.   We have understood how an incremental approach makes it easy to implement the model correctly.  We are then able to write our code into NetLogo for verification purpose.  Credits: Uri Wilensky book Lesson150: Here, we will learn about going beyond verification using agent-based modeling.  Sometimes results are not produced to what the implementers and authors hypothesized.  Results of modeling can therefore end up causing further confusions for the scientists.  Jagged edges can occur due to ties in votes.  Design the model that neighbors do not change their votes if tied.  Switch called CHANGE VOTE IF TIED.     Switch AWARD CLOSE CALLS TO LOSER?  With both switches on, we have a different outcome.   We have seen the details of how to go beyond verification.  Credits: Uri Wilensky book Lesson151: Here, we will discuss sensitivity analysis and robustness.  Creating the parameter to test the hypothesis of initial balance in one direction.  Using BehaviorSpace, we can run the experiment varying from 25 to 75 percent increment.  Two conditions:  If no voter switch vote in the last step the model will stop.  The model will stop after one hundred times the steps have executed.      Past research methodologies for sensitivity analysis include Active Nonlinear Testing.  We have seen the details of Sensitivity Analysis and Robustness  Credits : Uri Wilensky book Lesson152: Here, we are going to examines benefits and issues related to verification.  We need to develop an understanding the cause of unexpected outcomes and the impact of small changes.  Implemented model needs to correspond well with the conceptual model.  This all depends on the model's low level rules as well as an understanding of the mechanisms involved. A bug in the code can produce surprising results.  Understanding the operation of the model is therefore quite essential. It is also important to note that the verification process is not binary.  We have discussed verification benefits and issues.  Credits: Uri Wilensky book Lesson153: We will discuss the validation of agent based modeling.  Validation involves corresponding between implemented model and reality.  The various topics related to validation include:  Two axis.  Macrovalidation.  Face validation.  Flocking model.  A classical agent based model.  Here, we have discussed the validation of agent based models.  Credits: Uri Wilensky book. Lesson154: Here, we will understand the difference between “macrovalidation and microvalidation“ and examine their role in the validation of a model.  ABMs are built from agents hence we can directly compare them with those that exist in the real world.  For instance, we can ask whether the agents have properties similar to real birds in flocking model.  There are some limitation e.g. real birds can fly in three dimensions but our bird agents move in two dimension only. However, these limitations does not make our model invalid.  We can also build the flocking model in three dimensions and examine the resultant flocks to see if they are relevantly different from the 2D flocks.  The other major avenue of validation is to investigate the relationship between the global properties of the model and the flocking patterns of real birds, a process called macrovalidation.  By showing that our model corresponds to the macro-level phenomenon, we further validate that our model is descriptive of real world systems.  Macrovalidation tells you if you have captured the important parts of the system, whereas microvalidation tells you if you ’ve captured the important parts of the agent’s individual behavior.   We now understand how macrovalidation and microvalidation compares to each other.  Credits: Uri Wilensky book. Lesson155: Here we will discuss how “face validation” is different from “empirical validation”.  Face validation is the process of showing that the mechanisms and properties of the model look like mechanisms and properties of the real world.  Empirical validation is making sure that the model generates data. It corresponds to similar patterns of data in the real world.  Face validity can exist at both micro and macro levels.  Determining whether the birds in the model correspond to real birds is face microvalidity.  While determining if the flocks correspond to the appearance of real flocks is face macrovalidity.  Empirical validation sets a higher bar. Data produced by the model must correspond to empirical data derived from the real-world.  Measures and numerical data generated both by the model and the actual phenomenon.  Inputs and outputs in “ the real world ” are often poorly defined. It is hard to isolate and measure parameters from a real-world phenomenon.  The process of finding the parameters and initial conditions that cause the model to match with real-world data is called calibration.  We are now able to understand how these four types of validation (micro-face, macro- face, micro-empirical, and macro-empirical) characterize the majority of validation efforts carried out.  Credits: Uri Wilensky book Lesson156: Here we will understand why it is important to validate a model.  A valid model can be useful for extracting general principles about the world.  Changing the mechanisms and parameters can often help predict what might occur in the real world.  A model is not either valid or invalid:  A model can be said to be more or less valid based upon how closely it has been compared to the real process it is modeling.  A model is never inherently valid.  Its validity comes from the context of what it is being used for.  Something in the model corresponds to something in reality.  The user compare his observation’s with real world and use it as the basis of his validation.  Here we have understood the benefits of validation and question it arises.  Credits: Uri Wilensky book. Lesson157: Here, we will understand about the “Replication”. Replication  Model replication is the implementation by one researcher or group of researchers of a conceptual model previously implemented by someone else.  Scientists must publish the details of how the experiment was conducted.  Then subsequent teams of scientists carry out the experiment themselves to ascertain.  Replicating a physical experiment strengthens original results.  Comparing both the experimental setup and ensuing results.  Replicating a computational model serves this same purpose.  Replicating a computational model increases our confidence.  New implementation of the model has yielded the same results as the original.  This model includes a mechanism for inheritance of strategies.  The model suggests that “ ethnocentric ” behavior can evolve under a wide variety of conditions.  Even when there are no native “ ethnocentrics ”.  Here we understand about the basic concepts of replication and how it is useful for the correctness of a model.  Credits: Uri Wilensky book Lesson158: Here, we will understand about the “Replication of Computational Models: Dimensions and Standards”. Replication  Replication refers to the creation of a new implementation of a conceptual model based on the previous implementation. Dimensions  An original model and replicated model can differ across at least six dimensions.  Time  Hardware  Languages  Toolkits  Algorithms  Authors Successful Replication  A successful replication is one in which the replicators are able to establish that the replicated model creates outputs sufficiently similar to the outputs of the original. Replication Standard  The criterion by which the replication ’s success is judged is called the replication standard.  Different replication standards exist for the level of similarity between model outputs. Categories of Replication Standard  There are three categories of replication standard.  Numerical identity  Distributional equivalence.  Relational alignment. Data  ABMs usually produce large amounts of data.  much of which is usually irrelevant.  Data that are central to the conceptual model should be measured and tested during replication. Summary  Here, we understand about dimensions of conceptual model like time, hardware, toolkit, language, algorithms and author.  Then we discussed about replication standard and its categories.  Credits: Uri Wilensky book Lesson159: Overview  We are going to see the benefits of replication in this section. Benefits of Replication  Advances scientific knowledge.  Model verification.  Model validation.  Shared understanding.  Shared understanding.  Creation of sets of terms, idioms and best practices.  Communicate about model.  Model verification.  Distinct implementations. producing same results.  Conceptual model grows by capturing confidence.  Correction made if there is any difference found in implemented model and replication.  Model validation.  Correspondence between the outputs.  Validation of a model on the basis of output closer to real-world.  Reevaluating the original mapping.  Researchers investing and getting interested in it through replication  Describing the modeling process through developing a language.  Culture of replication fosters.  "mean" and "standard deviation" applying them to tests, overtime, replication of ABM experiment to understand "time step" and "shuffled lists". Summary  Here, we understood the benefits of replication in modeling.  Credits: Uri Wilensky book.  Lesson160: Overview  Here, we will learn about recommendations for model replicators. Recommendations for model replicators  RS is to produce the level of precision to establish hypothesis regularity.  Examples : “numerical identity”, “distributional equivalence” and “relational alignment”.  How detailed the description of the conceptual model is in the original paper or to communicate with authors or original model.  Beneficial to delay the contact with original author until after first attempt to recreate the original model.  Differences between public conceptual model and an implementation can be interesting and resulting in new discoveries.  Becoming familiar with the toolkit in which original model was written.  Better understanding of original model.  Replicator understands the subtler working.  Deliberately implementing a strategy.  Obtain the source code of original model.  Effective for illuminating discrepancies in the two model implementations.  Groupthink  Unconsciously adopting some of the practices of the original model developer. Summary  We have understood the recommendation for model replicators in modeling.  Credits: Uri Wilensky book. Lesson161: Overview  Here, we will understand the recommendations for the model authors. Recommendations for the model authors  Research paper should be well specified.  Complete source code for the model may not sufficient.  Model authors must make their source code publicly available or at least the “pseudo code”.  To what extent the model developer presents a sensitivity analysis.  Small modification to the original model can effect results drastically.  These sensitive differences should be published by the authors.  Original author may not be the original implementer.  Implementation will not be veridical.  Model authors implement their own models using “ low-threshold ” and toolkits.  Authors to examine their conceptual models through the lens of a potential model replicator. Summary  We have understood the recommendations for model authors in modelling.  Credits: Uri Wilensky book Lesson162: Overview  Here, we will understand the basics about complex adaptive systems (CAS). Complex Adaptive Systems  Complexity  Adaptation  Systems  The systems approach is an old concept.  The approach stands on the assumption that breaking down of a complex concept into simple easy to understand units helps in better understanding of the complexity.  Ludwig von Bertalanffy first proposed it as 'General System Theory'  A system is a delineated part of the universe which is distinguished from the rest by an imaginary boundary. (NECSI)  Properties of the system, the properties of the universe excluding the system which affect the system and the interactions / relationships.  An adaptive system (or a complex adaptive system, CAS) is a system that changes its behavior in response to its environment.  The adaptive change that occurs is often relevant to achieving a goal or objective. (NECSI)  Complexity is: ...the (minimal) length of a description of the system. ...the (minimal) amount of time it takes to create the system.  Here the length of a description is measured in units of information. Examples Summary  We have learnt about the basic ideas of Complex Adaptive Systems.  Credits: Uri Wilensky Book. Lesson163: Overview  Here, we will understand about the historical perspective of CAS History  Started with the concepts of General Systems Theory  Biologists focused primarily on individual cells  Social Scientists focused on connections  John Holland and cas/CAS Summary  We have learnt about the historical perspective of CAS.  Credits: Niazi & Hussain book. Lecture164: Overview  Here, we will understand about the basic ideas of complexity Complexity  Disambiguation from Computational Complexity  Nonlinearity  Large number of variables  Chaos theory  The term "chaos" is popularly used to refer to disorder or confusion.  In science, chaos is an important conceptual paradox that has a precise mathematical meaning:  A chaotic system is a deterministic system that is difficult to predict.  A deterministic system is defined as one whose state at one time completely determines its state for all future times.  So what does it mean for a chaotic system to be difficult to predict?  Butterfly effect Summary  We have learnt about the basics of Complexity in CAS.  Credits: Niazi and Hussain book. Lecture165: Overview  Here, we will understand about adaptation in CAS. Adaptation  Adaptive changes in interactions  Who changes  What changes  When changes  How changes Summary  We have learnt about the adaptation in CAS.  Credits: Niazi and Hussain book. Lecture166: Overview  Here, we will understand about the Systems approach. Systems Approach  Concept of a System - boundaries  Many systems can be correlated  Example: Life/Humans/Society  Whole vs. parts  Reductionist vs. Wholistic approach Summary  We have learnt about the systems perspective of things.  Credits: Niazi and Hussain book. Lecture167: Overview  Here, we will understand about the problems encountered in modeling of CAS. Modeling of CAS  Cognitive Agent-based (CABC) Framework Summary  We have learnt about the modeling of CAS using CABC framework  Credits: Niazi and Hussain book. Lecture168: Overview  Here, we will understand about the agent-based approach to modeling Agent-based Approach  Agent – something that acts  There is a concept of agent in domains as distinct as: o AI/CS/Robotics o Sociology (Individual) o Biology o Ecology o Etc.  CABC framework has 3 levels for Agent-based modeling  Exploratory Agent-based modeling (EABM)  Descriptive Agent-based Modeling (DREAM)  Validated Agent-based Modeling (VOMAS-based Modeling or VABM)  Exploratory ABM  DREAM  Virtual Overlay Multiagent System (VOMAS)  Validated Agent-based Modeling Summary  We have learnt about various approaches for modeling using Agents and the CABC framework.  Credits: Niazi and Hussain book. Lecture169: Overview  Here, we will understand about complex networks/graphs and modeling using the CABC Level 1 Complex Network-based Modeling  What is a graph?  Graph/Network  Same entity different terms in different domains  Network = weighted graph  Different aspects of modeling using networks  Labels  arcs/lines  Directed/Undirected  Centrality-based  Ego  Sentiment graphs  Community Detection  Key benefits  Very well-defined  Lot of free tools  Lot of community support  Well-defined mathematical models  Cons  Difficult to figure out  Difficulty in graph mining – lack of expertise in the community  Missing data can be very difficult  Requires complete data  Complexity in deciding data boundaries Summary  We have learnt about complex-network modeling using the CABC framework  Credits: Niazi and Hussain book. Lecture170: Overview  Here, we will understand how to put modeling and simulation to use employing the CABC framework. Cognitive Agent-based Computing Framework  Data and Models  How much data is needed  Why choose Agent-based Modeling?  Why choose Complex network modeling?  When to choose EABM?  When to choose DREAM?  When to choose VOMAS? Summary  We have learnt about putting the CABC framework to work.  Credits: Niazi and Hussain book. Lecture171: Overview  Here, we will learn about the useful statistical models. Useful Statistical Model  The models that are useful in the case of limited data are:  Queueing systems.  Inventory and supply-chain systems.  Reliability and maintainability.  Limited data.  Other distributions. Queueing Systems  Simulation solved the waiting line problems.  Interval and service time patters were given in queuing examples.  In these examples service and arrival time was probabilistic.  But it is possible to have constant arrival time and constant service time.  “Arrivals” can occur due to many reasons like machine break downs.  Exponential distribution used to simulate the random exponential distribution of services.  For special cases truncated normal distribution can be utilized. Inventory and supply-chain systems  Three random variables  Number of units demanded per order or per time period.  Time between demands.  Lead time.  Simple inventory systems have constant demand and lead time constant or zero.  Mathematical tractability based demand and lead time in inventory theory text could be invalid.  Variety of demand patterns are satisfied by geometric, Poisson and negative binomial distributions. Reliability and Maintainability  Failure time is modeled with many distributions.  When failure occurs randomly model distribution is considered as exponential.  Modeling standby redundancy required for gamma distribution. Limited Data  Three distributions.  Uniform distribution  Triangular distribution  Beta distribution  Uniform distribution is just a special case of beta distribution Summary  Here, we discussed about useful statistical model, queuing systems, Inventory and supply-chain systems, Limited data, Reliability, and maintainability.  Credits: Jerry Banks book. Lecture172: Overview  Here, we will learn about the Bernoulli distribution. Bernoulli Distribution  Bernoulli trail is one of the simplest experiment you can do in statistic and probability which has one out of two possible outcome.  Bernoulli distribution is a discrete probability distribution for Bernoulli trail which has one out of two possible outcomes success or fail.  Let us consider an example which consists of n trails and each of which can be a success or a failure.  Xj = 1 if the jth experiment is success  Xj = 0 if its failure.  N trails and each trail has only two out comes success or failure.  Success here happens with probability (p) and failure at (p – 1).  X is a discrete random variable. It takes value 1 in case of success and 0 in case of failure.  Below is the expected value of random variable X is calculated a  Below is the variance of the random variable X is calculated as: Summary  We have understand about the Bernoulli distribution.  Credits: Jerry Banks book. Lecture173: Overview  Here, we will learn about the binomial distribution. Binomial Distribution  Binomial distribution is the discrete probability distribution with parameters n and p of the number of successes in the sequence of n independent experiments.  Each n and p with its own boolean-value.  Random variable.  Bernoulli trial’s random variable X has binomial distribution:  This equation is for calculating the probability of a specific outcome with all successes.  S = success.  In first X trials.  n-x =f (failures).  q=1-p  This equation shows the required number of successes and failures.  Again this equation can be used to calculate mean and variance of binomial distribution.  X = sum of independent Bernoulli random variables. Then:  Expected value E(X)  n = total number of experiments.  p = probability of each experiment getting successful result.  X = binomially random variable.  Variance V(X)  n = total number of experiments.  p = probability of each experiment getting successful result.  X = binomially random variable.  q = probability of failure in each experiment. Summary  We have understand about the Binomial distribution.  Credits: Jerry Banks book. Lecture174: Overview  Here, we will learn about the Poisson distribution. Poisson Distribution  A discrete frequency distribution that gives the probability of independent events occurring in a fixed interval of time is known as the Poisson distribution.  X = 0,1,2,3,….  e= 2.71828  a= mean number of successes in the given time interval or region of space.  a>0  This is the important property of Poisson distribution that is mean and variance is equal to a.  This is the cumulative distribution function. Summary  We have understand the Poisson distribution.  Credits: Jerry Banks book. Lecture175: Overview  Here, We will understand “Uniform Distribution” also known as “rectangular distribution”. Uniform Distribution  The uniform distribution is a continuous distribution.  It takes values within a specified range.  Probability density function for a uniform distribution always take a random number on [0,1].  Consider a random variable X on the interval [a, b] distributed uniformly. The, equation is given by:  X=a+b-aR  For the derivative of function.  First, compute the cdf of variable X.  Second, set Fx=R on the range of X.  Third, solve the equation Fx in terms of R for X.  Step 1: o The cdf for the equation is :  If X takes only discrete values 0 and 1 then:  Step 2:  Set  Step 3:  Solve for X in terms of R which proves. X=a+b-aR Summary  Here, we understood about the fundamentals of Uniform Distribution.  Credits, Discrete Event System Simulation, Jerry Banks. Lecture176: Overview  Here, we will understand About “Exponential Distribution”. Exponential Distribution  The Exponential Distribution is a continuous valued probability distribution.  Takes positive real values and λ which controls the shape of the distribution.  The probability density function  This can also be defined as right-continuous Heaviside step function.  Here λ is called the rate parameter.  The cumulative distribution function  This can also be defined as heaviside step function  Here λ is the number of occurrences per time unit. The inverse transform technique is used when the form of cdf is so simple that its f-1 can be computed easily. This technique for exponential distribution consist of four steps defined next.  Step 1:  Compute the cdf of random variable x given  Step 2  Set Fx=R on the range of x which becomes as.  Step 3  Solve the equation FX=R for X in terms of R as follows:  Here, X is called random-variate generator.  Step 4  Generate uniform random numbers to compute desired random variates.  For exponential case.  If i=1,2,3… replace 1-Ri with Ri as follows:  Which proves that both Ri and 1-Ri are uniformly distributed on [0, 1]. Summary  We understand about “Exponential Distribution”  Credits, Discrete Event System Simulation, Jerry Banks. Lecture177: Overview  Here, we will discuss about “Triangular Distribution”. Triangular Distribution  A triangular distribution is a continuous probability distribution with a probability density function shaped like a triangle defined by  The minimum value a  The maximum value b  The peak value c  Where a

Cs620 Handouts (131-258) PDF

Document Details

Tags

Related

Summary

Full Transcript

Upgrade to continue