Risk Management and Hazard Control Process PDF
Document Details
Uploaded by SportyUkulele5959
Tags
Related
- Presentación APPCC PDF
- ITE College West School of Engineering Mechanical System Engineering Diploma - Risk Control and Management PDF
- ITE College West School of Engineering Diploma in Mechanical System Engineering Risk Control and Management Textbook PDF
- Legislation Notes 2023 PDF
- ISO 22000-2018 Transition Guidance PDF
- ASP Exam Study Workbook PDF
Summary
This document describes the risk management and hazard control process, explaining the components of a safety management system, including management leadership, employee involvement, work-site analysis, hazard prevention and control, and safety and health training. It provides examples and a framework for implementing an effective safety program.
Full Transcript
# Domain 2: Safety Management Systems ## Risk Management and Hazard Control Process ### The Safety Management System The effectiveness of any safety intervention can be tied to two main aspects of the overall safety program: 1. the existence of a safety management system 2. an organizational c...
# Domain 2: Safety Management Systems ## Risk Management and Hazard Control Process ### The Safety Management System The effectiveness of any safety intervention can be tied to two main aspects of the overall safety program: 1. the existence of a safety management system 2. an organizational culture that is supportive of the safety efforts. The **Occupational Safety and Health Administration (OSHA)** defines a safety management system as being comprised of four areas, all of which are necessary for a safety and health program to be effective in meeting its goals and objectives. The components of the safety and health management system include: - Management leadership - Employee involvement - Work-site analysis - Hazard prevention and control activities - Safety and health training. ### Management Leadership and Employee Involvement Without management leadership for safety, a safety program can be almost guaranteed to be ineffective. Through their actions, members of senior management display the importance that safety plays in an organization. Including safety performance as part of the overall organizational goals, is one way management conveys this importance. If safety is not perceived by the employees to be important to management, then it will almost certainly not be seen as being important by the workers. Where management has placed safety on a par with other functions, they must be genuinely committed to following through or employees will not abide by company policies. Getting employees involved in the development and implementation of safety program tasks increases the chances that their programs will be accepted and followed by the employees. ### Work-Site Analysis Work-site analysis involves the identification of hazards with the goal of correcting hazardous conditions before an accident occurs. Tools to consider as part of the work-site analysis include: - Conducting property hazard assessments - Environmental audits - Accident investigations - Job hazard analyses. Additionally, analyzing accident data can also be very helpful. - **Proactive safety programs** are implemented with the goal of preventing potential accidents and the losses from those accidents before they occur. - **Reactive safety programs**, on the other hand, focus their attention on activities aimed specifically at the causal factors attributed to accidents and losses that have already occurred. ### Hazard Prevention and Control Hazard prevention and control includes those program components designed to: 1. Prevent accidents from occurring, 2. Minimize their severity should an accident occur. Examples of programs aimed at hazard prevention and control include: - Preventive maintenance programs - Emergency preparedness A recognized hierarchy for hazard control is: - Elimination - Substitution - Engineering - Warning - Administrative action - The use of personal protective equipment (PPE). ### Safety and Health Training The fourth component of a safety management system is **safety and health training**. The training should ensure that employees at all levels of the organization are aware of safety and health policies and procedures that may impact them. Additionally, task-specific safety and health training should be provided to employees with unique exposures to hazards on the job. To evaluate the **safety and health management system**, OSHA has developed, as part of their outreach programs, an evaluation tool referred to as the **Safety and Health Program Assessment Worksheet (OSHA Form 33)**. As part of the assessment, consultants review an employer's existing safety and health management program to identify elements considered adequate and elements that need development or improvement. To assist employers in meeting their training obligations, OSHA published the **training requirements in OSHA Standards and Training Guidelines**. This document provides employers with guidance on: - How to identify training needs - How to develop a training program - How to evaluate the effectiveness of the program. ### Components of a Comprehensive Safety and Health Program The following seven components have been identified as necessary elements for a comprehensive safety and health program: 1. Hazard anticipation and detection programs, including: - Hazard surveys - Self-inspections - Accident investigations. 2. Hazard prevention and control measures, including: - The use of engineering controls - Personal protective equipment - Emergency response plans - Adequate medical care for employees. 3. Planning and evaluation programs, including: - Data collection and analysis methods - Development of safety goals and objectives - A review of the overall safety and health management system. 4. Administration and supervision activities, including: - Coordination of safety and health program activities - Accountability mechanisms - Safety responsibilities being communicated to those who must perform the duties. 5. Safety and health training, encompassing: - New employee orientations - Supervisor safety training - Management safety training. 6. Management leadership and commitment is vital to the success of any safety program. Performance measures include: - Adequate resource allocation for safety - Top management involvement in the planning and evaluation of safety performance. 7. Effective safety performance requires employee participation in all areas of: - Planning - Evaluation - Implementation of safety program tasks. Employee involvement can take many forms, including: - Employees involved in the decision-making process for safety - Employees participating in the detection and control of hazards. ## Occupational Safety and Health Management System Cycle The **American National Standard for Occupational Health and Safety Management Systems (ANSI/AIHA A10)** defines an occupational health and safety management system cycle as: 1. An initial planning process and the implementation of the management system 2. A process for checking the performance of these activities and taking appropriate corrective actions 3. A management review of the system for suitability, adequacy, and effectiveness against its policy and the ANSI standard. Elements of this cycle include: - The plan-do-check-act cycle - Management leadership - Employee participation - Management planning activities - System implementation - Checking and corrective action - Management review. The purpose of this cycle is to ensure that continuous improvement activities are systematically incorporated into the organization's management functions, resulting in a coordinated effort to continually improve safety performance. ## The Safety Culture An organization's culture consists of its values, beliefs, legends, rituals, mission, goals, performance measures, and sense of responsibility to its employees, customers, and community, all of which are translated into a system of expected behavior. The **safety culture of an organization defines how the organization values and perceives safety in the workplace**. This safety culture plays an important role in determining the success of safety and health activities. If management promotes a culture in which safety is perceived as not being important to the organization, then the employees will perceive safety as something that is not important. It is the organization's culture that determines whether the safety program as a whole will be effective. An assessment of the safety culture should include asking questions such as the following: - Is there a strong safety culture established with no tolerance for unsafe practices? - Is the cultural goal zero injuries? - Are health and safety procedures followed all the time? - Is there a vision of a safe work environment, and do all employees share in it? - Do employees value safe behavior, themselves, and their continued well-being? - Is the management style and culture nonautocratic with a win-win atmosphere? - Is there a trusting relationship between management and employees? - Do employees believe that safety is a company priority? In organizations with a strong safety culture, the following characteristics exist: - Executives and managers visibly support safety with no contradictory decisions, and they accept full accountability. - Employees are involved with safety and their views are sought and acted upon. - Supervisors' actions support safety, including recognizing and appreciating safe work practices and behaviors. It is accepted in the safety profession that there is a relationship between an organization's culture and safety performance and that the organization's culture can be measured and managed. Methods used to measure the safety culture in an organization include: - Employee surveys directed at their perceptions about management leadership for safety - Reinforcement by management to report hazards - Employee attitudes and perceptions about safety - How employees view the management and supervision of safety - Whether they feel there is a genuine commitment for safety. ### Measuring Safety Effectiveness In occupational safety and health, the need for a particular intervention can be determined by legislation in which a regulation stipulates that a particular safety and health activity be provided in addition to other areas not regulated by standards, such as ergonomics. It can also be determined by analysis and investigation. For example, an analysis of the work site and loss data may indicate the need to prevent back injuries. Once the intervention is in place, many times a more difficult question presents itself: "Is the safety intervention working?" The methods safety professionals use to answer this question vary widely. Some companies count the number of people injured at the end of the year, and others may use a continuous improvement process. As with any intervention in the workplace, an organization must determine if the activities implemented are effective in meeting the organization's goals and objectives. Safety activities are no different than any other business activities. Over the years, it has become more commonplace for the safety professional to tie safety activities to results in an effort to show how improving safety activities equates improving business operations. ### Historically Historically, the effectiveness of a safety activity has been measured in terms of: - The number of accidents incurred - The organization's OSHA recordable incidence rates - The dollars spent on accidents - The costs for insurance coverage. There is no one way to measure safety and health program effectiveness; rather, a systems approach is necessary. These multiple methods for measuring safety performance include an approach in which leading, trailing, and current indicators are used. As methods for continual improvement evolved in the workplace along with statistical process control, the safety profession has slowly moved toward some of these methods now routinely used in other aspects of the organization's management structure. Safety managers are increasingly held accountable for their activities and must show management how their activities positively impact the organization With an ever-increasingly global economy, and international standards becoming the framework by which management practices are designed and monitored, safety practices have evolved to systems approaches for continual improvement. An organization must accurately and validly assess where they are in terms of their safety performance, how they decide where they would like to be, and what needs to be done to get there. In recent years, much research has been conducted to evaluate the effectiveness of interventions designed to improve safety performance. Through modeling techniques and statistical analysis, it is possible to optimize the effects of the safety and health interventions by decreasing injury rates and property damage with less costly programs. ## Valid Measurements Measuring safety performance is a critical step in the safety performance improvement process. The purpose of safety performance measurement is to determine if the goals and objectives have been met. The measures selected to monitor and evaluate safety performance must be valid and reliable. Valid performance measures are measures that are true indicators of performance. There must be a relationship between what is being measured and safety performance. Because follow-up action is planned and implemented based on the outcomes of the performance measures, it is only logical that the corrective actions are also valid means for improving performance. For example, a safety manager determined that an indicator of the number of cumulative trauma disorder (CTD) injuries reported was the number of employees that successfully completed CTD injury prevention training. Using this measure, the safety manager tracked the number of employees trained each month and the number of CTD injuries reported. The safety manager found that as the number of employees in the facility trained on CTD injuries increased, so did the number of CTDs that were reported, indicating that the training was unsuccessful in reducing the number of injuries. What the safety manager failed to take into account was the fact that the training also included early symptom reporting procedures and information about the early symptoms of CTD injuries. Thus, using the completion of CTD training as an indicator of CTD injury prevention may not be considered a valid measure because the training introduced a confounding factor—the early reporting of CTD symptoms. Variables confounding in data research are variables whose individual effects upon an outcome cannot be readily measured. In some cases, statistical procedures may be used to control for this confounding. Another important trait of any measure used to evaluate safety performance is reliability. The reliability of a performance measure is the consistency of results obtained through the measurement. This consistency means that the same results are obtained when the measurement is taken multiple times. A measurement used to describe the number of CTD injuries suffered must be well-defined to ensure the reliability of the data collected. An unreliable measure can yield different numbers when measured by different people. Data must first be proven to be reliable before it can be evaluated for validity. Otherwise stated, unreliable data is always considered invalid. Reliable data may or may not be valid. Reliability of data can be statistically evaluated using a variety of techniques. Two examples of these methods include the test-retest method and the split-half method. In the test-retest method, one performance indicator is measured multiple times. If all the measurements are highly correlated, meaning the same results are obtained over the multiple trials, then the measurement technique and data can be shown to be reliable. The split-half method is commonly used with tests and survey instruments. With the split-half method, the items are randomly distributed throughout the instrument. If the items are consistently measuring the same outcome, one would expect to find a strong correlation when comparing the first half of the responses to the second half. ## Leading Indicators, Trailing Indicators, and Current Indicators The effectiveness of a safety activity should be measured via three indicators: 1. leading indicators 2. current indicators 3. trailing indicators. Trailing indicators are the most common measures used by safety professionals. Trailing indicators are those measures that indicate the results of an intervention strategy after the fact. Examples of trailing indicators include: - Lost-workday rates - The number of injuries over a period of time - The losses incurred by the organization. Some reasons why trailing indicators are so widely used to measure safety performance include: - The availability of data to make such measurements - The influence of OSHA's recordkeeping guidelines - The use of various OSHA rates and measures of safety performance in the United States. One major downside of using trailing indicators is that they are measuring unwanted events after the fact, thus providing no means for implementing improvement strategies to impact their outcomes. **Current indicators** measure the current status of the organization's safety performance. An example of a current indicator is the number of safety audits conducted up to a particular point in time. A positive outcome from using current indicators is that as soon as the measure is obtained, action can be taken immediately to improve the measure and thus improve safety performance. **Leading indicators** are those measures that are correlated to future safety performance. For example, participation in safety training may be an indicator as to whether employees suffer back injuries on the job. Measuring the number of workers trained at a point in time may be indicative of the number of back injuries expected in the future. As with current indicators, leading indicators provide the safety manager with information that can be acted upon today to get positive safety performance results in the future. A key to using current and leading indicators is that these measures must be directly correlated to safety performance. Without this relationship, a safety manager may find that activities taken to improve safety performance based on uncorrelated measures will have no effect on safety performance. Safety performance should not be measured using only one or two performance measures. Instead, it should be measured with a variety of leading, current, and trailing indicators that have been shown to be correlated to safety performance in the workplace. When selecting these performance measures, keep in mind that the data needed for the performance measure should be valid, reliable, and readily available. When using multiple measures, the data's main effects and interactive effects become important when interpreting the results. Main effects are the variables examined separately in order to determine their role in influencing the outcome measure. For example, a safety manager wishes to determine the influence the age of the worker and the number of training sessions attended have on the number of injuries reported over a given period. The age of the worker and the number of training sessions attended can be considered the main effects. Next, the safety manager wishes to determine the combined influence that age and the number of training sessions attended have on the number of injuries reported. When examining the two variables simultaneously, the safety manager is assessing the interactive effects of the two variables. ## Benchmarking Benchmarking, measurement, and evaluation are all essential for program success. The benchmarking process establishes a standard that the company has determined signifies successful performance. Benchmarking is a technique for measuring an organization's products, services, and operations against those of its competitors, resulting in a search for best practices that will lead to superior performance. **Benchmarking safety performance** entails: 1. Identifying similar organizations with outstanding safety performance 2. Identifying key aspects of their activities that make them stand out. Benchmarking is more than taking another organization's safety programs and copying them. Much research is necessary to be able to identify those aspects of safety activities that result in superior performance, and much work is required to tailor them so similar outcomes can be duplicated in another organization. Meaningful benchmarks are typically set using successful performance results from similar industries and other facilities. The **benchmarking process** can be completed in six steps: 1. Survey programs 2. Identify solutions 3. Choose organizational priorities 4. Develop a plan 5. Implement the plan 6. Follow up ### Surveying The first part of benchmarking is surveying front-running programs or organizations. This step is the most crucial in the entire process. Identifying who the best organizations are and what they are doing to generate exemplary safety performance is critical in establishing benchmarks and program priorities. ### Identifying Solutions The second part of benchmarking is identifying the complementary solutions used by the target organization or program. As stated previously, the benchmarking process is not merely copying what other successful organizations are doing, but incorporating their programs into your organization in a manner that fits the organization structure and goals. ### Prioritizing Part three of benchmarking is prioritizing growth opportunities from a list of complementary solutions. The purpose of prioritization is to determine which program changes will provide the organization with the largest improvement in business and safety performance. ### Planning The next part of the process involves developing a plan to achieve the goals. Incorporating changes in an organization will take time and careful planning. The programs identified as being crucial for success must be tailored to the organization. ### Implementing Implement the plan. Adequate personnel and resources must be made available to ensure the benchmarking plan is carried out. Inadequate resources in the implementation phase, a lack of commitment, and a lack of motivation to continue implementing the benchmarking plan will produce poor results. ### Following Up Benchmarking is a dynamic process. Follow-up activities include monitoring to ensure the changes are meeting the needs of the organization. Just because they were found to be successful in one organization does not necessarily mean they will achieve the same results in another. Follow-up may involve modifying the activities or identifying new ones to achieve the desired safety performance. ## Example: Benchmarking a Safety Measure An organization was experiencing ever-increasing workers' compensation costs due to employee back injuries. To control these costs, the safety director decided to apply a benchmarking approach. First, companies that had been recognized by the industry as leaders in safety were identified and invited to participate in benchmarking focus groups. In these focus group meetings, the activities that were being used to control workers' compensation costs were identified and prioritized in terms of their effectiveness. Following the focus group meetings, the safety director developed a plan to tailor the activities to best meet the needs of his facility and implement them. Using a continuous improvement approach, results from the cost-control activities were measured and further interventions implemented based on the measurable results. ## Quality Control **Quality control** is a universal management process for conducting operations in order to: 1. Provide stability 2. Prevent adverse change 3. Maintain the status quo. **Process control** is about maintaining variation in a process at a level where the only variation present is random and the process is stable and, therefore, predictable. The major distinction between the two is that **quality control focuses on outputs** and **process control focuses on inputs**. **Continuous improvement** is about improving the efficiency and effectiveness of products, processes, and systems that are under control. Continuous improvement efforts are directed toward both the inputs and outputs. Safety performance, like other aspects of business, can be managed using the tools and techniques found in quality control. **Quality safety performance begins with proper planning and the development of performance goals and objectives**. Methods for measuring this performance, commonly referred to as safety metrics, are then developed to define measures that are indicative of acceptable performance. Data are collected and analyzed, making comparisons against the established levels of acceptable performance. When gaps are identified between acceptable performance levels and actual performance levels, action is warranted. This quality control process is known as the continuous improvement process. ## Continuous Improvement During the 1990s, the use of continuous improvement processes increased dramatically in the business world. In the context of ISO 9000, there is no difference between **continuous improvement** and **continual improvement**. Improvement that is continuous has no periods of stability; it is attainable all the time. Although the rate of change may vary, improvement does not stop. In reality, there are periods of stability between periods of change; therefore, continual improvement is a better term to describe the phenomenon. **Continuous improvement** is the process of establishing performance measures with a desired goal, implementing an intervention designed to meet that goal, measuring the performance, and implementing change in the intervention until the desired goal is met. Simply putting out fires is not improvement of the process, and neither is the discovery and removal of a special cause detected by a point out of control. This only puts the process back to where it should have been in the first place. If there is no status quo (no normal level), action needs to be taken to establish a normal level, that is, bring operations under control. When a process is in control, data measurements are consistent without wild fluctuations and wide ranges. One can only improve what is already under control. Bringing operations under control is not improvement. ## Plan-Do-Check-Act In their quest for safety performance improvement, a variety of industries have adopted the **Plan-Do-Check-Act (PDCA) cycle**. This cycle has also been incorporated into standards such as ISO 9001:2008, the ISO 14000 family of standards, and ANSI/AIHA A10: 2005. Various safety professionals in the United States and Japan have been associated with the early evolution of the PDCA cycle, including Deming, Shewhart, and Mizuno. This continuous improvement approach has been the cornerstone of a variety of approaches to both safety and process control. First, **safety activities and performance goals** are defined and prioritized in the **Plan phase**. Next, these **safety activities are implemented in the Do phase**. This is followed by **measurement of the results of the activities** and **comparisons of the results with the planned or desired outcomes in the Check phase**. In the **Act phase**, if desired performance levels are not attained, then changes in the activities may be warranted to achieve the desired outcomes. If the performance goals are successfully met, then modifications to the planned outcomes can be made so that further improvements in performance can be planned. By repeating this cycle and planning for better safety performance with each successive time through the process, **continuous improvement** can be successfully planned, implemented, measured, and achieved. The first edition of the ISO 9000 standards sees PDCA management as compatible with the contemporary concept that all work is accomplished by a process. Safety management practices also lend themselves well to a PDCA process in which safety activities are planned and then implemented, outcomes are evaluated, and interventions are acted upon to close any gaps between the desired performance and the actual performance. ## Identification Methods - **How to Recognize Hazards** - **Energy Sources** - **Nonenergy-Related Hazards** - **Failures** - **Helpful Hints for Recognizing Hazards** - **Exploring Causes and Effects** - **Cause-and-Effect Principles** - **Root-cause Analysis** - **Principles of Variation** - **Benchmarking** - **Selection of Control Methods** - **Controlling Hazards** - **Hierarchy of Controls** - **Design Solutions: Making the Right Decision** - **Implementing Effective Controls** - **Implementing Risk Reduction Methodologies** - **Step 1: Engaging Organizational Leadership** - **Step 2: Using Business language to Implement Risk Reduction Processes** - **Step 3: Identifying and Assessing Risk** - **Step 4: Mitigating Risk** - **Monitor and Reevaluate** - **Defining Exposure Assessment** - **Exposure Assessment Strategies** - **Quantitative Exposure Assessment Methodologies** - **Impacts of Uncertainty and Data Quality** - **Role within OSH Programs** - **Uses and Limitations of Monitoring Equipment** - **Acceptable Risks and Occupational Limits** - **Methods for Determining Exposure Levels** - **Personal Sampling** ## How to Recognize Hazards There is no one rule for hazard recognition. Hazards come in many different forms. They can result from: - Design deficiencies - Human error - Environmental factors, - And so on. This section gives several general guidelines that help us recognize hazards. Hazard recognition is important early on in the project and in selecting appropriate hazard recognition techniques to fit the types of hazards or concerns that one desires to control. ## Energy Sources Most hazards are related to an inadvertent or uncontrolled release of energy. An explosion causes shock waves and kinetic energy of shrapnel, which can cause injury or damage. Uncontrolled electrical energy can cause injury or death and property damage. So the first task is to identify all energy sources. Then one needs to determine what targets (people, objects, environment, etc.) could be injured or damaged by any release of that energy. Look for barriers to harmful energy flow between the source and target. Barriers may be physical (design measures) or administrative (policies and procedures). Figure 1 illustrates the relationship of the energy source, barrier, and target and shows how barriers can prevent injuries from energy sources. Energy may be of one or more of the following types: - Mechanical - Rotating machinery - Potential - Raised crane load, coiled spring; resulting in kinetic energy - falling or flying objects - Pressure (pneumatic, hydraulic, acoustic) - stored pressure that could be released - Thermal - High or low temperatures - Chemical - Toxic or corrosive chemicals, reactants - Electrical - Energy that is used to operate equipment - Radioactive - Energy that is used to provide power. Many systems will contain several energy sources. A high-temperature furnace contains both thermal and electrical energy. Also consider interaction among many energy sources. For instance, a tank of gasoline is next to a house in a lightning-prone area. There are two energy sources - electrical energy from the lightning and chemical energy in the gasoline. If the two interact in the right combination, this could cause a fire or explosion. The **targets** are the house and its occupants. A hazard statement for this situation should be written in terms of the energy and targets. Barriers that prevent energy transfer to the targets are the hazard controls. The hazard could be described as follows: - Hazard: Leaking gasoline tank next to house in lightning-prone area - Cause: Careless placement of gasoline tank - Effect: Death or injury and damage from fire or explosion ## Nonenergy-Related Hazards Examining energy sources enables one to identify most of the serious hazards, but by restricting the process only to energy, one could miss other serious hazards. Such hazards can involve interruptions to normal life functions - for example, asphyxiation, smoke inhalation, or disease - or problems that result from human factors. To find such hazards, look for the following: - Presence of inert gas - Potential for loss of breathable atmosphere - Potential for disease - Human interfaces with the system - Software hazards. Software in and of itself is not hazardous. However, it can become hazardous when it interfaces with hardware. Software that controls hardware functions may command an undesired event or condition. Software that monitors hardware may fail to sense or properly process a hazardous condition. There are various methods for analyzing software. It is important that hardware designers identify all software interfaces with the hardware and bring them to the attention of the software designers and analysts. This includes subroutine controls for weapons systems, traffic lights and traffic control, communications systems, cancer irradiation equipment, and so on. ## Failures Many hazards do not result from failures. However, some do. Consider two cases: 1. The failure of safety-critical subsystems - life support, fire alarm or sprinkler, and so on - can cause a hazardous situation. Look at the organization's system to see if it contains any such subsystems. If it does, first, determine the probability that these systems will fail, and then take measures to reduce the probability, if necessary. 2. Component failures can cause hazards or mishaps. For instance, failure of a hydrogen line coupling can cause hydrogen to leak into a room and create a potential for explosion. Examine the system to see if any such failures are possible. This can be done in the course of the safety analysis, or one can consult a failure modes and effects analysis, if it is available. ## Helpful Hints for Recognizing Hazards Using the team approach is best. In most cases, no one person can be familiar with all aspects of system design and operation and related hazards. At a minimum, a knowledgeable engineer should analyze the system first and give the results to a safety engineer for review. Consult experts frequently. Experience is invaluable for many reasons, but one should learn from a past history of accidents to avoid repeating them. Inquire about past accidents associated with the organization's system. Look at all operating modes. Often, a system is scrutinized only while it is in normal operating mode. Problems often do not occur here, but at startup or shutdown. Other examples of operating modes include transport, delivery, installation, checkout, emergency startup and shutdown, and maintenance. The following is a list of proven methods for finding hazards: - Use intuitive engineering sense. - Examine similar facilities or systems. - Examine system specifications and expectations. - Review codes, regulations, and consensus standards. - Interview current or intended system users or operators. - Review system safety studies from other similar systems. - Review historical documents - mishap files, near-miss reports, OSHA-recordable injury rates, National Safety Council data, manufacturers' reliability analyses, and so on. - Consider external influences, such as local weather, environment, or personnel tendencies. - Brainstorm - mentally develop credible problems and play "what if?" games: "What could go wrong?" "What is the worst possible thing that could happen?" - Consider all the energy sources (pressure, motion, chemical, biological, radiation, electrical, gravity, and heat and cold). What's necessary to keep these sources under control? What happens if they get out of control? - Use checklists such as the hazard category list presented in the following section. ## Exploring Causes and Effects A review of causes and effects is needed to evaluate incident causation models. To be considered a cause, the event must precede the incident in time (Figure 2). The event (factor or condition) may be a necessary or sufficient condition for the effect; however, rarely is one event both the necessary and sufficient condition. In most cases, it is enough to identify the events/factors/conditions that increase the probability of the effect. Furthermore, an argument can be made that causes and effects are one and the same. The difference is primarily how they are perceived in time. When starting with an effect of consequence, one wants to prevent it from occurring. When someone asks why it occurred, a cause is found; but if one asks why again, what was just a cause becomes an effect. Look at the analogy below relative to construction safety. In the example, a fall is a cause when viewed as the precursor to the injury. However, a fall could also be an effect of floor opening when viewed as the result of slipping. One could also add slipped as the cause of fall and the effect of the floor opening if he wanted to include more causes in the example. As shown later, there are always causes in between causes, and logic can help in finding the optimum sequence. In essence, people have their own perceptions of a correct alignment based on their individual knowledge of the specific causal relationships. Individual perspectives are important, but in order to eliminate resulting bias, one should work with others who may perceive a cause or effect differently or more deeply if they have a greater understanding of the causal relationships. For example, we know we have a cold when we ache and cough, whereas a doctor knows we have a cold when he or she can observe a virus on a microscope slide. The situation of the effect is the same, but the knowledge of the causes is significantly different depending on perception and knowledge. By understanding that a cause and effect are one and the same thing - merely viewed from different perspectives - one begins to see how they are part of a continuous set of causes that has no beginning or end. In observing the structure of the cause chain created by asking why, one is drawn to a linear path of causes. When one keeps asking why, the chain of causes seems eternal. The answer to the initial why is a function of one's perspective. If one is the person responsible for valve maintenance in this example, he or she may choose to look at the leaky valve, or possibly the seal failure. If one is the safety professional, the primary interest would be preventing the injury, so he or she would probably focus on the injury when looking for causes. ## Cause-and-Effect Principles - All undesirable events are caused to happen. - These events are the result of design deficiencies, human errors, equipment malfunctions, and other elements. - The root cause(s) of an event can be determined by analyzing cause-and-effect relationships. - Because undesirable events are caused to happen, they are actually effects created by additional causes(s). Determining the root cause is a process for systematically detecting and analyzing all the possible causes of an accident. Root-cause determination is based on internal logic and reasoning skills to derive conclusions. Figure 3 outlines cause-and-effect definitions. ## Root-cause Analysis To effectively have a complete understanding of incident causation, one must evaluate the event, define the problem, and identify the cause. This three-step process will allow the safety professional to look at the system, including activities and processes, and determine hazards present in the system and the resulting causes for nonconformance and errors. The safety professional must clearly identify and describe the loss incident he or she is attempting to solve. This will provide focus for a root-cause analysis by determining the who, what, when, where, how, and why specific to the undesirable event, thus defining the loss incident. During a systematic attempt to define the loss, a safety professional, or anyone responsible for determining the causes of an incident, may reveal multiple problems that can be addressed to prevent reoccurrence. To identify potential incident causes: - Explore the undesirable events and situations inherent in the system - Determine the deviation(s) from any one requirement or expectation - Evaluate the primary effect necessary for the situation to occur. It is tempting to prejudge a probable cause or sequence based upon first observations or identified factors. This often results in forming an incomplete or erroneous conclusion. Root causes are those which, when corrected, would bring about positive results. Therefore, it is important that management understands its role specific to root causes of incidents. ## Principles of Variation There are two kinds of accidents in the context of system operation. The distinction between them is in the type of cause involved. - Type 1. The outcome is from common causes of variation. - Type 2. The outcome is from a special cause. This distinction is important to gaining a solid understanding of incident causation, and focusing on the wrong causes will result in system outcome recurrences. It is frequently important to locate, estimate, and control major sources of variation in incident prevention. Of all the devices for analysis of data, perhaps the most valuable is the simple graph. Since understanding comes as a result of information properly communicated, the mere existence of information is not enough. Whether the objective is the control of a manufacturing plant, care of a patient, or a construction-site incident evaluation, it is not only important that appropriate information be collected, but also that this information be fed back into a readily understood form to those responsible for taking action. To be informative, data must be displayed so that present and past experience can be readily compared, and concomitant variation in two or more impinging responses can be simultaneously considered. Appropriate plotting of data is never a futile effort as formal analysis can be conducted to prevent incidents. Evaluating data also frequently reveals unexpected characteristics that might otherwise be overlooked. The run chart is a simple statistical tool that involves plotting over a period of time. With the results of a statistically significant number of trial runs in hand, the result of each trail is plotted on a chart. This produces the basic run chart, a plot of test results over a series of tests. Walter Shewhart, a mathematician and scientist, studied variations for the future of the telephone industry during the 1920s. Shewhart studied variations and determined the average number of faults (variations) for the entire group of trials. The value was charted as a line around which the individual plots would be located. By differentiating between desirable and undesirable variations, one could improve the system. The objective would be to identify the factor that caused a condition above the average for the system. Next, the factor(s) would be eliminated or minimized to improve performance. The goal would be continuous improvement. Then, looking at Figure 4, the farther spread points would be closer to line 2, indicating increased optimal performance. The dispersion of data utilizing the average, or × line, is the basis of deviation. Using three standard deviations from either side of x will include 99.7 percent of the values of a normal population. This became Shewhart's upper and lower control limits, limits that result from the system itself. A system operating within those limits is in normal operation for the condition it is in. The random variations occurring within these limits