Chapter 1: Introduction to Fault Diagnosis PDF

# Chapter 1: Introduction to Fault Diagnosis ## 1.1 Introduction Every industrial company strives for optimal performance of its installations. Fault diagnosis, considered a "health check" of an industrial process, can occur before and after a failure. Following a failure, the urgent need is either to repair or replace the broken part. Subsequently, understanding the causes of the failure allows you to investigate further, considering all mechanisms and processes to prevent its recurrence. The industrial field is rich in diagnostic methods, often referred to as "industrial system reliability techniques." They focus on studying reliability, availability, maintainability, and safety. ## 1.2 Basic Concepts ### 1.2.1 Fault A fault refers to a malfunction or anomaly affecting a system or component. It encompasses both partial or progressive degradations. A fault can be anticipated and includes situations where the system continues to operate but with reduced performance or irregularities. Sometimes, a fault can be temporary or intermittent before it leads to a complete breakdown. * **Example**: A sensor providing inaccurate readings is experiencing a fault. ### 1.2.2 Defect A defect describes an imperfection or anomaly in an object, product, system, or a machine that does not meet the intended standards or specifications. A defect does not necessarily lead to an immediate malfunction but can cause one if left uncorrected. ### 1.2.3 Failure A failure represents an unexpected stoppage of operation. Failure is the consequence of a fault. ### 1.2.4 Reliability Reliability refers to the ability of a system or component to perform its intended function under specified conditions for a given period. It is expressed as the probability of the system successfully performing its intended function under given conditions during a specific time period. The concept of time can be measured in different ways: * Number of cycles completed: for automatic machines. * Distance traveled: for rolling stock. * Tonnage produced: for production equipment. A system is considered reliable if few breakdowns occur. Reliability applies to: * **Repairable systems:** industrial and household equipment. * **Non-repairable systems:** lamps, and components meant to be disposed of. A failure of a system can be characterized by a specific rate called **λ(t)** — the failure rate. It is also known as failure rate, hazard rate, or fatality rate. Its definition is the conditional probability that the system fails between time t and t + Δt, given that it survived until t. ### 1.2.5 Maintainability Maintainability refers to the ability of a system to be maintained or restored to a state where it can perform its intended function, given maintenance is conducted under specified conditions using prescribed procedures and resources. ### 1.2.6 Availability Availability represents the probability of a system functioning properly at a given time. Increasing availability involves minimizing downtime and reducing the time required to resolve problems. Availability denoted as D is given by: ### 1.2.7 Safety Safety refers to the system's ability to prevent catastrophic events from occurring. ## 1.3 Operational Reliability: A System's Ability to Perform Multiple Functions Operational Reliability encompasses four main components: reliability, maintainability, availability, and safety. Understanding these components is essential for implementing appropriate measures to prevent failures and ensure the system's intended functionalities. - **Diagnosis:** The process of identifying, analyzing, and understanding malfunctions or anomalies in industrial systems. This can include investigating equipment, processes, machines, or production systems. - **Prognosis:** The process of predicting the future evolution of an industrial system or component after identifying a problem or potential failure. It involves anticipating the future state of a system based on diagnostic data in order to predict the occurrence of a failure or performance degradation. ## 1.3.1 Understanding Pathological Phenomena: A Strategic Challenge ### 1.3.1.1 Failures: The Heart of Maintenance Failures are to maintenance what human pathologies are to medicine: their reason for existence. Every failure results from a reasonable and explainable pathological mechanism caused by one or multiple identifiable causes. It's crucial to diagnose failures before attempting corrective repairs. These repairs should address the root cause and ensure that the fault does not resurface. Without an accurate diagnosis, only temporary fixes can be achieved, leading to recurring issues. ### 1.3.1.2 Failures: A Source of Improvement! Failures, despite their negative impact, can offer valuable knowledge. Effectively managing failures allows for the transformation of a negative event into a positive opportunity for progress, especially in maintenance and design. * **Proactive Maintenance:** focuses on preventing failures by learning from past experiences and understanding the mechanisms behind them to prevent their recurrence. * **Building and Enhancing Operational Reliability:** Failures can be leveraged to optimize a system's availability by focusing on: 1. **Post-Failure Analyses:** These analyze the causes of a failure to improve the availability of a system in production. 2. **Predictive Analyses:** This preemptively considers failures during the design and development of systems to ensure sufficient operational reliability before the system is even manufactured. - **Understanding the Environment and Mechanisms of Failure:** Thorough investigation of a failure encompasses six factors: 1. **Identification and Localization of the Failure:** Identifying it in the organization (by DT number, technician, and nature of fault), as well as its specific location in time (using counter readings and date and time). Also, identifying its location within the organization (by barcode, module or faulty component), and functionally (what specific function is not working). 2. **Detection, Manifestation, and Alarm:** Determining who, when, and how the failure was detected, as well as the conditions and sensors involved. Also, understanding how the failure was manifested (partial or complete, slow or rapid, constant or intermittent). 3. **Preliminary Information:** Gathering details about the part, its origin, and its references. Also, noting environmental conditions before and during the event, as well as service conditions and the history of prior interventions on the part. 4. **Symptoms:** Observing any unusual occurrences before stopping. Noting any changes in measurements, variations, perturbations, or abnormal outputs. Conducting visual examinations, inspections, and tests (destructive and non-destructive), as well as any chemical analyses. 5. **Consequences:** Examining the impact on safety, downtime, product quality, and cost. Also, categorizing these consequences as minor, major, or critical. 6. **Causes:** Identifying the root cause of the failure as either: * **Extrinsic:** accidents, shocks, overloads, misuse, operator error, failure to follow procedures, lack of safety precautions, unsuitable environment, secondary failures (cascade failures). * **Intrinsic:** Material defects, design defects, manufacturing or assembly defects, installation defects, wear, abrasion, corrosion, surface fatigue, deformation, rupture, aging. - Documenting these factors after each failure fosters a systematic approach to managing failures. The information compiled should align with the company's existing system and provide a qualitative and quantitative database to complement existing data. ## 1.3.2 Failure Types - **According to causes:** * **Intrinsic**: generated by the system itself under normal operating conditions. The CEN (European Committee for Standardization) distinguishes between failures due to inadequate design, non-conformance to design or manufacturing procedures, and improper installation. * **Extrinsic**: caused by external events, such as improper use, accidental operation, or improper maintenance. - **According to severity:** * **Partial**: the system still performs some of its functions, but its overall performance is reduced. * **Complete**: the system completely fails to perform any of its intended functions. * **Intermittent**: the system fails intermittently, but works normally in between failures. * **Permanent**: the system fails permanently and cannot be recovered without repair or replacement. - **According to how fast they develop:** * **Sudden**: the failure occurs unexpectedly and cannot be anticipated by any form of inspection or monitoring. * **Progressive**: the failure develops gradually over time, allowing for detection and intervention before complete failure. * **Catastrophic**: the failure occurs suddenly and results in a complete loss of function, often causing major damage or loss of life. This type of failure is often associated with sudden breakdowns or explosions. A **Catalectic Failure** combines both sudden and complete failure, or progressive and partial failure. ## 1.3.3 Analyzing Failures: A Quantitative Approach To effectively manage failures, companies need a clear understanding of their systems' behavior. This requires collecting and analyzing performance data in a standardized format, enabling a comparison of various systems and a proactive approach to failure prevention. This data can be collected through: * **Equipment Technical Data (DTE):** Provides background information about the system's design and operation. * **Historical data:** Includes quantitative data, such as the number of breakdowns, time to repair, parts replaced, etc. * **Qualitative data:** Captured through post-failure analyses, like the details described above. This statistical analysis provides valuable insight: * **Mean Time Between Failures (MTBF):** Predicts the system's average operational life. * **Mean Time To Repair (MTTR):** Reflects the average time needed to restore the system's functionality. * **Availability (D):** Measures the percentage of time the system is functional. ## 1.4 Tools and Techniques for Monitoring Equipment The following sections delve into specific tools and techniques for monitoring equipment and diagnosing failures, helping you understand the key methods used to safeguard systems in real-world industrial settings. ## 1.4.1 Sensors: The Foundation of Monitoring Sensors, the primary component of any monitoring system, are devices that detect changes in their surrounding environment and output signals that can be interpreted by humans or control systems. Sensors provide the data required to diagnose failures. **Types of Sensors:** 1. **Physical Parameters:** These sensors measure physical quantities like temperature, pressure, density, etc., for different types of equipment. 2. **Spatial Parameters:** These sensors track locations, positions, and levels, essential for monitoring machinery and identifying inconsistencies. 3. **Anomalous Phenomena:** Sensors detect the presence of unwanted elements, such as flames, smoke, specific substances, etc. 4. **Kinematic Parameters:** These sensors track movement, such as speed and acceleration, crucial for machines and systems that move, ensuring consistency and detecting issues. 5. **Physico-Chemical Parameters:** These sensors measure chemical quantities, such as pH, conductivity, resistivity, voltage, current, etc., important for industrial processes involving chemical reactions. ## 1.4.2 Analyzing Signals to Detect Failures Signals transmitted from sensor systems contain raw data. To identify potential failures, sophisticated signal processing techniques can be used to analyze and extract insights. These include: 1. **Signal Processing:** transforms raw signals into a format suitable for analysis, generally by filtering out noise. 2. **Time Domain Analysis:** examines signal variations over time, revealing trends and patterns. 3. **Frequency Domain Analysis:** decomposes signals into their constituent frequencies, helping to identify specific anomalies that might be invisible in the time domain. 4. **Statistical Analysis:** uses mean, variance, etc. to identify significant deviations from normal behavior. ## 1.4.3 Conversion Techniques: Bridging the Gap Between Analog and Digital The conversion of analog signals (continuous in both time and amplitude) to digital signals (discrete in both time and amplitude) is crucial for monitoring and fault diagnosis. This conversion process commonly uses analog-to-digital converters (ADCs): * **Sampling:** The ADC captures the signal at regular intervals called the **sampling frequency**, which determines the accuracy of the conversion. * **Quantization:** The ADC assigns a discrete numerical value to each captured sample based on a predefined set of levels or a specific number of bits. This digital representation is essential for: 1. **Digital Signal Processing (DSP):** Analyzing and interpreting data using algorithms and calculations. 2. **Data Storage and Transmission:** storing data efficiently and transmitting it over digital networks. ## 1.5 Methods for Analysing and Interpreting Signals - **Time Domain Analysis:** The simplest and most intuitive approach, this method analyzes raw signals directly for trends and patterns, focusing on: 1. **Mean Value:** Indicates the average level of the signal's magnitude. A sudden shift in the mean value might point to an issue. 2. **Variance:** Measures the spread of data points around the mean. A high variance indicates inconsistent behavior that may be unusual. 3. **Root Mean Square (RMS):** Provides an effective value representing the energy content of the signal. 4. **Kurtosis:** Indicates the signal's degree of peakedness or flatness. High kurtosis suggests a distribution prone to extreme values. 5. **Crest Factor:** Measures the ratio between the signal's peak value and its RMS value. 6. **Time Domain Techniques:** Include calculating peak values, durations of events, and time intervals between events, which is crucial for detecting sudden changes or trends. - **Frequency Domain Analysis:** Uses the Fourier Transform to decompose signals into their individual frequencies, allowing for the detailed study of their frequency content. 1. **Spectral Analysis:** Detects abnormal frequencies or changes in the frequency components within the signal, highlighting deviations that may indicate faults. 2. **Envelope Analysis:** Extracts the envelope or outer shape of a signal, focusing on specific frequency bands, especially low-frequencies. This is particularly useful for detecting mechanical problems like bearing wear. - **Time-Frequency Analysis:** Combines both time and frequency analysis. ## 1.6 Conclusion By mastering the techniques outlined in this chapter, companies can effectively detect, analyze, and prevent failures in their industrial systems, leading to increased productivity, reduced downtime, and improved safety. From choosing the right sensors to understanding sophisticated signal analysis methods, implementing these techniques is critical for ensuring the optimal performance and reliability of industrial operations.

Chapter 1: Introduction to Fault Diagnosis PDF

Document Details

Tags

Related

Summary

Full Transcript