The 4 dimensions of the failure according to CENELEC EN 50126

16/10/2020

The CENELEC EN 50126 standard, the reference framework for the RAMS standard in railway industry (RAMS Engineering), gives strong emphasis to the concept, analysis, and management of "failure". That is to say, the integral and holistic management of that loss of the capacity to function in the required way of a product, system, or installation.

The concept of "failure" is described as an event in 4 dimensions , which we consider relevant to take into account when carrying out analysis and mitigation of these ones, typically in FMECA and risk analysis studies (PHA and Hazard Log ). When speaking about dimensions, we are referring to different classifications of failures, according to different categories which, used together, will allow a failure to be perfectly defined. We will define below the 4 categories or dimensions to define a "failure".

What is the nature of the failure? random failure or systematic failure

When talking about the nature of the failure, this means whether the failure is random or, conversely, the failure is systematic.

The random failure is defined as that unforeseeable failure resulting from one or more of the possible degradation mechanisms of a system or installation. Typically, this failure is modelled with a constant failure rate following the exponential model or with a progressive failure rate following the Weibull model. This type of modelling makes it possible to define a MTBF (Mean Time Between Failures), i.e. an average period of occurrence of the failure. These failures are usually caused by unsuitable environmental conditions, degradation due to excessive stress, wear, over-stressing or ageing of components and equipment.

Systematic failure invariably occurs under specific conditions of handling, storage, or use. One of the characteristics of systematic failures, as opposed to random failures, is their ability to reproduce themselves by deliberately applying the same conditions that have generated them. The systematic failures are considered to originate from people and processes being involved during the product life cycle, either in the specification, design, manufacture, installation, operation or maintenance phase of the equipment, system, or facility. Thus: a lack of requirements specification, incorrect or non-robust design, manufacturing quality deficiency or problems, software errors, incorrect installation, incorrect maintenance, etc.

As it can be deduced, knowing the nature of the failure is very important since it allows us to identify the origin of the failure, how we should fix it, if the failure was predictable or acceptable, if we have a quality problem in our design or manufacturing phases. In case we have some problem in any of our components or equipment that can generate a massive failure in an important population of our equipment, etc.

Random failures are due to situations that can be statistically modelled or controlled in order to predict or estimate their probability of occurrence. The statistical or probabilistic application does not make sense in systematic failures .

What is the source of the error?

We must classify the source of the failure into the following categories: internal, external in operation, external in maintenance activities. That is to say:

Internal failure (which will be either systematic or random).
Source of failure imposed on the system during its operation . In other words, we are talking about external disturbances due to conditions in the environment . Under what conditions must the system carry out its tasks? In what physical environment? Intensity of the service, human errors (human factor ), poor description of procedures to be carried out, poor logistics, coexistence with existing equipment.
Source of failure imposed on the system during the maintenance activities : human errors (human factor ), poor description of the procedures to be carried out, both preventive and corrective maintenance , poor logistics.

Does it affect safety or availability?

It is clear that one of the most relevant characteristics to be taken into account in a failure will cause the product, system or installation to be either out of service, or a situation against safety or, if this error does not cause any effect on either reliability or safety. As we can imagine, failures causing a safety failure will end up being failures that must be mitigated . However, regarding failures that affect availability, an analysis must be carried out to see whether their level of affect means that reliability and therefore availability levels agreed with the customer are not met. In this way, failures affecting only reliability normally have an approach associated with compliance regarding requirements, being balanced with economic aspects.

A typical example of a failure that does not affect either safety or availability is when an LED indicating that the system is working properly or that power is being supplied to the equipment stops working. This failure does not have, a priori, to affect either reliability or availability. It will be important to correct as soon as possible.

How severe it is?

Typically, four levels of severity are defined, focusing on the effects on people and the environment:

Catastrophic:Fatalities and/or multiple serious injuries. Extreme damage to the environment.
Critical:One single casualty and/or serious injury. Significant damage to the environment.
Marginal:Minor wounds. Minor damage to the environment.
Insignificant:Possible slight injury.

At Leedeo Engineering, we are specialists in the development of RAMS Railway projects, applying CENELEC standards EN 50126, EN 50129, EN 50128, EU Implementation Regulation 402/2013 with the application of the Common Safety Methods CSM-RA, supporting any level required to RAM and Safety tasks, in the development and certification of safety products and applications.