Methods for evaluating the reliability of a system
The Engineering RAMS allows
using different techniques to calculate the reliability of a product,
system or installation. It is important to start this article
explaining to know the reliability of a computer usually not be
easy because we wear aim at much a s sometimes
predict when a team will fail: Introducing a bit of
humor to the matter, since when has it been easy to predict the
future? To this is added, according to our experience, a starting point
with scarce information, short experimentation times and little knowledge of
the technology under analysis, which makes the reliability evaluation processes
difficult in every rule.
There are several ways to approach the calculation of the reliability or failure rate of a piece of equipment. In this article we will present the strategies most used today by reference sectors of RAMS Engineering such as Railways or Aeronautics .
We recommend reading the following articles if you are not familiar with the basics of reliability and RAMS parameters:
The first thing that is important to
explain is that the reliability calculation is not a deterministic
parameter. Reliability must be understood as a probabilistic parameter
that attempts to predict from that way a team will fail in the
future. Therefore, strictly speaking, reliability cannot be measured
accurately and repetitively.
The basic and classical definition of reliability is the probability of providing a specified level of performance for a specified time in a specified environment.
In addition, the reliability assessment, in many cases, should aim to identify and understand the main contributors to failure of our system, rather than the exactness of the calculation of when our equipment will fail. This identification allows us to establish in a prioritized way intelligent action plans supported by quantitative data, of the improvement of the parameters or RAMS indicators of our system.
In addition, we will understand or interpret reliability in two different ways depending on whether we are talking about an independent equipment such as a train, a railway installation or, if we are talking about a reliability associated or related to the population of a product and, where the product is characterized by have a high volume of production or implantation. Let's see an example: Let's imagine a product with a reliability value "mean time between failures" MTBF of 1,000,000h and a deployment of 1,000,0000 of these products in a facility. This high MTBF does not mean that the product will last 1,000,000 hours in operation before failure, that is, 114 years. Rather, it means that, on average and statistically, one of these products will fail every hour at our facility (1 million hours with a population of 1 million computers). So what seemed like a very good reliability value is actually, in context, not so good.
When is it necessary to carry out a reliability assessment?
Typically at Leedeo Engineering , we develop reliability calculations for a product or system to meet one of these customer goals:
- Compliance with the reliability or availability objectives required in the systems . Therefore, the fulfillment of the requirements regarding reliability or availability.
- Selection of the best design . Evaluation of the best implementation of a solution when there are different alternatives to solve a need.
- Product improvement prioritization . As we have commented previously, the knowledge of the points of infitolibility of a system, allows prioritizing improvements in the system based on empirical data, discarding intuitions, noses or unclear and concise operating situations.
- Integrated Logistic Support . Spare parts, strategy of the guarantees granted to the customer who buys the product or system, LCC calculations (life cycle costs). At this point we also introduce the forecast of the type of maintenance activities that will have to be done.
Method 1 for estimating reliability: analysis by similarity
The analysis by similarity uses reliable data (forgive the redundancy) of the operation of the equipment in operation to compare the new designed equipment with the predecessor equipment and thus estimate the reliability of the evolved product. As can be seen, this method is useful for product evolutions, in products that we have in service and with many experiences and, in which, due to the need, it is decided to evolve to an improved product or with new characteristics. Being able to identify the part of the design that is the same and the part of the design that is different from the previous version, we will be able to leverage the information from the previous product to the same design part.
In this method the concept of lessons learned and the return of experience comes into play . Many companies evolve their products and begin a reliability analysis without taking into account that perhaps 80% of the product has been "tested" on the previous product for the last 20 years. We consider that it is imperative, in the face of this situation, to use the experience that previous facilities and equipment have given us.
In this concept of analysis by similarity , you can see the importance of block or modular design. If we make a design and even a manufacture, in blocks of modular form and, we reuse modules in upcoming designs, we will be able to have information about each of the blocks, how reliable they are and how they fail. If you do not have a modular design to beginning a new design, it is not obvious from the other projects, designs or products, information on the reliability of its sub-components . So, again, we have another variable that supports a neat, organized, modular design with a step-by-step integration strategy, typical of a V- model engineering design .
The analysis by similarity must also contain the elements not only specific to the equipment but also the elements exogenous to it. In this sense, these elements must also coincide to leverage the experience of previous facilities and services: temperature and thermal shocks, humidity, mechanical stress, work cycles, etc.
Method 2 for estimating reliability: durability analysis
The durability evaluation is used to estimate the time it takes for a system to fail, through the execution of tests, defined as the structured analysis of the response of an item of equipment to the effort resulting from operation, maintenance, transportation, storage and other activities throughout the life cycle.
It is usual, for this strategy, to use accelerated test methods , which allow, by increasing the stress of the equipment under test, to be able to multiply the effective time versus the real time of the test. The typical variables to assess in this type of test are thermal, climatic, mechanical and electrical stresses. This is especially useful when we want to introduce equipment to the market or put it into service without requiring it to be tested for a time at least equal to its expected life time, a situation which always happens. Due to the needs of time to market , it is not possible to wait for most industries to introduce a product to the market for a time equal to its expected lifetime.
The accelerated tests allow us therefore within days or weeks, take the equipment to wear and use equivalent of years and therefore understand that way degrades and whichthey are the first sub-components or elements to fail. This process allows the iterative improvement of the system under analysis since when detecting "the weak point or points" of the equipment, it can be improved no longer being these weak points and, to launch an accelerated test again until detecting, the next weak point of the equipment.
In addition to accelerated tests , it is also common to use the staggered-type test strategy. The staggered tests implies increasing little by little and gradually, seeing the response of the equipment under analysis, the conditions in which the system works (again through variables of thermal, climatic, mechanical and electrical efforts), taking it, step by step , outside the operating conditions of the equipment and, therefore, stressing the system outside its operating range , to analyze which are the failures that occur, thus being able to understand the weakest parts of the design and, thus, being able to improve the reliability of the equipment, improving these parts.
Method 3 for estimating reliability: theoretical predictions
The estimation of theoretical reliability is based on the use of information, typically, provided by the manufacturers of the components that we acquire and use, which, previously, they have carried out in the majority of the times a reliability estimate based on Method 1 or Method 2. Therefore, Method 3 of the theoretical prediction , itself, depends on one of the previous methods.
For this method, we will normally have to consult or search in the data sheets of our components, the MTTF or MTBF values of the component that we buy. Other times, these values will not be fixed but will be a factor of the team's work variables and the supplier can present it to us in the form of a mathematical equation which models the behavior of reliability as a function of a variable.
Here is a link where you will see how a leading manufacturer of capacitors provides information on the MTBF of its components:
And in the following link, the
information of a manufacturer of small stepper motors:
Once we have the information on the reliability of our components, it will be possible to carry out their theoretical analysis by applying any of the known strategies, typically a Reliability Block Diagram (RBD) .
Leedeo Engineering is a specialist in the development of reliability calculations for products, systems and installations. With the best support software on the market, we have experience within the rail, aeronautical, automotive, defense and energy industries. Do not hesitate to contact us if you need to carry cape the calculation of the reliability of any of its equipment or facilities.