Advanced RAM Analysis Methods: Markov Models



For complex systems used in the railway industry, basic analysis methods (FMECA, RBD, FTA, etc.) of reliability, availability and maintainability (RAM) and subsequent decision making may not be sufficient in some cases. In this chapter we present one of the most widely used advanced techniques for modeling complex systems: MARKOV MODELS . The name of this model is in honor of Andrei Markov , a famous Russian mathematician who was born in 1856. 

The Markov Models

When applied in the context of reliability and safety analysis, Markov models provide a quantitative technique for describing the temporal evolution of a system in terms of a set of discrete states and transitions between them, given that the current and future states of the system The system does not depend on its state at any time in the past, but only on the present state. Markov models are particularly useful for analyzing redundant systems, as well as systems where the occurrence of system failures depends on the sequence of occurrence of individual component failures. That is, in a summarized and simplified way:

  • The probability of being in a specific state state at a future time t only depends only on the state of the system in this state, and not on the states that the system has had before.
  • A "out of memory" process, where the next state of the process depends only on the previous state and not on the sequence of states.

In general, a system can be described by means of a Markov process as long as all repair times and random failures of its constituent components are independent and exponentially distributed (the exponential failure rate model is widely used in the railway industry) .

Other types of distributions can be approximated with a Markov process , however this requires an appropriate expansion of the state space that significantly increases the number of states.

We normally recommend performing the following steps when developing a Markov model :

  • Defining the objectives of the analysis, for example, defining whether transient or stationary measurements are of interest and identifying the relevant properties of the system and boundary conditions . , for example, whether the system is repairable or non-repairable.
  • Verification that Markov modeling is adequate for the problem in question. This includes checking whether the random failure and repair times of system components can in fact be assumed independent and exponentially distributed.
  • Identification of the states required to describe the system for the purposes of the analysis, for example, various failure states may be possible ways a system may fail and the ways in which it can recover from a failed state; in addition, system failure due to common causes may require additional failure states.
  • Verification of whether the system states can be combined to simplify the model without affecting the results. This may be the case if the system exhibits certain symmetries: for example, it may be sufficient to distinguish between the number of identical components in operation, rather than distinguishing each of the individual functional components.
  • Establish the transition matrix based on the identification of the system states; evaluation of the model according to the objective of the analysis; and interpretation and documentation of the results.

It is important to always keep in mind that a model is a way of abstracting the real world so that static and dynamic interrelationships are represented. With an appropriate model of a real world situation, we should be able to predict certain outcomes or determine how the real world would behave if we implemented a particular alternative decision. Model building is identifying the important variables and relationships and then translating a perception of the real world into these essential relationships and variables and thus into a model that is manageable and hopefully computationally manageable.

Markov models are used to describe the deterioration and critical damage of railroad assets and to investigate the effects of an asset management strategy on a section of track. Other uses include modeling the reliability and availability of rolling stock under complex maintenance strategies or the reliability and safety of rail transport systems.

The strengths of Markov modeling include the ability to describe complex redundancies and multi-state dynamic systems , providing various transient or stationary measurements as results. Support state transition diagrams provide a simple and intuitive means of visualizing and communicating the structure of the model. 

Practical example: a 2oo3 system

A system of three identical and independent channels with a failure rate λ with 2oo3 vote (typical architecture of a high availability rail interlock). Since the channels are identical, it is enough to define four states and take into account that the system will be able to function correctly as long as it can remain in states 0 and 1.

Knowing the failure rate λ and the recovery rate µ, we can easily calculate P0, P1, P2 and P3 and, therefore, what is the probability that the system continues to function correctly (P0 + P1) and the probability that the system stops working properly (P2 + P3).

At Leedeo Engineering , we are specialists in the development of RAMS projects, supporting RAM and Safety tasks at any level required, and both at the infrastructure or on-board equipment level.

Are you interested in our articles about RAMS engineering and Technology?

Sign up for our newsletter and we will keep you informed of the publication of new articles.