<<< Back to Index
This Month's Tip >>>

Using Markov Diagrams in BlockSim for Reliability Analysis



Invented by Russian mathematician Andrey Markov, Markov chains are used across a broad range of applications to represent a "memoryless" stochastic process. This process is made up of random variables that represent the evolution of the process through various states. The meaning of "memoryless," also called the Markov property, is that the probability of being in a state during the next step is only dependent on the information present in the current step and not on any information from any steps prior to the current step. This article presents a way of using Markov chains in BlockSim 10 (if this feature is supported by your license).

When Markov chains are used in reliability analysis, the process usually represents the various stages (states) that a system can be in at any given time. The states are connected via transitions that represent the probability, or rate, that the system will move from one state to another during a step, or a given time. When using probabilities and steps the Markov chain is referred to as a discrete Markov chain, while a Markov chain that uses rate and the time domain is referred to as a continuous Markov chain. In this article we will limit ourselves to discrete Markov chains.

In a discrete Markov chain, we have to define each possible state that the system can be in at any given time, and also the transition probabilities per step that link the states together. The steps can represent time, but they do not have to. Lastly, we must also define the initial state probabilities that give us the starting point(s) of the system.

Mathematically, we can represent the initial state probabilities as a vector equation such that Xi represents the initial probability of being in state i:


The transitions between the states can be represented by a matrix equation:


where, for example, the term P12 is the transition probability from state 1 to state 2.

Then if we want to know the probability of being in a particular state after n steps, we can use the Chapman-Kolmogorov equation to arrive at the following equation:


where equation is the vector that represents the probability of being in a state after n steps. Using this methodology, we can find the point probability of being in a state at each step and from there also calculate the mean probability of being in a state over a certain number of steps.


In BlockSim 10, we are doing an initial estimation analysis on the life cycle of a complex drilling system that starts off as brand new (100% initial probability in the full capacity state). The system has a probability to degrade into various states of capacity with time and can eventually enter a salvage state. There is also a probability of being returned to the as-good-as-new condition from each degraded state, except from the salvage state. The salvage state is considered to be a "sink," a state from which there are no transitions to any other state and therefore we have zero probability of leaving. We want to determine, on average, what percent of the time will be spent in each state over a 10-year period. To perform the analysis we will use a discrete Markov chain diagram. Our initial setup looks like this:

Initial Markov Diagram setup

We estimate the following probabilities per month to move between states:

  1. 1% chance to degrade from 100% to 80% capacity.
  2. 10% chance to be restored from 80% to 100% capacity.
  3. 3% chance to degrade from 80% to 60% capacity.
  4. 8% chance to be restored from 60% to 100% capacity.
  5. 6% chance to degrade from 60% to 40% capacity.
  6. 5% chance to be restored from 40% to 100% capacity.
  7. 8% chance to degrade from 40% capacity to salvage.

Based on these percentages, the final diagram that is ready for analysis looks like this:

Final Markov Diagram setup

Since our estimated probabilities are on a month scale, we will take each step of the analysis to be the equivalent of one month. This means that we will run our calculation for 120 steps. After we calculate the diagram, we can see that the transition probability matrix between the states looks like this (which we can easily use to verify our inputs):

Transition Probability Matrix

We can use the state point probability plot to see if our system has reached steady state within our time frame.

State Point Probability plot

In this example, because we have a "sink" state, we do not reach steady state, where all the probabilities have reached a constant value, but rather a pseudo-steady state where the probabilities are changing at a roughly constant rate.

Afterwards, we can check the results summary to determine the mean probabilities in each state and the point probabilities after 120 steps (10 years).

Results summary


From the results we can conclude that the majority of the time (89.4%) our system should be running at 100% capacity and that after the 10-year period there is about a 5.3% chance that the system will degrade to a point from which it cannot be restored (the salvage state).