
Using Markov Diagrams in BlockSim for Reliability Analysis
Invented by Russian mathematician Andrey Markov, Markov chains are used across a broad range of applications to represent a "memoryless" stochastic process. This process is made up of random variables that represent the evolution of the process through various states. The meaning of "memoryless," also called the Markov property, is that the probability of being in a state during the next step is only dependent on the information present in the current step and not on any information from any steps prior to the current step. This article presents a way of using Markov chains in BlockSim 10 (if this feature is supported by your license).
When Markov chains are used in reliability analysis, the process usually represents the various stages (states) that a system can be in at any given time. The states are connected via transitions that represent the probability, or rate, that the system will move from one state to another during a step, or a given time. When using probabilities and steps the Markov chain is referred to as a discrete Markov chain, while a Markov chain that uses rate and the time domain is referred to as a continuous Markov chain. In this article we will limit ourselves to discrete Markov chains.
In a discrete Markov chain, we have to define each possible state that the system can be in at any given time, and also the transition probabilities per step that link the states together. The steps can represent time, but they do not have to. Lastly, we must also define the initial state probabilities that give us the starting point(s) of the system.
Mathematically, we can represent the initial state probabilities as a vector such that X_{i} represents the initial probability of being in state i:
The transitions between the states can be represented by a matrix :
where, for example, the term P_{12} is the transition probability from state 1 to state 2.
Then if we want to know the probability of being in a particular state after n steps, we can use the ChapmanKolmogorov equation to arrive at the following equation:
where is the vector that represents the probability of being in a state after n steps. Using this methodology, we can find the point probability of being in a state at each step and from there also calculate the mean probability of being in a state over a certain number of steps.
Example
In BlockSim 10, we are doing an initial estimation analysis on the life cycle of a complex drilling system that starts off as brand new (100% initial probability in the full capacity state). The system has a probability to degrade into various states of capacity with time and can eventually enter a salvage state. There is also a probability of being returned to the asgoodasnew condition from each degraded state, except from the salvage state. The salvage state is considered to be a "sink," a state from which there are no transitions to any other state and therefore we have zero probability of leaving. We want to determine, on average, what percent of the time will be spent in each state over a 10year period. To perform the analysis we will use a discrete Markov chain diagram. Our initial setup looks like this:
We estimate the following probabilities per month to move between states:
 1% chance to degrade from 100% to 80% capacity.
 10% chance to be restored from 80% to 100% capacity.
 3% chance to degrade from 80% to 60% capacity.
 8% chance to be restored from 60% to 100% capacity.
 6% chance to degrade from 60% to 40% capacity.
 5% chance to be restored from 40% to 100% capacity.
 8% chance to degrade from 40% capacity to salvage.
Based on these percentages, the final diagram that is ready for analysis looks like this:
Since our estimated probabilities are on a month scale, we will take each step of the analysis to be the equivalent of one month. This means that we will run our calculation for 120 steps. After we calculate the diagram, we can see that the transition probability matrix between the states looks like this (which we can easily use to verify our inputs):
We can use the state point probability plot to see if our system has reached steady state within our time frame.
In this example, because we have a "sink" state, we do not reach steady state, where all the probabilities have reached a constant value, but rather a pseudosteady state where the probabilities are changing at a roughly constant rate.
Afterwards, we can check the results summary to determine the mean probabilities in each state and the point probabilities after 120 steps (10 years).
Conclusions
From the results we can conclude that the majority of the time (89.4%) our system should be running at 100% capacity and that after the 10year period there is about a 5.3% chance that the system will degrade to a point from which it cannot be restored (the salvage state).