Risk assessment, risk management, and risk methods and models encompasses a vast array of topics, see Figure 1. I have arbitrarily divided the topics covered here into categories: EUT versus socio-psychological methods are at the top level of Figure 1. Within the EUT category, there are many methods that are broadly classified as single-asset versus systems of assets. A single-asset risk assessment involves a single “target” such as a building, airport, or computer. A system risk assessment involves many assets connected together to form a system. The electrical power grid, Internet, and healthcare system are examples. This tutorial addresses each of these in turn, moving from top-to-bottom, and right-to-left.

A variety of risk assessment and management processes have been proposed in the security literature. Indeed, there are many methods and tools to choose from. The risk methods and models outlined in Figure 1 are only a few of the many alternatives that are part of the general risk assessment process. None the less, the risk methods surveyed here are representative of models and methods at the heart of nearly all processes that combine the collection, analysis, mitigation, and evaluation steps of a robust risk assessment and management process. These methods and models are at the core of most scientifically sound processes in use, today.

The father of modern risk assessment as it applies to homeland security was an MIT professor of nuclear engineering, Norman Rasmussen (1927-2003). He achieved notoriety in the 1970s by debating the safety (or lack of it) of nuclear power plants with Ralph Nader. The televised debate took place in 1976, 3 years before the Three Mile Island nuclear power plant meltdown.

 

Perhaps more important than the debate with Nader was the method of risk assessment employed by Rasmussen – now known as PRA (probabilistic risk analysis). His 1975 report defined risk as the expected loss due to a failure: risk = Pr(failure) C(failure), where Pr(failure) is the likelihood of a reactor failing, and C(failure) is its consequence. Rasmussen’s use of expected utility theory in PRA is easy to understand, but it can be difficult to apply in practice, especially if power plant operators cannot calculate Pr(failure) and C(failure). Where do Pr(failure) and C(failure) come from?

For example, the Three Mile Island nuclear power plant meltdown (TMI) was supposed to be impossible. Thus, Pr(failure) was supposed to be zero. In hindsight, Pr(failure) is not zero, but how does an operator know this beforehand? If we use a priori analysis, we must know all of the ways failure can happen and all the ways it cannot. If we employ Laplace’s definition, S = 0, T = 0, so Pr(failure) = 50%! If we believed this in 1970, TMI would never have been built. On the other hand, we now have historical data to support an a posteriori estimate based on three major nuclear power plant catastrophes over the past 60 years. [This is left as an exercise for the reader.,]

Estimating consequences is somewhat easier, but not straightforward. The Three Mile Island meltdown caused approximately $2.4 billion in property damages and $1 billion in cleanup costs. Although a number of studies were conducted to assess the health consequences on the surrounding region, the consequences of radiation exposure have never been fully quantified. Furthermore, the nuclear power industry suffered for decades following the incident. How does one put a dollar value on “loss of business”? This illustrates the difficulty of estimating consequences, even when the risk method is as simple as Rasmussen’s PRA.

Rasmussen addressed the problem of estimating Pr(failure) using an old engineering technique called fault tree analysis, FTA. Instead of attempting to calculate Pr(failure), directly, he decomposed the critical components of each power plant into simpler components and then inserted them into a fault tree to determine how local failures perturb the entire power plant. The likelihood of each component failing, and its impact on the entire power plant is modeled as a tree, see Figure 2.

For example, in Figure 2, failure of the entire plant occurs if one or more of the following threats (accidents) succeed:

POWER: Electrical power is cut off
COOLING: The reactor overheats due to lack of cooling (water pumps fail)
TSUNAMI: A tsunami swamps the plant as happened at Fukushima dai-ichi.

Note the OR logic in the fault tree. This means that failure is imminent if any one or any combination of the three threats occurs either singly or together. Suppose the probability of a POWER outage is 10%, probability of a COOLING failure is 5%, and probability of a tsunami is 1%. The probability that the plant will fail due to a fault in at least one of these components is 15.36%. If the consequence is $10 million, then PRA risk is $1.536 million.

Fault tree analysis has its strengths and weaknesses. It is simple to apply and understand, but it does not solve the problem of identifying the threats, calculating probabilities of threats succeeding, and the consequences of failure. We must rely on other approaches to obtain these numbers. The good news is that tools like Model-Based Risk Analysis, MBRA, exist to perform all calculations, including automatic allocation of a fixed budget to reduce probability of component failure and minimize risk. A user inputs probabilities, consequences, and risk-reducing costs, and MBRA returns a prescription for minimizing risk by optimally allocating funds to reduce each individual threat.

PRA is perhaps the simplest method of risk assessment for single assets such as power plants, bridges, and buildings. Physical assets are much easier to assess because we know the threats (bombs, hurricanes) and we generally know the consequences. For example, it is easy to calculate the consequences of a car bomb exploding 50 yards from a building or the impact of shutting down a power plant because of mechanical or environmental faults. But when human actors enter the threat picture, PRA must be modified to account for intent as well as capability.

Terrorism may be modeled in a number of ways. It can be considered a contagion, so the spread of terrorism behaves like a disease. It can also be considered an intelligent adversary completely devoid of chance. In the PRA model, threat is a measure of the (probabilistic) intent and capability of a malicious human actor and therefore is simply another probability, T. In its simplest formulation, T is the probability an attack will be attempted.

The success of a terrorist attack on a target depends on its vulnerability, which is defined as the probability an attack will succeed if attempted. A building is not very vulnerable to a knife attack. But it is highly vulnerable to a bomb attack. Thus, vulnerability is paired with the threat and target. When a human actor is involved, risk is defined in terms of a coupled Threat-Component-Vulnerability triad – a model that has been advocated and used in numerous risk methods.

 

When considering human intent, Rasmussen’s PRA model must be modified to account for the Threat-Component-Vulnerability triad. The most obvious model defines risk as the product of T, V, and C. Here, T is the probability that an attack will be attempted, V is the (conditional) probability that an attempt will succeed, and C is the consequence as before. In this modification of PRA, Pr(failure) is the product of two probabilities, TV, and risk is the product of all three triad elements.

Risk = T(attacked)V(successful if attacked)C(failure)

Or in simple algebraic terms, R = TVC.

The TVC model makes serious assumptions, however, which are the source of major criticism. Specifically, it assumes T and V are independent. That is, it assumes that terrorists are not influenced by a target’s vulnerability. A weaker target is as likely to be attacked as a more secure target. This assumption may not hold in all cases, and even when it does, how are T and V obtained?

One of the most successful applications of PRA outside of the nuclear power industry is MSRAM (Maritime Security Risk Assessment Method) -- the US Coast Guard’s method and tool for assessing port security. MSRAM is PRA on steroids. It incorporates tools for estimating T, V, and C utilizing an enhanced definition of PRA:


where T is INTENT x CAPABILITY
INTENT is a measure of propensity to attack
CAPABILITY is a measure of ability to successfully attack
V is a measure of target weakness

C is modified consequences, moderated by preventive measures.

 

In MSRAM, T is a combination of a terrorist’s intent and ability to carry out an attack. V is a measure of lack of prevention, lack of target hardening and mitigation, and lack of resiliency. Consequence, C, is actually a reduced consequence calculated by considering how much port authorities collaborate with sister law enforcement agencies, the port’s response capability, and other target-hardening factors.

MSRAM is scenario driven, meaning it takes a user through a scenario as it collects inputs. One scenario may describe a plot to block a port by ramming and submerging a large ship. Another scenario may describe an IED attack on key assets, etc.

MSRAM has been used to assess thousands of port assets in terms of its Risk Index Number, RIN, which is a modified form of TVC. It is also part of more elaborate risk assessment tools used by the USCG to allocate resources such as people and ships.

PRA, MSRAM, and fault-tree analysis methodologies have many critics. Only the top three are considered here. First, there is no consideration of the cost involved in reducing risk. Most operators simply rank assets according to risk and then apply resources to the highest-risk-ranked assets. This is famously non-optimal in general, because different targets cost different amounts to protect and/or one asset benefits more from each dollar of investment than another asset.

For example, the cost to harden the Golden Gate Bridge is much higher than the cost to harden a Google Internet server. Even if the risk to both Golden Gate Bridge and Google Internet server were identical, it is more rational to allocate more resources to the lower-cost asset, because overall risk reduction is greater when investing in the less-expensive asset.

One remedy to this non-optimality is to incorporate prevention or vulnerability-reducing costs in the definition of risk. For example, suppose the cost to reduce vulnerability is E, and then define risk as TVC/E. Overall risk across two assets is T1V1C1/E1 + T2V2C2/E2. There may be vast differences in return on investment between T1V1C1/E1 and T2V2C2/E2 such that optimal risk reduction dictates that more resources be invested in one asset than the other one.

As an illustration, suppose the data on power plant risk is supplemented with costs of reducing vulnerability by hardening the plant. To make the illustration even simpler, suppose the elimination costs, E, are all the same - $10 million each – but the power plant operators only have $10 million to spend on all three vulnerabilities. [Note that it would take $30 million to eliminate all three vulnerabilities]. Finally, the model shown in Figure 2 assumes an exponentially declining diminishing return on investment. Initial investments reduce vulnerability more than additional investment. How should the $10 million be invested?

Using the EUT definition of risk, and the assumption of exponential decrease in vulnerability reduction per dollar invested, risk minimization produces an optimal allocation as follows:

POWER: $6.8 million
COOLING: $3.2 million
TSUNAMI: $0.0 million

Why did vulnerability to TSUNAMI damage receive no funding? The answer: return on investment was far lower for TSUNAMI protection than the other two vulnerabilities. The best overall use of $10 million is often unsatisfying to operators, because leaving one asset unprotected is undesirable, but optimal allocation lowers overall risk more than any other kind of allocation.

 


The second major criticism of PRA concerns the placement of T on the right-hand-side of the risk equation. Critics say that T should be an output rather than an input to risk assessment. That is, threat should be a byproduct of risk assessment, because terrorists are more likely to attack weaker targets than stronger or better-protected targets. According to the critics of PRA, a rational terrorist will attack the most-vulnerable target to maximize his or her expected utility. In other words, terrorists may perform their own risk assessment to determine the best return on their own investment. This line of reasoning is at the heart of game-theoretic approaches described later.