Project management risk identification

In this tutorial you will learn:

How to perform training to learn the factor probabilities in the model
How to use inference to query the model and reason about probabilities
How to perform inference in the model editor or the Python SDK
How to interpret the results of inference

We will use an construction project management risk identification dataset (Qazi et al. 2016) and attempt to answer the following question:

Problem statement: Based on possible risks in the construction project management process, what is the probability of

A decrease in the quality of work
Low market share/reputational issues
Time overruns
Cost overruns

The model file and sample data for this example is shown below:

Network variables and categories

For this network we will use the following variables:

Variable

Examining the network

Although the model is stored internally in Genius as a factor graph, it is often more intuitive to examine the Bayesian network. This network has the following structure:

Here are some things we can conclude from this network:

Delays/interruptions (R14) depend on conflicts with project stakeholders (R12), changes in project specifications (R11), and major design changes (R6).
Time overruns (O3) depend on delays/interruptions (R14), delays in design and regulatory approvals (R3), and delays in obtaining raw material (R7).
Cost overruns (04) depends . n major design changes (R6), unexpected events (R9), decrease in productivity (R13), and increase in raw material price (R10).
Low market share/reputational issues (O2) depends on time overruns (O3), decrease in quality of work (O1), and use of innovative technology (C2)

Model training (parameter learning)

We will assume that the probabilities in the Bayesian network are not available for this task and must be collected from the real world. In this case, we assume that past projects by the company can be used to create a dataset much like we did in the electric vehicle fire risk example.

Here we follow the steps for workflow #2 in the model editor. However, we could also use workflow #1 and use the Data to Model Wizard.

First, we must build the model in the model editing canvas using the Model Panel interface and set all the probabilities to uniform. This choice of probabilities is the default when we first create a model in the model editor. Now we are ready to perform training (parameter learning). To make this process quicker, the model with uniform probabilities is provided below in the model files and sample data section.

Select Model in the main menu.
Click Open to load a model from JSON and load the file projectmanagement_uniform.json into the model editing canvas.

You should see the model appear in the model editing canvas. If you inspect the properties of each factor you will see that the probabilities are set to uniform. The next step is to train the model to learn the parameters.

Select Model in the main menu.
Click Train.
Follow the prompts on the screen to upload the projectmanagement_2000.csv dataset for training.

If you inspect the probabilities for each factor you will see that the values have changed from uniform to new values determined by the imported data.

First, we load in a version of the project management model that contains uniform probabilities:

agent.load_model_from_json(json_path="projectmanagement_uniform.json")

Assuming the dataset is stored in a file called projectmanagement_2000.csv, we can learn the model parameters as follows:

agent.learn(csv_path="projectmanagement_2000.csv")

If you inspect the factor probabilities you will see that learning is successful.

Inference

Next, we use the network to answer our problem statement. Let's use the following fictitious scenario:

Suppose after we build the above network a new construction project begins. It is observed that the contractor lacks experience (R1), there are major design changes (R6), delays in obtaining raw materials (R7), changes in project specifications (R11), and conflicts with project stakeholders (R12). Now we wish to answer the problem statement - what is the probability of:

A decrease in the quality of work
Low market share/reputational issues
Time overruns
Cost overruns

First, click on the inference tab in the information panel and select O1, O2, O3, or O4 depending on which variable we are interested in determining the probability of. Then under "what do you want to observe?" set the following variables to "yes": R1, R6, R7, R11, R12. Then click "Run".

The probability of a decrease in the quality of work (O1):

The probability of low market share/reputational issues (O2):

The probability of time overruns (O3):

The probability of cost overruns (O4):

First we build the evidence for the different scenarios listed above:

evidence = {"R1": "YES", "R6": "YES", "R7": "YES", "R11": "YES", "R12": "YES"}

Now we run inference for each of the variables we are interested in

inference_result = agent.infer(variables="O1", evidence=evidence, verbose=True)
inference_result["probabilities"]

{'YES': 0.9804270462633452, 'NO': 0.019572953736654804}

inference_result = agent.infer(variables="O2", evidence=evidence, verbose=True)
inference_result["probabilities"]

{'YES': 0.712012055011406, 'NO': 0.287987944988594}

inference_result = agent.infer(variables="O3", evidence=evidence, verbose=True)
inference_result["probabilities"]

{'YES': 0.9500028623166014, 'NO': 0.04999713768339855}

aginference_result = agent.infer(variables="O4", evidence=evidence, verbose=True)
inference_result["probabilities"]

{'YES': 0.9500028623166014, 'NO': 0.04999713768339855}

Interpreting the results

The observed evidence for the project suggests a high risk in the decrease of quality of work, time overruns, and cost overruns. The probability of low market share/reputational issues is lower but still quite high.

PreviousElectric vehicle fire risk NextSupply chain delays

Last updated 9 months ago

hashtagNetwork variables and categories

hashtagExamining the network

hashtagModel training (parameter learning)

hashtagInference

hashtagInterpreting the results

Network variables and categories

Examining the network

Model training (parameter learning)

Inference

Interpreting the results