Medical diagnosis

This tutorial is a simple example of medical diagnosis to help teach the concepts of Bayesian inference. Deployed medical diagnosis systems would almost certainly be more complex than this simple example and include many other diagnostic tests as well as the professional opinion of a medical practitioner to infer a patient's diagnosis. Furthermore, systems of this kind would likely need governmental approval and rigorous testing before deployment in a hospital setting.

In the Example: Medical diagnosis tutorial we learned how to build a medical diagnosis model from scratch. In this example you will learn:

How to use inference to query the model and reason about probabilities
How to perform inference in the model editor or the Python SDK
How to interpret the results of inference

As a reminder, the medical diagnosis model (Lauritzen and Spiegelhalter 1988) attempted to answer the following:

Problem statement: Let's suppose that a patient visits a clinic with shortness-of-breath (dyspnoea). Dyspnoea is a common symptom of for tuberculosis, lung, cancer, and bronchitis. If a patient presents this symptom, which of the diseases might they have, if any?

This statement is a perfect application for Bayesian inference. Assuming that the model is already available, we can ask some questions of interest to address the problem statement:

What is the probability that a patient has tuberculosis given that they show symptoms of dyspnoea and a negative X-ray? What if the X-ray were positive?
What is the probability that a patient has lung cancer given that they show symptoms of dyspnoea and a negative X-ray? What if the X-ray were positive?
What is the probability that a patient has bronchitis given that they show symptoms of dyspnoea and a negative X-ray? What if the X-ray were positive?

The model file and CSV for this example are provided below:

Querying the model

Below we will see how to perform probabilistic inference both in the model editor and the SDK. We then interpret the results.

To query the model, we need to first connect to the agent, load a JSON model file, and send the loaded model to the agent. We will use the lung cancer dataset JSON file which can either be pasted into the box during loading or saved locally. After this is done, we are ready to query the model.

The probability of having tuberculosis given dyspnoea and a negative X-ray.

To determine this we must use the Inference tab. Under the Inference section of this tab the question is "what do you want to know?" This is asking us what variable we are interested in inferring the probability of. Here we are interested in the probability of having tuberculosis so we select tub.

Next, we need to apply the evidence. What do we already know? We know the patient shows symptoms of dyspnoea (dysp=yes) and a negative X-ray (xray=no). We can input this evidence in the Inference tab:

After clicking we obtain the following results:

We can continue in this fashion for the other questions of interest.

The probability of having tuberculosis given dyspnoea and a positive X-ray.

The probability of having lung cancer given dyspnoea and a negative X-ray.

The probability of having lung cancer given dyspnoea and a positive X-ray.

The probability of having bronchitis given dyspnoea and a negative X-ray.

The probability of having bronchitis given dyspnoea and a positive X-ray.

First, we need to connect to a Genius agent and load the model file. We will use the lung cancer dataset JSON file and save it locally. Assuming the GeniusAgent class is imported (omitting Genius agent connection details):

agent = GeniusAgent()
agent.load_model_from_json(json_path="asia.json")

To query the model with the Python SDK we must pass in the variable whose probability we are interested in using to the variables argument. Then we provide a dictionary that includes our evidence or data for other variables in the model in the evidence argument.

The probability of having tuberculosis given dyspnoea and a negative X-ray.

variable_id = "tub"
evidence = {"dysp": "yes", "xray": "no"}

inference_result = agent.infer(variables=variable_id, evidence=evidence)
print(inference_result["probabilities"])

{'yes': 0.000449821453787264, 'no': 0.9995501785462128}

The probability of having tuberculosis given dyspnoea and a positive X-ray.

variable_id = "tub"
evidence = {"dysp": "yes", "xray": "yes"}

inference_result = agent.infer(variables=variable_id, evidence=evidence)
print(inference_result["probabilities"])

{'yes': 0.11393332539070083, 'no': 0.8860666746092991}

The probability of having lung cancer given dyspnoea and a negative X-ray.

variable_id = "lung"
evidence = {"dysp": "yes", "xray": "no"}

inference_result = agent.infer(variables=variable_id, evidence=evidence)
print(inference_result["probabilities"])

{'yes': 0.002452775210524516, 'no': 0.9975472247894754}

The probability of having lung cancer given dyspnoea and a positive X-ray.

variable_id = "lung"
evidence = {"dysp": "yes", "xray": "yes"}

inference_result = agent.infer(variables=variable_id, evidence=evidence)
print(inference_result["probabilities"])

{'yes': 0.6212527966776288, 'no': 0.3787472033223713}

The probability of having bronchitis given dyspnoea and a negative X-ray.

variable_id = "bronc"
evidence = {"dysp": "yes", "xray": "no"}

inference_result = agent.infer(variables=variable_id, evidence=evidence)
print(inference_result["probabilities"])

{'yes': 0.8633919827619309, 'no': 0.13660801723806912}

The probability of having bronchitis given dyspnoea and a positive X-ray.

variable_id = "bronc"
evidence = {"dysp": "yes", "xray": "yes"}

inference_result = agent.infer(variables=variable_id, evidence=evidence)
print(inference_result["probabilities"])

{'yes': 0.6818685384593828, 'no': 0.31813146154061717}

Interpreting the results

These results indicate that if the patient has dyspnoea and a positive X-ray then it is most likely that they have bronchitis. However, the chances that they may have lung cancer is also quite high. If we were to also include the fact that the patient is a smoker, the probability of lung cancer increases to 71%.

The probability that a patient has tuberculosis, given that they have dyspnoea and a positive X-ray is relatively low (11%) because the disease is quite rare in general. However, once we condition on the fact that a patient also traveled to Asia recently, the probability increases to 39%. This fact indicates how the accumulation of relevant contextual data can dramatically change the results of an inference. It also underscores the importance of building a model that contains the relevant pieces of information for a particular problem at hand.

PreviousBuilding a medical diagnosis model NextInsurance

Last updated 9 months ago

hashtagQuerying the model

hashtagInterpreting the results

Querying the model

Interpreting the results