Creating a Bland-Altman plot is an essential technique used in statistics and medical research to visualize the difference between two sets of measurements. In this article, we will guide you through the process of creating a Bland-Altman plot using Python. We will explain the steps involved in creating the plot, the Python libraries needed, and provide an example for you to follow along.
Understanding the Bland-Altman Plot
Before we begin creating the plot, it is important to understand what the Bland-Altman plot represents. The plot is a simple graphical representation of the difference between two measurements against their average. The horizontal line in the plot represents the mean difference between the two measurements, while the two vertical lines represent the limits of agreement.
The mean difference is calculated by taking the difference between the two measurements and finding the average. The limits of agreement are calculated by taking the mean difference and adding or subtracting two standard deviations. This represents the range of values within which 95% of the differences between the two measurements fall.
Python Libraries Required
To create a Bland-Altman plot in Python, we will be using the following libraries:
- NumPy: for scientific computing with Python
- Matplotlib: for plotting and visualization
- Pandas: for data manipulation and analysis
You can install these libraries using pip or conda, depending on your preference.
Step 1: Create the Data
Here’s an example code to create the data for a Bland-Altman plot in Python:
import numpy as np
import matplotlib.pyplot as plt
# Generate two sets of measurements with some random noise
np.random.seed(42)
true_values = np.random.normal(10, 2, size=100)
measurement_1 = true_values + np.random.normal(0, 1, size=100)
measurement_2 = true_values + np.random.normal(0, 1, size=100)
# Calculate difference and average
diff = measurement_1 - measurement_2
avg = (measurement_1 + measurement_2) / 2
In this example, we generate two sets of measurements measurement_1
and measurement_2
by adding some random noise to a set of true values generated from a normal distribution. We then calculate the difference between the two sets of measurements (diff
) and the average of the two sets of measurements (avg
). These arrays can be used to create the Bland-Altman plot in the next step.
Step 2: Create the Bland-Altman Plot
Here’s complete code:
import numpy as np
import matplotlib.pyplot as plt
# Generate two sets of measurements with some random noise
np.random.seed(42)
true_values = np.random.normal(10, 2, size=100)
measurement_1 = true_values + np.random.normal(0, 1, size=100)
measurement_2 = true_values + np.random.normal(0, 1, size=100)
# Calculate difference and average
diff = measurement_1 - measurement_2
avg = (measurement_1 + measurement_2) / 2
# Create Bland-Altman plot
plt.scatter(avg, diff, alpha=0.5)
plt.axhline(y=diff.mean(), color='gray', linestyle='--')
plt.xlabel('Average of two measurements')
plt.ylabel('Difference between two measurements')
plt.title('Bland-Altman Plot')
plt.show()
In this example, we use the scatter()
function from matplotlib to create a scatter plot of the average of the two measurements (avg
) vs the difference between the two measurements (diff
). We set the alpha parameter to 0.5 to make the points semi-transparent. We also add a horizontal line representing the mean difference using the axhline()
function. Finally, we add labels and a title to the plot using the xlabel()
, ylabel()
, and title()
functions.
Output:

Wrap up
To learn more about Bland Altman check out the:
https://www.medcalc.org/manual/bland-altman-plot.php
Thanks for reading. Happy coding!