In this article, we will guide you on how to calculate and plot a Cumulative Distribution Function (CDF) in Python. Is an important statistical concept used in data analysis, and knowing how to calculate and plot it in Python can be useful for any data scientist or analyst.

Before we dive into the technicalities, let’s first understand what it is.

## What is a Cumulative Distribution Function?

A Cumulative Distribution Function is a function that maps the probability of a random variable to its cumulative distribution. Is used to determine the probability that a random variable takes a value less than or equal to a certain value.

In other words, this gives the probability of a random variable being less than or equal to a specific value. It is the integral of the probability density function (PDF) of a random variable, and its range is between 0 and 1.

Now, let’s move on to the technical aspects of how to calculate and plot a CDF in Python.

## Example 1: CDF of Random Distribution in python

To plot of a random distribution in Python using NumPy and Matplotlib libraries, you can use the `numpy`

library to generate the data and then use the `matplotlib`

library.

Here’s an example:

` ````
```import numpy as np
import matplotlib.pyplot as plt
# Generate some random data
data = np.random.normal(size=1000)
# Sort the data in ascending order
sorted_data = np.sort(data)
# Generate evenly spaced percentiles
percentiles = np.linspace(0, 100, len(sorted_data))
# Calculate the cumulative distribution function
cdf = np.cumsum(np.ones_like(sorted_data))/len(sorted_data)
# Plot
plt.plot(sorted_data, cdf)
plt.xlabel('Data')
plt.ylabel('CDF')
plt.show()

Output:

In this code, we first generate some random data using the NumPy library’s `normal()`

function. We then sort the data in ascending order using `np.sort()`

.

We generate evenly spaced percentiles using the `linspace()`

function from NumPy, and then use `np.cumsum()`

to calculate the cumulative sum of a vector of ones the same length as the sorted data, divided by the length of the sorted data to get the empirical cumulative distribution function.

Finally, we plot the using Matplotlib’s `plot()`

function, and add labels to the axes before showing the plot with `show()`

.

## Example 2: CDF of Normal Distribution

To plot as a normal distribution in Python, you can use the `scipy.stats`

module to generate the values and the `matplotlib`

library to plot.

Here’s an example:

` ````
```import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm
# Define the mean and standard deviation of the normal distribution
mu, sigma = 0, 1
# Generate evenly spaced values for the x-axis
x = np.linspace(-5, 5, num=1000)
# Calculate the values for the normal distribution
cdf = norm.cdf(x, mu, sigma)
# Plot the CDF
plt.plot(x, cdf)
plt.xlabel('Data')
plt.ylabel('CDF')
plt.show()

Output:

In this code, we first define the mean and standard deviation of the normal distribution. We then generate evenly spaced values for the x-axis using `np.linspace()`

.

Next, we use the `norm.cdf()`

function from the `scipy.stats`

module to calculate the CDF values for the normal distribution with the given mean and standard deviation.

Finally, we plot the using Matplotlib’s `plot()`

function, and add labels to the axes before showing the plot with `show()`

.

## Wrap up

To learn more about SciPy library, check out the:

https://matplotlib.org/

Thanks for reading. Happy coding!