Seaborn is a powerful and versatile data visualization library built on top of Matplotlib in Python. It provides a high-level interface for drawing attractive and informative statistical graphics. One of the most valuable and widely-used features of Seaborn is the distplot function. Distplot is used to visualize the distribution of a dataset, allowing data analysts and scientists to gain valuable insights into the underlying patterns and trends.

In this article, we will provide a comprehensive guide to understanding and effectively using Seaborn distplot for your data visualization needs.

What is a Seaborn Distplot?

A Seaborn distplot is a plot for visualizing the distribution of a dataset. It combines a histogram and a kernel density estimation (KDE) plot, which provides a smooth representation of the density of the data. The distplot can also plot other distribution plots, such as rug plots and probability density functions (PDFs).

Creating a Basic Seaborn Distplot

				
					import seaborn as sns
import matplotlib.pyplot as plt

# Load example data
tips = sns.load_dataset("tips")

# Create distplot with histogram and kernel density estimate
sns.distplot(tips["total_bill"], kde=False, bins=20)

# Add labels and title
plt.xlabel("Total Bill")
plt.ylabel("Frequency")
plt.title("Distribution of Total Bills")

# Show plot
plt.show()

				
			

Output:

seaborn Distplot

In this example, we first load the “tips” dataset from seaborn’s example data. We then create a distplot using sns.distplot(), passing in the “total_bill” column of the tips dataset. We set kde=False to remove the kernel density estimate and bins=20 to specify the number of bins in the histogram.

Finally, we add labels and a title to the plot using plt.xlabel(), plt.ylabel(), and plt.title(), respectively, and show the plot using plt.show().

Creating Seaborn Distplot with Multiple Variations

By altering the parameters of the distplot() method, it is possible to generate entirely new views. These parameters can be modified to alter color, orientation, and more.

Here’s an example of using subplot() from the pylab module to show four variations of distplot() using different parameters:

				
					import seaborn as sns
import matplotlib.pyplot as plt

# Load example data
iris = sns.load_dataset("iris")

# Create a 2x2 grid of subplots
fig, axs = plt.subplots(nrows=2, ncols=2, figsize=(10, 8))

# Plot a distplot with the default parameters in the top left subplot
sns.distplot(iris["sepal_length"], ax=axs[0, 0])
axs[0, 0].set_title("Default distplot")

# Plot a distplot with a rug plot and a blue color in the top right subplot
sns.distplot(iris["sepal_length"], rug=True, color="blue", ax=axs[0, 1])
axs[0, 1].set_title("Distplot with rug plot")

# Plot a distplot with a kernel density estimate and a green color in the bottom left subplot
sns.distplot(iris["sepal_length"], kde=True, color="green", ax=axs[1, 0])
axs[1, 0].set_title("Distplot with KDE")

# Plot a distplot with both a histogram and kernel density estimate and a red color in the bottom right subplot
sns.distplot(iris["sepal_length"], kde=True, hist=True, color="red", ax=axs[1, 1])
axs[1, 1].set_title("Distplot with both KDE and histogram")

# Increase the margin between the subplots
plt.subplots_adjust(hspace=0.4)

# Show the plot
plt.show()

				
			

Output: 

seaborn Distplot

In this example, we create a 2×2 grid of subplots using subplot() and the figsize parameter to specify the size of the figure. We then plot four variations of distplot() using different parameters in each subplot.

In the top left subplot, we use the default parameters for distplot(). In the top right subplot, we add a rug plot and change the color to blue. In the bottom left subplot, we remove the histogram and add a kernel density estimate (KDE), with the color changed to green. In the bottom right subplot, we show both the histogram and the KDE, with the color changed to red.

Moreover, we add the line plt.subplots_adjust(hspace=0.4) after the last sns.distplot() command to increase the vertical space between the subplots. The hspace parameter controls the height ratio between subplots, and its value of 0.4 adds 40% more height between the subplots.

The result is a plot with more vertical space between the subplots containing the two sns.distplot() commands. You can adjust the hspace parameter to your liking to increase or decrease the margin between subplots.

Finally, we set titles for each subplot using set_title() and show the plot using plt.show().

Customizing Seaborn Distplot

Seaborn distplot provides a lot of customization options to make our plot more informative and attractive. Let’s explore some of the important parameters of the distplot function:

KDE Plot

A kernel density estimate (KDE) plot can be added to the histogram using the “kde” parameter.

				
					sns.distplot(data, kde=True)

				
			

This will add a KDE plot to the histogram, which represents the distribution of the data in a smooth curve.

Rug Plot

A rug plot can be added to the histogram using the “rug” parameter.

				
					sns.distplot(data, rug=True)

				
			

This will add a rug plot to the histogram, which represents the distribution of the data as vertical lines.

Histogram Bins

We can customize the number of bins in the histogram using the “bins” parameter.

				
					sns.distplot(data, bins=20)

				
			

This will create a histogram with 20 bins.

Color and Style

We can change the color and style of the plot using the “color” and “style” parameters.

				
					sns.distplot(data, color='red', style='--')

				
			

This will create a red-colored histogram with dashed lines.

Multiple Datasets

We can plot multiple datasets on the same plot using the “multiple” parameter.

				
					sns.displot(data1)
sns.displot(data2)

				
			

This will create two histograms on the same plot, each representing a different dataset.

Wrap up

To create a Seaborn distplot in Python, you need to import the necessary libraries, load the data you want to visualize, and create the distplot using the sns.distplot() function. You can customize the plot by specifying parameters such as the type of distribution, color, and other aesthetics.

Overall, Seaborn’s distplot function is a powerful tool for visualizing the distribution of data in Python. It allows users to easily explore and understand their data, and create professional-looking plots with minimal code.

To learn more about Seaborn Distplot and Libarary check out the:
https://seaborn.pydata.org/tutorial.html


Thanks for reading. Happy coding!