Check if Heights Follow Normal Distribution with US president Heights Data(Q-Q plot and Hist)

Mohd Saif Ali
2 min readApr 19, 2021

--

With the well-known fact that heights and other naturally occurring things follow a normal distribution, In this project, we are using heights of US presidents to confirm the data. We will be running common visualization tests such as Q-Q plot and Histogram to check the relation stands the same.

Source: https://www.deviantart.com/

I'll be using my Favorite platform of choice ie Jovian to host my code.

Importing relevant libraries that help us read and visualize the data before we start performing tests.

After loading data into a variable called heights. we are checking if the data is loaded properly or not.

plotting the height data using the matplotlib library to check if the data has any outliners

Now that we are sure that the data is free from outliers, we are running a Q-Q plot test against a random normal distribution with help of scipy library, which otherwise can be done by generating NumPy using random.normal(mean,std,size)

As with the Q-Q plot test, we want to check if the data follows a straight line with distribution to check if both follow the same type of distribution, that in our case the normal distribution. Although the data seems to follow the straight lines against normal distribution, yet we can see few points don't lie on the red line. This showcases, the Q-Q plot might be the wrong idea as the data we have is quite less ie 43 data points in our case. The lesson we learn is, use another test such as with histogram when the data is less.

And Hence, the Histogram of our data indeed shows us the data follows a bell curve that's commonly seen with normal distributions. Either way, to further convince you, I'll be plotting another histogram of a randomly generated normal distribution with Numpy.

Hence, we can confidently conclude that indeed our data ie the US president's height follows a normal distribution.

--

--

Mohd Saif Ali

Data Scientist Seeking Cure for lack of melatonin owing to the love of DATA