The Normal Distribution.

Bharadwaj Narayanam
AlmaBetter
Published in
3 min readJul 18, 2021

--

Forget about the distribution. Just think, what is normal? Doing a 9–5 job is normal. Being able to walk is normal. Getting married is normal. Let’s get to the other side of the story. Someone starting up their own company is rare. Very few people are physically handicapped, and come on, who stays single these days? (Cries in the corner)

Photo by Samuel Regan-Asante on Unsplash

Ever wondered why is it normal? It is because most of the others are also doing it.

You will have to learn to live with the normal distribution if you belong to the field of data science.

Let us get a bit technical now. As usual, I will start with an example.

Say, you are an average student like me and are really worried about the marks other students scored in your class. You just want to check if your marks are normal.

The full marks were 100 and you scored 73. You found out that the average marks of your class are 50 with a standard deviation of 23. Now, what is the standard deviation? Standard deviation is a quantity that measures, how much the members of a group differ from the mean. You can refer to this article to know about it in detail and compute it.

Look at the below image and try to understand it. It is a graph of the above example. For our convenience, let us consider that the mean and mode of this distribution are the same.

Let me tell you, the normal curve (or any curve) peaks at its mode. In our case, the mean and the mode are 50. So the curve peaked at 50.

You know that you are normal. With what confidence can you say that, Is there any measure for your confidence? It would be cool to measure our confidence, isn’t it?

We have something called a confidence interval. Given a certain range of marks, with what confidence can you say that your score lies in that range?

This is measured by certain statistics, which is out of context for now.

Skewness

Now, what if the mean is 50, but the mode is somewhere around 62? The distribution will be slightly skewed towards the right. Which makes it a negatively skewed distribution. If the peak of the distribution is away from the Y-axis, it is negatively skewed. It is said so because it has most of its values towards its left. The positively skewed distribution is the exact opposite of the negatively skewed distribution.

What happens if something is not normal? In most cases, it is considered an outlier and is removed or replaced from the data.

What do you think of yourself, Are you NORMAL?

--

--

Bharadwaj Narayanam
AlmaBetter

On a mission of writing 100 quality articles related to statistics and data science.