How to interpret a box plot?

I have some data in which there are 5 categorical explanatory variables (concern, breath, weath, sleep, act) and 1 continuous response variable (tto). In addition, every categorical explanatory variable is divided into 5 levels which show how strong a person feels about it. level 1 and level 5 show the perfect and worst states respectively.

I was advised to create a box plot to see the relationship between the explanatory variables and the response variable. The plot is given below. However, I do not know how to read a box plot. Can any one please help me interpret it.

enter image description here


Interpretation of the box plot (alternatively box and whisker plot) rests in understanding that it provides a graphical representation of a five number summary, i.e. minimum, 1st quartile, median, 3rd quartile and maximum. The box encompasses 50% of the observations. The ends of the whiskers (vertical lines emanating from the top and bottom of the box) typically show where the minimum and maximum lie. However, where possible outliers exist (sometimes assessed based on 1.5 $\times$ interquartile range) points are added, as is the case for your figure.

It may be useful for you to look at a histogram or density plots on specific categories of the data as that may help you understand what the box plot is saying.

@Glen_b rightly indicates that left skew is evident and the central tendency for the 5th level of strength of feeling is lower than the others. It is difficult however to see whether that difference would be statistically significant or not.

Source : Link , Question Author : Günal , Answer Author : Nick Cox

Leave a Comment