The level
here describes the cumulative mass below a given threshold. As described with an example in the documentation.
Number of contour levels or values to draw contours at. A vector argument must have increasing values in [0, 1]. Levels correspond to iso-proportions of the density: e.g., 20% of the probability mass will lie below the contour drawn for 0.2. Only relevant with bivariate data
You can describe levels in 2 ways -
- Specify the number of partitions you want in your probability mass function (levels = 5 makes 4 contour lines that partition the probability mass function into 5 parts)
- Explicitly mention the thresholds for each of the contours as a vector
The partitions mentioned here describe the area outside the contour plot. So, 0.2 means, 20% of the probability mass lies outside the first contour that represents 20%. Playing around with the following code makes this clearer.
I show both the implementations below for your reference.
import seaborn as sns
geyser = sns.load_dataset("geyser",)
#Levels as equal cuts in the probability mass function
sns.kdeplot(
data=geyser, x="waiting", y="duration", hue="kind",
levels=5
)
#Levels as explicitly described cuts in the probability mass function
sns.kdeplot(
data=geyser, x="waiting", y="duration", hue="kind",
levels=[0.3, 0.4, 0.8]
)
levels
is set to a single number, it is supposed to be the number of contour lines (or areas in casefill=True
). When levels is an array, each of the entries defines a contour line; these numbers should be between 0 and 1 (close to 0 meaning almost all samples will fit into the contour; close to 1 means only the most central samples will fit into the contour). An array with one element will output exactly one contour line. – Witching