Show correlation values in pairplot
Asked Answered
E

3

18

I have the below data:

prop_tenure  prop_12m  prop_6m  
0.00         0.00      0.00   
0.00         0.00      0.00   
0.06         0.06      0.10   
0.38         0.38      0.25   
0.61         0.61      0.66   
0.01         0.01      0.02   
0.10         0.10      0.12   
0.04         0.04      0.04   
0.22         0.22      0.22 

and I am doing a pairplot as below:

sns.pairplot(data)
plt.show()

However I would like to display the correlation coefficient among the variables and if possible the skewness and kurtosis of each variable. How do you do that in seaborn?

Evalynevan answered 13/6, 2018 at 8:11 Comment(0)
G
46

As far as I'm aware, there is no out of the box function to do this, you'll have to create your own:

from scipy.stats import pearsonr
import matplotlib.pyplot as plt 

def corrfunc(x, y, ax=None, **kws):
    """Plot the correlation coefficient in the top left hand corner of a plot."""
    r, _ = pearsonr(x, y)
    ax = ax or plt.gca()
    ax.annotate(f'ρ = {r:.2f}', xy=(.1, .9), xycoords=ax.transAxes)

Example using your input:

import seaborn as sns; sns.set(style='white')
import pandas as pd

data = {'prop_tenure': [0.0, 0.0, 0.06, 0.38, 0.61, 0.01, 0.10, 0.04, 0.22], 
        'prop_12m':    [0.0, 0.0, 0.06, 0.38, 0.61, 0.01, 0.10, 0.04, 0.22], 
        'prop_6m':     [0.0, 0.0, 0.10, 0.25, 0.66, 0.02, 0.12, 0.04, 0.22]}

df = pd.DataFrame(data)

g = sns.pairplot(df)
g.map_lower(corrfunc)
plt.show()

enter image description here

Graphic answered 13/6, 2018 at 10:36 Comment(1)
Here is a nice solution for multiple hues: #43251521Hollyhock
M
1

Just to mention, for seaborn in more recent version (>0.11.0) the answer above doesn't work anymore. But you need to add a hue=None to make it work again.

def corrfunc(x, y, hue=None, ax=None, **kws):
    """Plot the correlation coefficient in the top left hand corner of a plot."""
    r, _ = pearsonr(x, y)
    ax = ax or plt.gca()
    ax.annotate(f'ρ = {r:.2f}', xy=(.1, .9), xycoords=ax.transAxes)

Reference this issue https://github.com/mwaskom/seaborn/issues/2307#issuecomment-702980853

Melioration answered 3/6, 2022 at 6:51 Comment(1)
The original code works fine for me using seaborn version 0.11.2Graphic
D
-1

In case you are looking for including correlation values on each hue level, I modified the above code. Give it a like if you find it useful.

 def corrfunc(x, y, hue=None, ax=None, **kws):
    '''Plot the correlation coefficient in the bottom left hand corner of a plot.'''
    if hue is not None:
        hue_order = pd.unique(g.hue_vals)
        color_dict = dict(zip(hue_order, sns.color_palette('tab10', hue_order.shape[0]) ))
        groups = x.groupby(g.hue_vals)
        r_values = []
        for name, group in groups:
            mask = (~group.isnull()) & (~y[group.index].isnull())
            if mask.sum() > 0:
                r, _ = pearsonr(group[mask], y[group.index][mask])
                r_values.append((name, r))
        text = '\n'.join([f'{name}: ρ = {r:.2f}' for name, r in r_values])
        fontcolors = [color_dict[name] for name in hue_order]
        
    else:
        mask = (~x.isnull()) & (~y.isnull())
        if mask.sum() > 0:
            r, _ = pearsonr(x[mask], y[mask])
            text = f'ρ = {r:.2f}'
            fontcolors = 'grey'
            # print(fontcolors)
        else:
            text = ''
            fontcolors = 'grey'
        
    ax = ax or plt.gca()
    if hue is not None:
        for i, name in enumerate(hue_order):
            text_i = [f'{name}: ρ = {r:.2f}' for n, r in r_values if n==name][0]
            # print(text_i)
            color_i = fontcolors[i]
            ax.annotate(text_i, xy=(.02, .98-i*.05), xycoords='axes fraction', ha='left', va='top',
                        color=color_i, fontsize=10)
    else:
        ax.annotate(text, xy=(.02, .98), xycoords='axes fraction', ha='left', va='top',
                    color=fontcolors, fontsize=10)


penguins = sns.load_dataset('penguins')
g = sns.pairplot(penguins, hue='species',diag_kind='hist',kind='reg', plot_kws={'line_kws':{'color':'red'}})
g.map_lower(corrfunc, hue='species')
Demurrage answered 4/4, 2023 at 6:6 Comment(1)
Thank you for contributing to the Stack Overflow community. This may be a correct answer, but it’d be really useful to provide additional explanation of your code so developers can understand your reasoning. This is especially useful for new developers who aren’t as familiar with the syntax or struggling to understand the concepts. Would you kindly edit your answer to include additional details for the benefit of the community?Weathered

© 2022 - 2024 — McMap. All rights reserved.