Plotting in sorted order using Plotnine
Asked Answered
W

2

6

I have a dataframe I am attempting to plot. I would like the data points to appear in sorted order along the x-axis in my plot. I have tried sorting the dataframe prior to passing it to ggplot, however my order gets disregarded. My data is as follows, I want to sort on the 'value' attribute.

       var1     var2  value     direction
0      PM25     PBAR  0.012001          1
1      PM25  DELTA_T  0.091262          1
2      PM25       RH  0.105857          1
3      PM25      WDV  0.119452          0
4      PM25     T10M  0.119506          0
5      PM25      T2M  0.129869          0
6      PM25     SRAD  0.134718          0
7      PM25      WSA  0.169000          0
8      PM25      WSM  0.174202          0
9      PM25      WSV  0.181596          0
10     PM25      SGT  0.263590          1

This is what my code looks like currently:

tix = np.linspace(0,.3,10)
corr = corr.sort_values(by='value').reset_index(drop = True)
p = ggplot(data = corr, mapping = aes(x='var2', y='value')) +\
  geom_point(mapping = aes(fill = 'direction')) + ylab('Correlation') + ggtitle('Correlation to PM25') +\
  theme_classic() +  scale_y_continuous(breaks = tix, limits = [0, .3])

print(p)

This produces the following plot:

1

Wessling answered 22/6, 2020 at 4:9 Comment(2)
By default, ggplot2 will treat any textual element along the x-axis as a factor variable and will set the levels in alphabetical order. To get the behavior you want, you need to make your Variable value to be a factor in the order provided. And then try to recreate the plot.Mixon
But as I hit submit, I didn't realize this was a Python implementation. The above comment is what you would need to do in R. It may or may not still be applicable.Mixon
S
13

You can do it in two ways

  1. Make sure the variable mapped to the x-axis is a categorical and the categories are ordered correctly. Below I use the fact that pd.unique returns values in order of appearance.
corr.sort_values(by='value').reset_index(drop = True)
corr['var2'] = pd.Categorical(corr.var2, categories=pd.unique(corr.var2))
...
  1. Plotnine has an internal function reorder (introduced in v0.7.0) which you can use inside an aes() call to change the order of values of one variable based on the values of another variable. See the documentation for reorder at the bottom of the page.
# no need to sort values
p = ggplot(data = corr, mapping = aes(x='reorder(var2, value)', y='value')) +\
...
Stuck answered 22/6, 2020 at 20:22 Comment(1)
The link to the reorder documentation is dead, could you update? The first answer doesn't actually sort anything (the lines don't have any effect) so I would like to try reorderHoke
S
1

I couldn't get reorder() to work but I was able to use scale_x_discrete() to control the order. See https://mcmap.net/q/1769651/-plotnine-bar-plot-order-by-variable

Semipostal answered 25/8, 2020 at 12:34 Comment(1)
This was worked for meClayberg

© 2022 - 2024 — McMap. All rights reserved.