Plot dataframe columns against each other
Asked Answered
T

4

5

I have this df :

      CET    MaxTemp  MeanTemp MinTemp  MaxHumidity  MeanHumidity  MinHumidity  revenue     events
0  2016-11-17   11      9        7            100           85             63   385.943800    rain
1  2016-11-18   9       6        3             93           83             66  1074.160340    storm
2  2016-11-19   8       6        4             93           87             76  2980.857860    
3  2016-11-20   10      7        4             93           84             81  1919.723960    rain-thunderstorm
4  2016-11-21   14     10        7            100           89             77   884.279340
5  2016-11-22   13     10        7             93           79             63   869.071070
6  2016-11-23   11      8        5            100           91             82   760.289260    fog-rain
7  2016-11-24   9       7        4             93           80             66  2481.689270
8  2016-11-25   7       4        1             87           74             57  2745.990070
9  2016-11-26   7       3       -1            100           88             61  2273.413250    rain 
10 2016-11-27  10       7        4            100           81             66  2630.414900    fog

Where:

CET                  object
Mean TemperatureC     int64
Mean Humidity         int64
Events               object
revenue              object
dtype: object

I want to plot all the columns against each other, to see how they variate over time. So, x-axis will be column CET and y-axis will have the rest of the columns. How can I do that? I used:

plt.figure();
df.plot(kind='line')
plt.xticks(rotation='vertical')
plt.yticks()
pylab.show()

but I can only see the Mean TemperatureC and Mean Humidity. Moreover, the x-axis is not CET date values, but the row number

Taveras answered 4/1, 2017 at 14:14 Comment(7)
This will surely help you on creating multiple y-axes.Serafina
see Multiple y-axes portion on the link page.Serafina
I want to use the df columns as x and y-axis (instead of x=[1, 2, 3], y=[40, 50, 60] ) but it gives me a key error, why is that?Taveras
simply x must be list of CET values and y must list of corresponding other column value.Serafina
How can I state x as a list of CET values?Taveras
make a list of various CET values as in shown in your df then assign it to x.Serafina
You mean to make manually a list for all my current CET values? Every day the dataframe will change, so it is not very handy to do it for every single date and having to change it every day.Taveras
I
7

As far as I remember plot uses the index for the x values. Try:

df.set_index('CET').plot()

And you should make sure that all you columns have a numeric datatype.

Edit:

df = df.set_index('CET')
num_cols = ['MaxTemp',
            'MeanTemp',
            'MinTemp',
            'MaxHumidity',
            'MeanHumidity',
            'MinHumidity',
            'revenue']
df[num_cols] = df[num_cols].astype(float)
df[num_cols].plot()
plt.xticks(range(len(df.index)), df.index)
Ice answered 4/1, 2017 at 14:25 Comment(10)
How can I show all values of 'CET' column on x-axis? It only shows a few as xticks, but not all of themTaveras
And also, since all of my columns have to be numeric, how can I transform a type "object" to "int64"? I tried df = df.convert_objects(convert_numeric = True), but the "Events" column remains as objectTaveras
There is no 'events' in your example data. But I updated my answer.Ice
You're right, I updated the question adding the Events column. I tried your suggestion, but I get this error ValueError: could not convert string to float: 'fog-rain' Taveras
Of course you can't convert a string to number. Which number would represent 'fog'? ;-) So when you converting the values, make sure you only select the numeric columns.Ice
Is there a way to be able to represent how the revenue changes over time, depending on the event?Taveras
You could add markers (vertical lines) where an event ocured. In this post someone had a similar problem: #21639227Ice
You should also consider adding multiple y-axis since the range of your y-values per column differs a lot (Mintemp [-1,7], revenue 100+)Ice
In this example, for trace1 they use x=[1, 2, 3], y=[40, 50, 60]. I want to use x=df['CET'], y = df['Mean TemperatureC'] , but it hits a KeyError. Do you know why?Taveras
Because by `df.set_index('CET') you move the CET column from the column to the index. You can also use df.plot(x='CET', y='Mean TemperatureC') before setting CET as the index. Here you can find a lot of example plots pandas.pydata.org/pandas-docs/version/0.18.1/visualization.htmlIce
D
1

The pandas plotting routines like plot.line or plot.scatter can take the column names for x and y arguments:

E.g.

>>> lines = df.plot.line(x='pig', y='horse')
>>> ax1 = df.plot.scatter(x='length',
...                       y='width',
...                       c='DarkBlue')
Danner answered 6/1, 2021 at 20:22 Comment(0)
I
0

To plot columns against each other you could use a pairplot

Isaak answered 20/5, 2021 at 4:9 Comment(0)
B
0

check out seaborn's "pairplot" and pandas "scattermatrix"

Beezer answered 22/1, 2022 at 7:30 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.