Plot dataframe columns against each other

Asked 4/1, 2017 at 14:14 Answered 22/1, 2022 at 7:30

I have this df :

      CET    MaxTemp  MeanTemp MinTemp  MaxHumidity  MeanHumidity  MinHumidity  revenue     events
0  2016-11-17   11      9        7            100           85             63   385.943800    rain
1  2016-11-18   9       6        3             93           83             66  1074.160340    storm
2  2016-11-19   8       6        4             93           87             76  2980.857860    
3  2016-11-20   10      7        4             93           84             81  1919.723960    rain-thunderstorm
4  2016-11-21   14     10        7            100           89             77   884.279340
5  2016-11-22   13     10        7             93           79             63   869.071070
6  2016-11-23   11      8        5            100           91             82   760.289260    fog-rain
7  2016-11-24   9       7        4             93           80             66  2481.689270
8  2016-11-25   7       4        1             87           74             57  2745.990070
9  2016-11-26   7       3       -1            100           88             61  2273.413250    rain 
10 2016-11-27  10       7        4            100           81             66  2630.414900    fog

Where:

CET                  object
Mean TemperatureC     int64
Mean Humidity         int64
Events               object
revenue              object
dtype: object

I want to plot all the columns against each other, to see how they variate over time. So, x-axis will be column CET and y-axis will have the rest of the columns. How can I do that? I used:

plt.figure();
df.plot(kind='line')
plt.xticks(rotation='vertical')
plt.yticks()
pylab.show()

but I can only see the Mean TemperatureC and Mean Humidity. Moreover, the x-axis is not CET date values, but the row number

Taveras answered 4/1, 2017 at 14:14 Comment(7)

This will surely help you on creating multiple y-axes. – Serafina 4/1, 2017 at 14:18

see Multiple y-axes portion on the link page. – Serafina 4/1, 2017 at 15:9

I want to use the df columns as x and y-axis (instead of x=[1, 2, 3], y=[40, 50, 60] ) but it gives me a key error, why is that? – Taveras 4/1, 2017 at 15:11

simply x must be list of CET values and y must list of corresponding other column value. – Serafina 4/1, 2017 at 15:23

How can I state x as a list of CET values? – Taveras 6/1, 2017 at 9:32

make a list of various CET values as in shown in your df then assign it to x. – Serafina 6/1, 2017 at 9:34

You mean to make manually a list for all my current CET values? Every day the dataframe will change, so it is not very handy to do it for every single date and having to change it every day. – Taveras 6/1, 2017 at 9:42

As far as I remember plot uses the index for the x values. Try:

df.set_index('CET').plot()

And you should make sure that all you columns have a numeric datatype.

Edit:

df = df.set_index('CET')
num_cols = ['MaxTemp',
            'MeanTemp',
            'MinTemp',
            'MaxHumidity',
            'MeanHumidity',
            'MinHumidity',
            'revenue']
df[num_cols] = df[num_cols].astype(float)
df[num_cols].plot()
plt.xticks(range(len(df.index)), df.index)

Ice answered 4/1, 2017 at 14:25 Comment(10)

How can I show all values of 'CET' column on x-axis? It only shows a few as xticks, but not all of them – Taveras 4/1, 2017 at 14:28

And also, since all of my columns have to be numeric, how can I transform a type "object" to "int64"? I tried df = df.convert_objects(convert_numeric = True), but the "Events" column remains as object – Taveras 4/1, 2017 at 14:30

There is no 'events' in your example data. But I updated my answer. – Ice 4/1, 2017 at 14:34

You're right, I updated the question adding the Events column. I tried your suggestion, but I get this error ValueError: could not convert string to float: 'fog-rain' – Taveras 4/1, 2017 at 14:38

Of course you can't convert a string to number. Which number would represent 'fog'? ;-) So when you converting the values, make sure you only select the numeric columns. – Ice 4/1, 2017 at 14:46

Is there a way to be able to represent how the revenue changes over time, depending on the event? – Taveras 4/1, 2017 at 14:47

You could add markers (vertical lines) where an event ocured. In this post someone had a similar problem: #21639227 – Ice 4/1, 2017 at 14:51

You should also consider adding multiple y-axis since the range of your y-values per column differs a lot (Mintemp [-1,7], revenue 100+) – Ice 4/1, 2017 at 14:55

In this example, for trace1 they use x=[1, 2, 3], y=[40, 50, 60]. I want to use x=df['CET'], y = df['Mean TemperatureC'] , but it hits a KeyError. Do you know why? – Taveras 4/1, 2017 at 15:8

Because by `df.set_index('CET') you move the CET column from the column to the index. You can also use df.plot(x='CET', y='Mean TemperatureC') before setting CET as the index. Here you can find a lot of example plots pandas.pydata.org/pandas-docs/version/0.18.1/visualization.html – Ice 4/1, 2017 at 15:53

The pandas plotting routines like plot.line or plot.scatter can take the column names for x and y arguments:

E.g.

>>> lines = df.plot.line(x='pig', y='horse')

>>> ax1 = df.plot.scatter(x='length',
...                       y='width',
...                       c='DarkBlue')

Recommended topics

Hot tags