How to make grouper and axis the same length?

S

3

15

For my assignment I'm supposed to plot the tracks of 20 hurricanes on a map using matplotlib. However when I run my code I get the error: AssertionError:Grouper and axis must be the same length

Here's the code I have:

import numpy as np
from mpl_toolkits.basemap import Basemap
import matplotlib.pyplot as plt 
from PIL import *
fig = plt.figure(figsize=(12,12))

ax = fig.add_axes([0.1,0.1,0.8,0.8])

m = Basemap(llcrnrlon=-100.,llcrnrlat=0.,urcrnrlon=-20.,urcrnrlat=57.,
        projection='lcc',lat_1=20.,lat_2=40.,lon_0=-60.,
        resolution ='l',area_thresh=1000.)

m.bluemarble()
m.drawcoastlines(linewidth=0.5)
m.drawcountries(linewidth=0.5)
m.drawstates(linewidth=0.5)

# Creates parallels and meridians

m.drawparallels(np.arange(10.,35.,5.),labels=[1,0,0,1])
m.drawmeridians(np.arange(-120.,-80.,5.),labels=[1,0,0,1])
m.drawmapboundary(fill_color='aqua')

# Opens data file

import pandas as pd
name = [ ]
df = pd.read_csv('louisianastormb.csv')
for name, group in df.groupby([name]):
    latitude = group.lat.values
    longitude = group.lon.values
    x,y = m(longitude, latitude)
    plt.plot(x,y,'y-',linewidth=2 )
    plt.xlabel('Longitude')
    plt.ylabel('Latitude')
    plt.title('20 Hurricanes with Landfall in Louisiana')

plt.savefig('20hurpaths.jpg', dpi=100)

Here's the full error output:

 Traceback (most recent call last): 
File "/home/darealmzd/lstorms.py", line 31, in <module> 
for name, group in df.groupby([name]): 
File "/usr/local/lib/python2.7/dist-packages/pandas/core/generic.py", line 186, in groupby 
squeeze=squeeze) 
File "/usr/local/lib/python2.7/dist-packages/pandas/core/groupby.py", line 533, in groupby 
return klass(obj, by, **kwds) 
File "/usr/local/lib/python2.7/dist-packages/pandas/core/groupby.py", line 197, in __init__ 
level=level, sort=sort) 
File "/usr/local/lib/python2.7/dist-packages/pandas/core/groupby.py", line 1325, in _get_grouper 
ping = Grouping(group_axis, gpr, name=name, level=level, sort=sort) 
File "/usr/local/lib/python2.7/dist-packages/pandas/core/groupby.py", line 1129, in __init__ 
self.grouper = _convert_grouper(index, grouper) 
File "/usr/local/lib/python2.7/dist-packages/pandas/core/groupby.py", line 1350, in _convert_grouper 
raise Assertionerror('Grouper and axis must be same length') 
Assertionerror: Grouper and axis must be same length

Soapberry answered 20/10, 2013 at 23:15 Comment(5)

You'll need to give some more details. What line does the error occur on? Is it the first time through your for loop? What's the full error output? – Oleson 21/10, 2013 at 3:2

@Oleson I just added the full error output. I'm having trouble with using group by to group the longitude and latitude values to plot the path of the storm. – Soapberry 21/10, 2013 at 3:10

It looks like name is empty, did you mean to do that? Probably need to have a column name there. – Chickenhearted 21/10, 2013 at 4:27

Also, can you add the result of print df.head(5) up there too? – Chickenhearted 21/10, 2013 at 4:27

@JeffTratner Ok thanks. Does the column names 'lat' and/or 'lon' have to be specified in the csv file? What I'm trying to do is put the longitude and latitude columns from the file into two seperate list so that I could plot them on the map. – Soapberry 21/10, 2013 at 4:46

C

18

The problem is that you're grouping by (effectively) a list of empty list ([[]]). Because you have name = [] earlier and then you wrap that in a list as well.

If you want to group on a single column (called 'HurricaneName'), you should do something like:

for name, group in df.groupby('HurricaneName'):

However, if you want to group on multiple columns, then you need to pass a list:

for name, group in df.groupby(['HurricaneName', 'Year'])

If you want to put it in a variable like you have, you can do it like this:

col_name = 'State'

for name, group in df.groupby([col_name]):

Chickenhearted answered 21/10, 2013 at 4:33 Comment(6)

Does the column names 'lat' and/or 'lon' have to be specified in the csv file? What I'm trying to do is put the longitude and latitude columns from the file into two seperate list so that I could plot them on the map. – Soapberry 21/10, 2013 at 4:46

@Soapberry no - those are just dummy names, you just need to use whatever names are in the csv to do the grouping. Re-reading your question - it looks like you want to group on something else (maybe you mean to group on hurricane so you can plot change over time?) If that's the case, then you want to group on something like 'HurricaneName'. I'm going to edit my answer to reflect that. If this answers your original question (about error message), you can check the arrow under my answer to mark your question resolved. – Chickenhearted 21/10, 2013 at 4:49

The way my file is set up is the Year, Name, Latitude, Longitude at the beginning of the file with the data beneath it. So would

for name, group in df.groupby(['Name']): latitude = group.Latitude.values  group.Longitude.values x,y(latitude,longitude)

be correct? – Soapberry 21/10, 2013 at 5:12

yep, that seems correct, that will generate results grouped by 'Name' (as it sounds) – Chickenhearted 21/10, 2013 at 5:18

Ok. Thank you so much for your help. However now I'm getting an error when trying to plot the x,y coordinates. Could you please help

Traceback (most recent call last):                                                                                    File "/home/darealmzd/lstorms.py", line 42, in <module>                                                               x,y(latitude,longitude)                                                                                         NameError: name 'x' is not defined

– Soapberry 21/10, 2013 at 5:35

needs to be x, y = m(latitude, longitude) – Chickenhearted 21/10, 2013 at 5:38

R

30

ValueError: Grouper and axis must be same length

This can occur if you are using double brackets in the groupby argument.

(I posted this since it is the top result on Google).

Ras answered 1/8, 2019 at 15:28 Comment(2)

This helped me, as I updated an index parameter for a pivot table and failed to remove the square brackets

data_pivot = pd.pivot_table(input_data,                                  values=’id',                                 #old_index=['level1','level2', 'level3'],                                 index=columns,                                 columns=['data_type'],                                 fill_value = 0,                                 aggfunc='count')

Where columns is a list of columns from the dataframe. – Haggle 29/10, 2019 at 1:0

I should promote a community project named "Translate Pandas (and Matplotlib) error messages". These are error messages written by developers for developers working on the same library, not fot general developers. – Antiquity 19/2, 2023 at 7:44

C

18

The problem is that you're grouping by (effectively) a list of empty list ([[]]). Because you have name = [] earlier and then you wrap that in a list as well.

If you want to group on a single column (called 'HurricaneName'), you should do something like:

for name, group in df.groupby('HurricaneName'):

However, if you want to group on multiple columns, then you need to pass a list:

for name, group in df.groupby(['HurricaneName', 'Year'])

If you want to put it in a variable like you have, you can do it like this:

col_name = 'State'

for name, group in df.groupby([col_name]):

Chickenhearted answered 21/10, 2013 at 4:33 Comment(6)

Does the column names 'lat' and/or 'lon' have to be specified in the csv file? What I'm trying to do is put the longitude and latitude columns from the file into two seperate list so that I could plot them on the map. – Soapberry 21/10, 2013 at 4:46

@Soapberry no - those are just dummy names, you just need to use whatever names are in the csv to do the grouping. Re-reading your question - it looks like you want to group on something else (maybe you mean to group on hurricane so you can plot change over time?) If that's the case, then you want to group on something like 'HurricaneName'. I'm going to edit my answer to reflect that. If this answers your original question (about error message), you can check the arrow under my answer to mark your question resolved. – Chickenhearted 21/10, 2013 at 4:49

The way my file is set up is the Year, Name, Latitude, Longitude at the beginning of the file with the data beneath it. So would

for name, group in df.groupby(['Name']): latitude = group.Latitude.values  group.Longitude.values x,y(latitude,longitude)

be correct? – Soapberry 21/10, 2013 at 5:12

yep, that seems correct, that will generate results grouped by 'Name' (as it sounds) – Chickenhearted 21/10, 2013 at 5:18

Ok. Thank you so much for your help. However now I'm getting an error when trying to plot the x,y coordinates. Could you please help

Traceback (most recent call last):                                                                                    File "/home/darealmzd/lstorms.py", line 42, in <module>                                                               x,y(latitude,longitude)                                                                                         NameError: name 'x' is not defined

– Soapberry 21/10, 2013 at 5:35

needs to be x, y = m(latitude, longitude) – Chickenhearted 21/10, 2013 at 5:38

D

0

Try iloc to make grouper equal to axis.

example:

sns.boxplot(x=df['pH-binned'].iloc[0:3], y=v_count, data=df)

In case axis=3.

Deliadelian answered 13/3, 2020 at 4:15 Comment(0)

Recommended topics

Hot tags