How to make grouper and axis the same length?
Asked Answered
S

3

15

For my assignment I'm supposed to plot the tracks of 20 hurricanes on a map using matplotlib. However when I run my code I get the error: AssertionError:Grouper and axis must be the same length

Here's the code I have:

import numpy as np
from mpl_toolkits.basemap import Basemap
import matplotlib.pyplot as plt 
from PIL import *
fig = plt.figure(figsize=(12,12))

ax = fig.add_axes([0.1,0.1,0.8,0.8])

m = Basemap(llcrnrlon=-100.,llcrnrlat=0.,urcrnrlon=-20.,urcrnrlat=57.,
        projection='lcc',lat_1=20.,lat_2=40.,lon_0=-60.,
        resolution ='l',area_thresh=1000.)

m.bluemarble()
m.drawcoastlines(linewidth=0.5)
m.drawcountries(linewidth=0.5)
m.drawstates(linewidth=0.5)

# Creates parallels and meridians

m.drawparallels(np.arange(10.,35.,5.),labels=[1,0,0,1])
m.drawmeridians(np.arange(-120.,-80.,5.),labels=[1,0,0,1])
m.drawmapboundary(fill_color='aqua')

# Opens data file

import pandas as pd
name = [ ]
df = pd.read_csv('louisianastormb.csv')
for name, group in df.groupby([name]):
    latitude = group.lat.values
    longitude = group.lon.values
    x,y = m(longitude, latitude)
    plt.plot(x,y,'y-',linewidth=2 )
    plt.xlabel('Longitude')
    plt.ylabel('Latitude')
    plt.title('20 Hurricanes with Landfall in Louisiana')

plt.savefig('20hurpaths.jpg', dpi=100)

Here's the full error output:

 Traceback (most recent call last): 
File "/home/darealmzd/lstorms.py", line 31, in <module> 
for name, group in df.groupby([name]): 
File "/usr/local/lib/python2.7/dist-packages/pandas/core/generic.py", line 186, in groupby 
squeeze=squeeze) 
File "/usr/local/lib/python2.7/dist-packages/pandas/core/groupby.py", line 533, in groupby 
return klass(obj, by, **kwds) 
File "/usr/local/lib/python2.7/dist-packages/pandas/core/groupby.py", line 197, in __init__ 
level=level, sort=sort) 
File "/usr/local/lib/python2.7/dist-packages/pandas/core/groupby.py", line 1325, in _get_grouper 
ping = Grouping(group_axis, gpr, name=name, level=level, sort=sort) 
File "/usr/local/lib/python2.7/dist-packages/pandas/core/groupby.py", line 1129, in __init__ 
self.grouper = _convert_grouper(index, grouper) 
File "/usr/local/lib/python2.7/dist-packages/pandas/core/groupby.py", line 1350, in _convert_grouper 
raise Assertionerror('Grouper and axis must be same length') 
Assertionerror: Grouper and axis must be same length 
Soapberry answered 20/10, 2013 at 23:15 Comment(5)
You'll need to give some more details. What line does the error occur on? Is it the first time through your for loop? What's the full error output?Oleson
@Oleson I just added the full error output. I'm having trouble with using group by to group the longitude and latitude values to plot the path of the storm.Soapberry
It looks like name is empty, did you mean to do that? Probably need to have a column name there.Chickenhearted
Also, can you add the result of print df.head(5) up there too?Chickenhearted
@JeffTratner Ok thanks. Does the column names 'lat' and/or 'lon' have to be specified in the csv file? What I'm trying to do is put the longitude and latitude columns from the file into two seperate list so that I could plot them on the map.Soapberry
C
18

The problem is that you're grouping by (effectively) a list of empty list ([[]]). Because you have name = [] earlier and then you wrap that in a list as well.

If you want to group on a single column (called 'HurricaneName'), you should do something like:

for name, group in df.groupby('HurricaneName'):

However, if you want to group on multiple columns, then you need to pass a list:

for name, group in df.groupby(['HurricaneName', 'Year'])

If you want to put it in a variable like you have, you can do it like this:

col_name = 'State'

for name, group in df.groupby([col_name]):
Chickenhearted answered 21/10, 2013 at 4:33 Comment(6)
Does the column names 'lat' and/or 'lon' have to be specified in the csv file? What I'm trying to do is put the longitude and latitude columns from the file into two seperate list so that I could plot them on the map.Soapberry
@Soapberry no - those are just dummy names, you just need to use whatever names are in the csv to do the grouping. Re-reading your question - it looks like you want to group on something else (maybe you mean to group on hurricane so you can plot change over time?) If that's the case, then you want to group on something like 'HurricaneName'. I'm going to edit my answer to reflect that. If this answers your original question (about error message), you can check the arrow under my answer to mark your question resolved.Chickenhearted
The way my file is set up is the Year, Name, Latitude, Longitude at the beginning of the file with the data beneath it. So would for name, group in df.groupby(['Name']): latitude = group.Latitude.values group.Longitude.values x,y(latitude,longitude) be correct?Soapberry
yep, that seems correct, that will generate results grouped by 'Name' (as it sounds)Chickenhearted
Ok. Thank you so much for your help. However now I'm getting an error when trying to plot the x,y coordinates. Could you please help Traceback (most recent call last): File "/home/darealmzd/lstorms.py", line 42, in <module> x,y(latitude,longitude) NameError: name 'x' is not defined Soapberry
needs to be x, y = m(latitude, longitude)Chickenhearted
R
30

ValueError: Grouper and axis must be same length

This can occur if you are using double brackets in the groupby argument.

(I posted this since it is the top result on Google).

Ras answered 1/8, 2019 at 15:28 Comment(2)
This helped me, as I updated an index parameter for a pivot table and failed to remove the square brackets data_pivot = pd.pivot_table(input_data, values=’id', #old_index=['level1','level2', 'level3'], index=columns, columns=['data_type'], fill_value = 0, aggfunc='count') Where columns is a list of columns from the dataframe.Haggle
I should promote a community project named "Translate Pandas (and Matplotlib) error messages". These are error messages written by developers for developers working on the same library, not fot general developers.Antiquity
C
18

The problem is that you're grouping by (effectively) a list of empty list ([[]]). Because you have name = [] earlier and then you wrap that in a list as well.

If you want to group on a single column (called 'HurricaneName'), you should do something like:

for name, group in df.groupby('HurricaneName'):

However, if you want to group on multiple columns, then you need to pass a list:

for name, group in df.groupby(['HurricaneName', 'Year'])

If you want to put it in a variable like you have, you can do it like this:

col_name = 'State'

for name, group in df.groupby([col_name]):
Chickenhearted answered 21/10, 2013 at 4:33 Comment(6)
Does the column names 'lat' and/or 'lon' have to be specified in the csv file? What I'm trying to do is put the longitude and latitude columns from the file into two seperate list so that I could plot them on the map.Soapberry
@Soapberry no - those are just dummy names, you just need to use whatever names are in the csv to do the grouping. Re-reading your question - it looks like you want to group on something else (maybe you mean to group on hurricane so you can plot change over time?) If that's the case, then you want to group on something like 'HurricaneName'. I'm going to edit my answer to reflect that. If this answers your original question (about error message), you can check the arrow under my answer to mark your question resolved.Chickenhearted
The way my file is set up is the Year, Name, Latitude, Longitude at the beginning of the file with the data beneath it. So would for name, group in df.groupby(['Name']): latitude = group.Latitude.values group.Longitude.values x,y(latitude,longitude) be correct?Soapberry
yep, that seems correct, that will generate results grouped by 'Name' (as it sounds)Chickenhearted
Ok. Thank you so much for your help. However now I'm getting an error when trying to plot the x,y coordinates. Could you please help Traceback (most recent call last): File "/home/darealmzd/lstorms.py", line 42, in <module> x,y(latitude,longitude) NameError: name 'x' is not defined Soapberry
needs to be x, y = m(latitude, longitude)Chickenhearted
D
0

Try iloc to make grouper equal to axis.

example:

sns.boxplot(x=df['pH-binned'].iloc[0:3], y=v_count, data=df)

In case axis=3.

Deliadelian answered 13/3, 2020 at 4:15 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.