Using loc in pandas without discarding the outer levels
Asked Answered
D

4

5

I have a dataframe like

df = pd.DataFrame({
    'level0': [0,1,2],
    'level1': ['a', 'b', 'b'],
    'level2':['x', 'x', 'x'],
    'data': [0.12, 0.34, 0.45]}
).set_index(['level0', 'level1', 'level2'])
level0 level1 level 2 data
0 a x 0.12
1 b x 0.34
2 b x 0.56

If level0, level1, and level2 are the index levels, I want to access the data at (2, b) but keep the first two levels of labels. If I do df.loc[(2, 'b')] the output is

level2 data
x 0.56

but my desired output is

level0 level1 level 2 data
2 b x 0.56

How do I keep the levels 0 and 1 while using loc? I could add these levels back afterwards, but this is slightly annoying, and I'm doing this frequently enough to wonder if there's a one step solution.

Druggist answered 16/2, 2022 at 17:42 Comment(0)
T
4

You can use the output of MultiIndex.get_locs in iloc:

>>> df.iloc[df.index.get_locs((2, 'b'))]

                      data
level0 level1 level2      
2      b      x       0.45
Toponym answered 16/2, 2022 at 18:0 Comment(0)
A
1

Another potentially cleaner solution may be to use a slice with the same beginning and end points:

>>> df.loc[slice((2, 'b'), (2, 'b'))]

                      data
level0 level1 level2      
2      b      x       0.45

This is especially helpful if the name of your dataframe is long.

I've come across situations where you can leave off the second tuple, but it doesn't work all the time, so it's probably safer to include it.

Anarch answered 23/10, 2023 at 22:14 Comment(1)
I posted a new answer that you might like :)Leuco
L
1

You can use a list to keep the index. With a MultiIndex, that means a tuple of lists, one for each level.

df.loc[([2], ['b']),]
                      data
level0 level1 level2      
2      b      x       0.45

The trailing comma is required to avoid ambiguity, since x[(a, b)] is equivalent to x[a, b].

For more info, see the User Guide: Advanced indexing with hierarchical index

Leuco answered 23/10, 2023 at 22:40 Comment(0)
S
0

A clean solution would be using xs method with drop_level=False:

>>> df.xs((2,'b'), drop_level=False)

                      data
level0 level1 level2      
2      b      x       0.45
Soosoochow answered 25/6 at 14:59 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.