Check if Series has Values in Range
Asked Answered
P

5

7

I have a Pandas dataframe that has user information and also has a column for their permissions:

UserName    Permissions
John Doe             02
John Doe             11
 Example             09
 Example             08
   User3             11

I am trying to create a new column called User Class that is based on their Permissions (looking at all of the users permissions). If a user has all permissions <10, they are considered Admin. If a user has all permission >=10, they are considered User. However if they have permissions that are both <10 and >=10, then they will be coded as Admin/User. So my resulting output would be:

UserName    Permissions    User Class
John Doe             02    Admin/User
John Doe             11    Admin/User
 Example             09         Admin
 Example             08         Admin
   User3             11          User

What would be the best way to do this? My original idea was to do:

for UserName, User_df in df.groupby(by='UserName'):
    LT10 = (User_df['Permissions'] < 10).any()
    GTE10 = (User_df['Permissions'] >= 10).any()
    if (LT10 & GTE10):
        UserClass = 'Admin/User'
    elif LT10:
        UserClass = 'Admin'
    elif GTE10:
        UserClass = 'User'
    df.at[User_df.index, 'User Class'] = UserClass

However these seems very inefficient because df has ~800K records

Pasteur answered 3/9, 2024 at 18:32 Comment(0)
E
6

Another possible solution:

df['User Class'] = (
    df.groupby('UserName')['Permissions']
    .transform(lambda x: 'Admin' if (x < 10).all() else 
               'User' if (x >= 10).all() else 'Admin/User'))

Output:

   UserName  Permissions  User Class
0  John Doe            2  Admin/User
1  John Doe           11  Admin/User
2   Example            9       Admin
3   Example            8       Admin
4     User3           11        User
Exorcism answered 3/9, 2024 at 18:45 Comment(0)
P
5

Group by the username and use transform to compute the min/max permission values per group. Then use this to compute User Class using np.select:

import numpy as np
import pandas as pd

data = {
    "UserName": ["John Doe", "John Doe", "Example", "Example", "User3"],
    "Permissions": [2, 11, 9, 8, 11],
}

df = pd.DataFrame(data)

permissions = df.groupby("UserName")["Permissions"]
min_permission = permissions.transform("min")
max_permission = permissions.transform("max")

df["User Class"] = np.select(
    [
        (min_permission < 10) & (max_permission < 10),
        (min_permission >= 10) & (max_permission >= 10),
    ],
    ["Admin", "User"],
    default="Admin/User",
)

print(df)

Output:

   UserName  Permissions  User Class
0  John Doe            2  Admin/User
1  John Doe           11  Admin/User
2   Example            9       Admin
3   Example            8       Admin
4     User3           11        User
Playboy answered 3/9, 2024 at 18:41 Comment(0)
S
2

I would use pandas.cut to map the values to Admin/User and groupby.transform to combine the classes:

df['User Class'] = (
 pd.cut(df['Permissions'], bins=[0, 10, np.inf],
        labels=['Admin', 'User'], right=False)
   .groupby(df['UserName'])
   .transform(lambda x: '/'.join(sorted(x.unique())))
 )

Output:

   UserName  Permissions  User Class
0  John Doe            2  Admin/User
1  John Doe           11  Admin/User
2   Example            9       Admin
3   Example            8       Admin
4     User3           11        User
Spies answered 3/9, 2024 at 20:0 Comment(0)
E
0

This is most efficient for Huge datasets.

import pandas as pd
import numpy as np
# Sample data
data = {
    'UserName': ['Tabu', 'Tabu', 'Roopa', 'Donald', 'Goofy'],
    'Permissions': ['02', '11', '09', '08', '11']
}

df = pd.DataFrame(data)

# Convert 'Permissions' to integers
df['Permissions'] = df['Permissions'].astype(int)
print(df)


df['User_Class'] = np.where(
    df.groupby('UserName')['Permissions'].transform('max') >= 10, 
    np.where(df.groupby('UserName')['Permissions'].transform('min') < 10, 'Admin/User', 'User'),
    'Admin'
)
print(df)

'''
  UserName  Permissions  User_Class
0     Tabu            2  Admin/User
1     Tabu           11  Admin/User
2    Roopa            9       Admin
3   Donald            8       Admin
4    Goofy           11        User
'''
Eloyelreath answered 7/9, 2024 at 9:4 Comment(0)
J
0

Try this:

(pd.get_dummies(df['Permissions'].gt(10))
.groupby(df['UserName']).transform('any')
.mul([1,2])
.sum(axis=1)
.map(dict(enumerate(['neither','admin','user','admin/user']))))

Output:

0    admin/user
1    admin/user
2         admin
3         admin
4          user
Joe answered 8/9, 2024 at 2:20 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.