How to use OrdinalEncoder() to set custom order?
Asked Answered
D

1

5

I have a column in my Used cars price prediction dataset named "Owner_Type". It has four unique values which are ['First', 'Second', 'Third', 'Fourth']. Now the order that makes the most sense is First > Second > Third > Fourth as the price decreases with respect to this order. How can I give this order to the values with OrdinalEncoder()? Please help me, thank you!

Dubitable answered 9/5, 2022 at 11:5 Comment(1)
Please provide enough code so others can better understand or reproduce the problem.Transonic
E
14

OrdinalEncoder has a categories parameter which accepts a list of arrays of categories. Here is a code example:

from sklearn.preprocessing import OrdinalEncoder
enc = OrdinalEncoder(categories=[['first','second','third','forth']])
X = [['third'], ['second'], ['first']]
enc.fit(X)
print(enc.transform([['second'], ['first'], ['third'],['forth']]))
Erythritol answered 9/5, 2022 at 13:47 Comment(5)
Why do you have to fit the encoder to X if you are already supplying the categories?Verticillate
if you apply transform without applying fit you get an error "AttributeError: 'OrdinalEncoder' object has no attribute 'categories_'" so you must aplly fit_transform or fit the model first and then apply transform.Erythritol
I see, you're totally right. It's a bit silly, though, since the calling fit when the categories are already supplied does nothing other than create the attribute categories_. It seems that this attribute should be created automatically if categories are supplied.Verticillate
I'm still struggling with the "correct" order of the categories parameter. In my opinion the order should be reversed, because "First" is more important. What do you think? OrdinalEncoder(categories=[['forth', 'third', 'second', 'first']])Marva
Yes you have a point, I think that when we a train a machine learning model or when we use this feature it's better to try both orders and see which one help the model to generalize more.Erythritol

© 2022 - 2024 — McMap. All rights reserved.