You question is not so clear.
But let's assume two scenarios:
you are encoding in some strange way the input features with target info
you are simply encoding the regression values, and, for example, you have set buckets values for the outcome:
something like
[0, 20] -> 1
[21, 40] -> 2
[41, 60] -> 3
[61, 80] -> 4
[81, 100] -> 5
Answer
If you are encoding your features with the the target value in some strange way, there is something wrong. You are basically introducing the info that you are trying to predict in the source, and that's a data leak. A model with this kind of set-up will not work on real data, because it's like cheating
If you are encoding your targets, which is quite common, there is nothing to change in your pipeline. Because features encoding and target encoding are two separate and independent steps
Usually there will be two encoder functions, one for the features, and one for the target, and these functions will be independent. In the training situation, you'll have a g(x) encoder for features (with x being the input matrix features), and t(y) function for encoding the target (with y being the target values)
when you do the training, you need both encoded features and encoded labels to calculate errors and improve the model, so you'll do something like this:
model.fit(g(x_train), t(y_train)) # iterate: train data on g(x_train), calculate loss with t(y_train) and change the model accordingly
when you'll do the prediction you'll work with something like this:
y_test = model.predict(g(x_test)) # test with encoded unseen data
Assuming the scenario in which you have used the example buckets above, y_test
will be already encoded, with values like [1, 2, 3, 4, 5]
.
So there will be no need to use the encoding of your target output, because the target is the output of your prediction, and in no way it should be used as info, encoded in some way, in the training features
The target should be used only in the loss function during the training
So, in summary:
- Training could have two independent encoders
- g(x) for x features
- t(y) for y labels/target
and these encoders are independent (e.g. they are constructed without knowing anything of eachother)
- Test on unseen data can have only encoders for the features (already obtained from the training)