I would like to use my own loss function instead of the squared loss for the linear regression model in spark MLlib. So far can't find any part in the documentation that mentions if it is even possible.
How to set a custom loss function in Spark MLlib
TLDR; it is not easy to use a custom loss function because you can not simply pass a loss function to spark models. However, you can easily write a customized model for yourself.
Long answer:
If you look at the code of LinearRegressionWithSGD you will see:
class LinearRegressionWithSGD private[mllib] (
private var stepSize: Double,
private var numIterations: Int,
private var regParam: Double,
private var miniBatchFraction: Double)
extends GeneralizedLinearAlgorithm[LinearRegressionModel] with Serializable {
private val gradient = new LeastSquaresGradient() #Loss Function
private val updater = new SimpleUpdater()
@Since("0.8.0")
override val optimizer = new GradientDescent(gradient, updater) #Optimizer
.setStepSize(stepSize)
.setNumIterations(numIterations)
.setRegParam(regParam)
.setMiniBatchFraction(miniBatchFraction)
So, let's look at how the least squared loss function is implemented here:
class LeastSquaresGradient extends Gradient {
override def compute(data: Vector, label: Double, weights: Vector): (Vector, Double) = {
val diff = dot(data, weights) - label
val loss = diff * diff / 2.0
val gradient = data.copy
scal(diff, gradient)
(gradient, loss)
}
override def compute(
data: Vector,
label: Double,
weights: Vector,
cumGradient: Vector): Double = {
val diff = dot(data, weights) - label
axpy(diff, data, cumGradient)
diff * diff / 2.0
}
}
So, you can simply write a class like LeastSquaresGradient
and implement the compute
function and use it in your LinearRegressionWithSGD
model.
© 2022 - 2024 — McMap. All rights reserved.