Non-linear SVM is not available in Apache Spark
Asked Answered
H

0

10

Does avyone know the reason why the Non-Linear SVM has not been implemented in Apache Spark? I was reading this page: https://issues.apache.org/jira/browse/SPARK-4638

Look at the last comment. It says:

"Commenting here b/c of the recent dev list thread: Non-linear kernels for SVMs in Spark would be great to have. The main barriers are: Kernelized SVM training is hard to distribute. Naive methods require a lot of communication. To get this feature into Spark, we'd need to do proper background research and write up a good design. Other ML algorithms are arguably more in demand and still need improvements (as of the date of this comment). Tree ensembles are first-and-foremost in my mind."

The question is: Why is the kernelized SVM hard to distribute?

Everybody knows that the non-linear SVMs exhibit better performance than the linear ones.

Hophead answered 12/5, 2017 at 23:16 Comment(1)
I have the same question .. A spark noob here , we have 2 bil + records and want to use non linear kernel trick for the data, only way out is spark mllib for now, would love to see such a featureLungki

© 2022 - 2024 — McMap. All rights reserved.