Can I extract significane values for Logistic Regression coefficients in pyspark

About

Asked 5/12, 2016 at 18:13 Answered 23/12, 2016 at 9:24

Solved apache-spark machine-learning pyspark logistic-regression significance

Is there a way to get the significance level of each coefficient we receive after we fit a logistic regression model on training data?

I was trying to find out a way and could not figure out myself.

I think I may get the significance level of each feature if I run chi sq test but first of all not sure if I can run the test on all features together and secondly I have numeric data value so if it will give me right result or not that remains a question as well.

Right now I am running the modeling part using statsmodel and scikit learn but certainly, want to know, how can I get these results from PySpark ML or MLLib itself

If anyone can shed some light, it will be helpful

Masse answered 5/12, 2016 at 18:13 Comment(0)

I use only mllib, I think that when you train a model you can use toPMML method to export your model un PMML format (xml file), then you can parse the xml file to get features weights, here an example

https://spark.apache.org/docs/2.0.2/mllib-pmml-model-export.html

Hope that will help

Fenwick answered 23/12, 2016 at 9:24 Comment(2)

Thanks for this response. For the moment, I used a hybrid approach where I ELT'd the input data using pyspark and then did the modeling by switching over to statsmodels. That has worked for me. Surely I lost on the Spark's benefits but it just helped for my purpose. – Masse 29/12, 2016 at 14:53

This is a terrible answer. He asked for feature significance, not weights. – Palatal 21/8, 2019 at 20:34

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags