scikit-learn

3

Solved

What is the best way to fit a quadratic polynomial to p-dimensional data and compute its gradient and Hessian matrix?

I have been trying to use the scikit-learn library to solve this problem. Roughly: from sklearn.preprocessing import PolynomialFeatures from sklearn.linear_model import LinearRegression # Make or ...

python scikit-learn linear-regression polynomials hessian-matrix

Giraffe asked 10/10, 2024 at 17:32

3

Keep pandas index while applying sklearn

I have a dataset which has a DateTime index and I'm using PCA from sklearn to reduce the number of dimensions. The following question bugs me - will PCA keep the order of the points in my series s...

pandas scikit-learn

Tour asked 1/2, 2017 at 13:50

4

Python : GridSearchCV taking too long to finish running

I'm attempting to do a grid search to optimize my model but it's taking far too long to execute. My total dataset is only about 15,000 observations with about 30-40 variables. I was successfully ab...

python machine-learning scikit-learn data-science cross-validation

Balanchine asked 3/5, 2022 at 14:51

11

Solved

How to resolve "cannot import name '_MissingValues' from 'sklearn.utils._param_validation'" issue when trying to import imblearn?

I am trying to import imblearn into my python notebook after installing the required modules. However, I am getting the following error: Additional info: I am using a virtual environment in Visual...

python python-3.x scikit-learn imblearn

Malvoisie asked 1/7, 2023 at 8:52

8

ValueError: With n_samples=0, test_size=0.2 and train_size=None, the resulting train set will be empty. Adjust any of the aforementioned parameters

I wrote a text classification program. When I run the program it crashes with an error as seen in this screenshot: ValueError: With n_samples=0, test_size=0.2 and train_size=None, the resulting t...

python scikit-learn nlp

Passably asked 3/2, 2020 at 16:25

6

Solved

scikit-learn DBSCAN memory usage

UPDATED: In the end, the solution I opted to use for clustering my large dataset was one suggested by Anony-Mousse below. That is, using ELKI's DBSCAN implimentation to do my clustering rather than...

python scikit-learn cluster-analysis data-mining dbscan

Derick asked 5/5, 2013 at 5:4

3

Solved

SageMaker failed to extract model data archive tar.gz for container when deploying

I am trying in Amazon Sagemaker to deploy an existing Scikit-Learn model. So a model that wasn't trained on SageMaker, but locally on my machine. On my local (windows) machine I've saved my model a...

machine-learning scikit-learn deployment amazon-sagemaker

Tamelatameless asked 25/1, 2021 at 9:3

2

how to make RandomForestClassifier faster?

I am trying to implement bag of word model from kaggle site with a twitter sentiments data which has around 1M raw. I already clean it but in last part when I applied my features vectors and sentim...

python-3.x machine-learning scikit-learn random-forest

Lustring asked 26/4, 2017 at 17:9

3

Solved

ValueError: The number of classes has to be greater than one (python)

When passing x,y in fit, I am getting the following error: Traceback (most recent call last): File "C:/Classify/classifier.py", line 95, in train_avg, test_avg, cms = train_model(X, y, "cep...

python-2.7 scikit-learn classification svm

Calycine asked 24/11, 2016 at 7:12

2

Solved

how to use sklearn when target variable is a proportion

There are standard ways of predicting proportions such as logistic regression (without thresholding) and beta regression. There have already been discussions about this: http://scikit-learn-genera...

python scikit-learn

Yovonnda asked 29/5, 2017 at 4:38

4

conditional sampling from multivariate kernel density estimate in python

One can create a multivariate kernel density estimate (KDE) with scikitlearn (https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KernelDensity.html#sklearn.neighbors.KernelDensity)...

python scikit-learn scipy

Dropwort asked 14/2, 2020 at 9:34

2

How can I get Gini Coefficient in sklearn

I would like in sklearn package, Find the gini coefficients for each feature on a class of paths such as in iris data. like Iris-virginica Petal length gini：0.4 ，Petal width gini：0.4.

python scikit-learn gini

Levitical asked 13/7, 2017 at 11:42

3

Solved

Using hyphen/dash in python repository name and package name

I am trying to make my git repository pip-installable. In preparation for that I am restructuring the repo to follow the right conventions. My understanding from looking at other repositories is th...

python scikit-learn pip package pypi

Sang asked 8/2, 2019 at 17:7

4

Solved

scikit-learn clustering: predict(X) vs. fit_predict(X)

In scikit-learn, some clustering algorithms have both predict(X) and fit_predict(X) methods, like KMeans and MeanShift, while others only have the latter, like SpectralClustering. According to the ...

python-3.x machine-learning scikit-learn

Jonette asked 9/5, 2016 at 2:25

2

Solved

Sklearn ROC AUC Score : ValueError: y should be a 1d array, got an array of shape (15, 2) instead

I have this dataset with target LULUS, it's an imbalance dataset. I'm trying to print roc auc score if I could for each fold of my data but in every fold somehow it's always raise error saying Valu...

python scikit-learn

Antwanantwerp asked 29/5, 2021 at 16:7

4

Implementing custom loss function in scikit learn

I want to implement a custom loss function in scikit learn. I use the following code snippet: def my_custom_loss_func(y_true,y_pred): diff3=max((abs(y_true-y_pred))*y_true) return diff3 score=m...

python machine-learning scikit-learn data-science gridsearchcv

Carmeliacarmelina asked 19/1, 2019 at 13:47

3

Solved

What is the difference between pipeline and make_pipeline in scikit-learn?

I got this from the sklearn webpage: Pipeline: Pipeline of transforms with a final estimator Make_pipeline: Construct a Pipeline from the given estimators. This is a shorthand for the Pipeline co...

python machine-learning scikit-learn pipeline

Harriettharrietta asked 20/11, 2016 at 18:56

4

Choosing random_state for sklearn algorithms

I understand that random_state is used in various sklearn algorithms to break tie between different predictors (trees) with same metric value (say for example in GradientBoosting). But the document...

machine-learning scikit-learn random-forest

Zwiebel asked 29/9, 2014 at 10:38

3

Solved

module 'numpy' has no attribute 'dtype'

When importing sklearn datasets eg. from sklearn.datasets import fetch_mldata from sklearn.datasets import fetch_openml I get the error Traceback (most recent call last): File "numbers.py", l...

python numpy scikit-learn python-3.5 python-import

Thresher asked 11/3, 2019 at 19:19

2

Solved

Training Linear Models with MAE using sklearn in Python

I'm currently trying to train a linear model using sklearn in python but not with mean squared error (MSE) as error measure - but with mean absolute error (MAE). I specificially need a linear model...

python scikit-learn data-science

Morpheus asked 17/5, 2018 at 13:31

5

Solved

Retrieve list of training features names from classifier

Is there a way to retrieve the list of feature names used for training of a classifier, once it has been trained with the fit method? I would like to get this information before applying to unseen ...

python pandas scikit-learn random-forest

Doolittle asked 8/11, 2016 at 11:6

4

How to specify a variable in pandas as ordinal/categorical?

I am trying to run some Machine learning algo on a dataset using scikit-learn. My dataset has some features which are like categories. Like one feature is A, which has values 1,2,3 specifying the q...

python pandas scikit-learn categorical-data nominal-data

Argosy asked 9/4, 2015 at 2:18

8

Solved

sklearn ImportError: cannot import name plot_roc_curve

I am trying to plot a Receiver Operating Characteristics (ROC) curve with cross validation, following the example provided in sklearn's documentation. However, the following import gives an ImportE...

python machine-learning scikit-learn roc

Salvo asked 20/2, 2020 at 13:44

2

Solved

Why is Random Forest with a single tree much better than a Decision Tree classifier?

I apply the decision tree classifier and the random forest classifier to my data with the following code: def decision_tree(train_X, train_Y, test_X, test_Y): clf = tree.DecisionTreeClassifier()...

python machine-learning scikit-learn random-forest decision-tree

Dugong asked 13/1, 2018 at 11:4

3

Remove highly correlated columns from a pandas dataframe

I have a dataframe name data whose correlation matrix I computed by using corr = data.corr() If the correlation between two columns is greater than 0.75, I want to remove one of them from datafram...

python r pandas machine-learning scikit-learn

Squires asked 3/7, 2017 at 15:39

scikit-learn Questions

Recommended topics

Hot tags