scikit-learn Questions

20

I'm doing a multiclass text classification in Scikit-Learn. The dataset is being trained using the Multinomial Naive Bayes classifier having hundreds of labels. Here's an extract from the Scikit Le...
Osteomalacia asked 23/9, 2016 at 13:45

4

I'm looking at this example from scikit-learn documentation: http://scikit-learn.org/0.18/auto_examples/model_selection/plot_nested_cross_validation_iris.html It seems to me that crossvalidation i...
Nealon asked 13/12, 2016 at 18:20

3

I want to use sklearn.compose.ColumnTransformer consistently (not parallel, so, the second transformer should be executed only after the first) for intersecting lists of columns in this way: log_t...
Calise asked 5/6, 2020 at 22:54

2

Solved

Some articles says that in case of having only train and test sets, first, we need to use fit_transform() to scale training set and then only transform() for test set, in order to prevent data leak...
Calfskin asked 12/11, 2019 at 16:54

3

I'm trying to work my head around the example of Nested vs. Non-Nested CV in Sklearn. I checked multiple answers but I am still confused on the example. To my knowledge, a nested CV aims to use a d...
Fatback asked 6/10, 2017 at 10:18

4

Working in a multi-label classification problem with 13 possibles outputs in my neural network with Keras, sklearn, etc... Each output can be an array like [0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1 ,0]....
Sightread asked 25/2, 2019 at 16:36

3

Solved

I understand that scaling means centering the mean(mean=0) and making unit variance(variance=1). But, What is the difference between preprocessing.scale(x)and preprocessing.StandardScalar() in sci...
Plebiscite asked 16/9, 2017 at 19:21

2

The exact warning is ....\.venv\lib\site-packages\sklearn\base.py:329: UserWarning: Trying to unpickle estimator LinearRegression from version 0.24.1 when using version 1.0.2. This might lead to br...
Dibri asked 11/2, 2022 at 7:11

15

I try to use pip to install sklearn, and I receive the following error message: ERROR: Could not install packages due to an OSError: [Errno 2] No such file or directory: 'C:\Users\13434\AppD...
Beilul asked 31/1, 2021 at 15:36

6

Solved

I want to get feature names after I fit the pipeline. categorical_features = ['brand', 'category_name', 'sub_category'] categorical_transformer = Pipeline(steps=[ ('imputer', SimpleImputer(strateg...
Dansby asked 12/2, 2019 at 9:27

2

Solved

I would like to use cross validation to test/train my dataset and evaluate the performance of the logistic regression model on the entire dataset and not only on the test set (e.g. 25%). These co...
Observant asked 26/8, 2016 at 9:46

4

Solved

Is it possible to get classification report from cross_val_score through some workaround? I'm using nested cross-validation and I can get various scores here for a model, however, I would like to s...

2

Understand the difference between CART and DecisionTreeClassifier of Sklearn. In Sklearn's documentation, it says that "scikit-learn uses an optimised version of the CART algorithm". However, I co...
Shanley asked 7/10, 2019 at 17:48

7

I want to plot a decision tree of a random forest. So, i create the following code: clf = RandomForestClassifier(n_estimators=100) import pydotplus import six from sklearn import tree dotfile = six...
Blanche asked 20/10, 2016 at 12:56

4

I am running k-means clustering on a dataset with around 1 million items and around 100 attributes. I applied clustering for various k, and I want to evaluate the different groupings with the silho...
Cyrilcyrill asked 15/5, 2014 at 19:41

6

Solved

I'm a newbie to Machine Learning and trying to work through an error I'm getting using OneHotEncoder class. The error is: "Expected 2D array, got 1D array instead". So when I think of 1D arrays it'...
Dib asked 24/12, 2017 at 0:27

4

Solved

In a multilabel classification setting, sklearn.metrics.accuracy_score only computes the subset accuracy (3): i.e. the set of labels predicted for a sample must exactly match the corresponding set ...
Pragmatist asked 27/8, 2015 at 2:10

9

Solved

Given is a simple CSV file: A,B,C Hello,Hi,0 Hola,Bueno,1 Obviously the real dataset is far more complex than this, but this one reproduces the error. I'm attempting to build a random forest cla...
Bluh asked 21/5, 2015 at 21:51

8

Solved

I'm trying to replace a column within a Pandas DataFrame containing strings into a one-hot encoded equivalent using Scikit-Learn's OneHotEncoder. My code below doesn't work: from sklearn.preproces...

4

Solved

How do I save the StandardScaler() model in Sklearn? I need to make a model operational and don't want to load training data agian and again for StandardScaler to learn and then apply on new data o...
Pantalets asked 5/11, 2018 at 10:33

4

I have a custom kernel function, and I am using GridSearchCV function with SVC(kernel=my_kernel). my_kernel function takes a parameter k to tune, so I was wondering whether it's possible to config...
Bellyband asked 6/7, 2014 at 11:4

4

I am attempting to run below code. from sklearn.metrics import plot_confusion_matrix And I am receiving below error. --------------------------------------------------------------------------- Imp...
Reduction asked 19/9, 2020 at 10:4

11

I have recently uninstalled a nicely working copy of Enthought Canopy 32-bit and installed Canopy version 1.1.0 (64 bit). When I try to use sklearn to fit a model my kernel crashes, and I get the f...
Oratory asked 12/12, 2013 at 20:56

3

There are a lot of changes in scikit-learn 1.2.0 where it supports pandas output for all of the transformers but how can I use it in a custom transformer? In [1]: Here is my custom transformer whic...

4

I'm interested in using group lasso for a problem I have. Here is a link to the algorithm. I know R has a slick implementation, but am curious to see if python has something similar. I think skle...
Lezlielg asked 2/4, 2017 at 0:31

© 2022 - 2025 — McMap. All rights reserved.