cross-validation Questions

3

How do you perform cross-validation in a deep neural network? I know that to perform cross validation to will train it on all folds except one and test it on the excluded fold. Then do this for k f...
Disk asked 10/6, 2017 at 16:39

4

I'm attempting to do a grid search to optimize my model but it's taking far too long to execute. My total dataset is only about 15,000 observations with about 30-40 variables. I was successfully ab...

10

Solved

i'm trying to predict next customer purchase to my job. I followed a guide, but when i tried to use cross_val_score() function, it returns NaN values.Google Colab notebook screenshot Variables: ...
Sorely asked 11/2, 2020 at 15:36

4

I'm looking at this example from scikit-learn documentation: http://scikit-learn.org/0.18/auto_examples/model_selection/plot_nested_cross_validation_iris.html It seems to me that crossvalidation i...
Nealon asked 13/12, 2016 at 18:20

3

I'm trying to work my head around the example of Nested vs. Non-Nested CV in Sklearn. I checked multiple answers but I am still confused on the example. To my knowledge, a nested CV aims to use a d...
Fatback asked 6/10, 2017 at 10:18

2

Solved

I would like to use cross validation to test/train my dataset and evaluate the performance of the logistic regression model on the entire dataset and not only on the test set (e.g. 25%). These co...
Observant asked 26/8, 2016 at 9:46

4

Solved

Is it possible to get classification report from cross_val_score through some workaround? I'm using nested cross-validation and I can get various scores here for a model, however, I would like to s...

5

Solved

Does the cross_val_predict (see doc, v0.18) with k-fold method as shown in the code below calculate accuracy for each fold and average them finally or not? cv = KFold(len(labels), n_folds=20) clf...
Amari asked 4/1, 2017 at 7:57

1

Solved

I am using the sklearn version "1.4.dev0" to weight samples in the fitting and scoring process as described in this post and in this documentation. https://scikit-learn.org/dev/metadata_r...
Fabrizio asked 21/11, 2023 at 12:47

4

Solved

I'm trying to reproduce this GitHub project on my machine, on Topological Data Analysis (TDA). My steps: get best parameters from a cross-validation output load my dataset feature selection extrac...
Sampling asked 15/1, 2021 at 20:17

3

Solved

The following code is used to do KFold Validation but I am to train the model as it is throwing the error ValueError: Error when checking target: expected dense_14 to have shape (7,) but got array...
Marcasite asked 26/2, 2019 at 17:19

4

I'm trying to use StratifiedKFold to create train/test/val splits for use in a non-sklearn machine learning work flow. So, the DataFrame needs to be split and then stay that way. I'm trying to do ...
Groome asked 20/7, 2017 at 17:54

2

I am trying to implement a cross validation scheme on grouped data. I was hoping to use the GroupKFold method, but I keep getting an error. what am I doing wrong? The code (slightly different from ...
Explicate asked 1/11, 2016 at 23:6

0

I am using a XGBClassifier and try to do a grid search in order to tune some parameters, and I get this warning : WARNING: ../src/learner.cc:1517: Empty dataset at worker: 0 whenever I launch the c...
Scald asked 9/2, 2023 at 12:33

2

I'm trying to use Convolutional Neural Network (CNN) for image classification. And I want to use KFold Cross Validation for data train and test. I'm new for this and I don't really understand how t...

2

Solved

I have an imbalanced dataset containing a binary classification problem. I have built Random Forest Classifier and used k-fold cross-validation with 10 folds. kfold = model_selection.KFold(n_splits...

8

Solved

I'm tinkering with some cross-validation code from the PySpark documentation, and trying to get PySpark to tell me what model was selected: from pyspark.ml.classification import LogisticRegression...

2

Solved

With sklearn, when you create a new KFold object and shuffle is true, it'll produce a different, newly randomized fold indices. However, every generator from a given KFold object gives the same ind...
Communicate asked 22/1, 2016 at 6:36

6

Solved

I am using sklearn for multi-classification task. I need to split alldata into train_set and test_set. I want to take randomly the same sample number from each class. Actually, I amusing this funct...
Bodine asked 18/2, 2016 at 4:13

2

Solved

I have trained a model in scikit-learn using Cross-Validation and Naive Bayes classifier. How can I persist this model to later run against new instances? Here is simply what I have, I can get the...
Confab asked 21/9, 2015 at 17:2

4

Solved

I've fit a Pipeline object with RandomizedSearchCV pipe_sgd = Pipeline([('scl', StandardScaler()), ('clf', SGDClassifier(n_jobs=-1))]) param_dist_sgd = {'clf__loss': ['log'], 'clf__penalty': [N...

2

Solved

I'm trying to use GridSearchCV for RandomForestRegressor, but always get ValueError: Found array with dim 100. Expected 500. Consider this toy example: import numpy as np from sklearn import ense...
Gustation asked 11/1, 2015 at 18:14

6

Solved

I have a dataset, which has previously been split into 3 sets: train, validation and test. These sets have to be used as given in order to compare the performance across different algorithms. I wo...
Robenarobenia asked 11/8, 2015 at 18:3

3

I was using StratifiedKFold from scikit-learn, but now I need to watch also for "groups". There is nice function GroupKFold, but my data are very time dependent. So similary as in help, ie number o...
Reception asked 26/11, 2016 at 14:52

1

from sklearn import datasets, linear_model from sklearn.model_selection import cross_val_predict iris = datasets.load_iris() X = iris.data[:150] y = iris.target[:150] lasso = linear_model.Las...
Earlie asked 18/3, 2021 at 7:12

© 2022 - 2024 — McMap. All rights reserved.