scikit-learn Questions
3
Solved
I have been trying to use the scikit-learn library to solve this problem. Roughly:
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
# Make or ...
Giraffe asked 10/10 at 17:32
3
I have a dataset which has a DateTime index and I'm using PCA from sklearn to reduce the number of dimensions.
The following question bugs me - will PCA keep the order of the points in my series s...
Tour asked 1/2, 2017 at 13:50
4
I'm attempting to do a grid search to optimize my model but it's taking far too long to execute. My total dataset is only about 15,000 observations with about 30-40 variables. I was successfully ab...
Balanchine asked 3/5, 2022 at 14:51
11
Solved
I am trying to import imblearn into my python notebook after installing the required modules. However, I am getting the following error:
Additional info: I am using a virtual environment in Visual...
Malvoisie asked 1/7, 2023 at 8:52
8
I wrote a text classification program. When I run the program it crashes with an error as seen in this screenshot:
ValueError: With n_samples=0, test_size=0.2 and train_size=None, the resulting t...
Passably asked 3/2, 2020 at 16:25
6
Solved
UPDATED: In the end, the solution I opted to use for clustering my large dataset was one suggested by Anony-Mousse below. That is, using ELKI's DBSCAN implimentation to do my clustering rather than...
Derick asked 5/5, 2013 at 5:4
3
Solved
I am trying in Amazon Sagemaker to deploy an existing Scikit-Learn model. So a model that wasn't trained on SageMaker, but locally on my machine.
On my local (windows) machine I've saved my model a...
Tamelatameless asked 25/1, 2021 at 9:3
2
I am trying to implement bag of word model from kaggle site with a twitter sentiments data which has around 1M raw. I already clean it but in last part when I applied my features vectors and sentim...
Lustring asked 26/4, 2017 at 17:9
3
Solved
When passing x,y in fit, I am getting the following error:
Traceback (most recent call last):
File "C:/Classify/classifier.py", line 95, in
train_avg, test_avg, cms = train_model(X, y, "cep...
Calycine asked 24/11, 2016 at 7:12
2
Solved
There are standard ways of predicting proportions such as logistic regression (without thresholding) and beta regression. There have already been discussions about this:
http://scikit-learn-genera...
Yovonnda asked 29/5, 2017 at 4:38
4
One can create a multivariate kernel density estimate (KDE) with
scikitlearn (https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KernelDensity.html#sklearn.neighbors.KernelDensity)...
Dropwort asked 14/2, 2020 at 9:34
2
I would like in sklearn package, Find the gini coefficients for each feature on a class of paths
such as in iris data. like Iris-virginica Petal length gini:0.4 ,Petal width gini:0.4.
Levitical asked 13/7, 2017 at 11:42
3
Solved
I am trying to make my git repository pip-installable. In preparation for that I am restructuring the repo to follow the right conventions. My understanding from looking at other repositories is th...
Sang asked 8/2, 2019 at 17:7
4
Solved
In scikit-learn, some clustering algorithms have both predict(X) and fit_predict(X) methods, like KMeans and MeanShift, while others only have the latter, like SpectralClustering. According to the ...
Jonette asked 9/5, 2016 at 2:25
2
Solved
I have this dataset with target LULUS, it's an imbalance dataset. I'm trying to print roc auc score if I could for each fold of my data but in every fold somehow it's always raise error saying Valu...
Antwanantwerp asked 29/5, 2021 at 16:7
4
I want to implement a custom loss function in scikit learn. I use the following code snippet:
def my_custom_loss_func(y_true,y_pred):
diff3=max((abs(y_true-y_pred))*y_true)
return diff3
score=m...
Carmeliacarmelina asked 19/1, 2019 at 13:47
3
Solved
I got this from the sklearn webpage:
Pipeline: Pipeline of transforms with a final estimator
Make_pipeline: Construct a Pipeline from the given estimators. This is a shorthand for the Pipeline co...
Harriettharrietta asked 20/11, 2016 at 18:56
4
I understand that random_state is used in various sklearn algorithms to break tie between different predictors (trees) with same metric value (say for example in GradientBoosting). But the document...
Zwiebel asked 29/9, 2014 at 10:38
3
Solved
When importing sklearn datasets eg.
from sklearn.datasets import fetch_mldata
from sklearn.datasets import fetch_openml
I get the error
Traceback (most recent call last):
File "numbers.py", l...
Thresher asked 11/3, 2019 at 19:19
2
Solved
I'm currently trying to train a linear model using sklearn in python but not with mean squared error (MSE) as error measure - but with mean absolute error (MAE). I specificially need a linear model...
Morpheus asked 17/5, 2018 at 13:31
5
Solved
Is there a way to retrieve the list of feature names used for training of a classifier, once it has been trained with the fit method? I would like to get this information before applying to unseen ...
Doolittle asked 8/11, 2016 at 11:6
4
I am trying to run some Machine learning algo on a dataset using scikit-learn. My dataset has some features which are like categories. Like one feature is A, which has values 1,2,3 specifying the q...
Argosy asked 9/4, 2015 at 2:18
8
Solved
I am trying to plot a Receiver Operating Characteristics (ROC) curve with cross validation, following the example provided in sklearn's documentation. However, the following import gives an ImportE...
Salvo asked 20/2, 2020 at 13:44
2
Solved
I apply the
decision tree classifier and the random forest classifier to my data with the following code:
def decision_tree(train_X, train_Y, test_X, test_Y):
clf = tree.DecisionTreeClassifier()...
Dugong asked 13/1, 2018 at 11:4
3
I have a dataframe name data whose correlation matrix I computed by using
corr = data.corr()
If the correlation between two columns is greater than 0.75, I want to remove one of them from datafram...
Squires asked 3/7, 2017 at 15:39
1 Next >
© 2022 - 2024 — McMap. All rights reserved.