data-science - 2

2

Solved

Extracting a Rust Polars dataframe value as a scalar value

I have the following code to find the mean of the ages in the dataframe. let df = df! [ "name" => ["panda", "polarbear", "seahorse"], "age" =&...

rust data-science rust-polars

Agosto asked 12/11, 2022 at 11:12

2

Missing 'Find in Selection' in VS Code when editing Jupyter Notebooks

The 'find in selection' button is missing from VSCode when working with Jupyter Notebooks. It slows down development so I would like to ask if anybody knows how to activate it? First image shows th...

python visual-studio-code jupyter-notebook data-science

Kirbykirch asked 4/10, 2021 at 9:45

2

Solved

Re-compose a Tensor after tensor factorization

I am trying to decompose a 3D matrix using python library scikit-tensor. I managed to decompose my Tensor (with dimensions 100x50x5) into three matrices. My question is how can I compose the initia...

python math data-science scikits

Iatrogenic asked 28/9, 2016 at 12:58

5

Solved

Pandas Fillna of Multiple Columns with Mode of Each Column

Working with census data, I want to replace NaNs in two columns ("workclass" and "native-country") with the respective modes of those two columns. I can get the modes easily: mode = df.filter(["wo...

python pandas numpy data-science

Suffrage asked 18/3, 2017 at 4:42

5

Solved

How to do superscripts and subscripts in Jupyter Notebook?

I want to to use numbers to indicate references in footnotes, so I was wondering inside of Jupyter Notebook how can I use superscripts and subscripts?

python jupyter-notebook jupyter data-science

Herbalist asked 2/9, 2017 at 8:8

6

Solved

Pandas df.describe() - how do I extract values into Dataframe?

I am trying to do a naive Bayes and after loading some data into a dataframe in Pandas, the describe function captures the data I want. I'd like to capture the mean and std from each column of the ...

python pandas dataframe data-science

Permenter asked 27/1, 2019 at 22:45

7

Solved

Filter pandas dataframe by list [duplicate]

I have a dataframe that has a row called "Hybridization REF". I would like to filter so that I only get the data for the items that have the same label as one of the items in my lis...

python pandas numpy data-science

Whitebait asked 11/7, 2017 at 16:45

4

Cannot import name 'CRS' from 'pyproj' for using the osmnx library

I have used a fresh anaconda install to download and install all the required modules for osnmx library but I got the following error:

python anaconda data-science osmnx pyproj

Cheeky asked 9/1, 2020 at 6:14

1

What is the difference between Databricks and Spark?

I am trying to a clear picture of how they are interconnected and if the use of one always require the use of the other. If you could give a non-technical definition or explanation of each of them,...

database apache-spark data-science azure-databricks

Dogbane asked 29/9, 2022 at 9:2

5

Solved

melt column by substring of the columns name in pandas (python)

I have dataframe: subject A_target_word_gd A_target_word_fd B_target_word_gd B_target_word_fd subject_type 1 1 2 3 4 mild 2 11 12 13 14 moderate And I want to melt it to a dataframe that wi...

pandas dataframe data-science melt data-munging

Reword asked 1/1, 2020 at 7:44

2

Getting errors while installing Surprise package

I am using the below command while installing surprise package. I have got error messages while installing and I am not able to understand. I need help to install this package successfully. pip ins...

machine-learning anaconda data-science

Snapper asked 12/1, 2021 at 7:10

4

Solved

What is sigma clipping? How do you know when to apply it?

I'm reading a book on Data Science for Python and the author applies 'sigma-clipping operation' to remove outliers due to typos. However the process isn't explained at all. What is sigma clipping?...

python pandas numpy statistics data-science

Glennglenna asked 14/8, 2017 at 3:16

2

Solved

Pandas: normalize values by group

I find it hard to explain with words what I want to achieve, so please don't judge me for showing a simple example instead. I have a table that looks like this: main_col some_metadata value ...

python pandas dataframe data-science data-wrangling

Sphenoid asked 27/9, 2022 at 13:51

3

Solved

How can repetitive rows of data be collected in a single row in pandas?

I have a dataset that contains the NBA Player's average statistics per game. Some player's statistics are repeated because of they've been in different teams in season. For example: Player Pos Age...

python pandas dataframe data-science

Footling asked 15/8, 2021 at 15:24

2

Solved

How to handle category mismatch after onehotencoding from test data while predicting?

I'm sorry if the title of the question is not that clear, I could not sum up the problem in one line. Here are the simplified datasets for an explanation. Basically, the number of categories in t...

python machine-learning scikit-learn data-science

Tuber asked 13/12, 2017 at 6:11

4

How can I invoke AWS SageMaker endpoint to get inferences?

I want to get real time predictions using my machine learning model with the help of SageMaker. I want to directly get inferences on my website. How can I use the deployed model for predictions?

amazon-web-services machine-learning data-science amazon-sagemaker

Mellman asked 21/11, 2018 at 4:57

9

Solved

quantile normalization on pandas dataframe

Simply speaking, how to apply quantile normalization on a large Pandas dataframe (probably 2,000,000 rows) in Python? PS. I know that there is a package named rpy2 which could run R in subprocess,...

python deep-learning data-science

Mongol asked 21/6, 2016 at 5:1

2

Solved

Select only columns that have at most N unique values

I want to count the number of unique values in each column and select only those columns which have less than 32 unique values. I tried using df.filter(nunique<32) and df[[ c for df.column...

python pandas dataframe data-science

Costly asked 24/6, 2019 at 16:27

9

Plotly missing orca

I have small problem when exporting static chart using plotly. Plotly does not correctly recognize that I have orca installed and I have still error related to missing orca. I try to change the or...

python plotly data-science orca

Plasia asked 20/10, 2019 at 14:10

4

Solved

Difference between Standard scaler and MinMaxScaler

What is the difference between MinMaxScaler() and StandardScaler(). mms = MinMaxScaler(feature_range = (0, 1)) (Used in a machine learning model) sc = StandardScaler() (In another machine learning ...

python python-3.x machine-learning scikit-learn data-science

Cubby asked 9/7, 2018 at 2:42

5

Solved

Is my python implementation of the Davies-Bouldin Index correct?

I'm trying to calculate the Davies-Bouldin Index in Python. Here are the steps the code below tries to reproduce. 5 Steps: For each cluster, compute euclidean distances between each point to the c...

python statistics cluster-analysis metrics data-science

Trophozoite asked 30/12, 2017 at 18:8

2

Solved

Installing CUDA Windows 10

I am trying to install the CUDA toolkit in order to be able to use Thundersvm in my personal computer. However I keep getting the following message in the GUI installer: "You already have a ne...

windows cuda data-science driver

Fessler asked 27/1, 2021 at 18:41

2

Solved

Jupyter Notebook ImportError: cannot import name 'example_var'

When I change/add a variable to my config.py file and then try to import it to my Jupyter Notebook I get: ImportError: cannot import name 'example_var' from 'config' config.py: example_var = 'exa...

python import jupyter-notebook data-science

Corum asked 20/1, 2021 at 17:8

1

How do I get misclassified instances and their indices for each fold cross validation in python?

from sklearn import datasets, linear_model from sklearn.model_selection import cross_val_predict iris = datasets.load_iris() X = iris.data[:150] y = iris.target[:150] lasso = linear_model.Las...

python-3.x data-science cross-validation

Earlie asked 18/3, 2021 at 7:12

4

Solved

How to install and use basemap on Google Colab?

I'm using google Colab notebook for a project that requires me to plot GPS coordinates on a map. I want to use basemap for this purpose. I tried to import it on the Colab notebook by using from mpl...

python matplotlib data-science data-analysis google-colaboratory

Ozonide asked 10/2, 2019 at 6:38

data-science Questions

Recommended topics

Hot tags