data-analysis Questions
5
Solved
I've read the answers to this question and they are quite helpful, but I need help.
I have an example data set in R as follows:
x <- c(32,64,96,118,126,144,152.5,158)
y <- c(99.5,104.8,108.5...
Koel asked 29/9, 2010 at 14:24
5
Solved
I think example will be much better than loooong description :)
Let's assume we have an array of arrays:
("Server1", "Server_1", "Main Server", "192.168.0.3")
("Server_1", "VIP Server", "Main Ser...
Anallise asked 26/5, 2011 at 6:22
5
Solved
the DataFrame(input)
0 1.0 25.0
1 1.0 31.0
2 2.0 97.0
3 1.0 25.0
4 1.0 26.0
output
I want to get an array that has indexes from 1 up to and including 97 that says each index was how many times in ...
Unmeasured asked 20/6, 2021 at 6:52
2
I have DataFrame like below:
df = pd.DataFrame([
("i", 1, 'GlIrbixGsmCL'),
("i", 1, 'GlIrbixGsmCL'),
("i", 1, '3IMR1UteQA'),
("c", 1, 'GlIrbixGsmCL'),
(...
Decamp asked 30/5, 2021 at 4:13
5
Solved
I have a dataframe like below where first column contains dates and other columns contain data on those dates:
date k1-v1 k1-v2 k2-v1 k2-v2 k1k3-v1 k1k3-v2 k4-v1 k4-v2
0 2021-01-05 2.0 7.0 NaN NaN...
Hypethral asked 5/6, 2021 at 10:25
12
Solved
I will be analysing vast amount of network traffic related data shortly, and will pre-process the data in order to analyse it. I have found that R and SPSS are among the most popular tools for stat...
Oshiro asked 24/9, 2010 at 12:54
3
Solved
I have the following data frame:
data = pd.DataFrame({'user_id' : ['a1', 'a1', 'a1', 'a2','a2','a2','a3','a3','a3'], 'product_id' : ['p1','p1','p2','p1','p1','p1','p2','p2','p3']})
product_id use...
Yockey asked 13/8, 2016 at 13:12
2
Let's say I have this data set and for analysing the trends between male and female literacy across rural and urban region of every state . I need to set index as Name
Which I can do as -
df...
Pedicular asked 1/12, 2020 at 2:7
3
Is it best to split your data into training and test sets before doing any exploratory data analysis, or do all exploration based solely on training data?
I'm working on my first full machine learn...
Indiraindirect asked 21/1, 2019 at 1:8
2
Solved
hi I am trying to get the column name of a dataframe which contains a specific word,
eg:
i have a dataframe,
NA good employee
Not available best employer
not required well manager
not eligible su...
Claypool asked 17/11, 2017 at 10:11
1
Solved
I try to do a bar graph using plotly.express but I find this problem
All arguments should have the same length. The length of argument y
is 51, whereas the length of previously-processed arguments...
Alphanumeric asked 23/7, 2020 at 19:3
1
I tried to compare the two, one is pandas.unique() and another one is numpy.unique(), and I found out that the latter actually surpass the first one.
I am not sure whether the excellency is linear ...
Minton asked 14/11, 2018 at 23:57
3
Solved
The problem: let us take Titanic dataset from Kaggle.
I have dataframe with columns "Pclass", "Sex" and "Age".
I need to fill NaN in column "Age" with a median for certain group.
If it is a woman f...
Cementum asked 23/11, 2017 at 14:34
2
Solved
In pandas, axis=0 represent rows and axis=1 represent columns.
Therefore to get the sum of values in each row in pandas, df.sum(axis=0) is called.
But it returns a sum of values in each columns and...
Momentum asked 9/5, 2020 at 1:25
3
Creating a dataframe using subsetting with below conditions
subset_df = df_eq.loc[(df_eq['place'].str.contains('Chile')) & (df_eq['mag'] > 7.5),['time','latitude','longitude','mag','place']...
Antiquated asked 29/7, 2016 at 15:4
3
Solved
I have noticed that when One Hot encoding is used on a particular data set (a matrix) and used as training data for learning algorithms, it gives significantly better results with respect to ...
Achieve asked 4/7, 2013 at 12:4
3
Solved
I'm writing a program in C++ but using data from matlab involving Cross Correlation.
I understand that when I do a correlation on 2 sets of data it gives me a single correlation coefficient number ...
Binette asked 8/6, 2011 at 16:9
4
Solved
I have GPS data of ice speed from three different GPS receivers. The data are in a pandas dataframe with an index of julian day (incremental from the start of 2009).
This is a subset of the data (...
Reginaldreginauld asked 28/11, 2012 at 10:43
1
Solved
I find violin plots very informative and useful, I use python library 'seaborn'.
However, when applied to positive values, they nearly always show negative values at the lower end. I find this real...
Hugmetight asked 8/1, 2020 at 15:50
3
Solved
Can we connect spark with sql-server? If so, how?
I am new to spark, I want to connect the server to spark and work directly from sql-server instead of uploading .txt or .csv file. Please help, Tha...
Melantha asked 17/1, 2018 at 7:12
2
I have been attempting to detect peaks in sinusoidal time-series data in real time, however I've had no success thus far. I cannot seem to find a real-time algorithm that works to detect peaks in s...
Incisure asked 2/1, 2020 at 3:11
1
Solved
this is my first question ever here I hope I am doing this right,
I was working on titanic dataset which is popular on kaggle, this tutarial if u wanna check A Data Science Framework: To Achieve 9...
Azalea asked 15/12, 2019 at 17:39
1
Solved
I have a huge data set containing bacteria samples (4 types of bacteria) from 10 water resources from 2010 until 2019. some values are missing so we need to not include them in the plot or analysis...
Hapsburg asked 23/11, 2019 at 13:35
3
Solved
Often, I want to run a cross validation on a dataset which contains some factor variables and after running for a while, the cross validation routine fails with the error: factor x has new levels Y...
Sherfield asked 13/11, 2013 at 6:30
0
I would like to know if someone knows if there is a framework like Python's pandas in swift or objective-c(preferably Swift) for an iOS application.
I would like to have a library that coul...
Figment asked 2/10, 2019 at 21:33
© 2022 - 2024 — McMap. All rights reserved.