data-analysis Questions

3

I am new to data analysis. I am currently using seaborn 0.13.1 along with pandas 2.2.0 and I was messing around with the following code: import pandas as pd import seaborn as sns from matplotlib im...
Sheeree asked 25/1 at 19:13

41

Solved

I am analyzing the following data: Raw data (seperated with spaces): 1 1 1.1 1 0.9 1 1 1.1 1 0.9 1 1.1 1 1 0.9 1 1 1.1 1 1 1 1 1.1 0.9 1 1.1 1 1 0.9 1 1.1 1 1 1.1 1 0.8 0.9 1 1.2 0.9 1 1 1.1 1.2 1...

5

Solved

Suppose I have a dataframe with columns a, b and c. I want to sort the dataframe by column b in ascending order, and by column c in descending order. How do I do this?
Terrel asked 17/6, 2013 at 6:28

6

Solved

How I can implement the case_when function of R in a python code? Here is the case_when function of R: https://www.rdocumentation.org/packages/dplyr/versions/0.7.8/topics/case_when as a minimu...
Anjaanjali asked 12/2, 2019 at 15:20

9

Solved

energy.loc['Republic of Korea'] I want to change the value of index from 'Republic of Korea' to 'South Korea'. But the dataframe is too large and it is not possible to change every index value. Ho...
Binnacle asked 4/11, 2016 at 16:50

3

Solved

I would like to change colors in this plot, it visualizes data properly but as you can see it isn't easy to read because all this colors are very similar (7 classes). Is there simple way to do it? ...
Breathe asked 16/1, 2021 at 22:23

4

Solved

I've got a large table of data in an Excel spreadsheet that, essentially, can be considered to be a collection of values for individuals identified as belonging to various subpopulations: IndivID...
Penguin asked 18/11, 2012 at 14:32

3

Solved

Consider the following dataset: ig_5 <- data.frame( category = c("A", "B", "C", "D", "E", "F"), prop = c(0.1, 0.2, 0.15, 0.25, 0.05,...
Pierian asked 9/5, 2023 at 21:39

1

Solved

How do I do something like: from tqdm.notebook import tqdm from matplotlib import pyplot as plt from IPython import display import time import numpy as np xx = list() for i in tqdm(range(500)): ...
Keeton asked 2/4, 2021 at 5:46

3

Solved

Suppose I have a dataframe like so: a b 1 5 1 7 2 3 1 3 2 5 I want to sum up the values for b where a = 1, for example. This would give me 5 + 7 + 3 = 15. How do I do this in pandas?
Hola asked 30/1, 2015 at 12:48

5

Solved

When doing data analysis, I sometimes need to recode values to factors in order to carry out groups analysis. I want to keep the order of factor same as the order of conversion specified in case_wh...
Brobdingnagian asked 30/3, 2018 at 10:5

13

Solved

I have different dataframes and need to merge them together based on the date column. If I only had two dataframes, I could use df1.merge(df2, on='date'), to do it with three dataframes, I use df1....
Extenuate asked 2/6, 2017 at 11:38

1

I have a pdf file and I need to edit some text/values in the pdf. For example, In the pdf files that I have BIRTHDAY DD/MM/YYYY is always N/A. I want to change it to whatever value I desire and the...
Demarcate asked 7/6, 2018 at 13:25

3

Solved

Following are 2 measures: SUMX ( ALL ( SALES ) , SALES[AMT] ) CALCULATE ( SUMX ( SALES, SALES[AMT] ), ALL (SALES) ) Similarly for the following 2 measures: SUMX ( FILTER ( SALES, SALES[QTY]>1 ...
Hayton asked 15/1, 2021 at 20:5

5

import pandas as pd numbers = {1,2,3,4,5} ser = pd.Series(numbers) print ser I write this code in python for pandas series. but it's giving this "AttributeError: 'module' object has no attribu...
Presumption asked 14/5, 2015 at 1:33

3

I have two different dfs that I want to combine using: pd.concat([df1, df2], 1) The end result being a df with the date as the index and all of the cols. According to pandas documentation, this...
Theona asked 26/4, 2017 at 2:51

4

Solved

I'm using google Colab notebook for a project that requires me to plot GPS coordinates on a map. I want to use basemap for this purpose. I tried to import it on the Colab notebook by using from mpl...

2

Solved

In the Excel sheet , I have two columns with large numbers. But when I read the Excel file with read_excel() and display the dataframe, those two columns are printed in scientific format with expon...
Rhombic asked 31/7, 2016 at 23:8

1

I am using a titanic.csv dataset where i am trying to use Column Transfer and Pipeline and while using pipe.predict(x_test) i am getting an error. Here is my code. titanic={'sex':['M','M','M','F','...
Obrian asked 2/4, 2022 at 7:49

6

Solved

I had following data frame (the real data frame is much more larger than this one ) : sale_user_id sale_product_id count 1 1 1 1 8 1 1 52 1 1 312 5 1 315 1 Then reshaped it to move the values in s...
Cordeiro asked 15/8, 2016 at 7:59

2

I have a temp DF that has the following data in it Quarter 2016Q3 146660510.0 2016Q4 123641451.0 2017Q1 125905843.0 2017Q2 129656327.0 2017Q3 126586708.0 2017Q4 116804168.0 2018Q1 118167263.0 2018Q...

5

I've been trying to practise what I've learned from this tutorial:(https://realpython.com/sentiment-analysis-python/) using PyCharm. And this line: textcat.add_label("pos") generated a w...
Disabuse asked 24/3, 2021 at 23:2

4

I replaced the missing values with NaN using lambda following function: data = data.applymap(lambda x: np.nan if isinstance(x, basestring) and x.isspace() else x) where data is the dataframe I am ...
Spinelli asked 2/10, 2015 at 7:58

5

Solved

Given that I have the following two vectors: In [99]: time_index Out[99]: [1484942413, 1484942712, 1484943012, 1484943312, 1484943612, 1484943912, 1484944212, 1484944511, 1484944811, 148...
Croaker asked 21/1, 2017 at 14:24

4

Solved

I would like to add certain words to the default stopwords list used in wordcloud. Current code: all_text = " ".join(rev for rev in twitter_clean.text) stop_words = ["https", "co", "RT"] wordcloud...
Crash asked 1/1, 2019 at 17:20

© 2022 - 2024 — McMap. All rights reserved.