Selecting and importing only certain columns from excel for importing
Asked Answered
D

1

2

'I have an excel file which contains many columns with strings, but i want to import certain columns of this excel file containing 'NGUYEN'.

I want to generate a string from columns in my excel which had 'NGUYEN' in them.

import pandas as pd
data = pd.read_excel("my_excel.xlsx", parse_cols='NGUYEN' in col for cols in my_excel.xlsx, skiprows=[0])
data = data.to_string()
print(data)


SyntaxError: invalid syntax

my_excel.xlsx

Function output should be

data =  'NGUYEN VIETNAM HANOIR HAIR PANTS BIKES CYCLING ORANGE GIRL TABLE DARLYN NGUYEN OMG LOL'   
Diazotize answered 15/10, 2017 at 5:24 Comment(1)
Have you tried to define cols and then paste this as argument into pd.read_excel?Snigger
D
1

I'm pretty sure this is what you are looking for. I tried making it as simple and compact as possible, if you need help making a more readable multi-line function. Let me know!

import pandas as pd
data = pd.read_excel("my_excel.xlsx")
getColumnsByContent = lambda string:  ' '.join([' '.join([elem for elem in data[column]]) for column in data.columns  if string in data[column].to_numpy()])
print(getColumnsByContent('NGUYEN'))
print(getColumnsByContent('PANTS'))
Deutschland answered 1/10, 2020 at 21:29 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.