Get Excel column letter based on column header - Python
Asked Answered
H

1

5

This seems relatively straight forward, but I have yet to find a duplicate that answers my question, or a method with the needed functionality.

I have an Excel spreadsheet where each column that contains data has a unique header. I would like to use pandas to get the letter key of the column by passing this header string into a function.

For example, if I pass "Output Parameters" to the function, I would like to return "M":

enter image description here

The closest thing that I have found is the xlsxwriter.utility.xl_col_to_name(index) method outlined in the second answer in Convert spreadsheet number to column letter

This is very similar to what I am trying to do, however the column numbers within my sheets will not remain constant (unlike the headers, which will). This being said, a method that can return the column number based on the header would also work, as I would then be able to apply the above.

Does pandas or xlsxwriter have a method that can handle the above case?

Horripilate answered 29/9, 2021 at 7:33 Comment(7)
What is your use case? If you load a pandas dataframe in from the middle of a sheet, eg using usecols to select the columns and skiprows to omit some rows, would you want to return the oiriginal column letters? Or just treat the first column in the dataframe as column "A"Halfback
@Halfback Yes, the original column letters are needed. The point of the spreadsheet is to act as a 'build your own manufacturing calculator', and should thus be as customizable as possible (therefore, very little is assumed static, making omission difficult). The answer below is exactly what was needed.Horripilate
The selected answer doesn't do that... it predicts what the column letter is from a dataframe if the dataframe were saved to a new Excel file. If the dataframe is created from a spreadsheet loaded with usecols=["B:E"], xlsxwriter will identify the first col as "A" as it would be if the dataframe were written to a new spreadsheet.Halfback
@Halfback I'm not understanding the difference between the two in that case - could you please elaborate if you have time? I have implemented the below solution and the output is as expected. Perhaps my original question was worded poorly.Horripilate
It is possible to load data into a dataframe from the middle of a sheet. For example, a table may appear in a spreadsheet starting at D1, not A1, and spanning cols D:G. The usecols=D:G params would let you load in just those cols. If you then saved the df, or used xlsxwriter.utility.xl_col_to_name on the dataframe, it would return as a sheet with cols A:D. I was trying to clarify if you needed to identify the columns from the original spreadsheet.Halfback
I't' also worth noting that all the xlsxwriter utility function is doing it taking a column index number retrieved from the dataframe and converting it to a column name form: github.com/jmcnamara/XlsxWriter/blob/…Halfback
@Halfback I'm understanding now, thank you. So that doesn't have an impact on what I'm trying to do, as the original spreadsheet and the dataframe I am reading have the same column positions. Though, you have brought up a very important consideration for anybody who may stumble across this question in the future. Thank you for the fantastic clarification.Horripilate
R
8

You can try:

import xlsxwriter
col_no = df.columns.get_loc("col_name")
print(xlsxwriter.utility.xl_col_to_name(col_no))
Ran answered 29/9, 2021 at 8:20 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.