Is there a way to read Stata labels in python?
Asked Answered
L

3

7
df = pd.read_stata('file.dta')
for cols in df.columns.values:
    name = cols.lower()
    type = df[cols].dtype
    #label = ...

I need to get the labels/descriptions in python for each column.

Longobard answered 28/6, 2017 at 18:8 Comment(0)
L
1

I got this

reader = pd.io.stata.StataReader('file.dta')
header = reader.variable_labels()
for var in header:
    name = var
    label = header[name]
Longobard answered 28/6, 2017 at 19:22 Comment(2)
I was about to comment on typo, but you fixed it. I am not sure what you were trying to do with the for loop though (?) as "header" is already a dictionary. Btw in retrospect I would have just done my answer as a comment but it got two quick upvotes so I decided to leave it.Sepulcher
Yes, I was writing it in a csv file row by row and then doing a little more manipulations with it. But yes..thanks for your input! :)Longobard
M
9

In Pandas 0.22, you can also access this by creation of the iterator. I.e.

import pandas as pd
itr = pd.read_stata('file.dta', iterator=True)
itr.variable_labels()

This will return a dictionary where the keys are variable names and the values are variable labels. I think this is easier to remember than pd.io.stata.StataReader.

Mali answered 8/1, 2018 at 20:11 Comment(0)
S
4

This will return a dictionary of labels:

>>> pd.io.stata.StataReader('file.dta').variable_labels()
{'x': 'x label', 'y': 'y label'}
Sepulcher answered 28/6, 2017 at 20:40 Comment(2)
reader is not defined in that answer so it wasn't clear where it came from. From your answer it seems it is from pd.io so that means something new for me. :)Bicipital
Ah, yes, good point! Thanks! I presume it was just a typo (now fixed, btw), but I'm happy to have added something of value in any event.Sepulcher
L
1

I got this

reader = pd.io.stata.StataReader('file.dta')
header = reader.variable_labels()
for var in header:
    name = var
    label = header[name]
Longobard answered 28/6, 2017 at 19:22 Comment(2)
I was about to comment on typo, but you fixed it. I am not sure what you were trying to do with the for loop though (?) as "header" is already a dictionary. Btw in retrospect I would have just done my answer as a comment but it got two quick upvotes so I decided to leave it.Sepulcher
Yes, I was writing it in a csv file row by row and then doing a little more manipulations with it. But yes..thanks for your input! :)Longobard

© 2022 - 2024 — McMap. All rights reserved.