Convert a dta file to csv without Stata software

M

12

83

Is there a way to convert a dta file to a csv?

I do not have a version of Stata installed on my computer, so I cannot do something like:

File --> "Save as csv"

Maxima answered 29/3, 2010 at 6:3 Comment(2)

I'm sure there is a way. If the format of the .DTA file is specified, it can become a simple programming exercise – Carpus 29/3, 2010 at 6:7

it's binary, I'm not sure how to get it out of there – Maxima 29/3, 2010 at 6:12

A

120

Python data-analysis library pandas can read Stata files.

>>> import pandas as pd
>>> data = pd.read_stata('my_stata_file.dta')
>>> data.to_csv('my_stata_file.csv')

Amazing!

Almsgiver answered 15/9, 2014 at 14:20 Comment(4)

Wow, I can't believe Pandas supports Stata :O – Phoebephoebus 5/3, 2018 at 14:44

This certainly worked for me. Very simple, can be done from command line, and totally free – Illusage 4/12, 2020 at 17:16

That didn't work exactly. With current pandas (2.2) it's just pd.read_stata pandas.pydata.org/pandas-docs/stable/reference/api/… – Tautonym 26/4 at 6:55

Thanks @Tautonym I've updated my answer (from 10 years ago!) – Almsgiver 7/5 at 16:50

U

61

You could try doing it through R:

For Stata <= 15 you can use the haven package to read the dataset and then you simply write it to external CSV file:

library(haven)
yourData = read_dta("path/to/file")
write.csv(yourData, file = "yourStataFile.csv")

Alternatively, visit the link pointed by huntaub in a comment below.

For Stata <= 12 datasets foreign package can also be used

library(foreign)
yourData <- read.dta("yourStataFile.dta")

Ungava answered 5/5, 2010 at 21:48 Comment(3)

Note that this technique does not work if you are utilizing Stata 13 .dta files. You should utilize the techniques in this question. – Frau 31/1, 2015 at 15:9

@huntaub Thanks huntaub, updated answer to clarify that is 12 downwards. – Ungava 2/2, 2015 at 9:15

Note for the complete beginner: start with library(haven) – Abrahan 25/10, 2017 at 21:0

Q

7

You can do it in StatTransfer, R or perl (as mentioned by others), but StatTransfer costs $$$ and R/Perl have a learning curve.
There is a free, menu-driven stats program from AM Statistical Software that can open and convert Stata .dta from all versions of Stata, see:

http://am.air.org/

Quinquepartite answered 19/8, 2010 at 2:0 Comment(1)

BTW, here is Stata's breakdown of how a .dta file is structured, which could be useful for extracting data elements: stata.com/help.cgi?dta – Quinquepartite 19/8, 2010 at 14:31

B

6

I have not tried, but if you know Perl you can use the Parse-Stata-DtaReader module to convert the file for you.

The module has a command-line tool dta2csv, which can "convert Stata 8 and Stata 10 .dta files to csv"

Beutner answered 29/3, 2010 at 6:43 Comment(0)

E

5

Another way of converting between pretty much any data format using R is with the rio package.

Install R from CRAN and open R
Install the rio package using install.packages("rio")

Load the rio library, then use the convert() function:

library("rio")
convert("my_file.dta", "my_file.csv")

This method allows you to convert between many formats (e.g., Stata, SPSS, SAS, CSV, etc.). It uses the file extension to infer format and load using the appropriate importing package. More info can be found on the R-project rio page.

Easting answered 24/3, 2016 at 14:17 Comment(2)

I'm sure this works great for those already experienced with R, but for those who are not (like me), this will probably be frustrating. It took me an hour-plus of Googling and trial-and-error to figure out all the different packages you have to install before this actually works. – Economist 19/1, 2017 at 5:0

@KennyLJ I'm new to R and found this to be very easy. Just ran install.packages("rio") and was good to go. – Niigata 30/6, 2017 at 4:35

C

4

The R method will work reliably, and it requires little knowledge of R. Note that the conversion using the foreign package will preserve data, but may introduce differences. For example, when converting a table without a primary key, the primary key and associated columns will be inserted during the conversion.

From http://www.r-bloggers.com/using-r-for-stata-to-csv-conversion/ I recommend:

library(foreign)
write.table(read.dta(file.choose()), file=file.choose(), quote = FALSE, sep = ",")

Carlile answered 13/3, 2013 at 21:59 Comment(0)

T

3

In Python, one can use statsmodels.iolib.foreign.genfromdta to read Stata datasets. In addition, there is also a wrapper of the aforementioned function which can be used to read a Stata file directly from the web: statsmodels.datasets.webuse.

Nevertheless, both of the above rely on the use of the pandas.io.stata.StataReader.data, which is now a legacy function and has been deprecated. As such, the new pandas.read_stata function should now always be used instead.

According to the source file of stata.py, as of version 0.23.0, the following are supported:

Stata data file versions:

104
105
108
111
113
114
115
117
118

Valid encodings:

ascii
us-ascii
latin-1
latin_1
iso-8859-1
iso8859-1
8859
cp819
latin
latin1
L1

As others have noted, the pandas.to_csv function can then be used to save the file into disk. A related function numpy.savetxt can also save the data as a text file.

EDIT:

The following details come from help dtaversion in Stata 15.1:

        Stata version     .dta file format
        ----------------------------------------
               1               102
            2, 3               103
               4               104
               5               105
               6               108
               7            110 and 111
            8, 9            112 and 113
          10, 11               114
              12               115
              13               117
              14 and 15        118 (# of variables <= 32,767)
              15               119 (# of variables > 32,767, Stata/MP only)
        ----------------------------------------
        file formats 103, 106, 107, 109, and 116
        were never used in any official release.

Tengdin answered 22/5, 2018 at 16:20 Comment(3)

I took the liberty of adding more information on dta versions. – Florio 22/5, 2018 at 16:28

Thanks. I was surprised to see that these details are literally buried in source code so i thought to post them on here for others. – Tengdin 22/5, 2018 at 16:38

They are not "literally buried in source code" but documented openly. – Florio 14/9, 2020 at 6:41

I

2

StatTransfer is a program that moves data easily between Stata, Excel (or csv), SAS, etc. It is very user friendly (requires no programming skills). See www.stattransfer.com

If you use the program just note that you will have to choose "ASCII/Text - Delimited" to work with .csv files rather than .xls

Isom answered 25/6, 2010 at 18:21 Comment(1)

This is paid but you can download to try out. – Federate 1/7, 2013 at 15:34

Y

2

Some mentioned SPSS, StatTransfer, they are not free. R and Python (also mentioned above) may be your choice. But personally, I would like to recommend Python, the syntax is much more intuitive than R. You can just use several command lines with Pandas in Python to read and export most of the commonly used data formats:

import pandas as pd

df = pd.read_stata('YourDataName.dta')

df.to_csv('YourDataName.csv')

Youngran answered 5/4, 2020 at 17:21 Comment(0)

L

0

SPSS can also read .dta files and export them to .csv, but that costs money. PSPP, an open source version of SPSS, which is rough, might also be able to read/export .dta files.

Lodmilla answered 28/3, 2018 at 12:25 Comment(0)

W

0

PYTHON - CONVERT STATA FILES IN DIRECTORY TO CSV

import glob
import pandas

path=r"{Path to Folder}"

for my_dir in glob.glob("*.dta")[0:1]:
    file = path+my_dir  # collects all the stata files
    # get the file path/name without the ".dta" extension
    file_name, file_extension = os.path.splitext(file)

    # read your data
    df = pandas.read_stata(file, convert_categoricals=False, convert_missing=True)

    # save the data and never think about stata again :)
    df.to_csv(file_name + '.csv')

Wallraff answered 16/7, 2021 at 9:9 Comment(0)

T

-11

For those who have Stata (even though the asker does not) you can use this:

outsheet produces a tab-delimited file so you need to specify the comma option like below

outsheet [varlist] using file.csv , comma

also, if you want to remove labels (which are included by default

outsheet [varlist] using file.csv, comma nolabel

hat tip to:

http://www.ats.ucla.edu/stat/stata/faq/outsheet.htm

Theobald answered 2/10, 2013 at 2:40 Comment(0)

Stata data file versions:

Valid encodings:

Recommended topics

Hot tags