I am using Stata to process some data, export the data in a csv file and load it in Python using the pandas read_csv function.
The problem is that everything is so slow. Exporting from Stata to a csv file takes ages (exporting in the dta Stata format is much faster), and loading the data via read_csv is also very slow. Using the read_stata pandas function is even worse.
I wonder is there are any other options? Like exporting a format other than csv? My csv dataset is approx 6-7 Gb large.
Any help appreciated
Thanks
read_stata()
is much faster starting with version 15.0 of pandas, so make sure you are up to date. – Mikkanen