How to open my files in data_folder with pandas using relative path? [duplicate]
Asked Answered
C

18

75

I'm working with pandas and need to read some csv files, the structure is something like this:

folder/folder2/scripts_folder/script.py

folder/folder2/data_folder/data.csv

How can I open the data.csv file from the script in scripts_folder?

I've tried this:

absolute_path = os.path.abspath(os.path.dirname('data.csv'))

pandas.read_csv(absolute_path + '/data.csv')

I get this error:

File folder/folder2/data_folder/data.csv does not exist
Crumple answered 13/2, 2016 at 19:25 Comment(2)
C
96

Try

import pandas as pd
pd.read_csv("../data_folder/data.csv")
Capriccio answered 13/2, 2016 at 19:31 Comment(5)
Actually this was the best solution.Punchy
why do we not specify all parent folders?Boris
and also by using osBoris
Thanks for the answer. Issue resolved when I used double quotes instead of single quotes.Zymometer
this solution is os dependent. if you intend to run your code on different operating systems, it will fail. Using os.path.dirname that is suggested by several other members is a more robust solution.Antecedence
H
49

Pandas will start looking from where your current python file is located. Therefore you can move from your current directory to where your data is located with '..' For example:

pd.read_csv('../../../data_folder/data.csv')

Will go 3 levels up and then into a data_folder (assuming it's there) Or

pd.read_csv('data_folder/data.csv')

assuming your data_folder is in the same directory as your .py file.

Holocrine answered 25/4, 2017 at 1:44 Comment(1)
Thanks this is the explanation I needed as a python / machine learning starter!Palsy
C
19

You could use the __file__ attribute:

import os
import pandas as pd
df = pd.read_csv(os.path.join(os.path.dirname(__file__), "../data_folder/data.csv"))
Chen answered 6/11, 2016 at 9:37 Comment(0)
V
14

For non-Windows users:

import pandas as pd
import os

os.chdir("../data_folder")
df = pd.read_csv("data.csv")

For Windows users:

import pandas as pd

df = pd.read_csv(r"C:\data_folder\data.csv")

The prefix r in location above saves time when giving the location to the pandas Dataframe.

Valentia answered 11/11, 2018 at 6:45 Comment(4)
No reason to import os. See @ksooklall's answer.Aaronson
pandas automatically find the CSV or any other dataset file from where your notebook is running but os.chdir() is just to change the working directory location from where you want to pick the multiple data. Maybe you have a large number of data that you store in a given folder then no need to write the location each time if you want to read the CSV files.Valentia
That has side effects and it's bad practice to ignore those for some perceived convenience. I say perceived convenience because the other answer is shorter, more readable, and doesn't have the unnecessary dependency yours does.Aaronson
For windows, this just works <pd.read_csv("../data_folder/data.csv")>Nickels
F
5
# script.py
current_file = os.path.abspath(os.path.dirname(__file__)) #older/folder2/scripts_folder

#csv_filename
csv_filename = os.path.join(current_file, '../data_folder/data.csv')
Forefront answered 13/2, 2016 at 19:31 Comment(0)
S
5

Keeping things tidy with f-strings:

import os
import pandas as pd

data_files = '../data_folder/'
csv_name = 'data.csv'

pd.read_csv(f"{data_files}{csv_name}")
Spodumene answered 28/1, 2020 at 19:8 Comment(0)
S
3

With python or pandas when you use read_csv or pd.read_csv, both of them look into current working directory, by default where the python process have started. So you need to use os module to chdir() and take it from there.

import pandas as pd 
import os
print(os.getcwd())
os.chdir("D:/01Coding/Python/data_sets/myowndata")
print(os.getcwd())
df = pd.read_csv('data.csv',nrows=10)
print(df.head())
Ser answered 28/1, 2020 at 19:37 Comment(0)
U
2

This link here answers it. Reading file using relative path in python project

Basically using Path from pathlib you'll do the following in script.py

from pathlib import Path
path = Path(__file__).parent / "../data_folder/data.csv"
pd.read_csv(path)
Undue answered 20/12, 2020 at 21:29 Comment(0)
S
2

If you want to keep your tidy, then I would suggest you to assign the path and file separately and then read:

path = 'C:/Users/username/Documents/folder'
file_name = 'file_name.xlsx'

file=pd.read_excel(f"{path}{file_name}")

Segregationist answered 30/11, 2021 at 16:35 Comment(0)
H
1
import pandas as pd
df = pd.read_csv('C:/data_folder/data.csv')
Histochemistry answered 17/9, 2019 at 1:50 Comment(1)
The provided answer was flagged for review as a Low Quality Post. Here are some guidelines for How do I write a good answer?. This provided answer is not correct and code only answers are not considered "good" answers. From review. The OP specifically stated for a relative path, but you have answered the question with an absolute path.Dede
T
1

I was also looking for the relative path version, this works OK. Note when run (Spyder 3.6) you will see (unicode error) 'unicodeescape' codec can't decode bytes at the closing triple quote. Remove the offending comment lines 14 and 15 and adjust the file names and location for your environment and check for indentation.

-- coding: utf-8 --

""" Created on Fri Jan 24 12:12:40 2020

Source: Read a .csv into pandas from F: drive on Windows 7

Demonstrates: Load a csv not in the CWD by specifying relative path - windows version

@author: Doug

From CWD C:\Users\Doug\.spyder-py3\Data Camp\pandas we will load file

C:/Users/Doug/.spyder-py3/Data Camp/Cleaning/g1803.csv

"""

import csv

trainData2 = []

with open(r'../Cleaning/g1803.csv', 'r') as train2Csv:

  trainReader2 = csv.reader(train2Csv, delimiter=',', quotechar='"')

  for row in trainReader2:

      trainData2.append(row)

print(trainData2)
Thermionic answered 25/1, 2020 at 13:51 Comment(0)
K
1

You can always point to your home directory using ~ then you can refer to your data folder.

import pandas as pd
df = pd.read_csv("~/mydata/data.csv")

For your case, it should be like this

import pandas as pd
df = pd.read_csv("~/folder/folder2/data_folder/data.csv")

You can also set your data directory as a prefix

import pandas as pd
DATA_DIR = "~/folder/folder2/data_folder/"
df = pd.read_csv(DATA_DIR+"data.csv")

You can take advantage of f-strings as @nikos-tavoularis said

import pandas as pd
DATA_DIR = "~/folder/folder2/data_folder/"
FILE_NAME = "data.csv"
df = pd.read_csv(f"{DATA_DIR}{FILE_NAME}")
Kumar answered 17/7, 2020 at 20:6 Comment(0)
D
1

You can use . to represent now working path.

#Linux
df = pd.read_csv("../data_folder/data.csv")
#Wins
df = pd.read_csv("..\\data_folder\\data.csv")
Dael answered 18/3, 2022 at 6:10 Comment(0)
A
0

You can try with this.

df = pd.read_csv("E:\working datasets\sales.csv")
print(df.head())
Airfield answered 22/4, 2021 at 18:29 Comment(0)
G
0
import os

s_path = os.getcwd()
# s_path = "...folder/folder2/scripts_folder/script.py"
s_path = s_path.split('/')
print(s_path)
# [,..., 'folder', 'folder2', 'scripts_folder', 'script.py']

d_path = s_path[:len(s_path)-2] + ['data_folder', 'data.csv']
print(os.path.join(*d_path))
# ...folder/folder2/data_folder/data.csv```
Gilgilba answered 4/7, 2021 at 13:2 Comment(0)
P
0

Try this: Open a new terminal window. Drag and drop the file (that you want Pandas to read) in that terminal window. This will return the full address of your file in a line. Copy and paste that line into read_csv command as shown here:

import pandas as pd
pd.read_csv("the path returned by terminal")

That's it.

Preter answered 24/5, 2022 at 5:30 Comment(0)
S
0

Just replace your "/" with ""

Southerly answered 25/1, 2023 at 5:43 Comment(2)
This should be a comment. You can also edit your question and add some additional information supporting your answer.Proportional
This does not provide an answer to the question. Once you have sufficient reputation you will be able to comment on any post; instead, provide answers that don't require clarification from the asker. - From ReviewInexperienced
K
0

You could use os.path.join with the name of the file which you are trying to read and the path where this file is located. Here is the example below:

import pandas as pd
import os
path='/home/user/d_directory/'
file= 'xyz.csv'
data= pd.read_csv(os.path.join(path, file)
Kendo answered 11/7, 2023 at 10:2 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.