I have been having issues reading a CSV file into Jupyter Notebook. this is the code:
import pandas as pd
mpg = pd.read_csv('C:/Users/Ajibola/Documents/mpg.csv')
mpg.head()
And this is the error I got:
File "<ipython-input-138-844bace16611>", line 1
mpg = pd.read_csv('C:\Users\Ajibola\Documents\mpg.csv')
^
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape
And after prefixing the PATH with r, I got the error:
---------------------------------------------------------------------------
UnicodeDecodeError Traceback (most recent call last)
<ipython-input-140-a1289650ba91> in <module>
----> 1 mpg = pd.read_csv(r'C:\Users\Ajibola\Documents\mpg.csv')
2 mpg.head()
~\Anaconda3\lib\site-packages\pandas\io\parsers.py in parser_f(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, dialect, tupleize_cols, error_bad_lines, warn_bad_lines, delim_whitespace, low_memory, memory_map, float_precision)
700 skip_blank_lines=skip_blank_lines)
701
--> 702 return _read(filepath_or_buffer, kwds)
703
704 parser_f.__name__ = name
~\Anaconda3\lib\site-packages\pandas\io\parsers.py in _read(filepath_or_buffer, kwds)
427
428 # Create the parser.
--> 429 parser = TextFileReader(filepath_or_buffer, **kwds)
430
431 if chunksize or iterator:
~\Anaconda3\lib\site-packages\pandas\io\parsers.py in __init__(self, f, engine, **kwds)
893 self.options['has_index_names'] = kwds['has_index_names']
894
--> 895 self._make_engine(self.engine)
896
897 def close(self):
~\Anaconda3\lib\site-packages\pandas\io\parsers.py in _make_engine(self, engine)
1120 def _make_engine(self, engine='c'):
1121 if engine == 'c':
-> 1122 self._engine = CParserWrapper(self.f, **self.options)
1123 else:
1124 if engine == 'python':
~\Anaconda3\lib\site-packages\pandas\io\parsers.py in __init__(self, src, **kwds)
1851 kwds['usecols'] = self.usecols
1852
-> 1853 self._reader = parsers.TextReader(src, **kwds)
1854 self.unnamed_cols = self._reader.unnamed_cols
1855
pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader.__cinit__()
pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._get_header()
UnicodeDecodeError: 'utf-8' codec can't decode bytes in position 0-1: invalid continuation byte
I've run through the community for related problems and answers but making no headway. An answer would be really appreciated.
pathlib
rather than strings for referencing filepaths. Another thought, it could be a weird character in your csv file, you might need to specify the encoding. You could try adding an argument likeencoding="latin1"
to yourread_csv
call, but you'd have to figure out which encoding was used to create the CSV. – Lott