How to fix syntax errors in postgresql .sql dump file when restoring with psql?
Asked Answered
Z

4

8

I have a postgresql .sql dump file created by pg_dump on another windows 10 box. I am trying to restore it on my windows 10 laptop with "psql -U user -d database -1 -f filename.sql". I created the database, but when I run the command to do the restore I get an error from psql after I give it my password:

psql:filename.sql:1:1: ERROR: syntax error at or near "ÿ_" LINE 1: ÿ_;

The file looks like straight ascii (I only see two dashes on line one. I don't see a 'y' with an umlaut anywhere). I did a file on the .sql file with cygwin bash, and it says:

Little-endian UTF-16 Unicode text, with very long lines, with CRLF, CR line >terminators

I really don't want to recreate the database by hand. I am looking for any suggestions.

I tried psql with and without the '-1' option; no luck. I tried putting a ';' at the top of the sql file, which I found suggested somewhere; again no luck.

I did a psql -l on my postgresql installation and the encoding on all my databases (including the one to which I am trying to do the restore) shows UTF8.

There really is no code. It is just that I can't seem to restore this dump file because it errors out.

I think that captures my problem. The windows box that I got the dump from is not available to me now; so I'm just hoping there is a way to get around this problem. Recreating the database by hand table by table is something I would prefer to avoid.

Thanks--

Al

Zomba answered 8/8, 2019 at 17:16 Comment(1)
Try converting the dump to UTF-8: conv -f UTF-16LE -t UTF-8 filename.sql -o filename_utf8.sql and restore from filename_utf8.sql.Tailrace
H
8

In my case , this exact thing happened because I was taking the dump using windows Powershell , due to which other characters got included in the dump file. Simply using command prompt to take the solved my problem.

Hindbrain answered 28/1, 2021 at 8:49 Comment(0)
S
1

I can only give you leads how to debug the problem, because the cause is not immediately obvious.

First, there should be a line close to the beginning of the dump file that sets client_encoding. The dump file should be in that encoding.

I can see two possibilities:

  • The file got mangled during transfer. To test for that, calculate a checksum for both files and compare.

    Always use binary mode to transfer PostgreSQL dumps.

  • some editor or something else sneaked a BOM (byte order mark) into the file at the very beginning.

    That's my prime suspect, since the problem is at line 1.

    Use a hex editor or od (in Cygwin) to verify that. If this is the problem, simply replace the BOM with spaces.

Slip answered 9/8, 2019 at 5:54 Comment(2)
Thanks! I got someone to help me unmangle the file, though I'm not sure how it was done. The bad file was about 134K and the corrected file that works is about 65K. A file on the corrected file now shows "ASCII text, with very long lines". I dumped the file and then copied it to a thumb drive. I didn't try looking at it with an editor until after it failed. I'm not sure what happened, but thank you for your thoughts. When I gain access to that other box again, I'll have to see if there is something odd about its postgres setup.Zomba
No, that is not connected to PostgreSQL. The file must have been converted to UTF16-LE accidentally. That is an encoding used by Windows, but not by PostgreSQL.Slip
N
0

To add to this answer: with PowerShell 7 there are no such encoding problems with pg_dump.

Noreen answered 14/5, 2023 at 18:46 Comment(3)
answers should stand alone to answer the question without having to read other answers before. please rectify.Eliathas
@user, I don't have enough reputation to do that.Noreen
you can always edit your own post. You don't need reputation to edit your own posts.Eliathas
P
0

When trying to import a SQL dump file, you might encounter errors related to character encoding. This can often happen if the dump was generated using Windows PowerShell, which might use a different default character encoding.

To address this, you can convert the SQL dump file to UTF-8 encoding using PowerShell:

  1. Open PowerShell in the directory containing your .sql file and run the following command. Make sure to replace FILENAME with the name of your SQL dump file:
Get-Content -Encoding Unicode FILENAME.sql | Set-Content -Encoding UTF8 FILENAME_utf8.sql

This command reads the content of your SQL file assuming it's in Unicode format and then writes it back to a new file in UTF-8 format.

  1. After converting the file to UTF-8, you can import it into your PostgreSQL database. Replace MYDATABASE with the name of your database and FILENAME_utf8.sql with the name of the newly generated file:
psql -U postgres -d MYDATABASE -f FILENAME_utf8.sql

The reason for this conversion is that Windows PowerShell might generate files in UTF-16 by default, while PostgreSQL expects them in UTF-8. The above steps ensure that you convert the file to the expected encoding before importing.

Psych answered 4/10, 2023 at 1:40 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.