Simple solution! Add a "\n".
SELECT encode(digest((x::text||E'\n')::bytea, 'sha1'), 'hex') FROM xtmp;
But the real problem is to get original file without cut the last "\n" (last EOL)... Lets see the function at my old test-kit:
INSERT INTO xtmp (x)
SELECT array_to_string(array_agg(x),E'\n')::xml FROM ttmp
;
It is here the "bug" (after workaround the COPY
that not load a full-file into one row one field).
The array_to_string()
not adds the last EOL, so concatenating by || E'\n'
fixed the bug.
NOTES
Checking other hypothesis and offering a good solution to the test-kit.
POSIX rule is not the problem ...
The end-of-line (EOL) is an obligation at POSIX filesystems (and non-binary mode), see this answer about EOL. We can imagine something like "string and file representations differ by an EOL"... We can check? it differs?
We can demonstrate by terminal that there are no "string vs file" problem, no strange addiction of EOL:
printf "<root/>" > original1.xml
printf "<root/>\n" > original2.xml
sha1sum original*.xml
printf "<root/>" | openssl sha1
printf "<root/>\n" | openssl sha1
results
062c3db8ce3458fc3ccaf2f930bf663d8ce31d7d original1.xml
a05d91cbf0902b0fe341c979e9fc18fc69813f55 original2.xml
(stdin)= 062c3db8ce3458fc3ccaf2f930bf663d8ce31d7d
(stdin)= a05d91cbf0902b0fe341c979e9fc18fc69813f55
So sha1sum
is not using an extra EOL, the string and the file are the some.
Now at SQL, the same conclusions:
SELECT encode(digest('<root/>'::bytea, 'sha1'), 'hex') ;
SELECT encode(digest(E'<root/>\n'::bytea, 'sha1'), 'hex') ;
results
062c3db8ce3458fc3ccaf2f930bf663d8ce31d7d
a05d91cbf0902b0fe341c979e9fc18fc69813f55
Solution for a better test-kit
The COPY
command is ugly for this simple load/save text procedure, lets use a direct getfile function instead:
CREATE FUNCTION getfile(p_file text) RETURNS text AS $$
with open(args[0],"r") as content_file:
content = content_file.read()
return content
$$ LANGUAGE PLpythonU;
SELECT encode(digest( getfile('/tmp/original1.xml') ::bytea, 'sha1'), 'hex') ;
SELECT encode(digest( getfile('/tmp/original2.xml') ::bytea, 'sha1'), 'hex') ;
results
062c3db8ce3458fc3ccaf2f930bf663d8ce31d7d
a05d91cbf0902b0fe341c979e9fc18fc69813f55
Perfect (!), no EOL problem now.