Get size of large object in PostgreSQL query?
Asked Answered
D

6

22

I would like to obtain the byte size of a blob.

I am using Postgresql and would like to obtain the size using an SQL query. Something like this:

SELECT sizeof(field) FROM table;

Is this possible in Postgresql?

Update: I have read the postgresql manual and could not find an appropriate function to calculate the file size. Also, the blob is stored as a large object.

Disenable answered 16/4, 2012 at 5:57 Comment(2)
Please read the manual before posting such a question: postgresql.org/docs/current/static/functions.htmlPachydermatous
@DanielVérité: it does look to be a dupe, but in fairness, I couldn't find that question when I searched before posting my answer. Who calls them "lobjects", honestly ? ;-) My function is uncannily like yours from that question, though in my defence, if I'd copied it I would have copied the error handling too!Ringleader
R
24

Not that I've used large objects, but looking at the docs: http://www.postgresql.org/docs/current/interactive/lo-interfaces.html#LO-TELL

I think you have to use the same technique as some file system APIs require: seek to the end, then tell the position. PostgreSQL has SQL functions that appear to wrap the internal C functions. I couldn't find much documentation, but this worked:

CREATE OR REPLACE FUNCTION get_lo_size(oid) RETURNS bigint
VOLATILE STRICT
LANGUAGE 'plpgsql'
AS $$
DECLARE
    fd integer;
    sz bigint;
BEGIN
    -- Open the LO; N.B. it needs to be in a transaction otherwise it will close immediately.
    -- Luckily a function invocation makes its own transaction if necessary.
    -- The mode x'40000'::int corresponds to the PostgreSQL LO mode INV_READ = 0x40000.
    fd := lo_open($1, x'40000'::int);
    -- Seek to the end.  2 = SEEK_END.
    PERFORM lo_lseek(fd, 0, 2);
    -- Fetch the current file position; since we're at the end, this is the size.
    sz := lo_tell(fd);
    -- Remember to close it, since the function may be called as part of a larger transaction.
    PERFORM lo_close(fd);
    -- Return the size.
    RETURN sz;
END;
$$; 

Testing it:

-- Make a new LO, returns an OID e.g. 1234567
SELECT lo_create(0);

-- Populate it with data somehow
...

-- Get the length.
SELECT get_lo_size(1234567);

It seems the LO functionality is designed to be used mostly through the client or through low-level server programming, but at least they've provided some SQL visible functions for it, which makes the above possible. I did a query for SELECT relname FROM pg_proc where relname LIKE 'lo%' to get myself started. Vague memories of C programming and a bit of research for the mode x'40000'::int and SEEK_END = 2 value were needed for the rest!

Ringleader answered 16/4, 2012 at 9:4 Comment(3)
To avoid "result out of range" errors and make it work with large objects larger than 2GB, use lo_seek64 and lo_tell64.Varicolored
How does the performance compare to the other answers?Dinge
As lo_seek64 already returns the current position - is there any need to execute lo_tell?Lute
K
22

You could change your application to store the size when you create the large object. Otherwise you can use a query such as:

select sum(length(lo.data)) from pg_largeobject lo
where lo.loid=XXXXXX

You can use also the large object API functions, as suggested in a previous post, they work ok, but are an order of magnitude slower than the select method suggested above.

Kamikamikaze answered 24/4, 2013 at 19:12 Comment(1)
Unfortunately, the pg_largeobject catalog is no longer publicly accessible since PostgreSQL 9.0: postgresql.org/docs/current/static/catalog-pg-largeobject.html.Varicolored
L
10
select pg_column_size(lo_get(lo_oid)) from table;

Gives you the size in bytes.

If you want pretty printing:

select pg_size_pretty(pg_column_size(lo_get(lo_oid))::numeric) from table;
Lankton answered 5/4, 2016 at 10:41 Comment(4)
somehow it is always 4 bytes more than it actually is, why?Basseterre
@Basseterre the first four bytes of a bytea column are the size of the rest of itDinge
i.e. you should use octet_length (as per other answers) not pg_column_sizeDinge
The works, but looks overkill - why should the DB read the complete BLOB (likely multiple hundreds of MB) to throw them away after counting the length - especially if lo_seek (end) just returns its position directly from the internally stored length?Lute
I
6

Try length() or octet_length()

Intersexual answered 16/4, 2012 at 6:6 Comment(0)
S
6

This is my solution:

select
lo.loid,
pg_size_pretty(sum(octet_length(lo.data)))
from pg_largeobject lo
where lo.loid in (select pg_largeobject.loid from pg_largeobject)
group by lo.loid;
Septuplet answered 31/1, 2019 at 16:5 Comment(0)
E
3

If the type of the blob column is oid

SELECT length(lo_get(blob_column)) FROM table;

lo_get ( loid oid [, offset bigint, length integer ] ) → bytea

Extracts the large object's contents, or a substring thereof.

https://www.postgresql.org/docs/current/lo-funcs.html

length ( bytea ) → integer

Returns the number of bytes in the binary string.

https://www.postgresql.org/docs/current/functions-binarystring.html

Eva answered 23/2, 2021 at 17:57 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.