I have written a simple query against a table in a Postgres database that contains a column “manifest” of type json. Each cell contains a very long value and I am extracting the numerical value for “size”.
I need to create a new column (perhaps call it “size in MB”), and perform division against size. Specifically, I need to take the value of size and divide it by 1024, two times. I have tried every example I can find to no avail.
Here is my current query, and a small snippet of the result:
select customers.name, images.customer_id, images.captured_at, images.name, manifest -> 'size' as image_size from public.images inner join public.customers on customers.id = images.customer_id where (captured_at > current_date - interval '12 months') order BY captured_at desc
Name customer_id captured_at name image_size Acme Inc 12345 2022-05-31 Central HMA 628032520
The goal is to take image_size, divide by 1024 two times, and store the result in a new column called “size in MB”. So in this case, the result of the math would be 598.
Advertisement
Answer
To extract a scalar value (not another json) use the ->> operator (not ->).
That returns type text. So you also need to cast before doing math on the number:
(manifest ->> 'size')::bigint -- or numeric? see below!
See:
You could do integer division, which truncates, computing 598 like you projected:
SELECT 628032520 / 1024 / 1024 -- 598 -- truncated!
Casting to a floating point number or numeric avoids integer division and shows your sample value to be much closer to 599:
SELECT 628032520.0 / 2^20 -- 598.9384841918945313 -- precise
The numeric constant 123 resolves to type integer, while 123.0 (containing a point .) resolves to numeric.
And 1024 * 1024 = 1048576 = 2^20.
Maybe round? Or keep two fractional digits?
SELECT round(628032520.0 / 2^20) -- 599 -- rounded
, round(628032520.0 / 2^20, 2) -- 599.94
But consider the built-in function pg_size_pretty() instead which is made for this very purpose:
SELECT pg_size_pretty(628032520.0) -- 599 MB -- formatted text
, pg_size_pretty(232520.0) -- 227 kB
So:
SELECT c.name AS customer, i.customer_id, i.captured_at, i.name AS image
, pg_size_pretty((c.manifest ->> 'size')::numeric) AS image_size
FROM public.images i
JOIN public.customers c ON c.id = i.customer_id
WHERE captured_at > date_trunc('day', now()) - interval '12 months'
ORDER BY captured_at DESC;
Should give you:
customer customer_id captured_at image image_size Acme Inc 12345 2022-05-31 Central HMA 599 MB
I also fixed a couple other issues with your query.
Assuming captured_at is type timestamptz (like it probably should be), current_date would introduce a dependency on time zone setting of the session, which is a sneaky, unnecessary source of corner-case errors. See:
And “name” is not a good name.