How to get index of an array value in PostgreSQL?
Asked Answered
S

6

13

I have a table called pins like this:

id (int) |      pin_codes (jsonb)
--------------------------------
1        |  [4000, 5000, 6000]
2        |  [8500, 8400, 8600]
3        |  [2700, 2300, 2980]

Now, I want the row with pin_code 8600 and with its array index. The output must be like this:

     pin_codes       |  index
------------------------------
[8500, 8500, 8600]   |   2

If I want the row with pin_code 2700, the output :

     pin_codes       |  index
------------------------------
[2700, 2300, 2980]   |   0

What I've tried so far:

SELECT pin_codes FROM pins WHERE pin_codes @> '[8600]'

It only returns the row with wanted value. I don't know how to get the index on the value in the pin_codes array!

Any help would be great appreciated.

P.S:

I'm using PostgreSQL 10

Sigman answered 17/10, 2017 at 18:56 Comment(9)
You really need to de-normalize your data. SQL is not designed to parse data like that. If you can't parse and denormalize the data then your next best bet is to to substring matches and determine the index in the app layer (where parsing is generally MUCH easier).Conchoidal
Are you stuck with this data model, or can you redesign?Gunnel
@Gunnel Yes I can re-design it. Do you have better idea?Sigman
@HamedKamrava - SQL does not handle arrays well, it's better to store each value individually as it's own line. Table called pins (pin_id, misc_info_columns). Table called Pins_info (pin_id,pin_seq,pin_value). Thats just an example thats designed for an unknown number of pins, it depends on what you are looking to use it for...fun part of SQL is there are usually 20 good solutions to any problem, but finding the optimal to suit your use cases is the challenge. If you have a consistent number of values, table pin (pin_id,pin1,pin2,pin3) also works.Gunnel
@HamedKamrava - As you can see, existing answers is all about unpacking the array into something SQL handles better.Gunnel
@Twelfth: arrays are actually quite powerful in Postgres. However putting arrays into a JSON makes things complicated.Burnham
@a_horse_with_no_name - good to know...that must be newer functionality in Postgres thenGunnel
@Twelfth: has been there for ages (at least since 7.x)Burnham
@a_horse_with_no_name The values always are not int. It may be string sometimesSigman
B
27

If you were storing the array as a real array not as a json, you could use array_position() to find the (first) index of a given element:

select array_position(array['one', 'two', 'three'], 'two') 

returns 2

With some text mangling you can cast the JSON array into a text array:

select array_position(translate(pin_codes::text,'[]','{}')::text[], '8600')
from the_table;

The also allows you to use the "operator"

select *
from pins
where '8600' = any(translate(pin_codes::text,'[]','{}')::text[])

The contains @> operator expects arrays on both sides of the operator. You could use it to search for two pin codes at a time:

select *
from pins
where translate(pin_codes::text,'[]','{}')::text[] @> array['8600','8400']

Or use the overlaps operator && to find rows with any of multiple elements:

select *
from pins
where translate(pin_codes::text,'[]','{}')::text[] && array['8600','2700']

would return

id | pin_codes         
---+-------------------
 2 | [8500, 8400, 8600]
 3 | [2700, 2300, 2980]

If you do that a lot, it would be more efficient to store the pin_codes as text[] rather then JSON - then you can also index that column to do searches more efficiently.

Burnham answered 17/10, 2017 at 19:40 Comment(1)
Please, note that array_position is available from postgres version 9.5Lorica
D
12

Use the function jsonb_array_elements_text() using with ordinality.

with my_table(id, pin_codes) as (
values
    (1, '[4000, 5000, 6000]'::jsonb),
    (2, '[8500, 8400, 8600]'),
    (3, '[2700, 2300, 2980]')
)

select id, pin_codes, ordinality- 1 as index
from my_table, jsonb_array_elements_text(pin_codes) with ordinality
where value::int = 8600;

 id |     pin_codes      | index 
----+--------------------+-------
  2 | [8500, 8400, 8600] |     2
(1 row)
Diondione answered 17/10, 2017 at 19:10 Comment(2)
Normalized tables surely will be more efficient. I'm not an orthodox however and think that it depends on circumstances. If the current solution with jsonb does not cause problems, stay with it. Or replace jsonb with integer[].Diondione
For year 2020 readers. This is the best answer, the modern SQL use with ordinality. It is also easy to rename, e. g. SELECT * FROM UNNEST(x) WITH ORDINALITY t(x_i,i).Eastman
W
3

As has been pointed out previously the array_position function is only available in Postgres 9.5 and greater.

Here is custom function that achieves the same, derived from nathansgreen at github.

-- The array_position function was added in Postgres 9.5.
-- For older versions, you can get the same behavior with this function.

create function array_position(arr ANYARRAY, elem ANYELEMENT, pos INTEGER default 1) returns INTEGER
language sql
as $BODY$
select row_number::INTEGER
from (
    select unnest, row_number() over ()
    from ( select unnest(arr) ) t0
) t1
    where row_number >= greatest(1, pos)
    and (case when elem is null then unnest is null else unnest = elem end)
limit 1;
$BODY$;

So in this specific case, after creating the function the following worked for me.

SELECT 
pin_codes,
array_position(pin_codes, 8600) AS index
FROM pins
WHERE array_position(pin_codes, 8600) IS NOT NULL;

Worth bearing in mind that it will only return the index of the first occurrence of 8600, you can use the pos argument to index which ever occurrence that you like.

Worcester answered 26/6, 2019 at 14:55 Comment(0)
G
1

If you insist on storing Arrays, I'd defer to klins answer.

As the alternative answer and extension to my comment...don't store SQL data in arrays. 'Normalize' your data in advance and SQL will handle it significantly better. Klin's answer is good, but may suffer for performance as it's outside of what SQL does best.

I'd break the Array prior to storing it. If the number of pincodes is known, then simply having the table pin_id,pin1,pin2,pin3, pinetc... is functional.

If the number of pins is unknown, a first table as pin that stored the pin_id and any info columns related to that pin ID, and then a second table as pin_id, pin_seq,pin_value is also functional (though you may need to pivot this later on to make sense of the data). In this case, select pin_seq where pin_value = 260 would work.

Gunnel answered 17/10, 2017 at 19:20 Comment(2)
The data is not stored as an array it's stored as a JSON - big difference in PostgresBurnham
I should probably delete this answer then.Gunnel
C
0

Try to unnest the string and assign numbers as follows:

with dat as
(
  select 1 id, '8700, 5600, 2300' pins
  union all
  select 2 id, '2300, 1700, 1000' pins
)
select dat.*, t.rn as index
from
(
  select id, t.pins, row_number() over (partition by id) rn
  from
  (
    select id, trim(unnest(string_to_array(pins, ','))) pins from dat
  ) t
) t
join dat on dat.id = t.id and t.pins = '2300'
Calefactory answered 17/10, 2017 at 19:8 Comment(0)
C
-1

In short, normalize your data structure, or don't do this in SQL. If you want this index of the sub-data element given your current data structure, then do this in your application code (take result, cast to list/array, get index).

Cursory answered 17/10, 2017 at 19:5 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.