How to create multiple sequences in one table?
Asked Answered
C

5

8

I have a table "receipts". I have columns customer_id (who had the receipt) and receipt_number. The receipt_number should start on 1 for each customer and be a sequence. This means that customer_id and receipt_number will be unique. How can I elegantly do this. Can I use the built-in sequeance functionality with CREATE SEQUENCE or similar? It seems like I would have to create a sequence for each customer, which of course is not an elegant solution.

EDIT: There must be a thread-safe and idiot-secure way to do this. It should be quite a simple/common need.

Clichy answered 5/10, 2012 at 12:17 Comment(4)
Do you really need to store that? Those unique numbers can easily be generated while retrieving the data.Wreckfish
@a_horse_with_no_name Yes, this needs to be auditable.Clichy
I'm not sure what you mean by "thread-safe". You need to use transactions to make a sequence of actions "atomic". Atomic transactions that are complete actions ought to be thread-safe unless I'm misunderstanding you.Faze
@WalterMitty You are right. That is a way to solve this. What I do not like about wrapping things in a transaction is that novice programmers sometimes do not understand the importance of transactions, so I consider them riskier than some built-in way.Clichy
D
4

SEQUENCE does not guarantee there are no gaps. For example, one transaction might generate a new number and then abort (due to a bug or a power failure or whatever...). The next transaction would then blindly get the next number, not the one that was "lost".

It would be best if your client application did not depend on "no gaps" assumption in the firs place. You could, however, minimize gaps like this:

  1. SELECT MAX(receipt_number) FROM receipts WHERE customer_id = :ci
  2. INSERT INTO receipts(customer_id, receipt_number) VALUES (:ci, aboveresult+1), or just insert 1 if step 1 returned NULL.
  3. If step 2 returned a PK violation*, retry from the beginning.

* Because a concurrent transaction has gone through the same process and committed.

As long as rows are just added and not deleted, this should prevent any gaps, even in a concurrent environment.


BTW, you can "condense" steps 1 and 2 like this:

INSERT INTO receipts (customer_id, receipt_number)
SELECT :ci, COALESCE(MAX(receipt_number), 0) + 1
FROM receipts
WHERE customer_id = :ci;

[SQL Fiddle]

The index underneath the PK {customer_id, receipt_number} should ensure that the SELECT part of this query is satisfied efficiently.

Deplorable answered 5/10, 2012 at 16:31 Comment(0)
L
3

You could use a trigger like this to update your column:

Table definition with unique constraint on customer_id, receipt_number:

CREATE TABLE receipts (id serial primary key, customer_id bigint, receipt_number bigint default 1);
CREATE UNIQUE INDEX receipts_idx ON receipts(customer_id, receipt_number);

Function to check for max receipt_number for the client, or 1 if no previous receipts

CREATE OR REPLACE FUNCTION get_receipt_number()  RETURNS TRIGGER AS $receipts$
  BEGIN
    -- This lock will block other transactions from doing anything to table until
    -- committed. This may not offer the best performance, but is threadsafe.
    LOCK TABLE receipts IN ACCESS EXCLUSIVE MODE;
    NEW.receipt_number = (SELECT CASE WHEN max(receipt_number) IS NULL THEN 1 ELSE max(receipt_number) + 1 END FROM receipts WHERE customer_id = new.customer_id);
    RETURN NEW;
  END;
$receipts$ LANGUAGE 'plpgsql';

Trigger to fire the function on each row insert:

CREATE TRIGGER rcpt_trigger 
   BEFORE INSERT ON receipts 
   FOR EACH ROW 
   EXECUTE PROCEDURE get_receipt_number();

Then, executing the following:

db=> insert into receipts (customer_id) VALUES (1);
INSERT 0 1
db=> insert into receipts (customer_id) VALUES (1);
INSERT 0 1
db=> insert into receipts (customer_id) VALUES (2);
INSERT 0 1
db=> insert into receipts (customer_id) VALUES (2);
INSERT 0 1
db=> insert into receipts (customer_id) VALUES (2);

should yield:

  id | customer_id | receipt_number 
 ----+-------------+----------------  
  14 |           1 |              1  
  15 |           1 |              2  
  16 |           2 |              1 
  17 |           2 |              2  
  18 |           2 |              3
Lifeanddeath answered 5/10, 2012 at 13:7 Comment(5)
This will produce incorrect results if multiple transactions are inserting into the receipts table.Wreckfish
That is true, but I think you should be able to make the operation safe by locking the table at the beginning of the function.Lifeanddeath
As a_horse_with_no_name mentioned, I am also worried that this will take care in order to be thread-safe. There must be a better way?Clichy
@David, I made a slight change to the code for you which will lock the table until the insert transaction has completed. You can test this by beginning a transaction in two different sessions and trying to insert a value in both (one will block until the other is committed). If under normal conditions, your transactions happen quickly - this should not be an issue, but if things are long running you will see blocking. You may also want to just consider having a separate table which you can update/select from that contains the company/receipt pairs and and use that to store your quasi-sequence.Lifeanddeath
@Lifeanddeath I tried your solution. It works as desired for single connection. In case of multiple connections, deadlocks happen.Galligaskins
C
1

enter image description here

-- next CustomerReceiptNo
select coalesce(max(CustomerReceiptNo), 0) + 1
from  Receipt
where CustomerId = specific_customer_id;

This is not thread-safe, so make sure to implement error handling if two separate threads try to create a new receipt for a given customer at the same time.


EDIT

There is more to thread-safety than just avoiding race-conditions. Suppose there are two separate threads creating a new receipt for the same customer at the same time. Should it happen? Is this normal, a bug, or security breach? Suppose a bank where two tellers are creating a new record for the same customer at the same time -- something is very wrong. If this is supposed to happen, you can use locks; if not, then some kind of error is in order.

Convexoconcave answered 5/10, 2012 at 12:50 Comment(0)
F
1

Why do receipt numbers begin with 1 for each customer? Is that part of the defined requirements?

The simplest way to get this done is to have the program that generates new receipts query the database for max(ReceiptNumber) where CustomerId = CurrentCustomerId and then add 1.

currentCustomerId is a program variable not a database value.

This is a little inelegant in that involves an extra search of the table. You will need to create your indexes carefully, in order to get one of the indexes to answer the question without a full table scan.

An alternative that's a little quicker at insert time is to create an extra column, called MaxReeceiptNumber, in the customer table. Increment that whennever you want to insert a new receipt.

Faze answered 5/10, 2012 at 12:51 Comment(1)
I am worried that this will take care in order to be thread-safeClichy
J
1

I would like to propose my solution to this problem - use +1 column on the customer table to store latest_receipt_id, and use incremental function next_receipt_id( customer_id ):

ALTER TABLE customers ADD COLUMN latest_receipt_id integer DEFAULT 1;

-- ensure customer_id, receipt_number pair uniqueness
CREATE UNIQUE INDEX customer_receipt_ids_pair_uniq_index ON receipts USING btree (customer_id, receipt_number);

-- sequence-like function for the next receipt id, 
-- will increment it on every execution
CREATE FUNCTION next_receipt_id( for_customer_id integer ) RETURNS integer
LANGUAGE plpgsql AS 
$$
DECLARE 
  result integer;
BEGIN  
  UPDATE customers SET latest_receipt_id = latest_receipt_id + 1 WHERE id = for_customer_id RETURNING latest_receipt_id INTO result;
  RETURN result;
END;
$$;

Then you can either use it in the receipt INSERT trigger:

-- somewhere inside trigger function, triggered on receipt INSERT 
NEW.receipt_number := next_receipt_id( NEW.customer_id );

OR inside your ORM (pseudocode):

# it does not matter when you assign the receipt_number, 
# it could be even in standalone update execution, just do it only once! 
receipt.update( 'receipt_number = next_receipt_id(customer_id)' )

Disregarding of any concurrency on inserts you will always have sequential ids.

Cheers!

Jody answered 25/6, 2020 at 16:16 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.