How should I implement an atomic sequence in Perl?

Asked 9/6, 2011 at 19:22 Answered 9/6, 2011 at 21:12

Solved perl parallel-processing atomic sequences

I have the following requirements:

The sequence is unique to a host (no shared incrementing is necessary)
The sequence must be monotonically increasing.
The sequence must be persistent across processes.
Incrementing the sequence must be atomic in the case of multiple processes working on it at the same time.
Most of the time the file will be updated and the new value read after update. But, it should also be possible to read the current value without update.

I can hack together perl code that will do roughly this, but I'd like a more elegant solution.

Apomict answered 9/6, 2011 at 19:22 Comment(2)

I have hacked together perl code that roughly does this. Does storing the current sequence number in a file and accessing/updating it inside a flock wrapper count as elegant? – Multilingual 9/6, 2011 at 19:54

If I don't have to write the code myself, yes it does :) – Apomict 9/6, 2011 at 20:27

Store the sequence number in a file and use flock to make sure only one process can access it:

sub set {     # seed the sequence number file
    my ($file, $number) = @_;
    open my $fh, '>', $file;
    print $fh $number;
}  # implicit close

sub get {
    my $file = shift;
    my $incr = @_ ? shift : 1;   # get($f) is like get($f,1)
    open my $lock, '>>', "$file.lock";
    flock $lock, 2;
    open my $fh, '<', $file;
    my $seq = <$fh>;
    close $fh;
    set($file, $seq+$incr) if $incr;   # update sequence number
    close $lock;
    return $seq;
}

You can call this as get($file,0) to retrieve the sequence number without changing it.

Multilingual answered 9/6, 2011 at 21:12 Comment(4)

This roughly matches what I've done. This will do nicely. Thanks! – Apomict 9/6, 2011 at 23:9

Why a separate lock file? Why not just open the file read/write ('+<') and flock it? – Amabel 9/6, 2011 at 23:35

cjm, see perl.plover.com/yak/hw-nylug/samples/slide020.html and the following slide. – Headspring 10/6, 2011 at 8:55

@daxim, I'm not sure that really applies here. The normal case is to read, increment, & write, all while the file remains locked. truncate isn't an issue, because the sequence is increasing; the new number will always be at least as long as the previous one. I'd expect a seek to be more efficient than a close & reopen (plus the extra open for the lockfile). – Amabel 10/6, 2011 at 18:56

System time provides a monotonically increasing sequence, which addresses (2):

perl -MTime::HiRes=time -lwe "print time"

Until someone resets the clock ...

Persistence (3) and atomicity of incrementations (4) seem to require a locking database. Berkeley DB comes to mind. But you might be looking for something simpler, unless you're already using it anyway. Read without update (5) would be no problem. A monotonically increasing sequence (2) wouldn't be either.

I'm not sure what you mean by "unique to a host" and "shared increment" (1). If it okay for sequence elements from different hosts to have the same value, then you can multiply the approach to all servers. Else you can have only one sequence that must be accessible to others via the network.

Sherrie answered 9/6, 2011 at 19:51 Comment(2)

Good try, but if two processes hit the same library code within the same second, they could easily have the same value. Plus, it's volatile so the value you read has little to do with what it is the next time. (#5) – Huddersfield 9/6, 2011 at 20:0

@Huddersfield - System time doesn't work it all, that's clear. As I wrote, someone resetting the clock. If you could exclude such tinkering with the clock that would make your sequence fail as a side-effect, you would have to back the time by something like Berkeley DB, which provides #3, #4 and #5. And possibly #1, but that needs clarification. Once you're there, you might also choose to pick something else as a sequence, probably just an integer. – Sherrie 9/6, 2011 at 20:14

Recommended topics

Hot tags