Atomic counter - redis vs postgres or other? [closed]
Asked Answered
B

1

13

I need an implementation of an Atomic Counter on cloud to produce a serial integer from concurrent connections. The business behind is a tracking server.

Requirements by priority:

  1. (MUST) Durable - be sure that once a clients gets a number, no other client will ever going to get this same number. no duplicates...
  2. (MUST) Scaleable - current load is 10K/second and 1M/second in future from 200-1000 concurrent client connections. A scaleablity feature of incrementing by 100
  3. (MUST) < +-15ms average (postgres/mysql/redis are great, http latency like DynamoDB is out of the question) this is just to filter out slow solutions
  4. (nice to have) increment by This is a scalability where the client increments by a chunk(e.g. 100) and manages the incrementation in application's memory.
  5. (nice to have) A fare price < 150$ for 5k/s and expecting leaner pricing growth beyond.
  6. (nice to have) HA (High Availability) - I can handle 0.01% failures, but durability is important, I need to have no duplicate numbers.

My alternatives are:

  1. Sequence of postgres CREATE SEQUENCE serial CACHE 100; SELECT nextval(sequence) - 140$/m MultiAZ AWS RDS db.m3.medium not not as fast as redis but I think is < 7ms in average. the "cache" is a strong feature that should boost performance.
  2. Redis INCR with Redis Sentinel/RDS MultiAZ - cache.m3.medium MultiAZ - 120$/m - Durablity is in question.

redis has INCRBY, and postgres only has the "cache" feature of sequence which requires roundtrip to the DB.

Any input? regarding those two alternatives or others?

Bcd answered 23/10, 2016 at 1:42 Comment(7)
Does this need to be in a database? Can't you build your own "counter service" using synchronization primitives present in most programming platforms?Sarad
It needs to be durable and persisted like a DB. And have TCP Socket like latency. So it is hard to get without a DB, in-memory won't work and high preferment, HA, zero downtime deployment, and managing high concurrency on a file writes without a DB is hard. you build many DB features actually, I rather buy it :)Bcd
maybe you should use GUIDs instead?Sarad
I don't think a single instance of Redis or Postgres can process 1M/s requests. You need some distributed method to generate unique ids. Take a look at twitter's Snowflake. It might help.Luckily
You didn't specify that you have to store any data. If you just have to maintain the counter, you should really not do that (counting) in a db. As Dai already said, an (actually not very complicated) service is really the best way to go (apart from implementing that at protocol stack). Not only databases understand locking/transactions. They rely on the underlying system to lock. Your service can use that too. To the pricing: Good that it is just a nice to have. For 1M/s, you can easily add one or two 0s. (But cost pressure might lead to other concepts, e.g. clients get a range of 100(0)).Experimentalize
Good points everyone Dai - GUiD is something that I have consider but it doesn't match my use case. Solarflare - indeed the < 150$ is under the assumption of 1000 id range. I have other scenario of db.m3.medium postgres RDS that holds 1K/s and if will I implement the 1000 range it seems that I can scale easily X1000 :)Bcd
forslack - thank you for sharing about Snowflake it surely a good alternative for future growth in traffic.Bcd
F
18

I think you are overestimating the risk of a redis failure that leaves it unable to flush to disk and underestimating the risk of any RDBMS doing the same. The risk can be mitigated in both by syncing writes to disk.

In redis this means switching to AOF (Append Only File) mode, as described in the persistence link you already link to.

There is no need to do any expiring key trickery. The atomic behavior of incr and incrby are more than sufficient to ensure uniqueness and durability, especially when combined with AOF persistence.

Redis is just about perfect for this use case. It is fast enough and scaleable. Redis has been around a while now. There are no legitimate durability concerns that wouldn't also be concerns for PostgreSQL or MySQL.

As noted by @Solarflare having applications grab blocks of IDs at a time is even more cost effective and scaleable. This can be accomplished in redis using incrby.

Fractious answered 25/10, 2016 at 6:55 Comment(4)
Carl, this is just the input I personally was looking for, I am going to accept this answer, and also mention for_clack and Solaflare for their contribution.Bcd
redis seems to be a good candidate also due to its scalability features regarding this scenario and others.Bcd
I want also to thank for_slack for mentioning the distributed solutions for unique ids like twitter snowflake and to Solflare for noting that my 100(0) range suggestion is actually a cost effective alternative to the distributed solutions when scaling up.Bcd
Made a small update to clarify how to do ranges in redis.Fractious

© 2022 - 2024 — McMap. All rights reserved.