I have a CQL table (cql 3, cassandra 2.0.*) that looks something like:
CREATE TABLE IF NOT EXISTS user_things (
user_id bigint,
thing_id bigint,
created_at timeuuid,
PRIMARY KEY (user_id, thing_id)
);
I want to do an insert like
INSERT INTO user_things (user_id, thing_id, created_at) VALUES (?, ?, now())
but only if the row doesn't exist.
I could do this in two synchronous statements (first a SELECT, followed by an INSERT if the SELECT didn't return a row) or I could use INSERT ... IF NOT EXISTS.
The CQL docs state "But please note that using IF NOT EXISTS will incur a non negligible performance cost (internally, Paxos will be used) so this should be used sparingly."
I'm wondering if anybody has done benchmarking to see what is more performant if we have lots of these operations happening? (say hundreds a second)
PRIMARY KEY ((user_id, thing_id))
vsPRIMARY KEY (user_id, thing_id)
(with clustering columns) ? – Franny