Cassandra batch statement - Execution order
Asked Answered
E

1

7

I have a batch statement of Cassandra that contains a delete and an insert statement of same partition key, where delete is the first statement and insert is the second. How the batch statement executes these statements ? Is in the same order in which,we added the statements?

Elemental answered 19/5, 2015 at 6:4 Comment(0)
B
10

No, it does not execute them in the order specified. To force a particular execution order, you can add the USING TIMESTAMP clause. Check the docs for more information: http://docs.datastax.com/en/cql/3.1/cql/cql_reference/batch_r.html

Using time stamp how it can maintain the order of execution . For Example if the above example (delete and insert for same partition key), the final result should be the inserted record. Is that possible by adding time-stamp ??

Yes. I'll combine examples from the link above and the DELETE documentation to demonstrate, and start by creating a simple table called purchases with two fields:

CREATE TABLE purchases (user text PRIMARY KEY, balance bigint);

Next, I'll execute a batch with an INSERT and a DELETE. I'll do the DELETE last, but with an earlier timestamp than the INSERT:

BEGIN BATCH
  INSERT INTO purchases (user, balance) VALUES ('user1', -8) USING TIMESTAMP 1432043350384;
  DELETE FROM purchases USING TIMESTAMP 1432043345243 WHERE user='user1';
APPLY BATCH;

When I query for userid:

aploetz@cqlsh:stackoverflow2> SELECT user, balance, writetime(balance) FROM purchases WHERE user='user1';

 user  | balance | writetime(balance)
-------+---------+--------------------
 user1 |      -8 |      1432043350384

(1 rows)

As you can see, the INSERT persisted because it had the latest timestamp. Whereas if I had simply run the INSERT and DELETE (in that order) from the cqlsh prompt, the query would have returned nothing.

Billi answered 19/5, 2015 at 6:59 Comment(4)
In your BATCH example, say you omitted the timestamp, will Cassandra guarantee that the row for user1 is deleted? What wins over if they have the same timestamp (in this case calculated by the server)?Yseult
Here we go. Tombstone will always take precedence over regular columns. Also described here.Yseult
@SotiriosDelimanolis I like the "beyond silly" comment in the first link :)Capitulary
These are the rules to resolve conflict in batch statement 1. If timestamps are different, pick the column with the largest timestamp (the value being a regular column or a tombstone) 2. if timestamps are the same, and one of the columns in a tombstone ('null') - pick the tombstone 3. if timestamps are the same, and none of the columns are tombstones, pick the column with the largest value. Ref: issues.apache.org/jira/browse/…Jovanjove

© 2022 - 2024 — McMap. All rights reserved.