How single partition batch in Cassandra function for multiple column update?

Asked 24/8, 2016 at 10:42 Answered 24/8, 2016 at 12:55

We have multiple update queries in a single partition of a single columnfamily. Like below

update t1 set username = 'abc', url = 'www.something.com', age = ? where userid = 100;
update t1 set username = 'abc', url = 'www.something.com', weight = ? where userid = 100;
update t1 set username = 'abc', url = 'www.something.com', height = ? where userid = 100;

username, url will be always same and are mandatory fields. But depending on the information given there will be extra columns.

As this is a single partition operation and we need atomicity + isolation. We will execute this in a batch.

As per Doc

A BATCH statement combines multiple data modification language (DML) statements (INSERT, UPDATE, DELETE) into a single logical operation, and sets a client-supplied timestamp for all columns written by the statements in the batch.

Now as we are updating columns(username, url) with same value in multiple statement, will C* combines it as a single statement before executing it like

update t1 set username = 'abc', url = 'www.something.com', age = ?, weight = ?, height = ? where userid = 100;

or same value will be upsert?

Another question is that, as they all have the same timestamp how C* resolves that conflict. Will C* compare every column (username, url) value.

As they all have the same timestamp C* resolves the conflict by choosing the largest value for the cells. Atomic Batch in Cassandra

Or should we add queries in batch like below. In this case we have to check username, url has already been added in statement.

update t1 set username = 'abc', url = 'www.something.com', age = ? where userid = 100;
update t1 set weight = ? where userid = 100;
update t1 set height = ? where userid = 100;

In short what will be the best way to do it.

Font answered 24/8, 2016 at 10:42 Comment(0)

For your first question(will C* combines it as a single statement?) answer is yes.

A single partition batch is applied as a single row mutation.

check this link for details: https://issues.apache.org/jira/browse/CASSANDRA-6737

For your second question(Will C* compare every column (username, url) value?) answer is also yes.

As given in the answer of your provided link "Conflict is resolved by choosing the largest value for the cells"

So, you can write queries in batch in either way(given in your question). As it will ultimately converted to a single write internally.

Clarion answered 24/8, 2016 at 12:55 Comment(0)

You are using Single partition batch so everything goes into a single partition.So all of your update will be merge and applied with a single RowMutation.

And so your update will be applied with no batch log, atomic, isolated

Camp answered 24/8, 2016 at 11:48 Comment(2)

So that means C* merges those queries and makes it single and executes it as a single row mutation? – Font 24/8, 2016 at 11:51

Yes. So all of your update will be merge and applied with a single RowMutation. (Edited the answer) – Camp 24/8, 2016 at 11:57