We have multiple update queries in a single partition of a single columnfamily. Like below
update t1 set username = 'abc', url = 'www.something.com', age = ? where userid = 100;
update t1 set username = 'abc', url = 'www.something.com', weight = ? where userid = 100;
update t1 set username = 'abc', url = 'www.something.com', height = ? where userid = 100;
username
, url
will be always same and are mandatory fields. But depending on the information given there will be extra columns.
As this is a single partition operation and we need atomicity + isolation. We will execute this in a batch.
As per Doc
A BATCH statement combines multiple data modification language (DML) statements (INSERT, UPDATE, DELETE) into a single logical operation, and sets a client-supplied timestamp for all columns written by the statements in the batch.
Now as we are updating columns(username, url) with same value in multiple statement, will C* combines it as a single statement before executing it like
update t1 set username = 'abc', url = 'www.something.com', age = ?, weight = ?, height = ? where userid = 100;
or same value will be upsert?
Another question is that, as they all have the same timestamp how C* resolves that conflict. Will C* compare every column (username, url) value.
As they all have the same timestamp C* resolves the conflict by choosing the largest value for the cells. Atomic Batch in Cassandra
Or should we add queries in batch like below. In this case we have to check username, url has already been added in statement.
update t1 set username = 'abc', url = 'www.something.com', age = ? where userid = 100;
update t1 set weight = ? where userid = 100;
update t1 set height = ? where userid = 100;
In short what will be the best way to do it.