Is a row automatically deleted if all the values are garbage-collected?
Asked Answered
F

1

2

Let's say there is a row that has a column family and a column in it. That column family has a gc policy and all the values in the column were just expired.

Then, what happens to that row? Would that row be deleted by the garbage collector? Or, would that still remain and be accessible?

I've checked documentation but only found some vague comment on that from https://cloud.google.com/bigtable/docs/overview#empty-cells

Empty cells in a Cloud Bigtable table do not take up any space. Each row is essentially a collection of key/value entries, where the key is a combination of the column family, column qualifier and timestamp. If a row does not include a value for a specific key, the key/value entry is simply not present.

In my case, the collection would be empty according to the above text. Then, my question is whether or not the empty collection still exists.

Ideally, I want an empty row to be automatically deleted to prevent the table has too many empty rows. If an empty row is not automatically deleted, is there any way to automate that, except writing a program for scanning and removing such rows?

Finsteraarhorn answered 10/2, 2019 at 13:27 Comment(0)
T
5

If all cells are garbage collected for a row key, then the row is indeed deleted. Note that garbage collection is asynchronous, and it can take up to a week for data to be totally removed.

Tirza answered 10/2, 2019 at 14:8 Comment(2)
If different column families have different different gc policies (in terms of number of days) then does the row get deleted when the last column family gets garbage collected?Cherenkov
Correct. A row will be completely removed when all of its cells are completely removed (via garbage collection or deletion)Tirza

© 2022 - 2024 — McMap. All rights reserved.