I am new to Google Cloud Bigtable and have a very basic question as to whether the cloud offering protects my data against user error or application corruption? I see a lot of mention on the Google website that the data is safe and protected but not clear if the scenario above is covered because I did not see references to how I can go about restoring data from a previous point-in-time copy. I am sure someone on this forum knows!
Updated 7/24/2020: Bigtable now supports both backups and replication.
Currently we create backups to protect against catastrophic events and provide for disaster recovery.
As of February 2017, Cloud Bigtable does not provide backups from user errors or application bugs at this time. We hope to make this feature available in a future release - there is no planned delivery date at this time. In the meantime you may make your own snapshots using HBase or a similar process.
In addition to Google's disaster protection @Greg Dubicki mentioned, at Egnyte we backup our mission-critical Bigtable data into GCS, as Hadoop sequence files, using a couple Python wrappers for the Bigtable HBase shaded jar.
This provides for a quick recovery, fully under our control (ie. no need to wait for Google support to recover data on demand) in case our BT cluster failed or if an error on our software/admin side corrupted the data. A usefull side-effect is access to historical BT data for debugging.
Last week I wrote about that on Egnyte's engineering blog: https://medium.com/egnyte-engineering/bigtable-backup-for-disaster-recovery-9eeb5ea8e0fb. And we are thinking about open-sourcing this. We'll see how it goes.
UPDATE: On Thu Feb 20 I have published the scripts on Egnyte’s GitHub, under MIT license - https://github.com/egnyte/bigtable-backup-and-restore.
As of February 2020, Cloud Bigtable does provide backups, but only vaguely described as:
(...) we [do] create backups of your data to protect against catastrophic events and provide for disaster recovery.
© 2022 - 2024 — McMap. All rights reserved.