Google Cloud Bigtable backup and recovery

Asked 17/2, 2016 at 0:14 Answered 3/2, 2020 at 11:34

I am new to Google Cloud Bigtable and have a very basic question as to whether the cloud offering protects my data against user error or application corruption? I see a lot of mention on the Google website that the data is safe and protected but not clear if the scenario above is covered because I did not see references to how I can go about restoring data from a previous point-in-time copy. I am sure someone on this forum knows!

Pallbearer answered 17/2, 2016 at 0:14 Comment(0)

Updated 7/24/2020: Bigtable now supports both backups and replication.

Currently we create backups to protect against catastrophic events and provide for disaster recovery.

As of February 2017, Cloud Bigtable does not provide backups from user errors or application bugs at this time. We hope to make this feature available in a future release - there is no planned delivery date at this time. In the meantime you may make your own snapshots using HBase or a similar process.

Padegs answered 17/2, 2016 at 21:52 Comment(3)

Thanks for the reply! Does anyone else see this as a big limitation of the offering or is it just me? Interested in hearing how other people are dealing with potential loss from user errors or application corruptions – Pallbearer 18/2, 2016 at 16:55

What I see as limitation is that it simply doesn't work. The documentation and error I got are not in sync and Google is not very good at covering troubleshooting. I'm getting 403 error code caused by output path does not exist or is not writeable. But I can create the output path and copy files using gsutil. – Marinara 7/9, 2018 at 19:43

It sounds like the machine you are running gsutil on is able to write to the bucket, but the machine you are running hbase (doesn't have that). In cloud console, IAM & Admin > IAM you can set the correct roles for the service account your instance uses. – Padegs 7/9, 2018 at 21:56

In addition to Google's disaster protection @Greg Dubicki mentioned, at Egnyte we backup our mission-critical Bigtable data into GCS, as Hadoop sequence files, using a couple Python wrappers for the Bigtable HBase shaded jar.

This provides for a quick recovery, fully under our control (ie. no need to wait for Google support to recover data on demand) in case our BT cluster failed or if an error on our software/admin side corrupted the data. A usefull side-effect is access to historical BT data for debugging.

Last week I wrote about that on Egnyte's engineering blog: https://medium.com/egnyte-engineering/bigtable-backup-for-disaster-recovery-9eeb5ea8e0fb. And we are thinking about open-sourcing this. We'll see how it goes.

UPDATE: On Thu Feb 20 I have published the scripts on Egnyte’s GitHub, under MIT license - https://github.com/egnyte/bigtable-backup-and-restore.

Doggery answered 3/2, 2020 at 11:34 Comment(0)

As of February 2020, Cloud Bigtable does provide backups, but only vaguely described as:

(...) we [do] create backups of your data to protect against catastrophic events and provide for disaster recovery.

Source

Parturient answered 2/2, 2020 at 12:22 Comment(0)

Recommended topics

Hot tags