Migrating Redis to AWS Elasticache with minimal downtime

Asked 13/6, 2016 at 10:55 Answered 28/9, 2023 at 7:41

amazon-web-services redis data-migration

Let's start by listing some facts:

Elasticache can't be a slave of my existing Redis setup. Real shame, that would be so much more efficent.
I have only one Redis server to migrate, with roughly 3gb of data.
Downtime must be less than 10 mins. I assume the usual "stop the site, stop redis, provision cluster with snapshot" will take longer than this.

Similar to this question: How do I set an elasticache redis cluster as a slave?

One idea on how this might work:

Set Redis to use an AOF and trigger BGSAVE at the same time.
When BGSAVE finishes, provision the Elasticache cluster with RDB seed.
Stop the site and shut down my local Redis instance.
Use an aof-replay tool to replay the AOF into Elasticache.
Start the site again, pointed at the Elasticache cluster.

My questions:

How can I guarantee that my AOF file begins at exactly the point the RDB file ends, and that no data will be written in between?
Is there an AOF tool supported by the maintainers of Redis, or are they all third-party solutions, and therefore (potentially) of questionable reliability?*

* No offence intended to any authors of such tools, I'm sure they're great, I just feel much more confident using a tool written by the same team as the product to avoid potential compatibility bugs.

Demonstration answered 13/6, 2016 at 10:55 Comment(4)

Can your app operate without Redis? Will it just be slower (no access to cache of course) or will it fail? – Muscadel 13/6, 2016 at 21:17

How active are your users overnight? 3am-5am? That's when I would migrate a major change in production app that needed to be up for our business users. Even if your app is used 24/7 the odds are you have a low usage period that you could plan for to minimize the noticed affect if you are offline for 30 mins during a migration. – Muscadel 13/6, 2016 at 21:24

App is mostly useless without Redis, and there isn't really a daily usage pattern. It's a constantly loaded application over a 24-hour period. What makes you guess it would be only 30 mins of downtime? – Demonstration 14/6, 2016 at 2:30

Please check this: aws.amazon.com/about-aws/whats-new/2019/10/… – Haman 7/11, 2019 at 7:44

I have only one Redis server to migrate, with roughly 3gb of data

I would halt, save the REDIS to S3 and then upload it to a new cluster.

I'm guessing 10 mins to save the file and get it into s3.
10 minutes to just launch an elasticache cluster from that data. Leaves you ten extra minutes to configure and test.

But there is a simple way of knowing EXACTLY how long. Do a test migration of it.

DONT stop your live system
Run BGSAVE and get a dump of your Redis (leave everything running as normal)
move the dump S3
launch an elasticache cluster for it.

Take DETAILED notes, TIME each step, copy the commands to a notepad window.

Put a Word/excel document so you have a migration document. That way you know how long it takes and there are no surprises. Let us know how it goes.

Muscadel answered 14/6, 2016 at 2:55 Comment(0)

ElastiCache has online migration support. You can use the start-migration API to start migration from self managed cluster to ElastiCache cluster.

aws elasticache start-migration --replication-group-id <ElastiCache Replication Group Id> --customer-node-endpoint-list "Address='<IP Address>',Port=<Port>"

The input to the API is your ElastiCache replication group id and the IP and port of the master of your self managed cluster. You need to ensure that the IP address is accessible from ElastiCache node. (An example IP address would be the private IP address of the master of your self managed cluster). This API will make the master node of the ElastiCache cluster call 'SLAVEOF' on the master of your self managed cluster. This will establish a replication stream and will start migrating data from self-managed cluster to ElastiCache cluster. During migration, the master of the ElastiCache cluster will stop accepting writes sent to it directly. You can start using ElastiCache cluster from your application for reads.

Once you have all your data in ElastiCache cluster, you can use the complete-migration API to stop the migration. This API will stop the replication from self managed cluster to ElastiCache cluster.

aws elasticache complete-migration --replication-group-id <ElastiCache Replication Group Id>

After this, the master of the ElastiCache cluster will start accepting writes. You can start using ElastiCache cluster from your application for both read and write.

The following limitations to be aware of for this migration method:

An existing or newly created ElastiCache deployment should meet the following requirements for migration:
It's cluster-mode disabled using Redis engine version 5.0.5 or higher.
It doesn't have either encryption in-transit or encryption at-rest enabled.
It has Multi-AZ with Auto-Failover enabled.
It has sufficient memory available to fit the data from your Redis on EC2 instance. To configure the right reserved memory settings, see Managing Reserved Memory.

Footlights answered 4/11, 2019 at 18:18 Comment(2)

Does online migration only support for redis hosted on Amazon Linux EC2 or it supports other OS as well? – Smock 24/11, 2020 at 14:27

The migration process is OS independent. So it should work on other OS as well. – Footlights 26/11, 2020 at 17:47

There are a few ways to migrate the data without downtime. They are harder to achieve though.

you could have your app write to two redis instances simultaneously - one of which would be on EC. Once the caches are both 'warm', you could just restart your app, and read from the EC cache.
You could initially migrate to EC2 instead of EC. not really what you were hoping to hear, I imagine. this is easy to do because you can set EC2 as salve of your redis instance. Also, migrating from EC2 to EC is somewhat easier (the data is already on AWS), so there's a benefit for users with huge sets of data.
You could, in theory, intercept the commands from the client and send them to EC, thus effectively "replicating". But this requires some programming ( I dont believe a tool like this exists ATM) and would be hard with multiple, ephemeral clients.

Tormoria answered 22/8, 2017 at 7:55 Comment(0)

You cloud have a try with RedisShake which was designed to do something like that.

Assume you have two Redis instances:

Instance A: 127.0.0.1:6379
Instance B: 127.0.0.1:6380

Create a new configuration file shake.toml:

[sync_reader]
address = "127.0.0.1:6379"

[redis_writer]
address = "127.0.0.1:6380"

To start RedisShake, run the following command:

./redis-shake shake.toml

The data will continue to synchronize, and you only need to choose the appropriate timing to switch.

Judicature answered 28/9, 2023 at 7:41 Comment(0)

Recommended topics

Hot tags