How to save memory from unpopular/cold Redis?

About

Asked 6/5, 2020 at 7:24 Answered 19/12, 2021 at 3:38

We have a lot of Redis instances, consuming TBs of memory and hundreds of machines.

With our business activities goes up and down, some Redis instances are just not used that frequent any more -- they are "unpopular" or "cold". But Redis stores everything in memory, so a lot of infrequent data that should have been stored in cheap disk are occupying expensive memory.

We are exploring a way to save the memory from these unpopular/cold Redis, as to reduce our machines usage.

We cannot delete data, nor can we migrate to other database. Are there some way to achieve our goals?

PS: We are thinking of some Redis compatible product that can "mix" memory and disk, i.e. it stores hot data in memory but cold in disk, and USING LIMITED RESOURCES. We know RedisLabs' "Redis on Flash(ROF)" solution, but it uses RocksDB, which is very memory unfriendly. What we want is a very memory restrained product. Besides, ROF is not open source :(

Thanks in advance!

Monogram answered 6/5, 2020 at 7:24 Comment(8)

Did you consider to use your unpopular/cold keys with TTL or creating eviction policies depending on the business needs ? – Trinl 6/5, 2020 at 10:17

"but it uses RocksDB, which is very memory unfriendly." RedisLabs tuned it so it won't "memory unfriendly" – Ichang 6/5, 2020 at 12:7

@Trinl We treat Redis as database, not cache, so eviction data is not possible. – Monogram 7/5, 2020 at 3:56

Does your application implement repository pattern? If not, then you may have painted yourself into a corner by making Redis a dependency. From Redis FAQ: "In the past the Redis developers experimented with Virtual Memory and other systems in order to allow larger than RAM datasets, but after all we are very happy if we can do one thing well: data served from memory, disk used for storage. So for now there are no plans to create an on disk backend for Redis. Most of what Redis is, after all, is a direct result of its current design." – Rinna 8/5, 2020 at 10:36

@bayinamy, can you provide some information on the type of data and the Redis data structure you're using to store the data? Also, what's the access pattern on the unpopular/cold data? How about a secondary cluster with the cold data being serialised into binary format (Kryo/Avro/ProtoBuf) and compressed? Then you can lookup in the primary cluster and if it's not available, look up in the secondary/unpopular/cold cluster? This might work if your unpopular data is rarely accessed and so it's okay to take a performance hit on two lookups instead of one. – Lofton 9/5, 2020 at 15:39

redislabs.com/ebook/part-2-core-concepts/… This might be hepful. – Cassondracassoulet 14/5, 2020 at 19:44

"PS: We are thinking of some Redis compatible product that can "mix" memory and disk, i.e. it stores hot data in memory but cold in disk, and USING LIMITED RESOURCES." That is swap, isn't it? Maybe you should tune the vm.swappiness parameter to aggressively swap unused data to disk? – Adenovirus 15/5, 2020 at 7:56

@ReinerRottmann that's a creative way of handling the problem, it might work depending on the type of load: Redis has its own way of storing data in memory pages and that may not fit the swapping pattern, I would be curious to see some benchmarks. Have you considered the possibility of shutting down the entire Redis instance when not in use, the serverless way? – Roby 15/5, 2020 at 8:27

ElastiCache Redis now supports data tiering. Data tiering provides a new cost optimal option for storing data in Redis by utilizing lower-cost local NVMe SSDs in each cluster node in addition to storing data in memory. It is ideal for workloads that access up to 20 percent of their overall dataset regularly, and for applications that can tolerate additional latency when accessing data on SSD. More details about data tiering can be found here.

Convector answered 19/12, 2021 at 3:38 Comment(0)

Your problem might be solved by using an orchestrator approach: scaledown when not in use, scale up when in demand.

Implementation depends much on your infrastructure, but a base requirement is proper monitoring of Redis instances usage. Based on that, if you are running on Kubernetes, you can leverage pod autoscaling.

Otherwise you can implement Consul and use HAProxy to handle the shutdown/spin-up logic. A starting point for that strategy is this article.

Of course Reiner's idea of using swap is a quick win if it works the intended way!

Roby answered 15/5, 2020 at 8:44 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags