Should Hadoop FileSystem be closed?
Asked Answered
V

1

12

I'm building a spring-boot powered service that writes data to Hadoop using filesystem API. Some data is written to parquet file and large blocks are cached in memory so when the service is shut down, potentially several hundred Mb of data have to be written to Hadoop.

FileSystem closes automatically by default, so when the service is shut down, sometimes FileSystem gets closed before all the writers are closed resulting in corrupted parquet files.

There is fs.automatic.close flag in filesystem Configuration, but FileSystem instance is used from multiple threads and I don't know any clean way to wait for them all to finish before closing FileSystem manually. I tried using a dedicated filesysem closing bean implementing Spring SmartLifeCycle with max phase so it is destroyed last, but actually it is not destroyed last but notified of shutdown last while other beans are still in the process of shutting down.

Ideally every object that needs a FileSystem would get one and would be responsible for closing it. The problem is FileSystem.get(conf) returns a cached instance. There is FileSystem.newInstance(conf), but it is not clear what are the consequences of using multiple FileSystem instances performance-wise. There is another issue with that - there is no way to pass FileSystem instance to ParquetWriter - it gets one using path.getFileSystem(conf). And one would think that line would return a FileSystem instance assigned to that file only, but one would be wrong - most likely the same cached instance would be returned so closing it would be wrong.

Is there a recommended way of managing a lifecycle of a FileSystem? What would happen if a FileSystem is created with fs.automatic.close set to true and never closed manually? Maybe spring-boot supports a clean way to close FileSystem after all other beans are actually destroyed (not being destroyed)?

Thanks!

Voluntary answered 14/3, 2019 at 17:41 Comment(1)
Could you please fix the missing backtick in the third paragraph? I would have to change 6 characters (meta.stackexchange.com/q/81520) and don't know how I should do this without adding unneccesary noise.Original
O
5

You can disable the FileSystem cache using the fs.<scheme>.impl.disable.cache configuration (found here, some discussion here), where <scheme> in your case would be hdfs (assuming you are using HDFS). This will force ParquetWriter to create a new FileSystem instance when it calls path.getFileSystem(conf). This configuration is undocumented for good reason--while widely used in unit tests within Hadoop itself, it can be very dangerous to use in a production system. To answer your question regarding performance, assuming you are using HDFS, each FileSystem instance will create a separate TCP connection to the HDFS NameNode. Application and library code is typically written with the assumption that calls like path.getFileSystem(conf) and FileSystem.get(conf) are cheap and lightweight, so they are used frequently. In a production system, I have seen a client system DDoS a NameNode server because it disabled caching. You need to carefully manage the lifecycle of not just FileSystem instances your code creates, but also those created by libraries you use. I would generally recommend against it.

It sounds like the issue is really coming from a bad interaction between the JVM shutdown hooks used by Spring and those employed by Hadoop, which is the mechanism used to automatically close FileSystem instances. Hadoop includes its own ShutdownHookManager which is used to sequence events during shutdown; FileSystem shutdown is purposefully placed at the end so that other shutdown hooks (e.g. cleaning up after a MapReduce task) can be completed first. But, Hadoop's ShutdownHookManager is only aware of shutdown tasks that have been registered to it, so it will not be aware of Spring's lifecycle management. It does sound like leveraging Spring's shutdown sequences and leveraging fs.automatic.close=false may be the right fit for your application; I don't have Spring experience so I can't help you in that regard. You may also be able to register Spring's entire shutdown sequence with Hadoop's ShutdownHookManager, using a very high priority to ensure that Spring's shutdown sequence is first in the shutdown queue.

To answer this portion specifically:

Is there a recommended way of managing a lifecycle of a FileSystem?

The recommended way is generally to not manage it, and let the system do it for you. There be dragons whenever you try to manage it yourself, so proceed with caution.

Outlet answered 8/5, 2020 at 21:33 Comment(2)
Thanks for the detailed answer! I solved my problem by disabling automatic FS shutdown and using a custom shutdown manager that is called by Spring on shutdown and stops every bean that uses FS (every such bean has to register on init). Its not ideal but the issue is while you can specify shutdown priorities, Spring doesn't wait for a bean with a higher priority to actually stop before stopping beans with a lower priority. For that reason even if Hadoop ShutdownHookManager is managed by Spring and shuts down last, some beans using FS might not have completed the shutdown process by that time.Voluntary
One problem with the referred blog post and the general assumption is, the cache is at the "scheme" level, not really the "URI" as Cache.Key takes. The problem is, if one application involves multiple namespaces like S3 buckets or HDFS clusters - this would fail. Ref: github.com/apache/hadoop/blob/trunk/hadoop-common-project/…Doorstop

© 2022 - 2024 — McMap. All rights reserved.