calling ExecutorService.shutDown() in java
Asked Answered
S

1

7

i am starting to learn the ExecutorService class. The documentation (and tutorials online) say to always call ExecutorService.shutDown() to reclaim resources. however, the documentation also says that after you call shutDown(), no new tasks will be accepted. so, my question is, do i have always have to instantiate a new ExecutorService whenever i need to parallelize data processing?

right now i have a List of Callable objects, and i do the following.

public void someMethod() {
 List<OuterCallable> outerCallables = getOuterCallables();
 ExecutorService executor = Executor.newFixedThreadPool(NUM_CPUS);
 executor.invokeAll(tasks);
 executor.shutDown();
}

however, my OuterCallable also splits data or executes data processing in parallel using InnerCallable.

public class OuterCallable implements Callable<Long> {
 public Long call() throws Exception {
  long result = 0L;

  List<InnerCallable> innerCallables = getInnerCallables();
  ExecutorServices executor = Executor.newFixedThreadPool(NUM_CPUS);
  executor.invokeAll(tasks);
  executor.shutDown();

  return result;
 }
}    

i can't remember if it was for ExecutorService or the Fork/Join approach, but i remember the documentations and tutorials saying that the actual parallel procedure to manipulate data should not involve I/O operations and everything should be done in memory. however, in my InnerCallable, i am actually making JDBC calls (not shown here).

ultimately, the way i am using ExecutorService works, but i still have lingering concerns.

  1. is my approach above good programming practice using ExecutorService?
  2. should i be using a singleton instance of ExecutorService?
  3. should i be not only avoiding I/O operations inside my parallel methods but also JDBC calls as well?

as a last concern, i was trying to research a little bit on Fork/Join vs ExecutorService. i came across an article that completely blasted the Fork/Join API/classes. is it worth it to learn Fork/Join? i saw a few articles on stackoverflow and elsewhere, where tests are used to compare Fork/Join vs ExecutorService, and there are graphs showing better CPU usage of Fork/Join vs ExecutorService (via Windows Task Manager). however, when i use ExecutorService (JDK 1.7.x), my CPU usage is max. has ExecutorService improved with the latest JDK?

any help/guidance is appreciated.

Sebaceous answered 27/2, 2012 at 13:8 Comment(0)
L
4

You should add awaitTermination calls, because shutDown returns without waiting for Callables to finish. Other than that,

  1. Do you OuterCallables have sequential dependencies? If so, your approach is fine, but using ForkJoinPool would be preferable because it will keep the number of worker threads low. If not, it would be better to submit a big flattened collection of Callables to a single Executor.
  2. Only if you want to use the ExecutorService in several different routines and want to avoid passing it around. If it's only used in someMethod, might as well instantiate it there as you are doing.
  3. You should avoid I/O routines that will take a long time to complete. If your ExecutorService has 4 worker threads and all of them are blocking on I/O, the JVM won't be using the CPU at all, even though other Callables may be waiting to do CPU-intensive work. A few JDBC calls is okay as long as the queries do not take long to complete.
Lashelllasher answered 27/2, 2012 at 13:22 Comment(9)
1. yes, there is a sequential dependency between the OuterCallables and InnerCallables. you are right on shutDown vs awaitTermination, but how come my code still works? 2. ok. thanks. 3. for the InnerCallables, they are all JDBC calls used to compute simple select count(*) SQL statements. the number of rows in the table is huge (beyond 3 million rows). even though each column is indexed, it is taking 10 to 20 seconds per SQL call. the SQL statements are dynamically generated (from the OuterCallable) and so there is no way to make them as stored procs for improved speed.Sebaceous
on point #3, so given that what i am doing is not a few JDBC calls and takes a long time, is this an abuse of ExecutorService? thanks for #1 and i will explore that route too.Sebaceous
1. Are you using Future objects to communicate results? If so, awaitTermination isn't really necessary since get() blocks. 3. I think most storage engines read through the smallest index (if one is available) for a count(*). You could try adding a dummy int column with an index and filling it with zeros, then doing count(dummy). Might be faster, but still linear time. Can you keep track of table sizes in a separate table which is updated on inserts and deletes, or would that require too many changes?Lashelllasher
I would say it's not an abuse of ExecutorService, as long you understand that the tasks will take a long time because of the JDBC calls. You still get the benefits of easily changing the worker pool size, so you can see how this affects database performance. My guess is that the parallelism would help if most of the index files fit in memory, but not if they're on disk.Lashelllasher
yes, i am using Future to get the results (not shown or implied in my original post). thanks, that explains it. i'll also try the dummy int/short column (using mysql). the table is read-only. thanks for all the explanation and time, much appreciated.Sebaceous
@Daniel The documentation says: 'Initiates an orderly shutdown in which previously submitted tasks are executed, but no new tasks will be accepted.' Doesn't this imply that shutDown() does wait for Callables to finish? source: docs.oracle.com/javase/1.5.0/docs/api/java/util/concurrent/…Normannormand
@ChrisMorris I didn't mean to say that tasks will be canceled, just that shutdown returns immediately rather than blocking until tasks have finished.Lashelllasher
@Daniel Ok. Let's say the Executor has spawned X of the Z processes that is has been given. Can the Executor return from shutDown() before spawning the remaining Z - X threads?Normannormand
@ChrisMorris yes, shutdown should return immediately and some time later those Z - X tasks will be picked up by worker threads and executed. Looking at ThreadPoolExecutor's implementation of shutdown, it basically just sets a flag which causes execute to reject new tasks, it shouldn't have any affect on tasks already submitted.Lashelllasher

© 2022 - 2024 — McMap. All rights reserved.