Is ARPACK thread-safe?
Asked Answered
H

4

7

Is it safe to use the ARPACK eigensolver from different threads at the same time from a program written in C? Or, if ARPACK itself is not thread-safe, is there an API-compatible thread-safe implementation out there? A quick Google search didn't turn up anything useful, but given the fact that ARPACK is used heavily in large scientific calculations, I'd find it highly surprising to be the first one who needs a thread-safe sparse eigensolver.

I'm not too familiar with Fortran, so I translated the ARPACK source code to C using f2c, and it seems that there are quite a few static variables. Basically, all the local variables in the translated routines seem to be static, implying that the library itself is not thread-safe.

Hypervitaminosis answered 8/10, 2010 at 9:20 Comment(1)
Thanks guys, all of you have helped a lot and I would have accepted all of them if that were possible.Bernoulli
H
5

Fortran 77 does not support recursion, and hence a standard conforming compiler can allocate all variables in the data section of the program; in principle, neither a stack nor a heap is needed [1].

It might be that this is what f2c is doing, and if so, it might be that it's the f2c step that makes the program non thread-safe, rather than the program itself. Of course, as others have mentioned, check out for COMMON blocks as well. EDIT: Also, check for explicit SAVE directives. SAVE means that the value of the variable should be retained between subsequent invocations of the procedure, similar to static in C. Now, allocating all procedure local data in the data section makes all variables implicitly SAVE, and unfortunately, there is a lot of old code that assumes this even though it's not guaranteed by the Fortran standard. Such code, obviously, is not thread-safe. Wrt. ARPACK specifically, I can't promise anything but ARPACK is generally well regarded and widely used so I'd be surprised if it suffered from these kinds of dusty-deck problems.

Most modern Fortran compilers do use stack allocation. You might have better luck compiling ARPACK with, say, gfortran and the -frecursive option.

EDIT:

[1] Not because it's more efficient, but because Fortran was originally designed before stacks and heaps were invented, and for some reason the standards committee wanted to retain the option to implement Fortran on hardware with neither stack nor heap support all the way up to Fortran 90. Actually, I'd guess that stacks are more efficient on todays heavily cache-dependent hardware rather than accessing procedure local data that is spread all over the data section.

Heinie answered 10/10, 2010 at 18:55 Comment(4)
Thanks for the clarification. So, theoretically, those static variables I see in the translated C code are made static only because allocating them in the data section is more efficient? It seems that there are no COMMON blocks in the ARPACK code (at least a quick grep for COMMON in the ARPACK source code revealed nothing), so is it safe to say then that the ARPACK code is thread-safe? Or are there other non-thread-safe mechanisms in FORTRAN that I should look out for?Bernoulli
@Tamas: I updated my answer hopefully answering your further questions.Heinie
Thanks, that helped a lot! For the record: it looks like ARPACK depends on LAPACK and LAPACK has quite a few SAVE directives. They seem to store machine-specific constants such as EPS which are initialized on the first call to dlamch. I guess it's not too hard to get rid of them, but it looks like ARPACK as is is not thread-safe.Bernoulli
Update: LAPACK seems to be entirely thread-safe from version 3.3. At least the Netlib reference implementation.Bernoulli
C
5

I have converted ARPACK to C using f2c. Whenever you use f2c and you care about thread-safety you must use the -a switch. This makes local variables have automatic storage, i.e. be stack based locals rather than statics which is the default.

Even so, ARPACK itself is decidedly not threadsafe. It uses a lot of common blocks (i.e. global variables) to preserve state between different calls to its functions. If memory serves, it uses a reverse communication interface which tends to lead developers to using global variables. And of course ARPACK probably was written long before multi-threading was common.

I ended up re-working the converted C code to systematically remove all the global variables. I created a handful of C structs and gradually moved the global variables into these structs. Finally I passed pointers to these structs to each function that needed access to those variables. Although I could just have converted each global into a parameter wherever it was needed it was much cleaner to keep them all together, contained in structs.

Essentially the idea is to convert global variables into local variables.

Cony answered 7/8, 2011 at 11:19 Comment(0)
L
1

ARPACK uses BLAC right? Then those libraries need to be thread safe too. I believe your idea to check with f2c might not be a bullet proof way of telling if the Fortran code is thread safe, I would guess it also depends on the Fortran compiler and libraries.

Libna answered 8/10, 2010 at 9:29 Comment(2)
Yes, ARPACK depends on BLAS and LAPACK, so the next step would be to check those too. I didn't mean that the absence of "static" in the translated C source code implies that the code itself is thread-safe, but the presence of it is a sign that the library is probably not thread-safe. However, while browsing the source code, I got a feeling that it wouldn't be too hard to re-write it in a thread-safe manner, and I'd be surprised to know that no one did it before me.Bernoulli
I also was thinking that maybe f2c uses a certain compiler strategy by generating statics, where another compiler would have allocated memory on the heap. I don't know Fortran, but it seems a lot more depends on the compiler than in C. Fortran does not expose pointers (right?) and a Fortran compiler has a bit more freedom than a C compiler.Libna
S
1

I don't know what strategy f2c uses in translating Fortran. Since ARPACK is written in FORTRAN 77, the first thing to do is check for the presence of COMMON blocks. These are global variables, and if used, the code is most likely not thread safe. The ARPACK webpage, http://www.caam.rice.edu/software/ARPACK/, says that there is a parallel version -- it seems likely that that version is threadsafe.

Solvolysis answered 8/10, 2010 at 13:4 Comment(1)
Parallel ARPACK is parallelized with MPI; it might, or then it might not, be thread-safe.Heinie

© 2022 - 2024 — McMap. All rights reserved.