Using 2d array vs array of derived type in Fortran 90

L

2

7

Assuming you want a list of arrays, each having the same size. Is it better performance-wise to use a 2D array :

integer, allocatable :: data(:,:)

or an array of derived types :

type test
    integer, allocatable :: content(:)
end type
type(test), allocatable :: data(:)

Of course, for arrays of different sizes, we don't have a choice. But how is the memory managed between the 2 cases ? Also, is one of them good code practice ?

Lymphosarcoma answered 12/8, 2013 at 12:43 Comment(0)

G

5

In general, you want to use the simplest data structure that suits your problem. If a 2d rectangular array meets your needs - and for a huge number of scientific computing problems, problems for which Fortran is a good choice, it does - then that's the choice you want.

The 2d array will be contiguous in memory, which will normally make accessing it faster both due to caching and one fewer level of indirection; the 2d array will also allow you to do things like data = data * 2 or data = 0. which the array-of-array approach doesn't [Edited to add: though as IanH points out in comments you can create a defined type and defined operations on those types to allow this]. Those advantages are great enough that even when you have "ragged arrays", if the range of expected row lengths isn't that large, implementing it as a rectangular 2d array is sometimes a choice worth considering.

Glorify answered 12/8, 2013 at 12:51 Comment(2)

Consider the capabilities offered by defined operations with respect to your comment about multiplication and assignment to arrays. – Chuddar 12/8, 2013 at 22:40

Fair enough, but it remains true that using built in 2d arrays gets you those (a) faster (b) with probably fewer temporaries (eg, d=a*b+c), (c) with slicing in both dimensions at the same time, all (d) for free. It still will make more sense to do arrays-of-arrays in some cases, but if you don't need that extra generality, using the simpler case probably makes sense. – Glorify 13/8, 2013 at 15:23

O

10

Choose the implementation which minimises the conceptual distance that your mind has to leap between the problem in your head and the solution in your code. The force of this approach increases with age, both the age of your code (good conceptual design is a solid foundation for future development) and your own age (the less effort understanding your code demands the longer you'll remain mentally competent enough to understand it).

As to the non-opinion-determined part of your question concerning the way that the memory is managed ... My naive expectation is that most compilers will, under most circumstances, allocate contiguous memory for the first of your outlines, and may not for the second. But I don't care enough about this to check, and I do not think that you should either. I don't, by this, suggest that you should not be interested in what is going on under the hood, but rather that you should be more concerned with the matters referred to in the first paragraph.

Outherod answered 12/8, 2013 at 12:52 Comment(2)

+1 - the more I think about this, "minimizing conceptual distance" is probably more important than some abstract measure of "simplest". – Glorify 13/8, 2013 at 15:31

that's precisely it! you have to find a balance between readability and performance... – Pisciform 15/2, 2014 at 15:40

G

5

In general, you want to use the simplest data structure that suits your problem. If a 2d rectangular array meets your needs - and for a huge number of scientific computing problems, problems for which Fortran is a good choice, it does - then that's the choice you want.

The 2d array will be contiguous in memory, which will normally make accessing it faster both due to caching and one fewer level of indirection; the 2d array will also allow you to do things like data = data * 2 or data = 0. which the array-of-array approach doesn't [Edited to add: though as IanH points out in comments you can create a defined type and defined operations on those types to allow this]. Those advantages are great enough that even when you have "ragged arrays", if the range of expected row lengths isn't that large, implementing it as a rectangular 2d array is sometimes a choice worth considering.

Glorify answered 12/8, 2013 at 12:51 Comment(2)

Consider the capabilities offered by defined operations with respect to your comment about multiplication and assignment to arrays. – Chuddar 12/8, 2013 at 22:40

Fair enough, but it remains true that using built in 2d arrays gets you those (a) faster (b) with probably fewer temporaries (eg, d=a*b+c), (c) with slicing in both dimensions at the same time, all (d) for free. It still will make more sense to do arrays-of-arrays in some cases, but if you don't need that extra generality, using the simpler case probably makes sense. – Glorify 13/8, 2013 at 15:23

Recommended topics

Hot tags