Nifty way to iterate over parallel arrays in Java using foreach
Asked Answered
S

8

33

I've inherited a bunch of code that makes extensive use of parallel arrays to store key/value pairs. It actually made sense to do it this way, but it's sort of awkward to write loops that iterate over these values. I really like the new Java foreach construct, but it does not seem like there is a way to iterate over parallel lists using this.

With a normal for loop, I can do this easily:

for (int i = 0; i < list1.length; ++i) {
    doStuff(list1[i]);
    doStuff(list2[i]);
}

But in my opinion this is not semantically pure, since we are not checking the bounds of list2 during iteration. Is there some clever syntax similar to the for-each that I can use with parallel lists?

Sanchez answered 4/4, 2011 at 23:42 Comment(8)
Re: semantic purity -- you could check the list2 bounds during iteration ((i < list1.length) && (i < list2.length)) or if you know the lists will be unmodified during iteration you could check right before the loop to see if list1 and list2 have equal lengths, in which case you can get away with not checking the bounds of both during iteration with a clear conscience.Snafu
If a thread modifies list2 but not list1, then I'm screwed.Sanchez
Well like I said -- "if you know the lists will be unmodified during iteration". Also, that would leave you no worse off than you are now. And if you have to worry about multiple threads hitting these lists then you have more worries than just this loop.Snafu
I'm not sure to what use cases my code may be adapted in the future. I'd prefer to not introduce a potentially very hard-to-find bug.Sanchez
True. But if list1 and list2 can be modified by another thread during iteration (either by changing elements in the existing arrays or by making the variables refer to different arrays entirely) then it doesn't matter how you write the loop or how you check the bounds -- you're in potential trouble here -- and anywhere else you use those arrays.Snafu
foreach checks bounds automatically, so if a list changes during iteration, I don't get an outofbounds exception. Again, I am not using multiple threads, but someone in the future, ignorant of how my functions are implemented, may run into this and it'll be a huge pain to track down.Sanchez
similar question dealing with collections in general, not arrays in particluar: stackoverflow.com/questions/3137944Alidus
+1 for cool header :-)Goldshell
L
23

I would use a Map myself. But taking you at your word that a pair of arrays makes sense in your case, how about a utility method that takes your two arrays and returns an Iterable wrapper?

Conceptually:

for (Pair<K,V> p : wrap(list1, list2)) {
    doStuff(p.getKey());
    doStuff(p.getValue());
}

The Iterable<Pair<K,V>> wrapper would hide the bounds checking.

Leffler answered 4/4, 2011 at 23:50 Comment(3)
The old Pair to solve everything. Didn't think about that, nice. Isn't there a JSR somewhere begging for this to be included in the language?Sanchez
Be careful with this answer. What happen if the first list has duplicated values? The resulting Map will override the values of those keys. So don't use this approach if your list may contain duplicated valuesFile
@File You are correct that a Map cannot contain duplicate keys. My answer does not actually require the use of a Map.Leffler
G
12

From the official Oracle page on the enhanced for loop:

Finally, it is not usable for loops that must iterate over multiple collections in parallel. These shortcomings were known by the designers, who made a conscious decision to go with a clean, simple construct that would cover the great majority of cases.

Basically, you're best off using the normal for loop.

If you're using these pairs of arrays to simulate a Map, you could always write a class that implements the Map interface with the two arrays; this could let you abstract away much of the looping.

Without looking at your code, I cannot tell you whether this option is the best way forward, but it is something you could consider.

Grappling answered 4/4, 2011 at 23:50 Comment(0)
P
10

This was a fun exercise. I created an object called ParallelList that takes a variable number of typed lists, and can iterate over the values at each index (returned as a list of values):

public class ParallelList<T> implements Iterable<List<T>> {

    private final List<List<T>> lists;

    public ParallelList(List<T>... lists) {
        this.lists = new ArrayList<List<T>>(lists.length);
        this.lists.addAll(Arrays.asList(lists));
    }

    public Iterator<List<T>> iterator() {
        return new Iterator<List<T>>() {
            private int loc = 0;

            public boolean hasNext() {
                boolean hasNext = false;
                for (List<T> list : lists) {
                    hasNext |= (loc < list.size());
                }
                return hasNext;
            }

            public List<T> next() {
                List<T> vals = new ArrayList<T>(lists.size());
                for (int i=0; i<lists.size(); i++) {
                    vals.add(loc < lists.get(i).size() ? lists.get(i).get(loc) : null);
                }
                loc++;
                return vals;
            }

            public void remove() {
                for (List<T> list : lists) {
                    if (loc < list.size()) {
                        list.remove(loc);
                    }
                }
            }
        };
    }
}

Example usage:

List<Integer> list1 = Arrays.asList(new Integer[] {1, 2, 3, 4, 5});
List<Integer> list2 = Arrays.asList(new Integer[] {6, 7, 8});
ParallelList<Integer> list = new ParallelList<Integer>(list1, list2);
for (List<Integer> ints : list) {
    System.out.println(String.format("%s, %s", ints.get(0), ints.get(1)));
}

Which would print out:

1, 6
2, 7
3, 8
4, null
5, null

This object supports lists of variable lengths, but clearly it could be modified to be more strict.

Unfortunately I couldn't get rid of one compiler warning on the ParallelList constructor: A generic array of List<Integer> is created for varargs parameters, so if anyone knows how to get rid of that, let me know :)

Pacificism answered 5/4, 2011 at 0:22 Comment(0)
M
6

You can use a second constraint in your for loop:

    for (int i = 0; i < list1.length && i < list2.length; ++i) 
    {
      doStuff(list1[i]);
      doStuff(list2[i]);
    }//for

One of my preferred methods for traversing collections is the for-each loop, but as the oracle tutorial mentions, when dealing with parallel collections to use the iterator rather than the for-each.

The following was an answer by Martin v. Löwis in a similar post:

it1 = list1.iterator();
it2 = list2.iterator();
while(it1.hasNext() && it2.hasNext()) 
{
   value1 = it1.next();
   value2 = it2.next();

   doStuff(value1);
   doStuff(value2);
}//while

The advantage to the iterator is that it's generic so if you don't know what the collections are being used, use the iterator, otherwise if you know what your collections are then you know the length/size functions and so the regular for-loop with the additional constraint can be used here. (Note I'm being very plural in this post as an interesting possibility would be where the collections used are different e.g. one could be a List and the other an array for instance)

Hope this helped.

Mytilene answered 29/11, 2011 at 12:46 Comment(0)
I
1

With Java 8, I use these to loop in the sexy way:

//parallel loop
public static <A, B> void loop(Collection<A> a, Collection<B> b, IntPredicate intPredicate, BiConsumer<A, B> biConsumer) {
    Iterator<A> ait = a.iterator();
    Iterator<B> bit = b.iterator();
    if (ait.hasNext() && bit.hasNext()) {
        for (int i = 0; intPredicate.test(i); i++) {
            if (!ait.hasNext()) {
                ait = a.iterator();
            }
            if (!bit.hasNext()) {
                bit = b.iterator();
            }
            biConsumer.accept(ait.next(), bit.next());
        }
    }
}

//nest loop
public static <A, B> void loopNest(Collection<A> a, Collection<B> b, BiConsumer<A, B> biConsumer) {
    for (A ai : a) {
        for (B bi : b) {
            biConsumer.accept(ai, bi);
        }
    }
}

Some example, with these 2 lists:

List<Integer> a = Arrays.asList(1, 2, 3);
List<String> b = Arrays.asList("a", "b", "c", "d");

Loop within min size of a and b:

loop(a, b, i -> i < Math.min(a.size(), b.size()), (x, y) -> {
    System.out.println(x +  " -> " + y);
});

Output:

1 -> a
2 -> b
3 -> c

Loop within max size of a and b (elements in shorter list will be cycled):

loop(a, b, i -> i < Math.max(a.size(), b.size()), (x, y) -> {
    System.out.println(x +  " -> " + y);
});

Output:

1 -> a
2 -> b
3 -> c
1 -> d

Loop n times ((elements will be cycled if n is bigger than sizes of lists)):

loop(a, b, i -> i < 5, (x, y) -> {
    System.out.println(x +  " -> " + y);
});

Output:

1 -> a
2 -> b
3 -> c
1 -> d
2 -> a

Loop forever:

loop(a, b, i -> true, (x, y) -> {
    System.out.println(x +  " -> " + y);
});

Apply to your situation:

loop(list1, list2, i -> i < Math.min(a.size(), b.size()), (e1, e2) -> {
    doStuff(e1);
    doStuff(e2);
});
Indulgent answered 13/12, 2016 at 3:42 Comment(0)
P
0

Simple answer: No.

You want sexy iteration and Java byte code? Check out Scala: Scala for loop over two lists simultaneously

Disclaimer: This is indeed a "use another language" answer. Trust me, I wish Java had sexy parallel iteration, but no one started developing in Java because they want sexy code.

Palacio answered 11/11, 2014 at 15:38 Comment(3)
Hey Joe! We keep running into each other :) Of course "use a different language" is a solution to the problem, but is not an answer to a question which includes the prepositional phrase "in Java".Sanchez
@TravisWebb Oh, didn't notice it was you! Actual answer to the posit was "No." Other language was a suggestion for sexy iteration. :) Hopefully they'll be a nice Java 8 lambda expression answer soon.Palacio
Yea this question is pretty old at this point. Probably some more modern answers are called for.Sanchez
M
0

ArrayIterator lets you avoid indexing, but you can’t use a for-each loop without writing a separate class or at least function. As @Alexei Blue remarks, official recommendation (at The Collection Interface) is: “Use Iterator instead of the for-each construct when you need to: … Iterate over multiple collections in parallel.”:

import static com.google.common.base.Preconditions.checkArgument;
import org.apache.commons.collections.iterators.ArrayIterator;

// …

  checkArgument(array1.length == array2.length);
  Iterator it1 = ArrayIterator(array1);
  Iterator it2 = ArrayIterator(array2);
  while (it1.hasNext()) {
      doStuff(it1.next());
      doOtherStuff(it2.next());
  }

However:

  • Indexing is natural for arrays – an array is by definition something you index, and a numerical for loop, as in your original code, is perfectly natural and more direct.
  • Key-value pairs naturally form a Map, as @Isaac Truett remarks, so cleanest would be to create maps for all your parallel arrays (so this loop would only be in the factory function that creates the maps), though this would be inefficient if you just want to iterate over them. (Use Multimap if you need to support duplicates.)
  • If you have a lot of these, you could (partially) implement ParallelArrayMap<> (i.e., a map backed by parallel arrays), or maybe ParallelArrayHashMap<> (to add a HashMap if you want efficient lookup by key), and use that, which allows iteration in the original order. This is probably overkill though, but allows a sexy answer.

That is:

Map<T, U> map = new ParallelArrayMap<>(array1, array2);
for (Map.Entry<T, U> entry : map.entrySet()) {
  doStuff(entry.getKey());
  doOtherStuff(entry.getValue());
}

Philosophically, Java style is to have explicit, named types, implemented by classes. So when you say “[I have] parallel arrays [that] store key/value pairs.”, Java replies “Write a ParallelArrayMap class that implements Map (key/value pairs) and that has a constructor that takes parallel arrays, and then you can use entrySet to return a Set that you can iterate over, since Set implements Collection.” – make the structure explicit in a type, implemented by a class.

For iterating over two parallel collections or arrays, you want to iterate over a Iterable<Pair<T, U>>, which less explicit languages allow you to create with zip (which @Isaac Truett calls wrap). This is not idiomatic Java, however – what are the elements of the pair? See Java: How to write a zip function? What should be the return type? for an extensive discussion of how to write this in Java and why it’s discouraged.

This is exactly the stylistic tradeoff Java makes: you know exactly what type everything is, and you have to specify and implement it.

Mucosa answered 16/6, 2015 at 1:40 Comment(0)
P
-1
//Do you think I'm sexy?
if(list1.length == list2.length){
    for (int i = 0; i < list1.length; ++i) {
        doStuff(list1[i]);
        doStuff(list2[i]);
    }
}
Pomeranian answered 4/4, 2011 at 23:48 Comment(2)
And everything mysteriously breaks when list1.length != list2.length.Leffler
See my comment in response to @SnafuSanchez

© 2022 - 2024 — McMap. All rights reserved.