Reorder vector using a vector of indices [duplicate]
Asked Answered
S

13

45

I'd like to reorder the items in a vector, using another vector to specify the order:

char   A[]     = { 'a', 'b', 'c' };
size_t ORDER[] = { 1, 0, 2 };

vector<char>   vA(A, A + sizeof(A) / sizeof(*A));
vector<size_t> vOrder(ORDER, ORDER + sizeof(ORDER) / sizeof(*ORDER));

reorder_naive(vA, vOrder);
// A is now { 'b', 'a', 'c' }

The following is an inefficient implementation that requires copying the vector:

void reorder_naive(vector<char>& vA, const vector<size_t>& vOrder)  
{   
    assert(vA.size() == vOrder.size());  
    vector vCopy = vA; // Can we avoid this?  
    for(int i = 0; i < vOrder.size(); ++i)  
        vA[i] = vCopy[ vOrder[i] ];  
}  

Is there a more efficient way, for example, that uses swap()?

Sawmill answered 8/5, 2009 at 5:35 Comment(4)
Don't use all caps in your function names. Or anything, for that matter. Only #define's can get away with that.Sherard
There is an ambiguity about the content of ORDER. Is it the index at which the corresponding letter should be stored, or is it the index of the letter that should be stored at this position ? Both interpretation would be correct with the given example, though they are different.Spiritless
I had in mind the latter, but both interpretations give exactly the same result.Sawmill
To anybody looking for an efficient version of reorder_naive above, DO NOT use the solutions proposed below. They calculate the first, and not the latter interpretation(see comments above) of the question, but DO NOT provide the same result.Pleasance
F
32

This algorithm is based on chmike's, but the vector of reorder indices is const. This function agrees with his for all 11! permutations of [0..10]. The complexity is O(N^2), taking N as the size of the input, or more precisely, the size of the largest orbit.

See below for an optimized O(N) solution which modifies the input.

template< class T >
void reorder(vector<T> &v, vector<size_t> const &order )  {   
    for ( int s = 1, d; s < order.size(); ++ s ) {
        for ( d = order[s]; d < s; d = order[d] ) ;
        if ( d == s ) while ( d = order[d], d != s ) swap( v[s], v[d] );
    }
}

Here's an STL style version which I put a bit more effort into. It's about 47% faster (that is, almost twice as fast over [0..10]!) because it does all the swaps as early as possible and then returns. The reorder vector consists of a number of orbits, and each orbit is reordered upon reaching its first member. It's faster when the last few elements do not contain an orbit.

template< typename order_iterator, typename value_iterator >
void reorder( order_iterator order_begin, order_iterator order_end, value_iterator v )  {   
    typedef typename std::iterator_traits< value_iterator >::value_type value_t;
    typedef typename std::iterator_traits< order_iterator >::value_type index_t;
    typedef typename std::iterator_traits< order_iterator >::difference_type diff_t;
    
    diff_t remaining = order_end - 1 - order_begin;
    for ( index_t s = index_t(), d; remaining > 0; ++ s ) {
        for ( d = order_begin[s]; d > s; d = order_begin[d] ) ;
        if ( d == s ) {
            -- remaining;
            value_t temp = v[s];
            while ( d = order_begin[d], d != s ) {
                swap( temp, v[d] );
                -- remaining;
            }
            v[s] = temp;
        }
    }
}

And finally, just to answer the question once and for all, a variant which does destroy the reorder vector (filling it with -1's). For permutations of [0..10], It's about 16% faster than the preceding version. Because overwriting the input enables dynamic programming, it is O(N), asymptotically faster for some cases with longer sequences.

template< typename order_iterator, typename value_iterator >
void reorder_destructive( order_iterator order_begin, order_iterator order_end, value_iterator v )  {
    typedef typename std::iterator_traits< value_iterator >::value_type value_t;
    typedef typename std::iterator_traits< order_iterator >::value_type index_t;
    typedef typename std::iterator_traits< order_iterator >::difference_type diff_t;
    
    diff_t remaining = order_end - 1 - order_begin;
    for ( index_t s = index_t(); remaining > 0; ++ s ) {
        index_t d = order_begin[s];
        if ( d == (diff_t) -1 ) continue;
        -- remaining;
        value_t temp = v[s];
        for ( index_t d2; d != s; d = d2 ) {
            swap( temp, v[d] );
            swap( order_begin[d], d2 = (diff_t) -1 );
            -- remaining;
        }
        v[s] = temp;
    }
}
Frazer answered 12/8, 2009 at 18:26 Comment(13)
@Potatoswatter: yes, that's the way stackoverflow works..., never really understood the reasoning when one guy does the (vast majority of the) editing i. It's like they want to discourage you from improving your answer or something...Leontine
Do you have any example code showing that this works? I can't even compile the second version with gcc. And the first version is not reordering my vector correctly.Omari
@mangledorf: ideone.com/TsWbu - Note that the reorder vector contains the final location for each corresponding element in the data vector, which is different from a reorder vector that selects which initial data vector element should appear at each final location. The OP example was ambiguous in this regard. My code can probably be adjusted to work the other way, but I don't have time right now :v( .Frazer
It seems this algorithm (at least second version) does swaps even if the element is already on its place. So if you call it with order vector being 0, 1, 2, ... it will actually do lots of copying around.Ecclesiasticus
I'm not going to test this, but I'm pretty sure the optimization for non-moved objects (singular orbits) is to add if ( s == order[ s ] ) continue; at the beginning of the outer loop for the first code, or the same but substituting order_begin for order in the other two.Frazer
It is not working for me (see also @AlecJacobson comment).Erase
I've written some tests for this and none of these actually return the correct results.Evildoer
@Evildoer Does it disagree with chmike’s solution? See above comment regarding “send-to” indices instead of “fetch-from”Frazer
BEWARE: This answer is incorrect as it answers a different question to the one asked.Karlykarlyn
It looks as though for ( d = order_begin[s]; d > s; d = order_begin[d] ) ; could induce an O(N^2) behaviour.Johathan
@Johathan That’s correct. For the algorithm to work without O(N) additional memory, it needs to determine which swaps were already done, using the given resources. Without modifying the input, that determination takes O(N) time. The pessimal case rotates the whole vector forward by one position. The third solution avoids this cost by caching the result in the given memory. So “16% faster on average” isn’t a very good description of the advantage. It’s asymptotically faster on forward rotations.Frazer
@Johathan Thanks for your edit. I reverted part of it because the best way to call swap is using ADL, without the std:: qualifier. Also, I updated the text. As my first SO post, it's quite old!Frazer
@Potatoswatter: Good call with the ADL.Johathan
S
11

In-place reordering of vector

Warning: there is an ambiguity about the semantic what the ordering-indices mean. Both are answered here

move elements of vector to the position of the indices

Interactive version here.

#include <iostream>
#include <vector>
#include <assert.h>

using namespace std;

void REORDER(vector<double>& vA, vector<size_t>& vOrder)  
{   
    assert(vA.size() == vOrder.size());

    // for all elements to put in place
    for( int i = 0; i < vA.size() - 1; ++i )
    { 
        // while the element i is not yet in place 
        while( i != vOrder[i] )
        {
            // swap it with the element at its final place
            int alt = vOrder[i];
            swap( vA[i], vA[alt] );
            swap( vOrder[i], vOrder[alt] );
        }
    }
}

int main()
{
    std::vector<double> vec {7, 5, 9, 6};
    std::vector<size_t> inds {1, 3,  0, 2};
    REORDER(vec, inds);
    for (size_t vv = 0; vv < vec.size(); ++vv)
    {
        std::cout << vec[vv] << std::endl;
    }
    return 0;
}

output

9
7
6
5

note that you can save one test because if n-1 elements are in place the last nth element is certainly in place.

On exit vA and vOrder are properly ordered.

This algorithm performs at most n-1 swapping because each swap moves the element to its final position. And we'll have to do at most 2N tests on vOrder.

draw the elements of vector from the position of the indices

Try it interactively here.

#include <iostream>
#include <vector>
#include <assert.h>

template<typename T>
void reorder(std::vector<T>& vec, std::vector<size_t> vOrder)
{
    assert(vec.size() == vOrder.size());
            
    for( size_t vv = 0; vv < vec.size() - 1; ++vv )
    {
            if (vOrder[vv] == vv)
            {
                continue;
            }
            size_t oo;
            for(oo = vv + 1; oo < vOrder.size(); ++oo)
            {
                if (vOrder[oo] == vv)
                {
                    break;
                }
            }
            std::swap( vec[vv], vec[vOrder[vv]] );
            std::swap( vOrder[vv], vOrder[oo] );
    }
}

int main()
{
    std::vector<double> vec {7, 5, 9, 6};
    std::vector<size_t> inds {1, 3,  0, 2};
    reorder(vec, inds);
    for (size_t vv = 0; vv < vec.size(); ++vv)
    {
        std::cout << vec[vv] << std::endl;
    }
    return 0;
}

Output

5
6
7
9
Spiritless answered 8/5, 2009 at 8:26 Comment(8)
Your approach of updating the order vector really improves the code, but do you really need the internal while loop? I believe you don't with the following rationale: In the first step of the algorithm you do not need the while loop. It will work in just one step. The fancy part is that you are actually performing a reduction of the original problem.Jame
@dribeas this is unfortunately not true but what I also thought in the first place. Try with the sequence 2 3 1 0 you'll see that 1 and 0 wont be in the correct place.Spiritless
it's broken, vA[i] ends up containing vA[k] where k is the last index in a cycleSusanasusanetta
Could you be more explicit ? What cycle are you referring to ? The only error I see in my code is that int should be size_t.Spiritless
The only error I see in my code is that I should have used size_t instead of int for i and alt. The code is correct and is fast but at the cost of modifying vOrder. If vOrder can't be modified, which was not specified, then a slightly different algorithm must be used.Spiritless
Sorry for updating an old thread, but this didn't work for me. Instead of checking and using i and vOrder[i], I had to use vOrder[i] and vOrder[vOrder[i]]. I added an answer below.Classicist
To clarify it appears to me that vOrder already contains a set of indexes in the desired order (as opposed to duplicating the steps it take to recreate vOrder), so I added an answer that handles this case.Classicist
As I commented in the question, there is an ambiguity about vOrder semantic.Spiritless
C
6

It appears to me that vOrder contains a set of indexes in the desired order (for example the output of sorting by index). The code example here follows the "cycles" in vOrder, where following a sub-set (could be all of vOrder) of indexes will cycle through the sub-set, ending back at the first index of the sub-set.

Wiki article on "cycles"

https://en.wikipedia.org/wiki/Cyclic_permutation

In the following example, every swap places at least one element in it's proper place. This code example effectively reorders vA according to vOrder, while "unordering" or "unpermuting" vOrder back to its original state (0 :: n-1). If vA contained the values 0 through n-1 in order, then after reorder, vA would end up where vOrder started.

template <class T>
void reorder(vector<T>& vA, vector<size_t>& vOrder)  
{   
    assert(vA.size() == vOrder.size());

    // for all elements to put in place
    for( size_t i = 0; i < vA.size(); ++i )
    { 
        // while vOrder[i] is not yet in place 
        // every swap places at least one element in it's proper place
        while(       vOrder[i] !=   vOrder[vOrder[i]] )
        {
            swap( vA[vOrder[i]], vA[vOrder[vOrder[i]]] );
            swap(    vOrder[i],     vOrder[vOrder[i]] );
        }
    }
}

This can also be implemented a bit more efficiently using moves instead swaps. A temp object is needed to hold an element during the moves. Example C code, reorders A[] according to indexes in I[], also sorts I[] :

void reorder(int *A, int *I, int n)
{    
int i, j, k;
int tA;
    /* reorder A according to I */
    /* every move puts an element into place */
    /* time complexity is O(n) */
    for(i = 0; i < n; i++){
        if(i != I[i]){
            tA = A[i];
            j = i;
            while(i != (k = I[j])){
                A[j] = A[k];
                I[j] = j;
                j = k;
            }
            A[j] = tA;
            I[j] = j;
        }
    }
}
Classicist answered 4/3, 2014 at 21:22 Comment(2)
sizeof(A) is sizeof(int*) and sizeof(A[0]) is sizeof (int)Partake
@Partake - answer is fixed now.Classicist
S
3

If it is ok to modify the ORDER array then an implementation that sorts the ORDER vector and at each sorting operation also swaps the corresponding values vector elements could do the trick, I think.

Septimal answered 8/5, 2009 at 6:43 Comment(0)
J
3

A survey of existing answers

You ask if there is "a more efficient way". But what do you mean by efficient and what are your requirements?

Potatoswatter's answer works in O(N²) time with O(1) additional space and doesn't mutate the reordering vector.

chmike and rcgldr give answers which use O(N) time with O(1) additional space, but they achieve this by mutating the reordering vector.

Your original answer allocates new space and then copies data into it while Tim MB suggests using move semantics. However, moving still requires a place to move things to and an object like an std::string has both a length variable and a pointer. In other words, a move-based solution requires O(N) allocations for any objects and O(1) allocations for the new vector itself. I explain why this is important below.

Preserving the reordering vector

We might want that reordering vector! Sorting costs O(N log N). But, if you know you'll be sorting several vectors in the same way, such as in a Structure of Arrays (SoA) context, you can sort once and then reuse the results. This can save a lot of time.

You might also want to sort and then unsort data. Having the reordering vector allows you to do this. A use case here is for performing genomic sequencing on GPUs where maximal speed efficiency is obtained by having sequences of similar lengths processed in batches. We cannot rely on the user providing sequences in this order so we sort and then unsort.

So, what if we want the best of all worlds: O(N) processing without the costs of additional allocation but also without mutating our ordering vector (which we might, after all, want to reuse)? To find that world, we need to ask:

Why is extra space bad?

There are two reasons you might not want to allocate additional space.

The first is that you don't have much space to work with. This can occur in two situations: you're on an embedded device with limited memory. Usually this means you're working with small datasets, so the O(N²) solution is probably fine here. But it can also happen when you are working with really large datasets. In this case O(N²) is unacceptable and you have to use one of the O(N) mutating solutions.

The other reason extra space is bad is because allocation is expensive. For smaller datasets it can cost more than the actual computation. Thus, one way to achieve efficiency is to eliminate allocation.

Outline

When we mutate the ordering vector we are doing so as a way to indicate whether elements are in their permuted positions. Rather than doing this, we could use a bit-vector to indicate that same information. However, if we allocate the bit vector each time that would be expensive.

Instead, we could clear the bit vector each time by resetting it to zero. However, that incurs an additional O(N) cost per function use.

Rather, we can store a "version" value in a vector and increment this on each function use. This gives us O(1) access, O(1) clear, and an amoritzed allocation cost. This works similarly to a persistent data structure. The downside is that if we use an ordering function too often the version counter needs to be reset, though the O(N) cost of doing so is amortized.

This raises the question: what is the optimal data type for the version vector? A bit-vector maximizes cache utilization but requires a full O(N) reset after each use. A 64-bit data type probably never needs to be reset, but has poor cache utilization. Experimenting is the best way to figure this out.

Two types of permutations

We can view an ordering vector as having two senses: forward and backward. In the forward sense, the vector tell us where elements go to. In the backward sense, the vector tells us where elements are coming from. Since the ordering vector is implicitly a linked list, the backward sense requires O(N) additional space, but, again, we can amortize the allocation cost. Applying the two senses sequentially brings us back to our original ordering.

Performance

Running single-threaded on my "Intel(R) Xeon(R) E-2176M CPU @ 2.70GHz", the following code takes about 0.81ms per reordering for sequences 32,767 elements long.

Code

Fully commented code for both senses with tests:

#include <algorithm>
#include <cassert>
#include <random>
#include <stack>
#include <stdexcept>
#include <vector>

///@brief Reorder a vector by moving its elements to indices indicted by another 
///       vector. Takes O(N) time and O(N) space. Allocations are amoritzed.
///
///@param[in,out] values   Vector to be reordered
///@param[in]     ordering A permutation of the vector
///@param[in,out] visited  A black-box vector to be reused between calls and
///                        shared with with `backward_reorder()`
template<class ValueType, class OrderingType, class ProgressType>
void forward_reorder(
  std::vector<ValueType>          &values,
  const std::vector<OrderingType> &ordering,
  std::vector<ProgressType>       &visited
){
  if(ordering.size()!=values.size()){
    throw std::runtime_error("ordering and values must be the same size!");
  }

  //Size the visited vector appropriately. Since vectors don't shrink, this will
  //shortly become large enough to handle most of the inputs. The vector is 1
  //larger than necessary because the first element is special.
  if(visited.empty() || visited.size()-1<values.size());
    visited.resize(values.size()+1);

  //If the visitation indicator becomes too large, we reset everything. This is
  //O(N) expensive, but unlikely to occur in most use cases if an appropriate
  //data type is chosen for the visited vector. For instance, an unsigned 32-bit
  //integer provides ~4B uses before it needs to be reset. We subtract one below
  //to avoid having to think too much about off-by-one errors. Note that
  //choosing the biggest data type possible is not necessarily a good idea!
  //Smaller data types will have better cache utilization.
  if(visited.at(0)==std::numeric_limits<ProgressType>::max()-1)
    std::fill(visited.begin(), visited.end(), 0);

  //We increment the stored visited indicator and make a note of the result. Any
  //value in the visited vector less than `visited_indicator` has not been
  //visited.
  const auto visited_indicator = ++visited.at(0);

  //For doing an early exit if we get everything in place
  auto remaining = values.size();

  //For all elements that need to be placed
  for(size_t s=0;s<ordering.size() && remaining>0;s++){
    assert(visited[s+1]<=visited_indicator);

    //Ignore already-visited elements
    if(visited[s+1]==visited_indicator)
      continue;

    //Don't rearrange if we don't have to
    if(s==visited[s])
      continue;

    //Follow this cycle, putting elements in their places until we get back
    //around. Use move semantics for speed.
    auto temp = std::move(values[s]);
    auto i = s;
    for(;s!=(size_t)ordering[i];i=ordering[i],--remaining){
      std::swap(temp, values[ordering[i]]);
      visited[i+1] = visited_indicator;
    }
    std::swap(temp, values[s]);
    visited[i+1] = visited_indicator;
  }
}



///@brief Reorder a vector by moving its elements to indices indicted by another 
///       vector. Takes O(2N) time and O(2N) space. Allocations are amoritzed.
///
///@param[in,out] values   Vector to be reordered
///@param[in]     ordering A permutation of the vector
///@param[in,out] visited  A black-box vector to be reused between calls and
///                        shared with with `forward_reorder()`
template<class ValueType, class OrderingType, class ProgressType>
void backward_reorder(
  std::vector<ValueType>          &values,
  const std::vector<OrderingType> &ordering,
  std::vector<ProgressType>       &visited
){
  //The orderings form a linked list. We need O(N) memory to reverse a linked
  //list. We use `thread_local` so that the function is reentrant.
  thread_local std::stack<OrderingType> stack;

  if(ordering.size()!=values.size()){
    throw std::runtime_error("ordering and values must be the same size!");
  }

  //Size the visited vector appropriately. Since vectors don't shrink, this will
  //shortly become large enough to handle most of the inputs. The vector is 1
  //larger than necessary because the first element is special.
  if(visited.empty() || visited.size()-1<values.size());
    visited.resize(values.size()+1);

  //If the visitation indicator becomes too large, we reset everything. This is
  //O(N) expensive, but unlikely to occur in most use cases if an appropriate
  //data type is chosen for the visited vector. For instance, an unsigned 32-bit
  //integer provides ~4B uses before it needs to be reset. We subtract one below
  //to avoid having to think too much about off-by-one errors. Note that
  //choosing the biggest data type possible is not necessarily a good idea!
  //Smaller data types will have better cache utilization.
  if(visited.at(0)==std::numeric_limits<ProgressType>::max()-1)
    std::fill(visited.begin(), visited.end(), 0);

  //We increment the stored visited indicator and make a note of the result. Any
  //value in the visited vector less than `visited_indicator` has not been
  //visited.  
  const auto visited_indicator = ++visited.at(0);

  //For doing an early exit if we get everything in place
  auto remaining = values.size();

  //For all elements that need to be placed
  for(size_t s=0;s<ordering.size() && remaining>0;s++){
    assert(visited[s+1]<=visited_indicator);

    //Ignore already-visited elements
    if(visited[s+1]==visited_indicator)
      continue;

    //Don't rearrange if we don't have to
    if(s==visited[s])
      continue;

    //The orderings form a linked list. We need to follow that list to its end
    //in order to reverse it.
    stack.emplace(s);
    for(auto i=s;s!=(size_t)ordering[i];i=ordering[i]){
      stack.emplace(ordering[i]);
    }

    //Now we follow the linked list in reverse to its beginning, putting
    //elements in their places. Use move semantics for speed.
    auto temp = std::move(values[s]);
    while(!stack.empty()){
      std::swap(temp, values[stack.top()]);
      visited[stack.top()+1] = visited_indicator;
      stack.pop();
      --remaining;
    }
    visited[s+1] = visited_indicator;
  }
}



int main(){
  std::mt19937 gen;
  std::uniform_int_distribution<short> value_dist(0,std::numeric_limits<short>::max());
  std::uniform_int_distribution<short> len_dist  (0,std::numeric_limits<short>::max());
  std::vector<short> data;
  std::vector<short> ordering;
  std::vector<short> original;

  std::vector<size_t> progress;

  for(int i=0;i<1000;i++){
    const int len = len_dist(gen);
    data.clear();
    ordering.clear();
    for(int i=0;i<len;i++){
      data.push_back(value_dist(gen));
      ordering.push_back(i);
    }

    original = data;

    std::shuffle(ordering.begin(), ordering.end(), gen);

    forward_reorder(data, ordering, progress);

    assert(original!=data);

    backward_reorder(data, ordering, progress);

    assert(original==data);
  }  
}
Johathan answered 25/7, 2020 at 0:19 Comment(0)
M
1

Never prematurely optimize. Meassure and then determine where you need to optimize and what. You can end with complex code that is hard to maintain and bug-prone in many places where performance is not an issue.

With that being said, do not early pessimize. Without changing the code you can remove half of your copies:

    template <typename T>
    void reorder( std::vector<T> & data, std::vector<std::size_t> const & order )
    {
       std::vector<T> tmp;         // create an empty vector
       tmp.reserve( data.size() ); // ensure memory and avoid moves in the vector
       for ( std::size_t i = 0; i < order.size(); ++i ) {
          tmp.push_back( data[order[i]] );
       }
       data.swap( tmp );          // swap vector contents
    }

This code creates and empty (big enough) vector in which a single copy is performed in-order. At the end, the ordered and original vectors are swapped. This will reduce the copies, but still requires extra memory.

If you want to perform the moves in-place, a simple algorithm could be:

template <typename T>
void reorder( std::vector<T> & data, std::vector<std::size_t> const & order )
{
   for ( std::size_t i = 0; i < order.size(); ++i ) {
      std::size_t original = order[i];
      while ( i < original )  {
         original = order[original];
      }
      std::swap( data[i], data[original] );
   }
}

This code should be checked and debugged. In plain words the algorithm in each step positions the element at the i-th position. First we determine where the original element for that position is now placed in the data vector. If the original position has already been touched by the algorithm (it is before the i-th position) then the original element was swapped to order[original] position. Then again, that element can already have been moved...

This algorithm is roughly O(N^2) in the number of integer operations and thus is theoretically worse in performance time as compare to the initial O(N) algorithm. But it can compensate if the N^2 swap operations (worst case) cost less than the N copy operations or if you are really constrained by memory footprint.

Mayhap answered 8/5, 2009 at 7:41 Comment(2)
Your algorithm is optimal is vOrder can't be modified or if N is small. Yours may spend more time scanning vOrder but will only swap vA values. Mine swap vOrder and vA values which may end up to be more expensive than to rescan vOrder with small N values.Spiritless
Hm, shouldn't 'tmp.resize( data.size() );' be 'tmp.reserve(...)' as you're using 'tmp.push_back(...)' ? To keep 'resize', the insertion should be 'tmp[i] = ...'.Karolinekaroly
T
1

It's an interesting intellectual exercise to do the reorder with O(1) space requirement but in 99.9% of the cases the simpler answer will perform to your needs:

void permute(vector<T>& values, const vector<size_t>& indices)  
{   
    vector<T> out;
    out.reserve(indices.size());
    for(size_t index: indices)
    {
        assert(0 <= index && index < values.size());
        out.push_back(std::move(values[index]));
    }
    values = std::move(out);
}

Beyond memory requirements, the only way I can think of this being slower would be due to the memory of out being in a different cache page than that of values and indices.

Torpedo answered 26/9, 2017 at 21:27 Comment(0)
U
0

You could do it recursively, I guess - something like this (unchecked, but it gives the idea):

// Recursive function
template<typename T>
void REORDER(int oldPosition, vector<T>& vA, 
             const vector<int>& vecNewOrder, vector<bool>& vecVisited)
{
    // Keep a record of the value currently in that position,
    // as well as the position we're moving it to.
    // But don't move it yet, or we'll overwrite whatever's at the next
    // position. Instead, we first move what's at the next position.
    // To guard against loops, we look at vecVisited, and set it to true
    // once we've visited a position.
    T oldVal = vA[oldPosition];
    int newPos = vecNewOrder[oldPosition];
    if (vecVisited[oldPosition])
    {
        // We've hit a loop. Set it and return.
        vA[newPosition] = oldVal;
        return;
    }
    // Guard against loops:
    vecVisited[oldPosition] = true;

    // Recursively re-order the next item in the sequence.
    REORDER(newPos, vA, vecNewOrder, vecVisited);

    // And, after we've set this new value, 
    vA[newPosition] = oldVal;
}

// The "main" function
template<typename T>
void REORDER(vector<T>& vA, const vector<int>& newOrder)
{
    // Initialise vecVisited with false values
    vector<bool> vecVisited(vA.size(), false);

    for (int x = 0; x < vA.size(); x++)
    {
        REORDER(x, vA, newOrder, vecVisited);
    }
}

Of course, you do have the overhead of vecVisited. Thoughts on this approach, anyone?

Ulmer answered 8/5, 2009 at 6:4 Comment(2)
In a first read it seems to me as if this roughly ends up with the same memory and time costs. While you are not copying to another vector, you are copying to just as many local variables as elements in the vector. I would have to work harder on the interpretation to test for correctness.Jame
Yeah, that was my thought too. In fact, it would probably use up more space because of the stack frames, and take more time because of method-calling overhead. But it was a fun intellectual exercise anyway ;-PUlmer
C
0

Your code is broken. You cannot assign to vA and you need to use template parameters.

vector<char> REORDER(const vector<char>& vA, const vector<size_t>& vOrder)  
{   
    assert(vA.size() == vOrder.size());  
    vector<char> vCopy(vA.size()); 
    for(int i = 0; i < vOrder.size(); ++i)  
        vCopy[i] = vA[ vOrder[i] ];  
    return vA;
} 

The above is slightly more efficient.

Costanzo answered 8/5, 2009 at 6:40 Comment(2)
The OP wants to reorder vA vector. So the prototype should be: void REORDER(vector<char>& vA, const vector<size_t>& vOrder).Decline
returning a vector will end up with the result of having to create a copy, which, to my interpretation, is what the OP wants to avoid. Another optimization of your approach would be not creating a vector of size elements but rather reserving memory and using push_backs. This would remove the cost of default constructing all elements before you copy.Jame
B
0

It is not clear by the title and the question if the vector should be ordered with the same steps it takes to order vOrder or if vOrder already contains the indexes of the desired order. The first interpretation has already a satisfying answer (see chmike and Potatoswatter), I add some thoughts about the latter. If the creation and/or copy cost of object T is relevant

template <typename T>
void reorder( std::vector<T> & data, std::vector<std::size_t> & order )
{
 std::size_t i,j,k;
  for(i = 0; i < order.size() - 1; ++i) {
    j = order[i];
    if(j != i) {
      for(k = i + 1; order[k] != i; ++k);
      std::swap(order[i],order[k]);
      std::swap(data[i],data[j]);
    }
  }
}

If the creation cost of your object is small and memory is not a concern (see dribeas):

template <typename T>
void reorder( std::vector<T> & data, std::vector<std::size_t> const & order )
{
 std::vector<T> tmp;         // create an empty vector
 tmp.reserve( data.size() ); // ensure memory and avoid moves in the vector
 for ( std::size_t i = 0; i < order.size(); ++i ) {
  tmp.push_back( data[order[i]] );
 }
 data.swap( tmp );          // swap vector contents
}

Note that the two pieces of code in dribeas answer do different things.

Brasilin answered 27/1, 2014 at 10:39 Comment(0)
C
0

I was trying to use @Potatoswatter's solution to sort multiple vectors by a third one and got really confused by output from using the above functions on a vector of indices output from Armadillo's sort_index. To switch from a vector output from sort_index (the arma_inds vector below) to one that can be used with @Potatoswatter's solution (new_inds below), you can do the following:

vector<int> new_inds(arma_inds.size());
for (int i = 0; i < new_inds.size(); i++) new_inds[arma_inds[i]] = i;
Crymotherapy answered 4/8, 2017 at 17:5 Comment(0)
M
0

I came up with this solution which has the space complexity of O(max_val - min_val + 1), but it can be integrated with std::sort and benefits from std::sort's O(n log n) decent time complexity.

std::vector<int32_t> dense_vec = {1, 2, 3};
std::vector<int32_t> order = {1, 0, 2};

int32_t max_val = *std::max_element(dense_vec.begin(), dense_vec.end());
std::vector<int32_t> sparse_vec(max_val + 1);

int32_t i = 0;
for(int32_t j: dense_vec)
{
    sparse_vec[j] = order[i];
    i++;
}

std::sort(dense_vec.begin(), dense_vec.end(),
    [&sparse_vec](int32_t i1, int32_t i2) {return sparse_vec[i1] < sparse_vec[i2];});

The following assumptions made while writing this code:

  • Vector values start from zero.
  • Vector does not contain repeated values.
  • We have enough memory to sacrifice in order to use std::sort
Melidamelilot answered 20/8, 2018 at 17:42 Comment(0)
D
-1

This should avoid copying the vector:

void REORDER(vector<char>& vA, const vector<size_t>& vOrder)  
{   
    assert(vA.size() == vOrder.size()); 
    for(int i = 0; i < vOrder.size(); ++i)
        if (i < vOrder[i])
            swap(vA[i], vA[vOrder[i]]);
}
Decline answered 8/5, 2009 at 6:54 Comment(3)
I believe this to be broken. Order: [ 2, 0, 1 ], first step will swap first and third element. In the second step i > order[i] and nothing is done, in the third step i > order[i] again and nothing is done.Jame
This code works with the example but wouldn't work with the vOrder sequence [3 0 1 2]. For i == 0 , 3 and 2 are swapped, then nothing is changed for subsequent i values.Spiritless
Thanks for pointing out the mistake. I should've checked more before posting the code. :(Decline

© 2022 - 2024 — McMap. All rights reserved.