how do you insert the value in a sorted vector?

Asked 5/4, 2013 at 21:4 Answered 4/4, 2022 at 22:47

This question is a continuation of this one. I think that STL misses this functionality, but it just my IMHO.

Consider following code:

class Foo
{
public:
    Foo();
    int paramA, paramB;
    std::string name;
};

struct Sorter
{
    bool operator()(const Foo &foo1, const Foo &foo2) const
    {
         switch( paramSorter )
         {
             case 1:
                 return foo1.paramA < foo2.paramA;
             case 2:
                 return foo1.paramB < foo2.paramB;
             default:
                 return foo1.name < foo2.name;
         }
    }

    int paramSorter;
};

int main()
{
    std::vector<Foo> foo;
    Sorter sorter;
    sorter.paramSorter = 0;
        // fill the vector
    std::sort( foo.begin(), foo.end(), sorter );
}

At any given moment of time the vector can be re-sorted. The class also have the getter methods which are used in the sorter structure.

What would be the most efficient way to insert a new element in the vector?

Situation I have is:

I have a grid (spreadsheet), that uses the sorted vector of a class. At any given time the vector can be re-sorted and the grid will display the sorted data accordingly.

Now I will need to insert a new element in the vector/grid. I can insert, then re-sort and then re-display the whole grid, but this is very inefficient especially for the big grid.

Any help?

Ultrastructure answered 5/4, 2013 at 21:4 Comment(3)

I think you should use set, if you don't have duplicates. Or std::list otherwise. Vector doesn't seem fit for something that needs to be sorted frequently. – Herra 5/4, 2013 at 21:9

std::set is based on red-black tree, with reinsert complexity of O(logn). You might want to consider related tree structure for this insert into sorted array problem, like rb-tree, avl-tree and etc. – Idaho 5/4, 2013 at 21:51

@stardust_ sorting is fine, I think that main drawback is insertion when it leads to shift of elements and especially when it leads to the memory reallocation. Good choice will be dequeue - as it consolidates all good traits of vector(search) and list(insert), all depends on data properties and operations with it. – Crasis 5/4, 2013 at 22:43

116

The simple answer to the question:

template< typename T >
typename std::vector<T>::iterator 
   insert_sorted( std::vector<T> & vec, T const& item )
{
    return vec.insert
        ( 
            std::upper_bound( vec.begin(), vec.end(), item ),
            item 
        );
}

Version with a predicate.

template< typename T, typename Pred >
typename std::vector<T>::iterator
    insert_sorted( std::vector<T> & vec, T const& item, Pred pred )
{
    return vec.insert
        ( 
           std::upper_bound( vec.begin(), vec.end(), item, pred ),
           item 
        );
}

Where Pred is a strictly-ordered predicate on type T.

For this to work the input vector must already be sorted on this predicate.

The complexity of doing this is O(log N) for the upper_bound search (finding where to insert) but up to O(N) for the insert itself.

For a better complexity you could use std::set<T> if there are not going to be any duplicates or std::multiset<T> if there may be duplicates. These will retain a sorted order for you automatically and you can specify your own predicate on these too.

There are various other things you could do which are more complex, e.g. manage a vector and a set / multiset / sorted vector of newly added items then merge these in when there are enough of them. Any kind of iterating through your collection will need to run through both collections.

Using a second vector has the advantage of keeping your data compact. Here your "newly added" items vector will be relatively small so the insertion time will be O(M) where M is the size of this vector and might be more feasible than the O(N) of inserting in the big vector every time. The merge would be O(N+M) which is better than O(NM) it would be inserting one at a time, so in total it would be O(N+M) + O(M²) to insert M elements then merge.

You would probably keep the insertion vector at its capacity too, so as you grow that you will not be doing any reallocations, just moving of elements.

Tyndareus answered 27/8, 2014 at 9:55 Comment(2)

Note that logically there is no difference between using lower_bound and upper_bound as it only makes a difference when the element you are inserting already exists. If it does, upper_bound is slightly better because you are inserting it nearer the back which means shifting fewer elements. – Tyndareus 9/2, 2015 at 10:59

To be added (beyond original question): This applies for std::vector, for std::list it's just inverse... – Malevolent 14/4, 2022 at 10:28

If you need to keep the vector sorted all the time, first you might consider whether using std::set or std::multiset won't simplify your code.

If you really need a sorted vector and want to quickly insert an element into it, but do not want to enforce a sorting criterion to be satisfied all the time, then you can first use std::lower_bound() to find the position in a sorted range where the element should be inserted in logarithmic time, then use the insert() member function of vector to insert the element at that position.

If performance is an issue, consider benchmarking std::list vs std::vector. For small items, std::vector is known to be faster because of a higher cache hit rate, but the insert() operation itself is computationally faster on lists (no need to move elements around).

Insignificancy answered 5/4, 2013 at 21:14 Comment(11)

will I be able to re-sort std::set on a different sorting criteria? Especially the one that have non-unique values? – Ultrastructure 5/4, 2013 at 23:30

@Igor: No, you can't re-sort an std::set based on a different criterion. But you can move the content to a new std::set sorted with a different criterion. – Insignificancy 6/4, 2013 at 10:54

Thank you for this answer. I just googled "insert into sorted vector" and ended up using std::set. Also, a note to your remark concerning performance: std::vector is going to win in almost all cases: Stroustrup: Why you should avoid linked lists? – Confer 7/12, 2013 at 20:36

@AndyProwl (oh, and just so you know: you helped me with my... wait for it... programming homework! :P ) – Confer 7/12, 2013 at 20:45

@H2CO3: I'm proud of that :P – Insignificancy 7/12, 2013 at 20:55

Although it may have solved OP's specific problem, if you really need a sorted vector, then a set won't do. Your leading sentence should probably go as a suggestion to carefully consider whether a vector is really needed, at the end. – Myrt 27/8, 2014 at 9:16

You might want multiset as the user has not specified there will be no duplicates – Tyndareus 27/8, 2014 at 9:58

It should also be noted that both set and multiset does not invalidate references and iterators when inserting elements, vector might. – Embitter 11/6, 2017 at 11:2

sets suck because you can't modify a member of a set, you have to delete it & re-add :o) – Cleaning 31/5, 2018 at 20:25

My use case for std::vector over std::set is the better cache locality – Jaramillo 9/5, 2019 at 6:45

upper_bound is more appropriate than lower_bound because then you move fewer elements. – Frogman 25/1, 2020 at 20:34

Just a note, you can use upper_bound as well depending on your needs. upper_bound will assure new entries that are equivalent to others will appear at the end of their sequence, lower_bound will assure new entries equivalent to others will appear at the beginning of their sequence. Can be useful for certain implementations (maybe classes that can share a "position" but not all of their details!)

Both will assure you that the vector remains sorted according to < result of elements, although inserting into lower_bound will mean moving more elements.

Example:

insert 7 @ lower_bound of { 5, 7, 7, 9 } => { 5, *7*, 7, 7, 9 }
insert 7 @ upper_bound of { 5, 7, 7, 9 } => { 5, 7, 7, *7*, 9 }

Delbert answered 6/8, 2014 at 18:39 Comment(1)

The speed differences between them depend on whether or not comparison or copy is more expensive and on how big the vector is. The upper/lower_bound functions are O(logN), and the insertion is O(N). Consider this pathological case: insert 0 into { 0, 0, 0, ... /* lots of zeros */, 0 } – Presidio 10/12, 2014 at 1:34

Instead of inserting and sorting. You should do a find and then insert

Keep the vector sorted. (sort once). When you have to insert

find the first element that compares as greater to the one you are going to insert.
Do an insert just before that position.

This way the vector stays sorted.

Here is an example of how it goes.

start {} empty vector

insert 1 -> find first greater returns end() = 1 -> insert at 1 -> {1}
insert 5 -> find first greater returns end() = 2 -> insert at 2 -> {1,5}
insert 3 -> find first greater returns 2 -> insert at 2 -> {1,3,5}
insert 4 -> find first greater returns 3 -> insert at 3 -> {1,3,4,5}

Herra answered 5/4, 2013 at 21:16 Comment(11)

But keep in mind that this is not efficient (both find and insert are O(n) operations). The algorithm in my answer (https://mcmap.net/q/268602/-how-do-you-insert-the-value-in-a-sorted-vector) is a bit more efficient if there are more greater than lesser elements. – Infatuation 5/4, 2013 at 21:23

@Sebastian: Actually, you might want to profile that before declaring it so. Efficiency is not entirely bound by algorithmic complexity. Also, finding in this case would be O(log n), since you can use a binary search. – Earthlight 5/4, 2013 at 21:26

@stardust_, I can't "sort once". My vector can be resorted at any given moment by different class field. If I start with vector sorted by name, the user might sort by paramA or paramB and then insert the new element. Problem is I don't know which sorter will be used. – Ultrastructure 5/4, 2013 at 21:27

@Benjamin: That O(log(n)) find is discussed in the comments of my answer. And, yes, asymptotic complexity isn't everything. But it's an important first step. – Infatuation 5/4, 2013 at 21:28

@Ultrastructure Actually you don't even have to sort once (if you start with empty vector). The important thing is where you insert. If you insert at the right place always your vector is always sorted. No need to sort it at any point. Think about it. – Herra 5/4, 2013 at 21:35

@startdust_: That doesn't help if lgor wants to chage the filed on which it shall be sorted, after the vector already contains elements. – Infatuation 5/4, 2013 at 21:37

@Infatuation Ahhh. I see now. He sorts it on different criteria every time. – Herra 5/4, 2013 at 21:43

It's not a question of inserting element per se. It is a question: How to insert element in the std::vector efficiently independently of the sorting criteria". I can sort on any field of the class and want to insert an element of this class in this sorted vector. I guess the only solution is to keep resorting and recreating the grid. – Ultrastructure 5/4, 2013 at 23:34

@Ultrastructure The thing is sorting immidiatly after insert is not necessary. Unless you have other reasons to do so. the above method will still work up until the point where you want to resort in another field. Then you do find based on that field and insert.. unti you need to sort again in some other field. This will reduce the number of sortss you need somewhow while keeping the list sorted at least by one field at any time – Herra 5/4, 2013 at 23:43

@Ultrastructure remember then you have to use find_if() and provide a comparator just like you did for the sort depending on find field. – Herra 5/4, 2013 at 23:44

@Ultrastructure boost.org/doc/libs/1_53_0/libs/multi_index/doc/tutorial/… from Sebastian seems like a good way to go. – Herra 5/4, 2013 at 23:46

When you want to switch between sort orders, you can use multiple index datastructures, each of which you keep in sorted order (probably some kind of balanced tree, like std::map, which maps sort-keys to vector-indices, or std::set to store pointers to youre obects - but with different comparison functions).

Here's a library which does this: http://www.boost.org/doc/libs/1_53_0/libs/multi_index/doc/index.html

For every change (insert of new elements or update of keys) you must update all index datastructure, or flag them as invalid.

This works if there are not "too many" sort orders and not "too many" updates of your datastructure. Otherwise - bad luck, you have to re-sort everytime you want to change the order.

In other words: The more indices you need (to speed up lookup operations), the more time you need for update operations. And every index needs memory, of course.

To keep the count of indices small, you could use some query engine which combines the indices of several fields to support more complex sort orders over several fields. Like an SQL query optimizer. But that may be overkill...

Example: If you have two fields, a and b, you can support 4 sort orders:

a
b
first a then b
first b then a

with 2 indices (3. and 4.). With more fields, the possible combinations of sort orders gets big, fast. But you can still use an index which sorts "almost as you want it" and, during the query, sort the remaining fields you couldn't catch with that index, as needed. For sorted output of the whole data, this doesn't help much. But if you only want to lookup some elements, the first "narrowing down" can help much.

Infatuation answered 5/4, 2013 at 21:32 Comment(0)

Here is one I wrote for simplicity. Horribly slow for large sets but fine for small sets. It sorts as values are added:

void InsertionSortByValue(vector<int> &vec, int val)
{
    vector<int>::iterator it;

    for (it = vec.begin(); it < vec.end(); it++)
    {
        if (val < *it)
        {
            vec.insert(it, val);
            return;
        }
    }

    vec.push_back(val);
}

int main()
{
    vector<int> vec;

    for (int i = 0; i < 20; i++)
        InsertionSortByValue(vec, rand()%20);
}

Here is another I found on some website. It sorts by array:

void InsertionSortFromArray(vector<int> &vec)
{
    int elem;
    unsigned int i, j, k, index;

    for (i = 1; i < vec.size(); i++)
    {
        elem = vec[i];

        if (elem < vec[i-1])
        {
            for (j = 0; j <= i; j++)
            {
                if (elem < vec[j])
                {
                    index = j;

                    for (k = i; k > j; k--)
                        vec[k] = vec[k-1];

                    break;
                }
            }
        }
        else
            continue;

        vec[index] = elem;
    }
}

int main()
{
    vector<int> vec;

    for (int i = 0; i < 20; i++)
        vec.push_back(rand()%20);

    InsertionSortFromArray(vec);
}

Cataclinal answered 4/4, 2022 at 22:47 Comment(0)

-2

Assuming you really want to use a vector, and the sort criterium or keys don't change (so the order of already inserted elements always stays the same): Insert the element at the end, then move it to the front one step at a time, until the preceeding element isn't bigger.

It can't be done faster (regarding asymptotic complexity, or "big O notation"), because you must move all bigger elements. And that's the reason why STL doesn't provide this - because it's inefficient on vectors, and you shouldn't use them if you need it.

Edit: Another assumption: Comparing the elements is not much more expensive than moving them. See comments.

Edit 2: As my first assumption doesn't hold (you want to change the sort criterium), scrap this answer and see my other one: https://mcmap.net/q/268602/-how-do-you-insert-the-value-in-a-sorted-vector

Infatuation answered 5/4, 2013 at 21:10 Comment(3)

It sounds like a binary search. – Dogberry 5/4, 2013 at 21:14

You could use the binary search to find the position where you want to insert. Doesn't help much, though. You still need O(n) steps to create the gap where you can insert the element. Helps, if moving is cheap and comparing is expensive. – Infatuation 5/4, 2013 at 21:16

Moving is often a lot cheaper particularly if the elements are ints, and in C++11 even moving objects can be cheap. – Tyndareus 10/2, 2015 at 9:16

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags