Find max/min of vector of vectors

M

11

16

What is the most efficient and standard (C++11/14) way to find the max/min item of vector of vectors?

std::vector<std::vector<double>> some_values{{5,0,8},{3,1,9}};

the wanted max element is 9

the wanted min element is 0

Magical answered 22/7, 2015 at 7:2 Comment(9)

std::minmax_element for the inner vectors. – Expeditionary 22/7, 2015 at 7:35

Why not to use 2 nested loops? The other ways may be less readable. – Prompt 22/7, 2015 at 7:36

@Expeditionary You mean to pass throgh each inner vector and call minmax_elemnt and then find the minmax_elemnt of the result ? – Magical 22/7, 2015 at 7:38

@SergeRogatch it is an solution. But I was wondering if there is an std function or pattern for this. – Magical 22/7, 2015 at 7:39

Just go over each vector and create two variables min and max and compare them – Jarib 22/7, 2015 at 7:45

@gilad yes that is the last option for me. I was looking for something more clear – Magical 22/7, 2015 at 7:46

@HumamHelfawi: For the outer loop, it seems more complicated to use directly standard algorithm. – Expeditionary 22/7, 2015 at 7:47

@Expeditionary I see... BTW, is not possible to benefit form the continuity property of storing vector in order to convert it to deal with it as one dimensional ? – Magical 22/7, 2015 at 7:52

@HumamHelfawi a vector of vectors is not stored contiguously, each inner vector is stored in its own contiguous block of dynamically allocated memory so you can't really treat them as "one dimensional". You could change how you store your 2D array such that it is stored contiguously, then you might be able to simplify the implementation a little. – Bremsstrahlung 22/7, 2015 at 8:30

T

4

Any efficient way to calculate the maximum element in a 2-D array(or vector in your case) involves a complexity of O(n^2) irrespective of what you do, as the calculation involves a comparison between n*n elements.Best way in terms of ease of use is to use std::max_element on the vector of vectors.I will not delve into details.Here is the reference.

Timberhead answered 22/7, 2015 at 7:40 Comment(17)

Thanks but as I know std::max_element work for only one dimension? Or I am missing something... I will appreciate if you add a two line code that describe your idea. many thanks – Magical 22/7, 2015 at 7:45

pls have a look ideone.com/c8npHh. There might be some issues as i have not brushed up my c++ for a long time but the logic will work. – Timberhead 22/7, 2015 at 8:1

@HumamHelfawi modify it to your own use. – Timberhead 22/7, 2015 at 8:2

Aha, I see.. I thought that it is one line solution.. it seems that it is only solution :( – Magical 22/7, 2015 at 8:5

@HumamHelfawi You can't help it the computational theory will not permit you. But there is one more solution, Are you familiar with something called a max-heap? It can be used directly while the input is taken.Then you have a better time complexity solution of O(nlog(n). Just for inserting the values. – Timberhead 22/7, 2015 at 8:8

unfortunately no, I will try to take a look. – Magical 22/7, 2015 at 8:27

It's actually just O(N) where N is the total number of elements to be compared. The fact that this is physically laid out as K vectors of on average N/K sub-elements is a bit misleading. – Coachman 22/7, 2015 at 9:56

@Coachman is right. This is a linear complexity problem O(n). I don't know why people are saying quadratic. – Wive 22/7, 2015 at 18:18

@Wive can you just explain to me why it is not quadratic ?. You need to compare n^2 elements to get to the correct answer.No matter what you do the complexity is not reducible beyond that.Its like the famous minimum number of races to find the fastest horse problem. – Timberhead 22/7, 2015 at 18:26

@Timberhead complexity of std::min_element is in terms of the number of calls to operator< compared to the total number of elements N. The fact that they are divided into several bins does not matter except for writing nested loops, but the nesting is not a "round-robin" tournament comparing every element to every other element. Every element is only compared to the min so far. For 100 numbers, you have 99 comparisons. For a 1000, you have 999. It's O(N). – Coachman 22/7, 2015 at 18:29

@Timberhead from my understanding of the OP's question he wants the max and min element of all doubles. Regardless of the layout of those elements, the min and max are found in linear time by examining them once. – Wive 22/7, 2015 at 18:30

@Wive You forget one basic fact it is a vector of vectors not a simple vector where the calculation could have taken something like O(n). even if you try changing the data you need access to m*n elements doesn't he ? – Timberhead 22/7, 2015 at 18:53

@Timberhead complexity for the Standard Library's std::min_element is defined as the number of comparisons in relation to the total number of elements. It's been explained several times that the layout in vector of vector is immaterial here. So downvoted until fixed. – Coachman 22/7, 2015 at 18:55

@Timberhead I think the nested data structure is throwing you off. If I have 100 elements, it doesn't matter if they are 1 list of 100 or 10 lists of 10. I still have to check all 100. The problem is linear in the number of elements regardless of how they are stored. – Wive 22/7, 2015 at 18:57

@Wive {{5,**0**,8},{3,1,**9**}} in a case like this it is conveniently 6 but if you want a generalisation based on the number of rows and columns, it would still be m*n.#11032515 – Timberhead 22/7, 2015 at 19:3

@Timberhead rows and columns are irrelevant. The complexity is O(n) for whatever values of r and c where r*c = n. Personally, reporting the complexity as O(n^2) is misleading. I understand what you mean though. I just think it will confuse people by saying that it is quadratic just because you need a nested loop. – Wive 22/7, 2015 at 19:9

@Wive I have to go now.We can have the discussion tonight maybe(i.e if you are interested, I am). If he did not use a vector, then the simplest one would have been just to use a heap. But here since data is organised in a particular way and there is no chance to manipulate the input options, I believe complexity would be n*n. Have a look at the second solution here #21637741. First one is irrelevant as it manipulates the input. – Timberhead 22/7, 2015 at 19:18

A

8

Here's a multi-threaded solution that returns an iterator (or throws) to the maximum for general type T (assuming operator< is defined for T). Note the most important optimisation is to perform the inner max operations on the 'columns' to exploit C++'s column-major ordering.

#include <vector>
#include <algorithm>

template <typename T>
typename std::vector<T>::const_iterator max_element(const std::vector<std::vector<T>>& values)
{
    if (values.empty()) throw std::runtime_error {"values cannot be empty"};

    std::vector<std::pair<typename std::vector<T>::const_iterator, bool>> maxes(values.size());

    threaded_transform(values.cbegin(), values.cend(), maxes.begin(),
                       [] (const auto& v) {
                           return std::make_pair(std::max_element(v.cbegin(), v.cend()), v.empty());
                       });

    auto it = std::remove_if(maxes.begin(), maxes.end(), [] (auto p) { return p.second; });

    if (it == maxes.begin()) throw std::runtime_error {"values cannot be empty"};

    return std::max_element(maxes.begin(), it,
                            [] (auto lhs, auto rhs) {
                                return *lhs.first < *rhs.first;
                            })->first;
}

threaded_transform is not part of the standard library (yet), but here's an implementation you could use.

#include <vector>
#include <thread>
#include <algorithm>
#include <cstddef>

template <typename InputIterator, typename OutputIterator, typename UnaryOperation>
OutputIterator threaded_transform(InputIterator first, InputIterator last, OutputIterator result, UnaryOperation op, unsigned num_threads)
{
    std::size_t num_values_per_threads = std::distance(first, last) / num_threads;

    std::vector<std::thread> threads;
    threads.reserve(num_threads);

    for (int i = 1; i <= num_threads; ++i) {
        if (i == num_threads) {
            threads.push_back(std::thread(std::transform<InputIterator,
                                      OutputIterator, UnaryOperation>,
                                      first, last, result, op));
        } else {
            threads.push_back(std::thread(std::transform<InputIterator,
                                      OutputIterator, UnaryOperation>,
                                      first, first + num_values_per_threads,
                                      result, op));
        }
        first  += num_values_per_threads;
        result += num_values_per_threads;
    }

    for (auto& thread : threads) thread.join();

    return result;
}

template <typename InputIterator, typename OutputIterator, typename UnaryOperation>
OutputIterator threaded_transform(InputIterator first, InputIterator last, OutputIterator result, UnaryOperation op)
{
    return threaded_transform<InputIterator, OutputIterator, UnaryOperation>(first, last, result, op, std::thread::hardware_concurrency());
}

Armillas answered 22/7, 2015 at 9:31 Comment(2)

Note that fails if an inner vector is empty. – Expeditionary 22/7, 2015 at 19:30

@Expeditionary You're right, thanks. Should work (or throw) for all cases now.. not quite as pretty anymore :( – Armillas 22/7, 2015 at 20:19

B

6

If you used a boost::multi_array<double, 2> instead of a std::vector<std::vector<double>> it would be as simple as:

auto minmax = std::minmax_element(values.data(), values.data() + values.num_elements());

Live demo.

Bremsstrahlung answered 22/7, 2015 at 14:36 Comment(1)

this is an interesting answer if boost is available. for me it is avaiable.. I am going to see it many thanks – Magical 22/7, 2015 at 15:10

T

6

The plain for loop way:

T max_e = std::numeric_limits<T>::min();
for(const auto& v: vv) {
    for(const auto& e: v) {   
        max_e = std::max(max_e, e);
    }
}

Throttle answered 23/7, 2015 at 5:11 Comment(1)

Concise and clear. Best answer so far IMHO. Sometimes simple is best. – Wive 27/7, 2015 at 18:15

G

5

You must at least look at every element, so, as Anony-mouse mentioned, complexity will be at least O(n^2).

#include <vector>
#include <limits>
#include <algorithm>

int main() {
    std::vector<std::vector<double>> some_values;
    double max = std::numeric_limits<double>::lowest();
    for (const auto& v : some_values)
    {
        double current_max = *std::max_element(v.cbegin(), v.cend());
        max = max < current_max ? current_max : max; // max = std::max(current_max, max);
    }
}

Gandhi answered 22/7, 2015 at 7:59 Comment(2)

The complexity of searching for the minimum and maximum values is O(n), not O(n^2). The fact that it takes two loops to do it doesn't matter. Each element is examined exactly once. – Elasticize 22/7, 2015 at 17:40

Note that fails if an inner vector is empty. – Expeditionary 22/7, 2015 at 19:32

P

5

Using the accumulate function you could write:

#include <iostream>
#include <numeric>
#include <vector>

int main()
{
  std::vector<std::vector<double>> m{ {5, 0, 8}, {3, 1, 9} };

  double x = std::accumulate(m.begin(), m.end(), m[0][0],
                             [](double max, const std::vector<double> &v)
                             {
                               return std::max(max,
                                               *std::max_element(v.begin(),
                                                                 v.end()));
                             });

  std::cout << x << '\n';
  return 0;
}

but I'd prefer the good, old for-loop.

The example can be extended to find both the min and max values:

std::accumulate(m.begin(), m.end(),
                std::make_pair(m[0][0], m[0][0]),
                [](std::pair<double, double> minmax, const std::vector<double> &v)
                {
                  auto tmp(std::minmax_element(v.begin(), v.end()));

                  return std::make_pair(
                    std::min(minmax.first, *tmp.first),
                    std::max(minmax.second, *tmp.second));
                });

^{(in real code you have to handle the empty-vector case)}

Unfortunately a vector of vector isn't stored contiguously in memory, so you haven't a single block containing all the values (this is one of the reasons why a vector of vector isn't a good model for a matrix).

You can take advantage of a vector of vector if it contains a lot of elements.

Since each sub-vector is autonomous, you could use std::async to fill asynchronously a vector of futures containing the max value of each sub-vector.

Putdown answered 22/7, 2015 at 8:38 Comment(2)

Nice. Just 2 things : #include <numeric> for accumulate not #include <algorithm>. And you need to add an if in your lambda to handle an empty vector. – Waring 21/1, 2020 at 12:58

@MartinMorterol You're right, I've somewhat fixed the answer. Thank you. – Putdown 24/1, 2020 at 17:8

C

5

You can do it pretty easily with Eric Niebler's range-v3 library (which obviously isn't standard yet, but hopefully will be in the not-too-distant future):

vector<vector<double>> some_values{{5,0,8},{3,1,9}};

auto joined = some_values | ranges::view::join;
auto p = std::minmax_element(joined.begin(), joined.end());

p.first is an iterator to the min element; p.second to the max.

(range-v3 does have an implementation of minmax_element, but unfortunately, it requires a ForwardRange and view::join only gives me an InputRange, so I can't use it.)

Crayon answered 22/7, 2015 at 13:46 Comment(0)