Is there a faster way to find both Min and Max values?
Asked Answered
Q

4

7

Often I have written: {Min@#, Max@#} &

Yet this seems inefficient, as the expression must be scanned twice, once to find the minimum value, and once to find the maximum value. Is there a faster way to do this? The expression is often a tensor or array.

Quarrelsome answered 4/5, 2011 at 12:45 Comment(6)
@Mr. I wonder if you care to add an edit to your questions with a plot of the timings for each answer in Mma 7. Post the code and then I'll run it in Mma 8 to add that results too. Agree?Inveracity
@belisarius I am rather sleepy at the moment, but I don't understand.Quarrelsome
@Mr. Don't worry, me too. Perhaps just a silly thing ..Inveracity
@belisarius, I added timings to my answer below.Borisborja
@Mr. You probably already know about this. I haven't checked it out, just stumbled across the site.Aristotle
@TomD I am, but I haven't seen that site before, and it is useful information for anyone finding this question with a search. Thanks.Quarrelsome
B
5

This beats it by a bit.

minMax = Compile[{{list, _Integer, 1}},
   Module[{currentMin, currentMax},
    currentMin = currentMax = First[list];
    Do[
     Which[
      x < currentMin, currentMin = x,
      x > currentMax, currentMax = x],
     {x, list}];
    {currentMin, currentMax}],
   CompilationTarget -> "C",
   RuntimeOptions -> "Speed"];
v = RandomInteger[{0, 1000000000}, {10000000}];
minMax[v] // Timing

I think it's a little faster than Leonid's version because Do is a bit faster than For, even in compiled code.

Ultimately, though, this is an example of the kind of performance hit you take when using a high level, functional programming language.

Addition in response to Leonid:

I don't think that the algorithm can account for all the time difference. Much more often than not, I think, both tests will be applied anyway. The difference between Do and For is measurable, however.

cf1 = Compile[{{list, _Real, 1}},
   Module[{sum},
    sum = 0.0;
    Do[sum = sum + list[[i]]^2,
     {i, Length[list]}];
    sum]];
cf2 = Compile[{{list, _Real, 1}},
   Module[{sum, i},
    sum = 0.0;
    For[i = 1, i <= Length[list],
     i = i + 1, sum = sum + list[[i]]^2];
    sum]];
v = RandomReal[{0, 1}, {10000000}];
First /@{Timing[cf1[v]], Timing[cf2[v]]}

{0.685562, 0.898232}
Beatitude answered 4/5, 2011 at 13:19 Comment(9)
Pretty cool! It is about twice faster than my fastest solution. The main reason though is that you were able to reduce the number or tests about twice, since you actually use better algorithm - you use explicitly the fact that at any given element, only one of the two tests can be effective - +1.Bilinear
I may have to start specifying "For Mathematica 7" in all my posts. I will leave it to others to vote for this, because I cannot test it. That doesn't mean I am not grateful for your efforts.Quarrelsome
Mark, congratulations on 1k rep. Can you direct me to information on why this kind of thing is not or can not be optimized at run time?Quarrelsome
@Quarrelsome I think that for JIT to be generally effective, Mathematica would have to be more strongly typed. Certain JIT-like features like auto-compilation are present in mma though.Bilinear
@Quarrelsome Even in Mathematica 7 you can still use Compile with Mark's code, just drop all the options.Amalamalbena
@Amalamalbena If you do that in M7, Mark's code is 10 times slower than the Min-Max code, so this is hardly an option.Bilinear
@Mark, the whole fact that you use Which means that you skip the second case entirely every time when the first case condition is satisfied. My code always tests the second condition. It indeed can not account for all the difference, but I still think that your algorithm is substantially better. Of course, the difference between Do and For is significant, no question about it.Bilinear
@Mr. Wizard You certainly can call Max and Min in compiled functions and this is nice, if you have procedural code that uses Max or Min that needs to be optimized. There's generally no reason to Compile a call to a single built in function, though, since that function is likely already optimized. Also, optimization will not fundamentally alter your algorithm so your original code will still take twice as long as a single call to Max or Min.Beatitude
I am giving this the check mark, even though I cannot use it, because I originally forgot to specify "for Mathematica 7" in my question.Quarrelsome
B
3

I think this is as fast as it gets, within Mathematica programming practices. The only way I see to attempt to make it faster within mma is to use Compile with the C compilation target, as follows:

getMinMax = 
 Compile[{{lst, _Real, 1}},
   Module[{i = 1, min = 0., max = 0.},
       For[i = 1, i <= Length[lst], i++,
        If[min > lst[[i]], min = lst[[i]]];
        If[max < lst[[i]], max = lst[[i]]];];
        {min, max}], CompilationTarget -> "C", RuntimeOptions -> "Speed"]

However, even this seems to be somewhat slower than your code:

In[61]:= tst = RandomReal[{-10^7,10^7},10^6];

In[62]:= Do[getMinMax[tst],{100}]//Timing
Out[62]= {0.547,Null}

In[63]:= Do[{Min@#,Max@#}&[tst],{100}]//Timing
Out[63]= {0.484,Null}

You probably can write the function entirely in C, and then compile and load as dll - you may eliminate some overhead this way, but I doubt that you will win more than a few percents - not worth the effort IMO.

EDIT

It is interesting that you may significantly increase the speed of the compiled solution with manual common subexpression elimination (lst[[i]] here):

getMinMax = 
 Compile[{{lst, _Real, 1}},
   Module[{i = 1, min = 0., max = 0., temp},
     While[i <= Length[lst],
       temp = lst[[i++]];
       If[min > temp, min = temp];
       If[max < temp, max = temp];];
       {min, max}], CompilationTarget -> "C", RuntimeOptions -> "Speed"]

is a little faster than {Min@#,Max@#}.

Bilinear answered 4/5, 2011 at 13:3 Comment(2)
Especially not worth the effort if you use Mma7! ;-p Leonid, you surely understand the mechanics of Mathematica better than I do. Can you (in a few words...) tell me why something like {Min@#, Max@#}& cannot be internally "JIT" optimized? That is, why Mathematica is not or cannot be built this way?Quarrelsome
I think such optimizations are hard to make generally reliable, for more complex functions, symbolic arguments, etc. Also, it is somewhat orthogonal to the idiomatic mma programming, since by doing so, you break the high-level thinking process (declarative programming style) by specifying details of how this should be computed. I think that, unless you gain an order of magnitude speed-up or so, such optimizations do not pay off since you lose the advantage of high-level thinking (abstractions, readability, screen space), which is often more important, especially for larger projects.Bilinear
B
3

For an array, you could do the simplest functional thing and use

Fold[{Min[##],Max[##]}&, First@#, Rest@#]& @ data

Unfortunately, it is no speed demon. Even for short lists, 5 elements, all of Leonid's answers and Mark's answer are at least 7 times faster, uncompiled in v.7. For long lists, 25000 elements, this gets worse with Mark's being 19.6 times faster, yet even at this length this solution took only about 0.05 secs to run.

However, I'd not count out {Min[#], Max[#]}& as an option. Uncompiled it was 1.7 times faster than Mark's for short list and nearly 15 times faster for long lists (8 times and nearly 300 times faster, respectively, than the Fold solution).

Unfortunately, I could not get good numbers for the compiled versions of either {Min[#], Max[#]}&, Leonid's, or Mark's answers, instead I got numerous incomprehensible error messages. In fact, {Min[#], Max[#]}& increased in execution time. The Fold solution improved dramatically, though, and took twice as long as Leonid's answers' uncompiled times.

Edit: for the curious, here's some timing measurements of the uncompiled functions -

timing measurements on a log-log plot

Each function was used on 100 lists of the length specified on the horizontal axis and the average time, in seconds, is the vertical axis. In ascending order of time, the curves are {Min[#], Max[#]}&, Mark's answer, Leonid's second answer, Leonid's first answer, and the Fold method from above.

Borisborja answered 4/5, 2011 at 16:29 Comment(1)
@belisarius, I know. I had already computed the timings, so I plotted them for all to see.Borisborja
S
2

For all of you who are doing timings I'd like to warn you that order of execution is extremely important. For instance, look at the following two subtly different timings tests:

(1)

res =
 Table[
  a = RandomReal[{0, 100}, 10^8];
  {
   Min[a] // AbsoluteTiming // First, Max[a] // AbsoluteTiming // First,
   Max[a] // AbsoluteTiming // First, Min[a] // AbsoluteTiming // First
   }
  , {100}
  ]

enter image description here
The odd man out here is the last Min

(2)

res =
 Table[
  a = RandomReal[{0, 100}, 10^8];
  {
   Max[a] // AbsoluteTiming // First, Min[a] // AbsoluteTiming // First,
   Min[a] // AbsoluteTiming // First, Max[a] // AbsoluteTiming // First
   }
  , {100}
  ]

enter image description here

Here, the highest timing is found for the first Max, the second-highest for the second max and the two Mins are about the same and lowest. Actually, I'd expect Max and Min to take about the same time,but they don't. The former seems to take 50% more time than the latter. Having the pole position also seems to come with a 50% handicap.

Now a comparison with the algorithms given by Mark and Leonid:

res =
 Table[
  a = RandomReal[{0, 100}, 10^8];
  {
   {Max[a], Min[a]} // AbsoluteTiming // First,
   {Min@#, Max@#} &@a // AbsoluteTiming // First,
   getMinMax[a] // AbsoluteTiming // First,
   minMax[a] // AbsoluteTiming // First,
   {Min[a], Max[a]} // AbsoluteTiming // First
   }
  , {100}
  ]

enter image description here

Here we find about .3 s for the {Max[a], Min[a]} (which includes the pole position handicap), the .1 level is for Mark's method; the others are all about the same.

Swoosh answered 4/5, 2011 at 20:30 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.