Finding Mode of Vector of Ints in C++
Asked Answered
F

8

10

So I'm trying to make a basic program to learn the basics of C++, I'm generating 100 random numbers from 0 to 100 and storing them in a vector, I am then displaying the sum, mean, median, mode, high and low of the vector. I have everything else done except the mode which is where I get stuck. Here is the code I have so far.

int modeFunction()
     {
         numMode = 0;
         count = 0;
         for (int n = 0; n < 100; n++)
         {
             for (int y = 0; y < 100; y++)
             {
                 if (numVector.at(y) == numVector.at(n))
                {
                    numMode = numVector.at(y);
                    count++;
                }
             }

         }
         return numMode;
     }

After that I get stuck because in my mind that should work but it doesn't. It just out puts the last number, usually 100. Any help would be much appreciated.

Freida answered 25/3, 2011 at 22:29 Comment(4)
if myVector is a std::vector<int> (seems like it atleast), you can index it like an array: myVector[y] and myVector[n] will yield the same as the myVector.at version, but looks nicer imho. :)Gotha
@Xeo: the difference being that at has defined behaviour when the index is out of range. Arguably operator[] is a micro-optimization, although as you say it's also a bit of a style difference.Edgeways
@Steve: Ah, thanks for that tidbit. :) Didn't bother with at yet, but a normal array also has undefined behaviour for out of range access, though it certainly is nice to have defined behavious when you need it. :)Gotha
@Xeo: to be honest, I never use at myself. I occasionally wonder whether I should, but in practice I never write code that I want to throw an exception when the index is out of bounds, so it only serves as a debugging assist, it "should" never happen. Despite calling it a micro-optimization, it's a reasonable amount of redundant code, so in the end if I want bounds-checking I just switch to Python ;-)Edgeways
C
12

since all the values are between 0 and 100, you can find the mode efficiently with a histogram:

std::vector<int> histogram(101,0);
for( int i=0; i<100; ++i )
  ++histogram[ numVector[i] ];
return std::max_element( histogram.begin(), histogram.end() ) - histogram.begin();
Codding answered 25/3, 2011 at 23:3 Comment(1)
Would this not work for values outside of [0,100] (this should work for any symmetric histogram, no?)Brothers
S
4

Since mode is the number that occurs most frequent you shouldn't change numMode unless the new number's count is greater than numMode's count.

EDIT: To clarify, you need to keep a separate count for the current element and the current number that you think is the mode. Ideally, setting newMode to the first element is a good approach.

In addition, mode isn't necessary unique (i.e. "1 1 2 2"). You may want to keep that in mind if you care about that.

newMode = element[0]
modeCount = # of occurrence of newMode

for ( i-th element from [1 to end] ) {
   tmpCount = # of occurrence of element[i]
   if tmpCount > modeCount {
     newMode = element[i]
     modeCount = tmpCount
   }
}
Sweettempered answered 25/3, 2011 at 22:32 Comment(1)
I am who downvoted. I did it because this answer is incomplete as it assumes the array of number of occurrences of the values is known, however, this array is the most important here.Pilsner
P
3

bmcnett's approach works great if number of elements are small enough. If you have large number of elements but the all element value are with in a small range using map/hashmap works well. Something like

typedef std::pair<int, int> mode_pair;

struct mode_predicate
{
  bool operator()(mode_pair const& lhs, mode_pair const& rhs)
  {
    return lhs.second < rhs.second;
  }
};

int modeFunction()
{
  std::map<int, int> mode_map;
  for (int n = 0; n < 100; n++)
    mode_map[numVector[n]]++;
  mode_predicate mp;
  return std::max_element(mode_map.begin(), mode_map.end(), mp)->first;
}
Publican answered 25/3, 2011 at 23:32 Comment(0)
E
1

Your algorithm is wrong - it outputs the last number in the array because that's all it can ever do. Every time the number at index y matches the number at index n you overwrite the results for the previous n. Since you're using the same loop conditions, y and n are always the same at at least one point in the nested loop for each possible n value - and you'll always end up with numMode being numVector.at(99).

You need to change your algorithm to save the count for each n index along the way (or at least which n index ended up with the greatest count), so that you can know at the end of the n loop which entry occured the most times.

Elide answered 25/3, 2011 at 22:34 Comment(0)
B
1

Alternative solutions. Note: untested.

int mode1(const std::vector<int>& values)
{   
    int old_mode = 0;
    int old_count = 0;
    for(size_t n=0; n < values.size(); ++n) 
    {
        int mode = values[n];
        int count = std::count(values.begin()+n+1, values.end(), mode);

        if(count > old_count) 
        {
            old_mode = mode;
            old_count = count;
        }
    }
    return old_mode;
}

int mode2(const std::vector<int>& values)
{   
    return std::max_element(values.begin(), values.end(), [](int value)
    {
        return std::count(values.begin(), values.end(), value);
    });
}
Biochemistry answered 25/3, 2011 at 22:42 Comment(2)
I think in mode1() you should compare count + 1 with old_count because you start counting from the position one after the position of the counted value: i.e. you must add the initial one to count.Pilsner
Moreover, I think it's not a good practice to assign the initial value for the value with the searched property as 0. It should be better to use the first element for this purpose.Pilsner
P
0

Mode means a number with highest frequency. The logic should be -

//Start of function

int mode = 0, globalCount = 0 ;  
// Start of outer for loop
for i = 0 to length - 1    
   int localCount = 0   

  // Start of inner for loop  
  for j = 0 to length - 1      
     if vec[i] == vec[j]    
     ++localCount    
 // End of Inner for loop 

 if ( localCount > globalCount )  
     globalCount = localCount  
     mode = vec[i]  
// End of outer for loop  

if globalCount > 1 // This should be checked whether vec has repetitions at all
   return mode
else
   return 0

// End of function
Poussette answered 25/3, 2011 at 22:52 Comment(1)
@Freida - The logic can be even better improving efficiency but this is what the algorithm should be according to your thought process.Poussette
B
0
    int number = array_list[0];
    int mode = number;
    int count = 1;
    int countMode = 1;

    for (int i=1; i<size_of_list; i++)
    {
          if (array_list[i] == number) 
          { // count occurrences of the current number
             count++;
             if (count > countMode) 
                {
                      countMode = count; // mode is the biggest ocurrences
                      mode = number;
                }
          }
          else
          { // now this is a different number
                if (count > countMode) 
                {
                      countMode = count; // mode is the biggest ocurrences
                      mode = number;
                }
               count = 1; // reset count for the new number
               number = array_list[i];
      }
    }
Bespoke answered 14/6, 2019 at 11:44 Comment(0)
B
0

So, you have sorted vector with integers.
Iterate over the vector and accumulate repeatments if current value equals to next one.
If neighboring values are not equal, reset current repeatment to 1 and continue the loop.
Along the way store and update max repeatments.

vector<int> nums;
int mode = 1;
for (int i = 0, cur_reps = 1; i < nums.size(); i++)
{
    if (i < nums.size()-1 && nums[i] == nums[i + 1])
    {
        cur_reps++;
        continue;
    }
    if (cur_reps > mode) mode = cur_reps;
    cur_reps = 1;
}
Biform answered 26/11, 2023 at 16:31 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.