Before going into further details, I recommend you to checkout this leetcode problem to get an idea about the problem statement.
So I expect, by now you've gone through the problem statement and are well acquainted with the question. Let me just put that here again, feel free to skip this :p
Given a sequence of numbers of length n, one of them occurs more than ⌊n ÷ 2⌋ times (i.e. is a majority). We need to find that majority element.
Give the problem a try, and move to solutions afterwards.
One of the most obvious solutions that would be striking you mind would be to simply count the number of occurrences of each element & compare that with ⌊n ÷ 2⌋. It works 🚀️.
Coming to the time complexity of this method, since we'll be taking an element and looping through the entire list to find its number of occurrences, and we will be doing this for every element, we end up with a nested for loop. Thus:
Time complexity: O(n2)
Space complexity: O(1)
Now that we have a correct solution, let's try improving it. The above approach tries to count the number of occurrences of each element. And to do this we're looping through the entire list, for every single element giving us a time complexity of O(n2). This is the bottleneck. So is there any way to count the occurrences in single pass? Just think for a while. Well there can be two methods that may come to your mind, each having their own tradeoffs.
Since we want to count the number of occurrences of each distinct element, we can simply sort the array in O(n×log n) time. And then we can traverse the array again and maintain a counter for the element. Whenever, a new distinct element is found we reset the counter to 1 and move ahead. When the value of the counter becomes greater than ⌊n ÷ 2⌋, a majority element is found.
Time complexity: O(n×log n)
Space complexity: O(1)
The title explains the approach very well. Use the array elements as a key to the hash map & its number of occurrences so far as the value. In every iteration update the element's count & check if it has become greater than ⌊n ÷ 2⌋. If it has, then we have found the majority element. Since updation in hash map happens in constant time on an average, we get a linear time algorithm. However, it is going to use n entries in the hash map in the worst case, giving us a space complexity of O(n).
Time complexity: O(n)
Space complexity: O(n)
The solutions we explored above were fine, depending on the problem constraints. However, still there is a lot of room for improvement. We can obtain a time complexity of O(n) and at the same time get rid of the additional O(n) space constraint. This is exactly what Moore's Voting algorithm is.
The key idea behind the method is that, if we pair distinct elements in the array and eliminate them, then the element we will be left with in the end would be the one which is a majority element, provided that there exists a majority in the array. Here the order in which the elements are paired does not affect the answer. Let's just try this with an example here:
I highly recommend you to try this by taking a few more examples.
Another caveat here is that, if there does not exist a majority, then the elements remaining at the end don't give the solution. So if it is not guaranteed that the array always has a majority, then we have to traverse the array once again to find out the number of occurrences of the last remaining element. Then only we can decide if it is a majority or not.
So to implement this solution, we can maintain a variable
candidate(represents the candidate for majority element) and its
count. Initialise these to null & 0 respectively. Now we traverse the array, and for every element encountered we have three choices:
count == 0, then set
count = 1.
current element == candidate, then increment
current element != candidate, then decrement
Finally, at the end we need to check if this
candidate is indeed the majority or not (we may omit this step, it is guaranteed that a majority exists).
So, overall we did two passes through the list to find the majority, giving us a time complexity of O(n).
Time complexity: O(n)
Space complexity: O(1) If you managed to follow this far, and understand the idea, then congratulations 🎉🎉, you did it. I encourage you to try to code the solution yourself. If not, then try to give this a little more thought, it is in fact a little tricky to get at the first time.😀
Next, we'll take a look at the code (in C++) for this approach. Give it a try yourself, before moving ahead. I leave the code for other approaches as an exercise for you.
So this concludes the blog post. As a side note, I would like to add that this method can be extended by taking triplets, quadruples, and so on (instead of pairs) to find the elements having occurrences more than 33%, 25% and so on.
If you understood it clearly, then you may try your hands on this related problem.