The original question is given file containing 5GB URL being visited last day, find the top k frequent URL. The problem can be solved by using hash map to count the occurrences of distinct URL and find top k with the help of min heap, taking a O(n log k) time.
Now I'm thinking what if the input was unlimited online data stream (instead of static file), then how can I know the top k URL of the last day?
Or is there any improvement that I can made to the system that allow me to get top k URL for last minute and last day and last hours dynamically?
Any hint will be appreciated!!