Memory-efficient way of computing the median of a large data set? [closed]
Asked Answered
C

4

7

If one computer can only hold 1 million numbers, how to find out the median number from 100 million numbers?

Chilcote answered 25/9, 2009 at 2:2 Comment(4)
#1312264Praxiteles
At best, this should probably be Community Wiki.Pileus
This is a valid programming related question, how to compute the median in a memory efficient way. It just comes along as a puzzle.Fonteyn
Use the "Median of medians" approach.Fonteyn
A
3

Do an external sort and then scan once for the median.

Hopefully, the real problem was "how do I do an external sort"? (If this is homework...I want to help in the right way. :-)

Androsterone answered 25/9, 2009 at 2:5 Comment(3)
This is what I thought. :) But I'm not sure it's the correct answer, so I asked here.Chilcote
There has got to be a way to do this with the literal constraint that the device can only store 1 million numbers. Using the external sort seems like cheating. Now I'm gonna be up all night thinking about this.Sulfapyrazine
Heh, I wondered that myself. It's a really good question.Androsterone
D
3

Reduce the problem to a more difficult one: sort the 100 million numbers using merge sort Then, take the 50 millionth element.

Drandell answered 25/9, 2009 at 2:6 Comment(3)
But the computer can hold 1 million numbers only, how can I find the 50 millionth one?Chilcote
On the tape (oh, right, this is no longer the 80s. I meant "on the disk"), at the 50 millionth position. You have storage for your 100M elements, right? Because if you don't (elements read from a stream) the exercice can be proved impossible by a counting argument.Drandell
It is not correct for 100 million numbers to take 50 millionth element, because 100 million is even number, so one has to take mean of 50 millionth and 50 millionth + 1 elementsLycanthrope
L
1

Using 101 computers and a sort-merge just like a database.

Lurdan answered 25/9, 2009 at 2:7 Comment(2)
Lol. This answer should feature as the best programmer joke!Tuneberg
I'll take it as one of my answers.:)Chilcote
M
0

Find the middle million numbers and then report the median of them. (Hmmm, now how to find those middle million numbers...)

Majka answered 25/9, 2009 at 2:19 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.