Find Median Given Frequency of Numbers
Problem
Numbers(value, frequency) tells how many times each value appears. Return the median.
rows = [(0,7),(1,1),(2,3),(3,1)]0.0SELECT AVG(n.Number) AS median FROM Numbers n WHERE
n.Frequency >= ABS((SELECT SUM(Frequency) FROM Numbers WHERE Number <= n.Number)
- (SELECT SUM(Frequency) FROM Numbers WHERE Number >= n.Number));
Explanation
Each row says a number appears a certain number of times, so the data is really a long sorted list hidden inside a frequency table. The median is the middle of that expanded list, and this query finds it without ever expanding the data.
The trick is a balance test. For a candidate number, count how many items are at or below it and how many are at or above it. The two inner subqueries do exactly that with SUM(Frequency). For a true median, those two counts are nearly equal — their difference is no bigger than the number's own frequency.
So the condition Frequency >= ABS(below - above) keeps precisely the row (or rows) that straddle the center of the list.
Wrapping it in AVG(n.Number) handles the even-count case: if two middle values qualify, their average is the median; if one qualifies, its average is itself.
Example: with values 0,1,2,3 and frequencies 7,1,3,1, the value 0 dominates the low end, the counts balance there, and the median comes out to 0.0.