Snap II
Closed this issue · 1 comments
darrenfu commented
12/12/2019
But the interface is borrowed from 295. Find Median from Data Stream. You are asked to implement two methods:
class MedianFinder:
def __init__(self, k: int):
"""
initialize your data structure here.
k is the sliding window size
"""
def addNum(self, num: int) -> None:
def findMedian(self) -> float:
Note:
- The sliding window elements can be put inside the memory.
- There could be duplicate elements in this stream.
darrenfu commented
2-heap (min, max) based solution.
The difference from 295. Find Median from Data Stream is:
- uses 2 heaps left-(max)heap
lh
and right-(min)heaprh
. The key idea is the maintain the size invariance of the heaps as we add and remove elements. The top of both the heaps can be used to calculate the median. - We use lazy-deletion from the heap
- using the first k elements construct a min heap
lh
. Then popk-k/2
and add it to therh
. Now the heap sized are set atk/2
andk-k/2
- Iterate over rest of the numbers and add it to the appropriate heap and maintain heap size invariance by moving the top number from one heap to another as needed.
My notes:
- We need to maintain two integer variables to record how many out-of-window elements in each heap, and always balance the two heap sizes, ensuring:
| (L(heap1)-N1(out-of-window)) - (L(heap2)-N2(out-of-window)) | <= 1
- We also need to store every value with its index to the heaps for removal purpose for the element out of sliding window.
- The removal may happen after adding new element into the heap. Keep checking whether any heap top index is out-of-window, pop the heap, repeat this checking process until there is no out-of-window elements in both heaps. Meanwhile, ensure Criteria 1 is met: heap size is balanced.
Time complexity: O(N logK)
Space complexity: O(K) (at most 2*K)
For more details, refer to this post and this post
Besides maintaining two heaps, we can also use insertion-sort or BST (multiset in C++) to solve this problem.