darrenfu/LeetcodePy

Snap II

Closed this issue · 1 comments

12/12/2019

480. Sliding Window Median

But the interface is borrowed from 295. Find Median from Data Stream. You are asked to implement two methods:

class MedianFinder:

    def __init__(self, k: int):
        """
        initialize your data structure here.
        k is the sliding window size
        """

    def addNum(self, num: int) -> None:

    def findMedian(self) -> float:

Note:

  1. The sliding window elements can be put inside the memory.
  2. There could be duplicate elements in this stream.

2-heap (min, max) based solution.
The difference from 295. Find Median from Data Stream is:

  1. uses 2 heaps left-(max)heap lh and right-(min)heap rh. The key idea is the maintain the size invariance of the heaps as we add and remove elements. The top of both the heaps can be used to calculate the median.
  2. We use lazy-deletion from the heap
  3. using the first k elements construct a min heap lh. Then pop k-k/2 and add it to the rh. Now the heap sized are set at k/2 and k-k/2
  4. Iterate over rest of the numbers and add it to the appropriate heap and maintain heap size invariance by moving the top number from one heap to another as needed.

My notes:

  1. We need to maintain two integer variables to record how many out-of-window elements in each heap, and always balance the two heap sizes, ensuring:
    | (L(heap1)-N1(out-of-window)) - (L(heap2)-N2(out-of-window)) | <= 1
  2. We also need to store every value with its index to the heaps for removal purpose for the element out of sliding window.
  3. The removal may happen after adding new element into the heap. Keep checking whether any heap top index is out-of-window, pop the heap, repeat this checking process until there is no out-of-window elements in both heaps. Meanwhile, ensure Criteria 1 is met: heap size is balanced.

Time complexity: O(N logK)
Space complexity: O(K) (at most 2*K)

For more details, refer to this post and this post

Besides maintaining two heaps, we can also use insertion-sort or BST (multiset in C++) to solve this problem.