[Leetcode]1508. Range Sum of Sorted Subarray Sums

Given the array nums consisting of n positive integers. You computed the sum of all non-empty continous subarrays from the array and then sort them in non-decreasing order, creating a new array of n * (n + 1) / 2 numbers.
Return the sum of the numbers from index left to index right (indexed from 1), inclusive, in the new array. Since the answer can be a huge number return it modulo 10^9 + 7.
Example:
Input: nums = [1,2,3,4], n = 4, left = 1, right = 5 Output: 13 Explanation: All subarray sums are 1, 3, 6, 10, 2, 5, 9, 3, 7, 4. After sorting them in non-decreasing order we have the new array [1, 2, 3, 3, 4, 5, 6, 7, 9, 10]. The sum of the numbers from index le = 1 to ri = 5 is 1 + 2 + 3 + 3 + 4 = 13.

This question first appeared in Biweekly Leetcode Contest 30. Its brute force approach is quite straightforward and also is able to pass the test cases. However, there are another two efficient solutions and, especially, the third one improves the time complexity from O(n^2logn) to O(nlog(sum(nums))).

Brute force – O(n^2logn)

Follow the context and ‘translate’ it into Python: use a nested for-loop to find the summation of all subarrays and return [left:right]. Computing sums for all subarrays takes O(n^2) and sorting them takes O((n^2)log((n^2))), which gives O(n^2logn).

class Solution:
    def rangeSum(self, nums: List[int], n: int, left: int, right: int) -> int:
        sums = []
        n = len(nums)
        for i in range(n):
            s = 0
            for j in range(i,n):
                s += nums[j]
                sums.append(s)
        return sum(sorted(sums)[left-1:right]) % (10**9 + 7)

Optimisation 1 – Priority queue – O(n^2logn)

Well, we don’t have to compute sums for all subarrays, and this approach saves time by computing sums for only top right subarrays. Although this approach still has O(n^2logn) as the worst cases time complexity.

To find the sum for one subarray starting with nums[a], we could keep adding the next element to it so sums for the subarrays starting with nums[a] would be: nums[a],nums[a]+nums[a+1],....With that being said, we could use a min-heap to store (cur_sum,next) where cur_sum is the sum for nums[a]+...+nums[cur] and next is the index of the next element to be added. Therefore we could generate the sums for subarrays in increasing order, and we would stop until we have right sums.

Use a min-heap to store the sums for subarrays.
class Solution:
    def rangeSum(self, nums: List[int], n: int, left: int, right: int) -> int:
        pq = []
        ret = 0
        for i in range(n):
            heapq.heappush(pq,(nums[i],i+1))
        for i in range(1,right+1):
            cur_sum,_next = heapq.heappop(pq)
            if i >= left:
                ret += cur_sum
            if _next<n:
                heapq.heappush(pq,(cur_sum+nums[_next],_next+1))
        return ret % (10**9 + 7)

Optimisation 2 – Binary search + sliding window – O(nlog(sum(nums)))

The previous two approaches calculate sums for subarrays which takes O(n^2), and this approach uses a sliding window to reduce it into n credit to @HuaHua.

We will look into this approach through two steps:

  1. Define a function count_arrays(t) so that given a target t, it returns the number of subarrays with sums smaller than t – sliding window with O(n).
  2. Apply binary search with [1,sum(nums)] as the search space to find the target t_min and t_max such that there are left subarrays with sums smaller than t_min and right subarrays with sums smaller than t_max. The difference between them is the answer – O(sum(nums)).

Define count_arrays(t)

We use a sliding window with two pointers pointer_l and pointer_r and constrain that sum(nums[pointer_l...pointer_r]) is smaller or equal to t. Each time we move the pointer_r, we would have pointer_r-pointer_l+1 new subarrays ending with nums[pointer_r]. If the constraint still holds, then all new subarrays are qualified; or if not, we can move pointer_l until the constraint is again satisfied.

Use a sliding window to count subarrays that are summed smaller than t.

While we slide the window, we would also compute the sums for all subarrays along the way. We can observe that, say we have sum for the current window [pointer_l,pointer_r-1] as sub_sum, when we reach pointer_r, the sums for all subarrays within the current window is sub_sum+(pointer_r-pointer_l)*nums[r] as shown below:

Increase by len(window)*nums[pointer_r]

At the same time, if we move from [pointer_l-1,pointer_r] to [pointer_l,pointer_r], we would lose sum(nums[[pointer_l-1,pointer_r]):

Decrease by sum(window)

Putting these altogether, we have:

        def count_arrays(t):
            l = r = s_window = 0
            cnt = sum_arrays = sub_sum = 0
            while r < n:
                s_window += nums[r]
                sub_sum += (r-l+1)*nums[r]
                while s_window > t:
                    sub_sum -= s_window
                    s_window -= nums[l]
                    l += 1
                cnt += r - l + 1
                sum_arrays += sub_sum
                r += 1
            return cnt,sum_arrays

Binary search

Apply binary search until we locate the top t subarrays. Note that, there might be repeated sums so we need to remove the extra (cnt-t)*l from the s.

        def binary_search(l,r,t):
            while l < r:
                mid = (l+r)//2
                cnt,sum_arrays = count_arrays(mid)
                if cnt >= t:
                    r = mid
                else:
                    l = mid+1
            cnt,s = count_arrays(l)
            # might be same values, skip them
            return s-(cnt-t)*l

Finally, we can put them together:

class Solution:
    def rangeSum(self, nums: List[int], n: int, left: int, right: int) -> int:
        def count_arrays(t):
            l = r = s_window = 0
            cnt = sum_arrays = sub_sum = 0
            while r < n:
                s_window += nums[r]
                sub_sum += (r-l+1)*nums[r]
                while s_window > t:
                    sub_sum -= s_window
                    s_window -= nums[l]
                    l += 1
                cnt += r - l + 1
                sum_arrays += sub_sum
                r += 1
            return cnt,sum_arrays
        def binary_search(l,r,t):
            while l < r:
                mid = (l+r)//2
                cnt,sum_arrays = count_arrays(mid)
                if cnt >= t:
                    r = mid
                else:
                    l = mid+1
            cnt,s = count_arrays(l)
            # might be same values, skip them
            return s-(cnt-t)*l
        total = sum(nums)
        sum_arrays_1 = binary_search(min(nums),total,left-1)
        sum_arrays_2 = binary_search(min(nums),total,right)
        return (sum_arrays_2 - sum_arrays_1) % (10**9 + 7)

Reference

https://zxi.mytechroad.com/blog/queue/leetcode-1508-range-sum-of-sorted-subarray-sums/

Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments