[Leetcode]340. Longest Substring with At Most K Distinct Characters

Given a string, find the length of the longest substring T that contains at most k distinct characters.

This question could be solved with two pointers and more importantly, using the Ordered Dictionary would be able to achieve O(n) time complexity.

Two pointers

With two pointers i and j, we can explore the string using j while marking the left boundary with i such that s[i..j] contains at most k distance characters. We would record the rightmost position for each character in s[i..j] using a hashmap and, when the number of distinct characters exceeds k, i.e., becomes k+1, we need to find the leftmost character in the hashmap.

When we have 5 distinct characters, we need to put the left pointer to the leftmost of all, which is ‘O’.
Graph source: https://leetcode.com/problems/longest-substring-with-at-most-k-distinct-characters/solution/

The time complexity would be O(nk) as for each character we would search the hashmap of size O(k).

class Solution:
    def lengthOfLongestSubstringKDistinct(self, s: str, k: int) -> int:
        n = len(s)
        if not k or not n:
            return 0
        l = r = ret = 0
        hashmap = {}
        while r < n:
            hashmap[s[r]] = r
            r += 1
            if len(hashmap) > k:
                leftmost = n
                for c in hashmap:
                    leftmost = min(leftmost,hashmap[c])
                l = leftmost+1
                del hashmap[s[leftmost]]
            ret = max(ret,r-l)
        return ret

Optimization with the Ordered Dictionary

However, to achieve O(n) time complexity, we need the Ordered Dictionary. It remembers the order of insertion/deletion and therefore allows us to find the first/last inserted item in O(1) time. More info about the Ordered Dictionary could be found at:

Hence, we can let the order in the ordered dictionary to be the relative positions of each item – when we meet a duplicated character, we could update its value by deleting and re-inserting, thus the first item in it is always the leftmost one.

class Solution:
    def lengthOfLongestSubstringKDistinct(self, s: str, k: int) -> int:
        n = len(s)
        if not k or not n:
            return 0
        l = r = ret = 0
        hashmap = collections.OrderedDict()
        while r < n:
            if s[r] in hashmap:
                del hashmap[s[r]]
            hashmap[s[r]] = r
            r += 1
            if len(hashmap) > k:
                _,del_idx = hashmap.popitem(last=False)
                l = del_idx+1
            ret = max(ret,r-l)
        return ret

Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments