roadmap/arrays-hashing/Encode and Decode Strings

Encode and Decode Strings

MediumGoogleFacebookAmazon
Solve on LeetCode ↗

Design an algorithm to encode a list of strings to a single string. The encoded string is then sent over the network and decoded back to the original list of strings.

The challenge: any character (including delimiters) can appear in the strings. The trick is length-prefixed encoding: store each string as its_length + '#' + the_string itself. The '#' just marks where the length ends — even if '#' appears in the string, we use the length to skip exactly the right number of characters.

1. Length prefix encoding

Encode each string as len(s) + '#' + s. Join all encoded strings. The '#' delimiter is unambiguous because we always read the exact number of characters specified by the length.

2. Decode by scanning

Find the '#' to get the length. Read exactly that many characters. Move past them. Repeat.

Time
O(n)
Space
O(n)

n = total characters across all strings. Encode and decode each character once.

Python
def encode(self, strs: List[str]) -> str:
    return ''.join(f'{len(s)}#{s}' for s in strs)

def decode(self, s: str) -> List[str]:
    result = []
    i = 0
    while i < len(s):
        j = s.index('#', i)       # find the '#'
        length = int(s[i:j])      # parse the length
        result.append(s[j+1 : j+1+length])  # read exactly 'length' chars
        i = j + 1 + length        # advance past this string
    return result
Empty string in list: encodes as '0#'
String containing '#': length prefix handles it correctly
Empty list: encodes to empty string
Progress is saved on this device