Data Structure Patterns
This skill provides recognition signals, core ideas, Python templates, and pitfall guidance for seven advanced data structure patterns. Use it when a problem's constraints or access patterns suggest a specialized structure beyond basic arrays, hash maps, or linked lists. Each pattern section follows a consistent format: when to reach for it, how it works, a clean implementation template, and the mistakes that cost time in practice.
Pattern Recognition Table
| Trigger Signals | Technique | Typical Complexity |
|---|---|---|
| "k-th largest/smallest", "top K", "merge K sorted" | Heap / Priority Queue | O(n log k) |
| "next greater/smaller element", "sliding window max/min" | Monotonic Stack / Queue | O(n) |
| "prefix matching", "autocomplete", "word search in grid" | Trie | O(L) per operation |
| "range query + point update", "range min/max/sum" | Segment Tree | O(log n) per query/update |
| "prefix sums with updates", "count of elements less than X" | Fenwick Tree (BIT) | O(log n) per query/update |
| "balanced parentheses", "evaluate expression", "nested structures" | Stack-Based Parsing | O(n) |
| "rank of element", "k-th smallest in dynamic set", "floor/ceiling" | Ordered Set (SortedList) | O(log n) per operation |
Constraint-to-Technique Mapping
When the problem statement does not name a structure directly, use constraints to narrow the choice:
- n <= 10^5 with repeated range queries -- Segment Tree or Fenwick Tree. Prefer Fenwick when only prefix sums are needed; use Segment Tree for arbitrary range operations.
- n <= 10^6 and "next greater" or "span" language -- Monotonic Stack. Linear time is essential at this scale.
- String dictionary with prefix lookups -- Trie. Hash maps work for exact match but not prefix enumeration.
- "Top K" or "k-th element" with streaming data -- Heap. A size-k heap avoids sorting the full dataset.
- Dynamic insertions + rank/order queries -- SortedList. Python lacks a built-in balanced BST; sortedcontainers fills this gap.
- Nested or recursive syntax (math expressions, HTML tags) -- Stack-Based Parsing. The stack mirrors the nesting depth.
Heap / Priority Queue
Recognition Signals
- "Find the k-th largest/smallest element"
- "Merge k sorted lists/arrays"
- "Continuously add elements and query the median or top-K"
- "Schedule tasks by priority"
Core Idea A heap maintains partial order so that the min (or max) element is always accessible in O(1) with O(log n) insertion and extraction. For "top K" problems, maintain a min-heap of size k: every new element either replaces the root or is discarded, guaranteeing O(n log k) total work.
Python Template
import heapq
from typing import List
def top_k_frequent(nums: List[int], k: int) -> List[int]:
from collections import Counter
freq = Counter(nums)
# Min-heap of size k on frequency
return heapq.nlargest(k, freq.keys(), key=freq.get)
def merge_k_sorted(lists: List[List[int]]) -> List[int]:
result: list[int] = []
heap: list[tuple[int, int, int]] = []
for i, lst in enumerate(lists):
if lst:
heapq.heappush(heap, (lst[0], i, 0))
while heap:
val, list_idx, elem_idx = heapq.heappop(heap)
result.append(val)
if elem_idx + 1 < len(lists[list_idx]):
heapq.heappush(heap, (lists[list_idx][elem_idx + 1], list_idx, elem_idx + 1))
return result
Key Edge Cases
- Empty input lists or sublists in merge-k scenarios
- All elements identical (heap still works, but deduplication may be needed)
- k equals n (heap degenerates; sorting may be simpler)
- Negative numbers with max-heap emulation (negate values for
heapq)
Common Mistakes
- Forgetting that Python's
heapqis a min-heap; for max-heap, negate values or useheapq.nlargest - Not including a tiebreaker index in the heap tuple, causing crashes on uncomparable objects
- Using
heapifythen manually inserting withoutheappush(breaks the heap invariant)
Monotonic Stack / Queue
Recognition Signals
- "Next greater element" or "next smaller element"
- "Sliding window maximum/minimum"
- "Largest rectangle in histogram"
- "Daily temperatures / stock span"
Core Idea A monotonic stack maintains elements in strictly increasing or decreasing order. When a new element arrives, pop all elements that violate the monotonic property -- each popped element has found its "next greater" (or "next smaller"). Every element is pushed and popped at most once, yielding O(n) total. For sliding windows, a monotonic deque adds front-eviction to maintain the window boundary.
Python Template
from collections import deque
from typing import List
def next_greater_element(nums: List[int]) -> List[int]:
n = len(nums)
result = [-1] * n
stack: list[int] = [] # indices
for i in range(n):
while stack and nums[stack[-1]] < nums[i]:
result[stack.pop()] = nums[i]
stack.append(i)
return result
def sliding_window_max(nums: List[int], k: int) -> List[int]:
dq: deque[int] = deque() # indices, front = max
result: list[int] = []
for i, val in enumerate(nums):
while dq and dq[0] <= i - k:
dq.popleft()
while dq and nums[dq[-1]] <= val:
dq.pop()
dq.append(i)
if i >= k - 1:
result.append(nums[dq[0]])
return result
Key Edge Cases
- Array with all equal elements (no element is "greater"; result is all -1)
- Single-element array (result is trivially -1)
- Circular array variant (iterate 2n elements using modular indexing)
- Window size k = 1 (output equals input)
Common Mistakes
- Storing values instead of indices on the stack (cannot determine window boundaries)
- Using the wrong comparison direction (< vs. <=) and getting a non-strictly monotonic stack
- Forgetting to handle the circular case by iterating
range(2 * n)
Trie
Recognition Signals
- "Find all words with a given prefix"
- "Word search in a 2D grid with a dictionary"
- "Autocomplete / typeahead suggestions"
- "Maximum XOR of two numbers in an array"
Core Idea A trie stores strings character-by-character in a tree where each edge represents one character. Shared prefixes share nodes, making prefix lookups O(L) regardless of dictionary size. For XOR tries, store numbers bit-by-bit (MSB to LSB) and greedily choose the opposite bit at each level.
Python Template
from typing import Optional
class TrieNode:
__slots__ = ("children", "is_end", "count")
def __init__(self) -> None:
self.children: dict[str, "TrieNode"] = {}
self.is_end: bool = False
self.count: int = 0 # words passing through this node
class Trie:
def __init__(self) -> None:
self.root = TrieNode()
def insert(self, word: str) -> None:
node = self.root
for ch in word:
if ch not in node.children:
node.children[ch] = TrieNode()
node = node.children[ch]
node.count += 1
node.is_end = True
def starts_with(self, prefix: str) -> int:
"""Return count of words with given prefix."""
node = self.root
for ch in prefix:
if ch not in node.children:
return 0
node = node.children[ch]
return node.count
Key Edge Cases
- Empty string insertion (root itself becomes an end node)
- Prefix that is also a complete word (must check
is_endseparately from prefix existence) - Unicode or case-sensitive input (normalize before insertion)
- Deletion requires decrementing counts and pruning childless nodes
Common Mistakes
- Not distinguishing "prefix exists" from "exact word exists" (missing
is_endcheck) - Building a trie for exact-match-only problems where a hash set suffices
- Forgetting
__slots__on trie nodes, leading to high memory usage on