Data Structure Patterns

This skill provides recognition signals, core ideas, Python templates, and pitfall guidance for seven advanced data structure patterns. Use it when a problem's constraints or access patterns suggest a specialized structure beyond basic arrays, hash maps, or linked lists. Each pattern section follows a consistent format: when to reach for it, how it works, a clean implementation template, and the mistakes that cost time in practice.

Pattern Recognition Table

Trigger Signals	Technique	Typical Complexity
"k-th largest/smallest", "top K", "merge K sorted"	Heap / Priority Queue	O(n log k)
"next greater/smaller element", "sliding window max/min"	Monotonic Stack / Queue	O(n)
"prefix matching", "autocomplete", "word search in grid"	Trie	O(L) per operation
"range query + point update", "range min/max/sum"	Segment Tree	O(log n) per query/update
"prefix sums with updates", "count of elements less than X"	Fenwick Tree (BIT)	O(log n) per query/update
"balanced parentheses", "evaluate expression", "nested structures"	Stack-Based Parsing	O(n)
"rank of element", "k-th smallest in dynamic set", "floor/ceiling"	Ordered Set (SortedList)	O(log n) per operation

Constraint-to-Technique Mapping

When the problem statement does not name a structure directly, use constraints to narrow the choice:

n <= 10^5 with repeated range queries -- Segment Tree or Fenwick Tree. Prefer Fenwick when only prefix sums are needed; use Segment Tree for arbitrary range operations.
n <= 10^6 and "next greater" or "span" language -- Monotonic Stack. Linear time is essential at this scale.
String dictionary with prefix lookups -- Trie. Hash maps work for exact match but not prefix enumeration.
"Top K" or "k-th element" with streaming data -- Heap. A size-k heap avoids sorting the full dataset.
Dynamic insertions + rank/order queries -- SortedList. Python lacks a built-in balanced BST; sortedcontainers fills this gap.
Nested or recursive syntax (math expressions, HTML tags) -- Stack-Based Parsing. The stack mirrors the nesting depth.

Heap / Priority Queue

Recognition Signals

"Find the k-th largest/smallest element"
"Merge k sorted lists/arrays"
"Continuously add elements and query the median or top-K"
"Schedule tasks by priority"

Core Idea A heap maintains partial order so that the min (or max) element is always accessible in O(1) with O(log n) insertion and extraction. For "top K" problems, maintain a min-heap of size k: every new element either replaces the root or is discarded, guaranteeing O(n log k) total work.

Python Template

import heapq
from typing import List

def top_k_frequent(nums: List[int], k: int) -> List[int]:
    from collections import Counter
    freq = Counter(nums)
    # Min-heap of size k on frequency
    return heapq.nlargest(k, freq.keys(), key=freq.get)

def merge_k_sorted(lists: List[List[int]]) -> List[int]:
    result: list[int] = []
    heap: list[tuple[int, int, int]] = []
    for i, lst in enumerate(lists):
        if lst:
            heapq.heappush(heap, (lst[0], i, 0))
    while heap:
        val, list_idx, elem_idx = heapq.heappop(heap)
        result.append(val)
        if elem_idx + 1 < len(lists[list_idx]):
            heapq.heappush(heap, (lists[list_idx][elem_idx + 1], list_idx, elem_idx + 1))
    return result

Key Edge Cases

Empty input lists or sublists in merge-k scenarios
All elements identical (heap still works, but deduplication may be needed)
k equals n (heap degenerates; sorting may be simpler)
Negative numbers with max-heap emulation (negate values for heapq)

Common Mistakes

Forgetting that Python's heapq is a min-heap; for max-heap, negate values or use heapq.nlargest
Not including a tiebreaker index in the heap tuple, causing crashes on uncomparable objects
Using heapify then manually inserting without heappush (breaks the heap invariant)

Monotonic Stack / Queue

Recognition Signals

"Next greater element" or "next smaller element"
"Sliding window maximum/minimum"
"Largest rectangle in histogram"
"Daily temperatures / stock span"

Core Idea A monotonic stack maintains elements in strictly increasing or decreasing order. When a new element arrives, pop all elements that violate the monotonic property -- each popped element has found its "next greater" (or "next smaller"). Every element is pushed and popped at most once, yielding O(n) total. For sliding windows, a monotonic deque adds front-eviction to maintain the window boundary.

Python Template

from collections import deque
from typing import List

def next_greater_element(nums: List[int]) -> List[int]:
    n = len(nums)
    result = [-1] * n
    stack: list[int] = []  # indices
    for i in range(n):
        while stack and nums[stack[-1]] < nums[i]:
            result[stack.pop()] = nums[i]
        stack.append(i)
    return result

def sliding_window_max(nums: List[int], k: int) -> List[int]:
    dq: deque[int] = deque()  # indices, front = max
    result: list[int] = []
    for i, val in enumerate(nums):
        while dq and dq[0] <= i - k:
            dq.popleft()
        while dq and nums[dq[-1]] <= val:
            dq.pop()
        dq.append(i)
        if i >= k - 1:
            result.append(nums[dq[0]])
    return result

Key Edge Cases

Array with all equal elements (no element is "greater"; result is all -1)
Single-element array (result is trivially -1)
Circular array variant (iterate 2n elements using modular indexing)
Window size k = 1 (output equals input)

Common Mistakes

Storing values instead of indices on the stack (cannot determine window boundaries)
Using the wrong comparison direction (< vs. <=) and getting a non-strictly monotonic stack
Forgetting to handle the circular case by iterating range(2 * n)

Trie

Recognition Signals

"Find all words with a given prefix"
"Word search in a 2D grid with a dictionary"
"Autocomplete / typeahead suggestions"
"Maximum XOR of two numbers in an array"

Core Idea A trie stores strings character-by-character in a tree where each edge represents one character. Shared prefixes share nodes, making prefix lookups O(L) regardless of dictionary size. For XOR tries, store numbers bit-by-bit (MSB to LSB) and greedily choose the opposite bit at each level.

Python Template

from typing import Optional

class TrieNode:
    __slots__ = ("children", "is_end", "count")
    def __init__(self) -> None:
        self.children: dict[str, "TrieNode"] = {}
        self.is_end: bool = False
        self.count: int = 0  # words passing through this node

class Trie:
    def __init__(self) -> None:
        self.root = TrieNode()

    def insert(self, word: str) -> None:
        node = self.root
        for ch in word:
            if ch not in node.children:
                node.children[ch] = TrieNode()
            node = node.children[ch]
            node.count += 1
        node.is_end = True

    def starts_with(self, prefix: str) -> int:
        """Return count of words with given prefix."""
        node = self.root
        for ch in prefix:
            if ch not in node.children:
                return 0
            node = node.children[ch]
        return node.count

Key Edge Cases

Empty string insertion (root itself becomes an end node)
Prefix that is also a complete word (must check is_end separately from prefix existence)
Unicode or case-sensitive input (normalize before insertion)
Deletion requires decrementing counts and pruning childless nodes

Common Mistakes

Not distinguishing "prefix exists" from "exact word exists" (missing is_end check)
Building a trie for exact-match-only problems where a hash set suffices
Forgetting __slots__ on trie nodes, leading to high memory usage on

data-structures

How to add

Drop this on your repo README

Related skills

claude-api

skill-creator

oh-my-issues

claude-mem

Get new Desenvolvimento skills every Monday

Data Structure Patterns

Pattern Recognition Table

Constraint-to-Technique Mapping

Heap / Priority Queue

Monotonic Stack / Queue

Trie

Comments · No comments