Math and Combinatorics Patterns
Mathematical and combinatorial problems appear frequently in competitive programming and algorithm design. They often hide behind constraints that look like brute-force problems but require algebraic shortcuts to meet time limits. This reference covers eight core pattern families with recognition signals, templates, and pitfalls.
Pattern Recognition Table
| Trigger Signals | Technique | Typical Complexity |
|---|---|---|
| "answer modulo 10^9+7", large product/sum | Modular Arithmetic | O(1) per operation |
| "count primes up to N", "smallest factor" | Sieve of Eratosthenes | O(N log log N) |
| "greatest common divisor", "least common multiple" | GCD / LCM | O(log min(a,b)) |
| "how many ways to choose", "combinations mod p" | Binomial Coefficients | O(N) precompute, O(1) query |
| "compute a^b mod m", "matrix recurrence" | Fast Exponentiation | O(log b) |
| "count arrangements", "distribute items into groups" | Counting Techniques | Varies |
| "count elements satisfying at least one", "derangements" | Inclusion-Exclusion | O(2^k) for k constraints |
| "two players, optimal play", "who wins" | Game Theory (Sprague-Grundy) | Varies by state space |
Constraint-to-Technique Mapping
- N up to 10^7 and asks about primes: Sieve of Eratosthenes.
- N up to 10^6 and asks "how many ways": Precomputed factorials + modular inverse for nCr.
- "modulo 10^9+7" anywhere in the problem: Modular arithmetic throughout; use modular inverse for division.
- Two players, finite game, no randomness: Sprague-Grundy theorem or direct DP on game states.
- "count subsets satisfying property" with small constraint count (k <= 20): Inclusion-exclusion over 2^k subsets.
- Linear recurrence with large N (10^18): Matrix exponentiation to compute nth term in O(k^3 log N).
- "distribute N items into K groups": Stars and bars, possibly combined with inclusion-exclusion for bounded constraints.
Individual Patterns
Modular Arithmetic
Recognition Signals
- Problem says "answer modulo 10^9+7" or any large prime
- Intermediate values overflow 64-bit integers
- Division is required under a modular context
- Counting problem with astronomically large answers
Core Idea
All arithmetic operations (add, subtract, multiply) distribute over modulo. Division under a prime modulus uses modular inverse: a / b mod p = a * b^(p-2) mod p via Fermat's little theorem. Always reduce intermediate results to prevent overflow.
Python Template
MOD = 10**9 + 7
def mod_add(a: int, b: int) -> int:
return (a + b) % MOD
def mod_mul(a: int, b: int) -> int:
return (a * b) % MOD
def mod_pow(base: int, exp: int, mod: int = MOD) -> int:
result = 1
base %= mod
while exp > 0:
if exp & 1:
result = result * base % mod
exp >>= 1
base = base * base % mod
return result
def mod_inv(a: int, mod: int = MOD) -> int:
"""Modular inverse via Fermat's little theorem. Requires mod is prime."""
return mod_pow(a, mod - 2, mod)
def mod_div(a: int, b: int) -> int:
return mod_mul(a, mod_inv(b))
Key Edge Cases
- Subtraction can go negative: use
(a - b + MOD) % MOD - Zero divisor:
mod_inv(0)is undefined; guard against it - Non-prime modulus: Fermat's theorem does not apply; use extended GCD instead
- Chained multiplications: reduce after every multiply, not just at the end
Common Mistakes
- Forgetting to take mod after subtraction, producing negative values
- Using Fermat's inverse with a composite modulus
- Applying mod to intermediate index calculations where exact values are needed
Sieve of Eratosthenes
Recognition Signals
- "Find all primes up to N" with N up to 10^7
- "Smallest prime factor of every number"
- Need to factorize many numbers efficiently
- "Count of prime factors" across a range
Core Idea
The classic sieve marks composites by iterating through multiples of each prime. The smallest-prime-factor (SPF) variant stores the smallest divisor at each index, enabling O(log N) factorization of any number after an O(N log log N) precompute. For ranges beyond memory, use a segmented sieve.
Python Template
def sieve_primes(n: int) -> list[bool]:
"""Standard boolean sieve. is_prime[i] is True if i is prime."""
is_prime = [True] * (n + 1)
is_prime[0] = is_prime[1] = False
for i in range(2, int(n**0.5) + 1):
if is_prime[i]:
for j in range(i * i, n + 1, i):
is_prime[j] = False
return is_prime
def smallest_prime_factor(n: int) -> list[int]:
"""SPF sieve for fast factorization."""
spf = list(range(n + 1))
for i in range(2, int(n**0.5) + 1):
if spf[i] == i: # i is prime
for j in range(i * i, n + 1, i):
if spf[j] == j:
spf[j] = i
return spf
def factorize(x: int, spf: list[int]) -> list[int]:
"""Factorize x using precomputed SPF table."""
factors: list[int] = []
while x > 1:
factors.append(spf[x])
x //= spf[x]
return factors
Key Edge Cases
- N = 0 or 1: no primes exist; handle boundary
- Memory limit: standard sieve needs O(N) space; switch to segmented sieve above ~10^8
- SPF of 1: not defined; skip or handle separately
- Even numbers: can optimize by treating 2 specially and only sieving odds
Common Mistakes
- Starting inner loop at
2*iinstead ofi*i(correct but slower) - Off-by-one on sieve size (allocate
n+1slots) - Forgetting that
sieve[0]andsieve[1]are not prime
GCD / LCM
Recognition Signals
- Problem explicitly mentions GCD or LCM
- "Find the largest number that divides all elements"
- "Smallest number divisible by all elements"
- Equations involving linear combinations of two integers (Bezout's identity)
Core Idea
Euclidean algorithm computes GCD in O(log min(a,b)). LCM derives from GCD: lcm(a,b) = a * b // gcd(a,b). Extended Euclidean finds coefficients x, y such that a*x + b*y = gcd(a,b), which is essential for modular inverse when the modulus is not prime.
Python Template
from math import gcd
from functools import reduce
def lcm(a: int, b: int) -> int:
return a * b // gcd(a, b)
def lcm_array(arr: list[int]) -> int:
return reduce(lcm, arr)
def gcd_array(arr: list[int]) -> int:
return reduce(gcd, arr)
def extended_gcd(a: int, b: int) -> tuple[int, int, int]:
"""Returns (g, x, y) such that a*x + b*y = g = gcd(a, b)."""
if b == 0:
return a, 1, 0
g, x1, y1 = extended_gcd(b, a % b)
return g, y1, x1 - (a // b) * y1
def mod_inv_ext(a: int, m: int) -> int:
"""Modular inverse via extended GCD. Works for any coprime a, m."""
g, x, _ = extended_gcd(a % m, m)
if g != 1:
raise ValueError(f"Inverse does not exist: gcd({a}, {m}) = {g}")
return x % m
Key Edge Cases
gcd(0, x) = xandgcd(0, 0) = 0by convention- LCM overflow: compute
a // gcd(a,b) * binstead ofa * b // gcd(a,b) - Negative inputs: take absolute values before computing
- Extended GCD with
b = 0: base case returns(a, 1, 0)
Common Mistakes
- LCM overflow from multiplying before dividing by GCD
- Assuming modular inverse exists when
gcd(a, m) != 1 - Recursion depth for very large inputs (Python's default limit); use iterative version if needed
Binomial Coefficients
Recognition Signals
- "How many ways to choose k items from n"
- "nCr mod p" with large n
- Pascal's triangle or binomial theorem mentioned
- Lattice path counting (grid movement problems)
Core Idea
For small n (up to ~10^6), precompute factorials and inverse factorials modulo a prime, then answer nCr queries in O(1). Pascal's triangle DP works for smaller n without modular arithmetic. Lucas' theorem handles nCr mod p when n is very large but p is small.
Python Template
MOD = 10**9 + 7
def precompute_factorials(n: int) -> tuple[list[in