mathguard — Math-Heavy Optimization for AI Code
lemmaly makes you pick the right classical algorithm. mathguard kicks in when the classical algorithm is already optimal but mathematics gives a better bound — usually by accepting bounded approximation, exploiting structure, or moving to a smarter algebraic space.
The model knows these techniques. It almost never proposes them spontaneously. mathguard fixes that.
Violating the letter of these rules is violating the spirit of the skill. A Bloom filter where the caller assumed exact answers is a production incident, not an optimization.
When to Use This Skill
Use mathguard when:
- Working with large-scale data (
n ≥ 10⁶): similarity search, deduplication, top-K / heavy-hitters, streaming analytics, cardinality estimation, embeddings, recommender systems. - Doing signal/image processing, polynomial or big-integer arithmetic, convolution, graph distance, computational geometry, randomized algorithms.
- The classical O(n log n) is already the floor and you need an asymptotic win (Bloom filter, HyperLogLog, Count-Min Sketch, MinHash/LSH, FFT/NTT, Johnson-Lindenstrauss projection, sweep line, kd-tree/BVH, fast exponentiation, monoid parallel reduction, amortized potential method).
- Loaded after
lemmalyhas confirmed the classical answer is not enough.
Do not use mathguard when:
- The caller needs exact answers (auth, billing, dedup-for-correctness, primary keys).
nis small (n < 10⁴) and the path is not hot.- The bottleneck is I/O, not CPU/memory.
The Iron Law
NO APPROXIMATE STRUCTURE WITHOUT WRITTEN ε/δ AND EXPLICIT CALLER ACCEPTANCE
Probabilistic data structures (Bloom, HyperLogLog, Count-Min, MinHash/LSH, t-digest), randomized projections (JL), and lossy transforms (floating FFT) all change the answer's meaning. Before proposing one:
- Write the error parameter the caller will see (false-positive rate, relative error, distortion bound).
- Identify the caller and state, in one sentence, that they tolerate this kind of wrong answer.
- If you cannot identify the caller, or they need exact (auth checks, billing, dedup keys, deduplication for correctness, anything that flows into a primary key), DO NOT propose the approximate structure. Keep classical, or escalate to a sharded/streaming exact design.
This rule has saved more incidents than any other in this skill. Do not soften it.
Non-negotiable rules
-
Declare exact vs approximate up front. Before suggesting a math-level technique, state:
mode: exactormode: approximate- If approximate: the error parameter (ε, δ, false-positive rate) and a sentence on whether the caller can tolerate it.
- If the caller needs exact and there is no exact win, say so and stop — do not silently degrade to approximate.
-
Cite the technique by name. Never describe a probabilistic or numerical trick in vague terms. Name it:
Bloom filter,HyperLogLog,Count-Min Sketch,MinHash + LSH,Johnson–Lindenstrauss projection,FFT,NTT,fast exponentiation,Karatsuba,Strassen,sweep line,kd-tree,BVH,union-find with path compression,Floyd's cycle detection,Boyer-Moore majority,reservoir sampling,Knuth shuffle,Aho-Corasick,suffix automaton,segment tree with lazy propagation,Fenwick tree,monoid scan / parallel prefix. A named technique is auditable; "a smart approximation" is not. -
State the trade you are making. Every math-level optimization buys something at a cost. In one line:
- Buys:
space,time,wall-clock,parallelism. - Costs:
accuracy ε=?,code complexity,dependency,non-determinism,numerical stability. - If the cost is invisible to the caller, write "callers see no change".
- Buys:
-
Justify the asymptotic win. Do not propose a math technique without a one-line bound argument:
- "HyperLogLog: count uniques in O(log log n) bits at standard error 1.04/√m."
- "FFT: polynomial multiplication O(n log n) vs schoolbook O(n²)."
- "JL projection: preserves pairwise distances within (1±ε) using O(log n / ε²) dimensions."
- "Sweep line: rectangle overlap from O(n²) pair checks to O(n log n) events." No bound, no proposal.
-
Forbid math cargo-culting. Do not introduce these techniques when:
- n is small enough that a linear scan finishes in microseconds (n < ~10⁴ unless it is a hot path).
- The problem is I/O-bound — the math win disappears behind network/disk.
- Exact answers are required and no exact technique exists.
- The team will not maintain it (write that down: "team familiarity: ?").
The pre-proposal protocol
Before suggesting a math-level technique, your message must contain — in this order:
- The classical floor — what is the best non-mathy algorithm and its Big-O? ("Hash join is O(n+m); we're already there.")
- Why classical is not enough — n too large, space blows up, real-time deadline, etc.
- The math technique — named (rule 2).
- Exact or approximate — with ε if approximate (rule 1).
- The new bound — with one-line derivation (rule 4).
- The trade — buys/costs (rule 3).
- When NOT to use this — at least one disqualifier.
- The code or pseudocode.
If any of 1–7 is missing, do not propose the technique.
Playbook — math technique → problem → win → caveat
Sketches and probabilistic structures (massive data, approximate)
| Problem | Classical | Math technique | Win | Caveat |
|---|---|---|---|---|
| Membership: "have I seen this key?" at scale | Set<id>, O(n) space | Bloom filter | O(n) bits at chosen ε false-positive | False positives only; cannot remove (use Cuckoo if needed) |
| Count distinct values in a stream | Set to count, O(unique) space | HyperLogLog | O(log log n) bits, ~1% relative error | Approximate; cannot list elements |
| Top-K / heavy hitters in a stream | full counter, O(unique) space | Count-Min Sketch + heap | O(log(1/δ)·1/ε) space | Overestimates; choose ε,δ deliberately |
| Document / set similarity at scale | full Jaccard, O(n·m) | MinHash + LSH | Sub-linear ANN query | Tunes recall vs precision; param search |
| k-NN in high-dim vectors | brute O(n·d) | JL projection → HNSW / IVF | O(log n) per query, (1±ε) distortion | Index build cost; recall < 1 |
| Reservoir of size k from a stream of unknown length | buffer all, O(n) space | Reservoir sampling | O(k) space, uniform sample | Single-pass only |
| Find majority element | counter map | Boyer-Moore majority vote | O(1) space, O(n) time | Requires majority exists; verify pass |
| Quantiles in a stream | sort, O(n log n) | t-digest / GK | O(1/ε) space, ε-accurate quantiles | Approximate |
Fast arithmetic / transforms (numeric and combinatorial)
| Problem | Classical | Math technique | Win | Caveat |
|---|---|---|---|---|
| Multiply two polynomials / big integers | O(n²) | FFT / NTT / Karatsuba | O(n log n) | Floating FFT loses precision — use NTT for integers |
| Convolution of two signals | O(n·m) | FFT-based convolution | O((n+m) log(n+m)) | Numerical noise at very small magnitudes |
pow(a, b) mod p, b large | O(b) multiplications | Fast exponentiation (square-and-multiply) | O(log b) | Watch for overflow inside; use modular arithmetic |
| GCD of large integers | repeated subtraction | Euclidean algorithm | O(log min) | Standard; AI sometimes still writes the subtraction loop |
| Matrix multiplication, n large | O(n³) | Strassen (then Coppersmith-Winograd family) | O(n^2.81) | High constant; only wins for very large dense |
| Solving Ax=b for sparse A | O(n³) dense | Conjugate gradient / sparse LU | O(nnz · iterations) | Numerical conditioning matters |
| Modular inverse | brute force | Extended Euclidean or Fermat when p prime | O(log p) | p must be prime for Fermat |
Dimensionality reduction and linear algebra
| Problem | Classical | Math technique |