Product Analytics Setup
A senior PM and analyst's playbook for instrumenting product analytics correctly the first time.
Most product analytics setups are some combination of inherited mistakes, dashboard sprawl, and events nobody trusts. The team launches a new feature; instrumentation gets bolted on under deadline pressure; naming drifts; properties are inconsistent; six months later nobody can answer simple questions because the answer depends on which event you trust.
This skill is the discipline that prevents that. It assumes you have answered the strategic questions about what to measure (see analytics-strategy). It assumes you have a tool connected (Mixpanel, Heap, PostHog, Amplitude, or warehouse-native via BigQuery, Snowflake, or dbt). The hard part is the systematic execution: naming conventions, property design, schema versioning, funnel construction, cohort definitions, retention measurement.
When to use this skill: setting up product analytics from scratch, auditing an existing instrumentation, fixing a "we have data but cannot trust it" problem, or designing instrumentation for a new feature.
What this skill is for
This skill spans instrumentation execution. It does not cover measurement strategy (use analytics-strategy), experimentation result interpretation (use experimentation-analytics), paid media analytics (use ads-performance-analytics), or platform decisions (use experimentation-platform-orchestrator). Pair this skill with the relevant integrations microsite for your specific tool.
The clean distinction from analytics-strategy. That skill (Growth category) is strategic: what to measure and why, KPI hierarchy, dashboard architecture, attribution models. This skill (Product category) is execution: how to actually instrument the product correctly. The two compose. Read analytics-strategy first to decide what matters; read this skill to instrument it.
The instrumentation hierarchy
The mental model. Every analytics setup is a stack of layers. Each layer depends on the one below it being correct.
- Events are the atomic facts:
user_signed_up,checkout_completed,feature_x_used. - Properties describe events: who, what, where, when, with what context.
- Identities map events to people:
anonymous_id,user_id,account_id. - Cohorts are filters across events: "users acquired via paid in March."
- Funnels are sequences of events: signup, then activated, then first paid action.
- Retention measures repeat behavior: signups still active at week N.
You cannot construct higher levels without correct lower levels. Garbage events produce garbage funnels. The discipline is bottom-up. Most "we have data but cannot trust it" problems trace back to the bottom two layers.
Event taxonomy design
Three rules for event design.
- Past tense, action-oriented.
checkout_completed, notcheckout_completeorcompleting_checkout. Past tense reads as "this happened" rather than as a state. - Object-action format. Noun then verb.
video_played,form_submitted,email_opened. Reading the event name aloud should describe what happened. - Granular but not redundant. Track distinct user actions, not button clicks. Fire
checkout_completedonce at the moment of completion, notsubmit_button_clickedpluscheckout_completed. UI events are noise; semantic events are signal.
The verbs vs states trap.
- Verbs ARE events.
checkout_completed,subscription_canceled,account_upgraded. - States are NOT events; they are properties.
user_status: activeis a property on the user, not an event. Setting state via events ("status_changed_to_active") is a code smell that produces double-counting.
How many events to design. Thirty to fifty events is the sweet spot for a typical SaaS product. Below twenty means under-instrumented; above one hundred almost always means tracking UI noise or duplicating events in different formats.
Detail and a canonical event spec in references/event-taxonomy-template.md.
Property design: event-level vs user-level
Two property types, treated separately.
Event-level properties describe THIS event. The checkout_completed event has properties like cart_value, item_count, payment_method, discount_code. They live on the event payload and are immutable once fired.
User-level properties describe the USER over time. subscription_tier, lifetime_value, acquisition_channel. Set them once on the user profile; the analytics tool joins them onto every event the user fires. They update over time as the user changes.
The trap. Putting user-level properties on every event. Do not track subscription_tier on every event payload; set it once on the user profile and rely on the join. Putting it on the event creates payload bloat, schema drift when the value changes, and reporting confusion when a user upgrades mid-session.
Data type discipline.
- Strings for enums: status, tier, channel, region. Enumerable values where the set is bounded.
- Numbers only for actual numbers: count, value, duration, score. Never use strings for numeric data ("free trial day 7" should be
trial_day: 7). - Booleans for actual booleans:
is_admin,has_trial,is_new_user. Two values; nothing else. - Timestamps in ISO 8601, always. Always. The number of bugs caused by inconsistent date formats is uncountable.
- Arrays rarely. An array property is usually a sign you should split into multiple events with one item per event.
Worked example in references/property-design-patterns.md showing right and wrong design for a product_viewed event.
Naming conventions
Pick ONE convention and enforce it. Three conventions worth picking.
- snake_case for events and properties:
user_signed_up,cart_value. Most platforms default to this; pushback is rarely worth it. - Object-action format for events:
user_signed_up,video_played. Reading the name should describe what happened. - Verb-noun for user properties (or just nouns):
subscription_tier,is_admin,last_active_at.
What NOT to do.
- Mixed case across events.
user_signedUp,User Signed Up,userSignedUpall coexisting in the same project. Pick one and migrate. - Spaces in event or property names. "Sign Up Completed" breaks every URL-encoding scenario and confuses every tool.
- Inconsistent verbs.
user_signed_uppluscompletedCheckoutplusVIEW_PRODUCTin the same project means nobody can predict an event name without looking it up. - Brand names in event names.
mailchimp_email_openedages badly when you switch to Customer.io.
The naming convention reference file provides a complete style guide. Cite it in your team's data contract.
Detail in references/naming-convention-reference.md.
Schema versioning
Schema changes are inevitable. The pattern.
Additive changes are safe. New event, new property on an existing event, new value in an enum. Just ship. Existing dashboards continue to work.
Breaking changes require migration. Renamed event, removed property, changed property type, narrowed enum. These break dashboards downstream; the migration plan is part of the change.
Versioning patterns.
- Append
_v2to events when semantics change.checkout_completed_v2fires alongsidecheckout_completedduring a transition. - Keep old events firing during the transition (90 days is typical). Both versions fire; analytics queries gradually migrate to v2.
- Migrate dashboards to v2 before retiring v1. Then deprecate v1 explicitly with a documentation note.
The data contract idea.
- Document the canonical schema in code: TypeScript interface, JSON Schema, or Protobuf definition.
- Code review every schema change. Schema is product, not afterthought.
- CI lint rejects schema