Meeting Insights Analyzer
Originally contributed by maximcoding — enhanced and integrated by the claude-skills team.
Transform meeting transcripts into concrete, evidence-backed feedback on communication patterns, leadership behaviors, and interpersonal dynamics.
Core Workflow
1. Ingest & Inventory
Scan the target directory for transcript files (.txt, .md, .vtt, .srt, .docx, .json).
For each file:
- Extract meeting date from filename or content (expect
YYYY-MM-DDprefix or embedded timestamps) - Identify speaker labels — look for patterns like
Speaker 1:,[John]:,John Smith 00:14:32, VTT/SRT cue formatting - Detect the user's identity: ask if ambiguous, otherwise infer from the most frequent speaker or filename hints
- Log: filename, date, duration (from timestamps), participant count, word count
Print a brief inventory table so the user confirms scope before heavy analysis begins.
2. Normalize Transcripts
Different tools produce wildly different formats. Normalize everything into a common internal structure before analysis:
{ speaker: string, timestamp_sec: number | null, text: string }[]
Handling per format:
- VTT/SRT: Parse cue timestamps + text. Speaker labels may be inline (
<v Speaker>) or prefixed. - Plain text: Look for
Name:or[Name]prefixes per line. If no speaker labels exist, warn the user that per-speaker analysis is limited. - Markdown: Strip formatting, then treat as plain text.
- DOCX: Extract text content, then treat as plain text.
- JSON: Expect an array of objects with
speaker/textfields (common Otter/Fireflies export).
If timestamps are missing, degrade gracefully — skip timing-dependent metrics (speaking pace, pause analysis) but still run text-based analysis.
3. Analyze
Run all applicable analysis modules below. Each module is independent — skip any that don't apply (e.g., skip speaking ratios if there are no speaker labels).
Module: Speaking Dynamics
Calculate per-speaker:
- Word count & percentage of total meeting words
- Turn count — how many times each person spoke
- Average turn length — words per uninterrupted speaking turn
- Longest monologue — flag turns exceeding 60 seconds or 200 words
- Interruption detection — a turn that starts within 2 seconds of the previous speaker's last timestamp, or mid-sentence breaks
Produce a per-meeting summary and a cross-meeting average if multiple transcripts exist.
Red flags to surface:
- User speaks > 60% in a 1:many meeting (dominating)
- User speaks < 15% in a meeting they're facilitating (disengaged or over-delegating)
- One participant never speaks (excluded voice)
- Interruption ratio > 2:1 (user interrupts others twice as often as they're interrupted)
Module: Conflict & Directness
Scan the user's speech for hedging and avoidance markers:
Hedging language (score per-instance, aggregate per meeting):
- Qualifiers: "maybe", "kind of", "sort of", "I guess", "potentially", "arguably"
- Permission-seeking: "if that's okay", "would it be alright if", "I don't know if this is right but"
- Deflection: "whatever you think", "up to you", "I'm flexible"
- Softeners before disagreement: "I don't want to push back but", "this might be a dumb question"
Conflict avoidance patterns (requires more context, flag with confidence level):
- Topic changes after tension (speaker A raises problem → user pivots to logistics)
- Agreement-without-commitment: "yeah totally" followed by no action or follow-up
- Reframing others' concerns as smaller than stated: "it's probably not that big a deal"
- Absent feedback in 1:1s where performance topics would be expected
For each flagged instance, extract:
- The full quote (with surrounding context — 2 turns before and after)
- A severity tag:
low(single hedge word),medium(pattern of hedging in one exchange),high(clearly avoided a necessary conversation) - A rewrite suggestion: what a more direct version would sound like
Module: Filler Words & Verbal Habits
Count occurrences of: "um", "uh", "like" (non-comparative), "you know", "actually", "basically", "literally", "right?" (tag question), "so yeah", "I mean"
Report:
- Total count per meeting
- Rate per 100 words spoken (normalizes across meeting lengths)
- Breakdown by filler type
- Contextual spikes — do fillers increase in specific situations? (e.g., when responding to a senior stakeholder, when giving negative feedback, when asked a question cold)
Only flag this as an issue if the rate exceeds ~3 per 100 words. Below that, it's normal speech.
Module: Question Quality & Listening
Classify the user's questions:
- Closed (yes/no): "Did you finish the report?"
- Leading (answer embedded): "Don't you think we should ship sooner?"
- Open genuine: "What's blocking you on this?"
- Clarifying (references prior speaker): "When you said X, did you mean Y?"
- Building (extends another's idea): "That's interesting — what if we also Z?"
Good listening indicators:
- Clarifying and building questions (shows active processing)
- Paraphrasing: "So what I'm hearing is..."
- Referencing a point someone made earlier in the meeting
- Asking quieter participants for input
Poor listening indicators:
- Asking a question that was already answered
- Restating own point without acknowledging the response
- Responding to a question with an unrelated topic
Report the ratio of open/clarifying/building vs. closed/leading questions.
Module: Facilitation & Decision-Making
Only apply when the user is the meeting organizer or facilitator.
Evaluate:
- Agenda adherence: Did the meeting follow a structure or drift?
- Time management: How long did each topic take vs. expected?
- Inclusion: Did the facilitator actively draw in quiet participants?
- Decision clarity: Were decisions explicitly stated? ("So we're going with option B — Sarah owns the follow-up by Friday.")
- Action items: Were they assigned with owners and deadlines, or left vague?
- Parking lot discipline: Were off-topic items acknowledged and deferred, or did they derail?
Module: Sentiment & Energy
Track the emotional arc of the user's language across the meeting:
- Positive markers: enthusiastic agreement, encouragement, humor, praise
- Negative markers: frustration, dismissiveness, sarcasm, curt responses
- Neutral/flat: low-energy responses, monosyllabic answers
Flag energy drops — moments where the user's engagement visibly decreases (shorter turns, less substantive responses). These often correlate with discomfort, boredom, or avoidance.
4. Output the Report
Structure the final output as a single cohesive report. Use this skeleton — omit any section where data was insufficient:
# Meeting Insights Report
**Period**: [earliest date] – [latest date]
**Meetings analyzed**: [count]
**Total transcript words**: [count]
**Your speaking share (avg)**: [X%]
---
## Top 3 Findings
[Rank by impact. Each finding gets 2-3 sentences + one concrete example with a direct quote and timestamp.]
## Detailed Analysis
### Speaking Dynamics
[Stats table + narrative interpretation + flagged red flags]
### Directness & Conflict Patterns
[Flagged instances grouped by pattern type, with quotes and rewrites]
### Verbal Habits
[Filler word stats, contextual spikes, only if rate > 3/100 words]
### Listening & Questions
[Question type breakdown, listening indicators, specific examples]
### Facilitation
[Only if applicable — agenda, decisions, action items]
### Energy & Sentiment
[Arc summary, flagged drops]
## Strengths
[3 specific things the user does well, with evidence]
## Growth Opportunities
[3 ranked by impact, each with: what to change, why it matters, a concrete "try this next time" action]
## Comparison to Previous Period
[Only if prior analysis exists — delta on