Brief Outline Generator
Generates a content outline as a formatted .docx file. The output is a skeleton — section headings, short topic prompts, angles for each section — that a writer fills in with their own conclusions, numbers, and prose.
This is an outline generator, not a brief generator. If your output reads like an article in note form, you've gone too far. Read references/section-rules.md before generating anything.
The DOCX is always produced by running scripts/generate-brief.py. Do not reimplement the renderer. Do not write inline docx code. Assemble the config JSON and run the script.
Inputs — collect from user before proceeding
| Field | Required | Notes |
|---|---|---|
title | ✅ | Blog post title. Warn if > 70 chars. |
focus_keyword | ✅ | Primary keyword |
sitemap_url | ✅ | Full sitemap URL, e.g. https://firefly.ai/sitemap.xml. domain_url is derived from this automatically. |
word_count_range | ✅ | e.g. 1500-2000 |
target_intent | ✅ | Informational, Commercial, Transactional, or Navigational |
target_product | ⬜ | Product name — triggers a product integration section if provided |
secondary_keywords | ⬜ | Pre-supplied list; skip generation if provided |
Execution workflow — follow these steps in order
Step 1 — Validate inputs
title: non-empty; warn (don't block) if > 70 charsfocus_keyword: non-emptysitemap_url: non-empty, starts withhttp://orhttps://(accept any path —.xml,.xml.gz, or extensionless dynamic URLs are all valid)word_count_range: two positive integers separated by-(e.g.1500-2000)target_intent: one ofInformational,Commercial,Transactional,Navigational(case-insensitive)
Derive two values from sitemap_url:
domain_url— strip the path to get the origin, e.g.https://firefly.ai/sitemap.xml→https://firefly.ai. Used in the config JSON.domain_hostname— strip the protocol too, e.g.https://firefly.ai→firefly.ai. Used in tool calls that require a bare hostname (e.g.site:firefly.ai).
Stop and report all validation errors before proceeding.
Step 2 — Read the rules
Read references/section-rules.md in full now. It contains:
- The outline-vs-brief distinction (the most important rule)
- Hard bullet rules (≤ 12 words, no invented numbers, no conclusions, no em-dash clauses)
- Hard structure rules (no
topic_summary, no Writer Directives box, TLDR carries 2–3 topic pointers) - The four archetypes (Listicle, Comparison, How-to, Concept/Explainer) and their section sets
- Per-section rules and good/bad bullet examples
- A final quality check to run before generating the DOCX
Do not skip this step. Generating without reading the rules produces brief-style output every time.
Step 3 — Domain analysis
DataForSEO tools are deferred — load them before calling. Call tool_search(query="on_page content parsing") at the start of this step.
3a — Discover site URLs
Call web_fetch on sitemap_url.
- Readable XML (response contains
<loc>tags) → extract every<loc>URL. These are your site URLs. - Binary / compressed (unreadable response) → fall back: call
dataforseo:serp_organic_live_advancedwithkeyword="site:{domain_hostname}"(e.g.site:firefly.ai) to get all indexed URLs.
Cap at 20 URLs. If more exist, prioritise: homepage → product/use-case pages → blog/resource pages.
3b — Read full page content
For each URL from 3a, call dataforseo:on_page_content_parsing with enable_javascript: true.
This returns fully rendered page text — headings, body copy, product descriptions, etc. Collect all output.
If a page fails (bot protection, timeout), skip it and continue — do not abort.
3c — Extract meta title and description
From the homepage content parsed in 3b, extract:
- Page
<title>→meta_title <meta name="description">content →meta_description
Surface character count advisories if outside recommended ranges:
meta_titleoutside 50–60 chars → "Note: fetched meta title is N chars (recommended: 50–60)."meta_descriptionoutside 150–160 chars → "Note: fetched meta description is N chars (recommended: 150–160)."
These are passed into the config as meta_title and meta_description and rendered in the metadata table.
3d — Compile domain_context
From all parsed page content, extract and store:
- Product name and core value proposition
- Key technical terms and vocabulary used on the site
- Target audience signals (roles, team types, use cases mentioned)
- Existing content topics — used to avoid duplication in the outline
Store as domain_context. Do not render domain_context in the output document. If all fetches fail, set domain_context = null and continue.
Step 4 — Classify the title's archetype
Based on the title pattern, pick one of:
- Listicle / Tool Roundup — "Top N", "Best X", "X Alternatives"
- Comparison / Versus — "X vs Y", "X or Y"
- How-to / Implementation Guide — "How to X", "How do X teams Y", "Implementing X"
- Concept / Explainer — "What is X", "Designing X", "[Function/Pattern]: Designing X"
If multiple seem to fit, use the defaults from section-rules.md. If still unclear, ask the user.
Announce the chosen archetype to the user before generating — one line, e.g. "Detected archetype: How-to. Generating outline with intro → strategies → case studies → implementation → FAQs." Let them override if they disagree.
Step 5 — Generate keyword volumes
Fetch USA monthly search volumes for the focus keyword and each secondary keyword using DataForSEO.
DataForSEO tools are deferred — load them before calling.
- Call
tool_search(query="keyword search volume google ads")to load the DataForSEO keyword tools. - Call
dataforseo:kw_data_google_ads_search_volumewith the full list of keywords (focus + all secondaries) in a single call. Uselocation_codefor the USA (2840). - From the response, extract the
search_volumefield for each keyword. - Format each volume as a thousands-separated string (e.g.
3400→"3,400"). Volumes under 1,000 stay as plain digits (e.g."500","30"). Volume of0should be rendered as"0", not"N/A"— it's a real datapoint. - If a keyword returns no data, set its volume to
"N/A"and continue. Don't abort the whole run. - If
tool_searchreturns no DataForSEO tools (connector not installed), set every volume to"N/A"and surface a one-line warning: "DataForSEO connector not available — keyword volumes set to N/A."
Never omit the volume field on any keyword. Every row must have one.
Flag volume mismatches. If the focus keyword's volume is more than 10× smaller than any secondary keyword's volume, tell the user before generating: "Note: your focus keyword has volume X, but secondary keyword Y has volume Z. Consider whether Y should be the focus." Let them decide; don't auto-swap.
Store as:
"focus_keyword_volume": "2,400",
"secondary_keywords": [
{ "keyword": "disaster recovery plan", "volume": "1,900" },
{ "keyword": "cloud DR strategy", "volume": "N/A" }
]
If no secondary keywords were supplied, generate 5 by combining base terms from the focus keyword, top domain key terms, and modifiers ("best practices", "guide", "checklist", "for teams", current year). Then fetch volumes for them the same way.
Step 6 — Build the outline using the archetype's section set
Use the section set for the archetype you chose in Step 4. Do not force every topic into a how-to template. A listicle has no Problem or Case Studies section. A comparison has no Implementation steps.
Each section object shape:
{
"heading": "H2",
"title": "Section Title",
"rules": ["short topic prompt 1", "short topic prompt 2"],
"subsections": [
{
"heading": "H3",
"title": "Subsection Titl