narrator-ai-cli — AI Video Narration CLI Skill

CLI client for Narrator AI video narration API. Designed for AI agents and developers.

CLI repo: https://github.com/NarratorAI-Studio/narrator-ai-cli
Resources preview (BGM / dubbing / templates): https://ceex7z9m67.feishu.cn/wiki/WLPnwBysairenFkZDbicZOfKnbc

Reference Index

This file covers decision flow, the common workflow, and pointers. Detailed lookups live in references/:

Topic	File
Resource selection (material / BGM / dubbing / templates) — list commands, response formats, field mapping	`references/resources.md`
Full workflow steps with parameter tables and JSON examples (Fast Path + Standard Path)	`references/workflows.md`
Magic Video — optional visual template step (catalog, params, language rules)	`references/magic-video.md`
Polling pattern, task types, file ops, user account, error codes	`references/operations.md`

Pipeline at a Glance

                    ┌─── Fast Path (原创文案, cheaper) ───┐
                    │   fast-writing → fast-clip-data     │
  Source material ──┤              ↓                      ├──→ video-composing ──→ (magic-video)
  (material list /  │   [video-composing keys off         │   final MP4 URL       optional visual
   search-movie /   │    fast-clip-data.task_order_num]   │                        template pass
   file upload)     └─────────────────────────────────────┘
                    ┌─── Standard Path (二创文案) ────────┐
                    │   popular-learning → generate-      │
                    │   writing → clip-data               │
                    │              ↓                      │
                    │   [video-composing keys off         │
                    │    generate-writing.task_order_num] │
                    └─────────────────────────────────────┘

Agent Rules (mandatory — apply across all steps)

Always:

Confirm before acting. Every resource (source, BGM, dubbing, template) and every magic-video submission requires explicit user approval. Never auto-select, never auto-submit.

Source data, never invent. Construct confirmed_movie_json from material list fields or task search-movie output. If neither yields it, ask the user — do not fabricate.

Honor the language chain. The dubbing voice's language defines the writing task language param AND every magic-video text param. All three must match. → references/magic-video.md § Language Awareness

Paginate material list to exhaustion, search programmatically. Fetch all pages until total is consumed, then grep -i or python3 -c on the JSON. Never trust truncated terminal display.

Poll with the canonical while loop at 5-second intervals. Never use a fixed-iteration for loop. → references/operations.md § Task Polling

Never:

Submit magic-video without showing the full request body (templates + every template_params value) and getting user confirmation. The cost is 30 pts/minute and irreversible.

Submit Chinese default values for magic-video text params when narration language is non-Chinese. The defaults are hardcoded Chinese and will appear as Chinese text in a non-Chinese video.

Submit .task_id (32-char hex) as order_num. Downstream tasks want .task_order_num (the prefixed string like generate_writing_xxxxx), not .task_id. Submitting the hex returns 10001 任务关联记录数据异常. The other look-alike — .results.order_info.order_num (script_xxxxx) — is also wrong; see references/operations.md § Task Query Response Shape.

Auto-switch paths after a failure. If a step fails, surface the error to the user and ask explicitly: retry the same path, switch to the other path, or abort. Never infer a path switch on the agent's own initiative.

Prerequisites

This skill assumes the narrator-ai-cli binary is installed and configured with a valid NARRATOR_APP_KEY. See README.md for install / setup. Agents can verify with narrator-ai-cli user balance.

Core Concepts

Concept	Description
file_id	32-char hex string for uploaded files. Via `file upload` or task results
task_id	32-char hex string returned on task creation. Poll with `task query`
task_order_num	Assigned after task creation. Used as `order_num` for downstream tasks
files[]	Output files in the completed task response (flat, top-level array). Each entry has `file_id`, `file_path`, `suffix`. Read `.files[0].file_id` for the next step's input
learning_model_id	Narration style model — from a pre-built template (90+) or `popular-learning` result
learning_srt	Reference SRT file_id. Mutually exclusive with `learning_model_id`

Conversation Initiation

⚠️ Agent behavior — first message of a session: Before asking the user for a movie title or workflow path, proactively orient them about what the skill offers. Most users assume they need to upload their own video + SRT and don't realize a pre-built material library ships with the skill. Skipping this step often results in unnecessary uploads or aborted sessions.

Required opening (adapt to the conversation language):

Lead with the pre-built material library. Mention upfront that ~100 ready-to-use movies are available with video + SRT already loaded — no upload needed in most cases.
Offer three concrete entry points (let the user pick one):
- "I have a specific movie in mind" → take the title, search materials first, fall back to task search-movie only if not found
- "Show me what's available" → run material list --json and present 5–8 titles spanning varied genres; offer to filter by genre on request
- "I'll upload my own video + SRT" → guide through file upload
Defer the Fast vs Standard path question until source material is confirmed. Asking both at once forces a decision the user has no context for yet.
Optionally share the visual resources preview link (BGM / dubbing / templates browsable visually): https://ceex7z9m67.feishu.cn/wiki/WLPnwBysairenFkZDbicZOfKnbc — but only if the user wants to browse, not as a wall of links upfront.

Example opening (Chinese conversation):

你好，欢迎使用 AI 解说大师。这个技能可以帮你生成电影/短剧解说视频。我这边内置了约 100 部电影素材（视频 + 字幕都是现成的），所以大多数情况你不需要自己上传任何文件。

你想怎么开始？

直接告诉我片名 — 我先查内置素材库，没有再去外部搜

让我列一些内置素材 — 你可以按类型挑（喜剧 / 动作 / 悬疑 / 科幻…）

自己上传视频 + 字幕 — 我引导你完成上传流程

After source material is confirmed, walk the user through the decision sequence below — one question per turn, in order. Do NOT collapse multiple decisions into one message; users cannot reason about target_mode before they've picked a path.

Decision sequence (each step waits for explicit user confirmation):

Source material — covered above.
Workflow path — Fast (原创文案) or Standard (二创文案). See "Two Workflow Paths" below.
target_mode — only ask if path = Fast. Choose mode 1 / 2 / 3 (see "Fast Path internal: target_mode" below). If path = Standard, skip this question entirely — Standard Path has no target_mode.
BGM → Dubbing voice → Narration template — see "Resource Selection Protocol".

⚠️ Anti-pattern (do NOT do this): Asking "① 解说模式 (纯解说/原声混剪) ② 制作路线 (快速/标准)" in the same message. 纯解说 and 原声混剪 are Fast Path internal modes (target_mode 1 vs 2). They do not exist in Standard Path. Asking them alongside the path choice forces the user to make decisions in the wrong order and conflates two layers of the decision tree.

Two Workflow Paths

Two end-to-end paths produce a finished narrated video. Choose with the user before starting.

	Fast Path (原创文案, recommended)	Standard Path (二创文案)
Pipeline	material → fast-writing → fast-clip-data → video-composing → magic-video*	material → popular-learning** → generate-writing →

narrator-ai-cli

How to add

Drop this on your repo README

Related skills

MoneyPrinterTurbo

weather-svg-creator

azure-keyvault-secrets-rust

azure-monitor-ingestion-py

Get new Automação skills every Monday