narrator-ai-cli — AI Video Narration CLI Skill
CLI client for Narrator AI video narration API. Designed for AI agents and developers.
- CLI repo: https://github.com/NarratorAI-Studio/narrator-ai-cli
- Resources preview (BGM / dubbing / templates): https://ceex7z9m67.feishu.cn/wiki/WLPnwBysairenFkZDbicZOfKnbc
Reference Index
This file covers decision flow, the common workflow, and pointers. Detailed lookups live in references/:
| Topic | File |
|---|---|
| Resource selection (material / BGM / dubbing / templates) — list commands, response formats, field mapping | references/resources.md |
| Full workflow steps with parameter tables and JSON examples (Fast Path + Standard Path) | references/workflows.md |
| Magic Video — optional visual template step (catalog, params, language rules) | references/magic-video.md |
| Polling pattern, task types, file ops, user account, error codes | references/operations.md |
Pipeline at a Glance
┌─── Fast Path (原创文案, cheaper) ───┐
│ fast-writing → fast-clip-data │
Source material ──┤ ↓ ├──→ video-composing ──→ (magic-video)
(material list / │ [video-composing keys off │ final MP4 URL optional visual
search-movie / │ fast-clip-data.task_order_num] │ template pass
file upload) └─────────────────────────────────────┘
┌─── Standard Path (二创文案) ────────┐
│ popular-learning → generate- │
│ writing → clip-data │
│ ↓ │
│ [video-composing keys off │
│ generate-writing.task_order_num] │
└─────────────────────────────────────┘
Agent Rules (mandatory — apply across all steps)
Always:
- Confirm before acting. Every resource (source, BGM, dubbing, template) and every
magic-videosubmission requires explicit user approval. Never auto-select, never auto-submit.- Source data, never invent. Construct
confirmed_movie_jsonfrommaterial listfields ortask search-movieoutput. If neither yields it, ask the user — do not fabricate.- Honor the language chain. The dubbing voice's language defines the writing task
languageparam AND everymagic-videotext param. All three must match. →references/magic-video.md§ Language Awareness- Paginate
material listto exhaustion, search programmatically. Fetch all pages untiltotalis consumed, thengrep -iorpython3 -con the JSON. Never trust truncated terminal display.- Poll with the canonical
whileloop at 5-second intervals. Never use a fixed-iterationforloop. →references/operations.md§ Task PollingNever:
- Submit
magic-videowithout showing the full request body (templates + everytemplate_paramsvalue) and getting user confirmation. The cost is 30 pts/minute and irreversible.- Submit Chinese default values for
magic-videotext params when narration language is non-Chinese. The defaults are hardcoded Chinese and will appear as Chinese text in a non-Chinese video.- Submit
.task_id(32-char hex) asorder_num. Downstream tasks want.task_order_num(the prefixed string likegenerate_writing_xxxxx), not.task_id. Submitting the hex returns10001 任务关联记录数据异常. The other look-alike —.results.order_info.order_num(script_xxxxx) — is also wrong; seereferences/operations.md§ Task Query Response Shape.- Auto-switch paths after a failure. If a step fails, surface the error to the user and ask explicitly: retry the same path, switch to the other path, or abort. Never infer a path switch on the agent's own initiative.
Prerequisites
This skill assumes the narrator-ai-cli binary is installed and configured with a valid NARRATOR_APP_KEY. See README.md for install / setup. Agents can verify with narrator-ai-cli user balance.
Core Concepts
| Concept | Description |
|---|---|
| file_id | 32-char hex string for uploaded files. Via file upload or task results |
| task_id | 32-char hex string returned on task creation. Poll with task query |
| task_order_num | Assigned after task creation. Used as order_num for downstream tasks |
| files[] | Output files in the completed task response (flat, top-level array). Each entry has file_id, file_path, suffix. Read .files[0].file_id for the next step's input |
| learning_model_id | Narration style model — from a pre-built template (90+) or popular-learning result |
| learning_srt | Reference SRT file_id. Mutually exclusive with learning_model_id |
Conversation Initiation
⚠️ Agent behavior — first message of a session: Before asking the user for a movie title or workflow path, proactively orient them about what the skill offers. Most users assume they need to upload their own video + SRT and don't realize a pre-built material library ships with the skill. Skipping this step often results in unnecessary uploads or aborted sessions.
Required opening (adapt to the conversation language):
- Lead with the pre-built material library. Mention upfront that ~100 ready-to-use movies are available with video + SRT already loaded — no upload needed in most cases.
- Offer three concrete entry points (let the user pick one):
- "I have a specific movie in mind" → take the title, search materials first, fall back to
task search-movieonly if not found - "Show me what's available" → run
material list --jsonand present 5–8 titles spanning varied genres; offer to filter by genre on request - "I'll upload my own video + SRT" → guide through
file upload
- "I have a specific movie in mind" → take the title, search materials first, fall back to
- Defer the Fast vs Standard path question until source material is confirmed. Asking both at once forces a decision the user has no context for yet.
- Optionally share the visual resources preview link (BGM / dubbing / templates browsable visually): https://ceex7z9m67.feishu.cn/wiki/WLPnwBysairenFkZDbicZOfKnbc — but only if the user wants to browse, not as a wall of links upfront.
Example opening (Chinese conversation):
你好,欢迎使用 AI 解说大师。这个技能可以帮你生成电影/短剧解说视频。我这边内置了约 100 部电影素材(视频 + 字幕都是现成的),所以大多数情况你不需要自己上传任何文件。
你想怎么开始?
- 直接告诉我片名 — 我先查内置素材库,没有再去外部搜
- 让我列一些内置素材 — 你可以按类型挑(喜剧 / 动作 / 悬疑 / 科幻…)
- 自己上传视频 + 字幕 — 我引导你完成上传流程
After source material is confirmed, walk the user through the decision sequence below — one question per turn, in order. Do NOT collapse multiple decisions into one message; users cannot reason about target_mode before they've picked a path.
Decision sequence (each step waits for explicit user confirmation):
- Source material — covered above.
- Workflow path — Fast (原创文案) or Standard (二创文案). See "Two Workflow Paths" below.
target_mode— only ask if path = Fast. Choose mode 1 / 2 / 3 (see "Fast Path internal:target_mode" below). If path = Standard, skip this question entirely — Standard Path has notarget_mode.- BGM → Dubbing voice → Narration template — see "Resource Selection Protocol".
⚠️ Anti-pattern (do NOT do this): Asking "① 解说模式 (纯解说/原声混剪) ② 制作路线 (快速/标准)" in the same message.
纯解说and原声混剪are Fast Path internal modes (target_mode 1 vs 2). They do not exist in Standard Path. Asking them alongside the path choice forces the user to make decisions in the wrong order and conflates two layers of the decision tree.
Two Workflow Paths
Two end-to-end paths produce a finished narrated video. Choose with the user before starting.
| Fast Path (原创文案, recommended) | Standard Path (二创文案) | |
|---|---|---|
| Pipeline | material → fast-writing → fast-clip-data → video-composing → magic-video* | material → popular-learning** → generate-writing → |