agent-research-aggregator

Should I run? (decision gate)

Before starting Phase 1, check whether aggregation is actually needed:

Situation	Action
`workspace/inputs/idea.md` and `workspace/inputs/experimental_log.md` both exist and are non-empty	Skip this skill entirely. Proceed directly to `paper-orchestra`.
Either file is missing or empty, and the user provided a directory path	Run this skill with that directory as `--search-roots`.
Either file is missing or empty, and no directory was provided	Scan cwd and `~` by default; show the discovery summary to the user before continuing.
The inputs exist but look thin (e.g. idea.md has < 5 lines, no numeric data in experimental_log.md)	Ask the user whether to supplement with aggregation or proceed as-is.

The skill is intentionally a pre-pass — it is cheap to skip and should only run when the structured inputs don't already exist.

A pre-processing skill for PaperOrchestra (arXiv:2604.05018). Reads scattered experimentation artifacts from AI coding-agent cache directories and synthesizes them into the structured (I, E) input pair the PaperOrchestra pipeline expects.

[.claude/]  [.cursor/]  [.antigravity/]  [.openclaw/]
      │            │              │               │
      └────────────┴──────────────┴───────────────┘
                          │
                    Phase 1: Discovery
                  (discover_logs.py)
                          │
                    discovered_logs.json
                          │
                    Phase 2: Extraction
                  (LLM call per log batch)
                          │
                    raw_experiments.json
                          │
                    Phase 3: Synthesis
                  (LLM call — consolidate)
                          │
                    synthesis.json
                          │
                    Phase 4: Formatting
                  (format_po_inputs.py)
                          │
             ┌────────────┴────────────┐
      workspace/inputs/         workspace/ara/
        idea.md                   aggregation_report.md
        experimental_log.md       discovered_logs.json
                                  raw_experiments.json
                                  synthesis.json

The output drops directly into workspace/inputs/ so the user can immediately run paper-orchestra on the same workspace.

Inputs

Parameter	Required	Default	Description
`--search-roots`	no	cwd, `~`	Comma-separated directories to scan for agent caches
`--agents`	no	all	Comma-separated subset: `claude,cursor,antigravity,openclaw`
`--workspace`	no	`./workspace`	PaperOrchestra workspace root
`--depth`	no	4	Max directory scan depth (prevents runaway scans on large home dirs)
`--since`	no	none	Only include logs modified after this date (ISO 8601: `2025-01-01`)

The user specifies these when invoking the skill, or you may ask them for --search-roots if the current directory has no detectable agent caches.

Phase 1 — Discovery (deterministic)

Run the discovery script to catalog every relevant log file:

python skills/agent-research-aggregator/scripts/discover_logs.py \
    --search-roots <roots> \
    --agents <agents> \
    --depth <depth> \
    --since <since> \
    --out workspace/ara/discovered_logs.json

The script exits with code 2 when no --project filter is set (this is expected on the first run). It prints a "Projects found" list to stdout — show it to the user immediately.

If no logs are found at all: stop and ask the user to specify --search-roots or point you at a directory that contains agent cache folders.

Phase 1.5 — Project Selection (mandatory)

A paper can only be written from a single project. You must ask the user which project to use before any LLM processing begins.

Display the numbered project list from the discovery summary, e.g.:

Projects found:
  [1] /home/alice/projects/my-rl-experiment  (42 files)
  [2] /home/alice/projects/llm-eval-suite    (17 files)
  [3] /home/alice/projects/old-demo          (3 files)

Ask: "Which project should this paper be based on? Please choose a number or paste the project path."
Do not proceed to Phase 2 until the user has answered.
Re-run discovery with the chosen project to filter the manifest:

python skills/agent-research-aggregator/scripts/discover_logs.py \
    --search-roots <roots> \
    --agents <agents> \
    --depth <depth> \
    --since <since> \
    --project "<chosen project path>" \
    --out workspace/ara/discovered_logs.json

This overwrites discovered_logs.json so only the selected project's files remain. The script exits 0 on success.

If the discovery finds only one project: skip the question and inform the user: "Only one project found: <path>. Using it for the paper." — then re-run with --project automatically.

If the discovery summary shows irrelevant files after filtering: ask the user whether to include or exclude them before continuing to Phase 2. Err on the side of inclusion — the extraction prompt is conservative.

Phase 2 — Extraction (LLM-assisted)

Process discovered logs in batches (group by agent type; keep batches under ~50 KB of raw text to stay within context limits):

For each batch:

Read the log files in the batch (the script's --list output tells you which file paths to read).
Apply the extraction prompt from references/extraction-prompt.md as your system message.
Pass the raw log text as the user message.
Collect the structured JSON the LLM returns (see schema in the prompt).
Append to workspace/ara/raw_experiments.json.

After all batches:

python skills/agent-research-aggregator/scripts/extract_experiments.py \
    --discovered workspace/ara/discovered_logs.json \
    --out workspace/ara/raw_experiments.json \
    --validate-only

Run this in --validate-only mode to check the combined JSON is well-formed and meets the minimum schema (experiments array non-empty, each entry has hypothesis or method or results). Fix any malformed entries before Phase 3.

Phase 3 — Synthesis (LLM-assisted)

Consolidate possibly-redundant experiment records from multiple agent caches into a single coherent research narrative. This is ONE LLM call.

System message: Use references/synthesis-prompt.md verbatim.

User message:

<raw_experiments>
{contents of workspace/ara/raw_experiments.json}
</raw_experiments>

The LLM must return a synthesis.json with keys:

research_question — the overarching question being investigated
hypothesis — the core proposed solution / claim
method_summary — how the approach works (concise, no data leakage)
key_contributions — 2–5 bullet strings
experimental_setup — datasets, metrics, baselines, implementation notes
results_tables — array of {title, headers[], rows[]} markdown-table objects
qualitative_observations — free-form text blocks (what worked, what didn't, failure modes, ablation insights)
iteration_history — ordered list of {iteration_id, change_description, outcome} entries if multiple iterations are detected
open_questions — questions that remain unanswered in the logs

Save to workspace/ara/synthesis.json.

Note: By this point, the user has already selected a single project in Phase 1.5. The synthesis should represent one coherent research thread. If the LLM still surfaces multiple disconnected research questions, flag this as a data quality warning in the audit report (Phase 5) but do not re-ask for project selection — that decision was made earlier.

Phase 4 — Formatting (deterministic)

Convert synthesis.json into PaperOrchestra input files:

pyt

agent-research-aggregator

Cómo agregar

Pega en el README de tu repo

Skills relacionadas

dev-browser

agent-browser

understand-chat

understand-dashboard

Recibe nuevas skills de Pesquisa e Web todos los lunes

agent-research-aggregator

Should I run? (decision gate)

Inputs

Phase 1 — Discovery (deterministic)

Phase 1.5 — Project Selection (mandatory)

Phase 2 — Extraction (LLM-assisted)

Phase 3 — Synthesis (LLM-assisted)

Phase 4 — Formatting (deterministic)

Comentarios · Sin comentarios