pipeline-eval

Name: pipeline-eval
Rating: 5 (1 reviews)
Author: EvXata

A system-level evaluation framework for multi-stage LLM pipelines, scoring the entire pipeline across 8 dimensions including input/output quality and prompt design. It complements `deepeval` by evaluating the pipeline architecture itself, rather than single content artifacts.

1stars

Updated 2 months ago

View on GitHub ↗License: MIT

How to add

/plugin marketplace add EvXata/deepeval-bcg

The exact command may vary by repository. Check the README on GitHub.

For the skill author

Drop this on your repo README

Shows your skill is listed on Skillteca, generates a backlink and trackable traffic.

[![Listada na Skillteca](https://www.skillteca.com.br/api/badge/pipeline-eval-mpsg60nv/svg)](https://www.skillteca.com.br/skills/pipeline-eval-mpsg60nv?utm_source=badge&utm_medium=readme&utm_campaign=badge)

#llm

Related skills

See all in Design e Frontend →

webapp-testing

153.1k

Toolkit for interacting with and testing local web applications using Playwright. Supports verifying frontend functionality, debugging UI behavior, capturing browser screenshots, and viewing browser logs.

Design e Frontend#testby anthropics

brand-guidelines

153.1k

Applies Anthropic's official brand colors and typography to any artifact that may benefit from its look-and-feel. Use it when brand colors, style guidelines, visual formatting, or company design standards apply.

Design e Frontendby anthropics

frontend-design

153.1k

Creates distinctive, production-grade frontend interfaces with high design quality, generating creative, polished code and UI design that avoids generic AI aesthetics. Use for building web components, pages, and applications, or for styling/beautifying web UIs.

Design e Frontend#css#aiby anthropics

mcp-builder

153.1k

Guide for creating high-quality MCP (Model Context Protocol) servers that enable LLMs to interact with external services through well-designed tools. Use when building MCP servers to integrate external APIs or services, whether in Python (FastMCP) or Node/TypeScript (MCP SDK).

Design e Frontend#llm#typescriptby anthropics

Category alert

Get new Design e Frontend skills every Monday

One short email with only the new Design e Frontend skills. 4 minutes of reading, no spam, unsubscribe with one click.

You confirm your email on the first send. No spam. Unsubscribe with one click.

pipeline-eval — System-level evaluation for LLM pipelines

What this skill is for

deepeval scores one content artifact. pipeline-eval scores the system that produced it. Different question, different rubric, complementary.

Use pipeline-eval when:

The user has a multi-stage pipeline (pipeline.json, n8n workflow, LangGraph, custom orchestrator)
They want to know: where does quality leak? — at input, at the prompt, at sequencing, at fact-grounding, somewhere else
They

[Description truncada. Veja o README completo no GitHub.]

ShareX LinkedIn

Comments · No comments

No comments yet. Be the first.