← Back to the catalog

evaluating-code-models

Evaluates code generation models across HumanEval, MBPP, MultiPL-E, and 15+ benchmarks using pass@k metrics. This industry standard from the BigCode Project, used by HuggingFace leaderboards, is ideal for benchmarking code models and measuring code generation quality.

4stars
Updated 13 days ago

View on GitHub ↗License: MIT

How to add

/plugin marketplace add immacualate/claude-forge

The exact command may vary by repository. Check the README on GitHub.

For the skill author

Drop this on your repo README

Shows your skill is listed on Skillteca, generates a backlink and trackable traffic.

Listada na Skillteca
[![Listada na Skillteca](https://www.skillteca.com.br/api/badge/evaluating-code-models/svg)](https://www.skillteca.com.br/skills/evaluating-code-models?utm_source=badge&utm_medium=readme&utm_campaign=badge)

Category alert

Get new Desenvolvimento skills every Monday

One short email with only the new Desenvolvimento skills. 4 minutes of reading, no spam, unsubscribe with one click.

You confirm your email on the first send. No spam. Unsubscribe with one click.

ShareXLinkedIn

Comments · No comments

Sign in to comment. Sign in

  • No comments yet. Be the first.