← Back to the catalog

nemo-curator

GPU-accelerated data curation for LLM training. Supports text/image/video/audio. Features fuzzy deduplication (16× faster), quality filtering (30+ heuristics), semantic deduplication, PII redaction, NSFW detection. Scales across GPUs with RAPIDS. Use for preparing high-quality training datasets, cleaning web data, or deduplicating large corpora.

7stars
Updated 2 months ago

View on GitHub ↗License: MIT

How to add

/plugin marketplace add braxtonROSE4/zorro-agent

The exact command may vary by repository. Check the README on GitHub.

For the skill author

Drop this on your repo README

Shows your skill is listed on Skillteca, generates a backlink and trackable traffic.

Listada na Skillteca
[![Listada na Skillteca](https://www.skillteca.com.br/api/badge/nemo-curator-braxtonrose4/svg)](https://www.skillteca.com.br/skills/nemo-curator-braxtonrose4?utm_source=badge&utm_medium=readme&utm_campaign=badge)

Category alert

Get new Dados e Análise skills every Monday

One short email with only the new Dados e Análise skills. 4 minutes of reading, no spam, unsubscribe with one click.

You confirm your email on the first send. No spam. Unsubscribe with one click.

ShareXLinkedIn

Comments · No comments

Sign in to comment. Sign in

  • No comments yet. Be the first.