Explore skills
4,845 skills found
Category alert
Get new DevOps e Infra skills every Monday
ml-pipeline
Designs and implements production-grade ML pipeline infrastructure: configures experiment tracking with MLflow or Weights & Biases, creates Kubeflow or Airflow DAGs for training orchestration, builds feature store schemas with Feast, deploys model registries, and automates retraining and validation workflows. Use when building ML pipelines, orchestrating training workflows, automating model lifecy
terraform-engineer
Use when implementing infrastructure as code with Terraform across AWS, Azure, or GCP. Invoke for module development (create reusable modules, manage module versioning), state management (migrate backends, import existing resources, resolve state conflicts), provider configuration, multi-environment workflows, and infrastructure testing.
gguf-quantization
GGUF format and llama.cpp quantization for efficient CPU/GPU inference. Use when deploying models on consumer hardware, Apple Silicon, or when needing flexible 2-8 bit quantization without GPU requirements.
prompt-guard
Meta's 86M prompt injection and jailbreak detector filters malicious prompts and third-party data for LLM applications. It boasts over 99% TPR, under 1% FPR, is fast (<2ms GPU), multilingual (8 languages), and can be deployed via HuggingFace or batch processing for RAG security.
model-merging
Merge multiple fine-tuned models with mergekit to combine capabilities without retraining, ideal for creating specialized models by blending domain-specific expertise or improving performance. It covers various merging techniques like SLERP, TIES-Merging, DARE, Task Arithmetic, and linear merging, plus production deployment strategies.
nemo-evaluator-sdk
NVIDIA's enterprise-grade platform evaluates LLMs across 100+ benchmarks from 18+ harnesses (MMLU, HumanEval, GSM8K, safety, VLM) with multi-backend execution. It provides scalable evaluation on local Docker, Slurm HPC, or cloud platforms, featuring a container-first architecture for reproducible benchmarking.
modal-serverless-gpu
Serverless GPU cloud platform for running ML workloads. Use when you need on-demand GPU access without infrastructure management, deploying ML models as APIs, or running batch jobs with automatic scaling.
llamaguard
Meta's 7-8B specialized moderation model filters LLM input/output across 6 safety categories: violence/hate, sexual content, weapons, substances, self-harm, and criminal planning. It boasts 94-95% accuracy and can be deployed with vLLM, HuggingFace, Sagemaker, integrating with NeMo Guardrails.
fine-tuning-openvla-oft
Fine-tunes and evaluates OpenVLA-OFT and OpenVLA-OFT+ policies for robot action generation using continuous action heads, LoRA adaptation, and FiLM conditioning on LIBERO simulation and ALOHA real-world setups. This is useful for reproducing paper results, training custom VLA action heads, deploying ALOHA inference, or debugging related components.
speculative-decoding
Accelerate LLM inference using speculative decoding, Medusa multiple heads, and lookahead decoding techniques. This optimizes inference speed (1.5-3.6x speedup), reduces latency for real-time applications, and is useful for deploying models with limited compute.
knowledge-distillation
Compress large language models using knowledge distillation from teacher to student models. This technique is useful for deploying smaller models with retained performance, transferring GPT-4 capabilities to open-source models, or reducing inference costs, covering strategies like temperature scaling, soft targets, reverse KLD, logit distillation, and MiniLLM training.
wiki
Claude + Obsidian knowledge companion. Sets up a persistent wiki vault, scaffolds structure from a one-sentence description, and routes to specialized sub-skills. Use for setup, scaffolding, cross-project referencing, and hot cache management. Triggers on: "set up wiki", "scaffold vault", "create knowledge base", "/wiki", "wiki setup", "obsidian vault", "knowledge base", "second brain setup", "run