Explore skills
5,474 skills found
Category alert
Get new Pesquisa e Web skills every Monday
lambda-labs-gpu-cloud
Reserved and on-demand GPU cloud instances for ML training and inference. Use when you need dedicated GPU instances with simple SSH access, persistent filesystems, or high-performance multi-node clusters for large-scale training.
instructor
Extract structured data from LLM responses with Pydantic validation, retry failed extractions automatically, parse complex JSON with type safety, and stream partial results with Instructor - a battle-tested structured output library.
outlines
Guarantee valid JSON/XML/code structure during generation, use Pydantic models for type-safe outputs, support local models (Transformers, vLLM), and maximize inference speed with Outlines - dottxt.ai's structured generation library.
long-context
Extend transformer model context windows using RoPE, YaRN, ALiBi, and position interpolation techniques. This is useful for processing long documents, extending pre-trained models, or implementing efficient positional encodings, covering various embedding and extrapolation strategies for LLMs.
brainstorming-research-ideas
Guides researchers through structured ideation frameworks to discover high-impact research directions. Use when exploring new problem spaces, pivoting between projects, or seeking novel angles on existing work.
qdrant-vector-search
High-performance vector similarity search engine for RAG and semantic search. Use it for production RAG systems needing fast nearest neighbor search, hybrid search with filtering, or scalable Rust-powered vector storage.
ml-paper-writing
Write publication-ready ML/AI papers for NeurIPS, ICML, ICLR, ACL, AAAI, COLM. Use when drafting papers from research repos, structuring arguments, verifying citations, or preparing camera-ready submissions; for systems venues, use 'systems-paper-writing'.
segment-anything-model
Foundation model for image segmentation with zero-shot transfer. Use when you need to segment any object in images using points, boxes, or masks as prompts, or automatically generate all object masks in an image.
implementing-llms-litgpt
Implements and trains LLMs using Lightning AI's LitGPT, supporting over 20 pretrained architectures like Llama, Gemma, Phi, Qwen, and Mistral. It's suitable for clean model implementations, educational understanding of architectures, or production fine-tuning with LoRA/QLoRA, featuring single-file implementations without abstraction layers.
awq-quantization
This 4-bit LLM compression method, winner of the MLSys 2024 Best Paper Award, uses activation-aware weight quantization, providing a 3x speedup and minimal accuracy loss. It's ideal for deploying large models on limited GPU memory or for faster, more accurate inference than GPTQ, especially for instruction-tuned and multimodal models.
pytorch-fsdp2
Adds PyTorch FSDP2 (fully_shard) to training scripts with correct init, sharding, mixed precision/offload config, and distributed checkpointing. Use when models exceed single-GPU memory or when you need DTensor-based sharding with DeviceMesh.
distributed-llm-pretraining-torchtitan
Provides PyTorch-native distributed LLM pretraining using torchtitan with 4D parallelism (FSDP2, TP, PP, CP). It is ideal for pretraining Llama 3.1, DeepSeek V3, or custom models at scale from 8 to 512+ GPUs, leveraging Float8, torch.compile, and distributed checkpointing.