Explore skills

5,474 skills found

Category alert

Get new Pesquisa e Web skills every Monday

guidance

Control LLM output with regex and grammars, guarantee valid JSON/XML/code generation, enforce structured formats, and build multi-step workflows with Guidance - Microsoft Research's constrained generation framework.

Pesquisa e Web#llm#aiby Orchestra-Research

nanogpt

9.1k

A ~300-line educational GPT implementation by Andrej Karpathy, reproducing GPT-2 (124M) on OpenWebText. It offers clean, hackable code perfect for learning transformers and understanding GPT architecture from scratch, trainable on Shakespeare (CPU) or OpenWebText (multi-GPU).

Pesquisa e Web#aiby Orchestra-Research

pytorch-lightning

9.1k

A high-level PyTorch framework featuring a Trainer class, automatic distributed training (DDP/FSDP/DeepSpeed), and a callbacks system, designed for minimal boilerplate. It scales from laptops to supercomputers with the same code, providing clean training loops with built-in best practices.

Pesquisa e Web#aiby Orchestra-Research

skypilot-multi-cloud-orchestration

9.1k

Multi-cloud orchestration for ML workloads with automatic cost optimization. Use when you need to run training or batch jobs across multiple clouds, leverage spot instances with auto-recovery, or optimize GPU costs across providers.

Pesquisa e Web#aiby Orchestra-Research

serving-llms-vllm

9.1k

Serves LLMs with high throughput using vLLM's PagedAttention and continuous batching. Ideal for deploying production LLM APIs, optimizing inference, or serving models with limited GPU memory, it supports OpenAI-compatible endpoints, quantization, and tensor parallelism.

Pesquisa e Web#llm#deployby Orchestra-Research

weights-and-biases

9.1k

Track ML experiments with automatic logging, visualize training in real-time, optimize hyperparameters with sweeps, and manage model registry with W&B - a collaborative MLOps platform.

Pesquisa e Web#aiby Orchestra-Research

evolving-ai-agents

9.1k

Provides guidance for automatically evolving and optimizing AI agents across any domain using LLM-driven evolution algorithms. Use when building self-improving agents, optimizing agent prompts and skills against benchmarks, or implementing automated agent evaluation loops.

Pesquisa e Web#llm#aiby Orchestra-Research

llama-cpp

9.1k

Runs LLM inference on CPU, Apple Silicon, and consumer GPUs without NVIDIA hardware, ideal for edge deployment, M1/M2/M3 Macs, AMD/Intel GPUs, or when CUDA is unavailable. It supports GGUF quantization (1.5-8 bit) for reduced memory and 4-10x speedup vs PyTorch on CPU.

Pesquisa e Web#llm#deployby Orchestra-Research

sglang

9.1k

Fast structured generation and serving for LLMs using RadixAttention prefix caching. It's ideal for JSON/regex outputs, constrained decoding, agentic workflows, or when 5x faster inference than vLLM with prefix sharing is needed, powering over 300,000 GPUs at major tech companies.

Pesquisa e Web#llm#aiby Orchestra-Research

deepspeed

9.1k

Expert guidance for distributed training with DeepSpeed, covering ZeRO optimization stages, pipeline parallelism, FP16/BF16/FP8, 1-bit Adam, and sparse attention.

Pesquisa e Web#aiby Orchestra-Research

evaluating-llms-harness

9.1k

Evaluates LLMs across 60+ academic benchmarks like MMLU and HumanEval. It's an industry standard for benchmarking model quality, comparing models, and tracking training progress, supporting HuggingFace, vLLM, and APIs.

Pesquisa e Web#llm#aiby Orchestra-Research

nemo-guardrails

9.1k

NVIDIA's runtime safety framework for LLM applications features jailbreak, hallucination, and toxicity detection, alongside input/output validation, fact-checking, and PII filtering. It uses Colang 2.0 DSL for programmable rails, is production-ready, and runs on T4 GPUs.

Pesquisa e Web#llm#aiby Orchestra-Research