← Back to the catalog

training-llms-megatron

Trains large language models (2B-462B parameters) using NVIDIA Megatron-Core with advanced parallelism strategies. It's ideal for models >1B parameters, maximum GPU efficiency (47% MFU on H100), or requiring various parallelism types, and is a production-ready framework used for Nemotron, LLaMA, and DeepSeek.

9.1kstars
Updated 2 months ago

View on GitHub ↗License: MIT

How to add

/plugin marketplace add Orchestra-Research/AI-Research-SKILLs

The exact command may vary by repository. Check the README on GitHub.

For the skill author

Drop this on your repo README

Shows your skill is listed on Skillteca, generates a backlink and trackable traffic.

Listada na Skillteca
[![Listada na Skillteca](https://www.skillteca.com.br/api/badge/training-llms-megatron/svg)](https://www.skillteca.com.br/skills/training-llms-megatron?utm_source=badge&utm_medium=readme&utm_campaign=badge)

Category alert

Get new Pesquisa e Web skills every Monday

One short email with only the new Pesquisa e Web skills. 4 minutes of reading, no spam, unsubscribe with one click.

You confirm your email on the first send. No spam. Unsubscribe with one click.

ShareXLinkedIn

Comments · No comments

Sign in to comment. Sign in

  • No comments yet. Be the first.