[ PROMPT_NODE_22282 ]
huggingface-accelerate
[ SKILL_DOCUMENTATION ]
# HuggingFace Accelerate - 统一分布式训练
## 快速开始
Accelerate 将分布式训练简化为 4 行代码。
**安装**:
bash
pip install accelerate
**转换 PyTorch 脚本**(4 行):
python
import torch
+ from accelerate import Accelerator
+ accelerator = Accelerator()
model = torch.nn.Transformer()
optimizer = torch.optim.Adam(model.parameters())
dataloader = torch.utils.data.DataLoader(dataset)
+ model, optimizer, dataloader = accelerator.prepare(model, optimizer, dataloader)
for batch in dataloader:
optimizer.zero_grad()
loss = model(batch)
- loss.backward()
+ accelerator.backward(loss)
optimizer.step()
**运行**(单条命令):
bash
accelerate launch train.py
## 常见工作流
### 工作流 1:从单 GPU 到多 GPU
**原始脚本**:
python
# train.py
import torch
model = torch.nn.Linear(10, 2).to('cuda')
optimizer = torch.optim.Adam(model.parameters())
dataloader = torch.utils.data.DataLoader(dataset, batch_size=32)
for epoch in range(10):
for batch in dataloader:
batch = batch.to('cuda')
optimizer.zero_grad()
loss = model(batch).mean()
loss.backward()
optimizer.step()
**使用 Accelerate**(添加 4 行):
python
# train.py
import torch
from accelerate import Accelerator # +1
accelerator = Accelerator() # +2
model = torch.nn.Linear(10, 2)
optimizer = torch.optim.Adam(model.parameters())
dataloader = torch.utils.data.DataLoader(dataset, batch_size=32)
model, optimizer, dataloader = accelerator.prepare(model, optimizer, dataloader) # +3
for epoch in range(10):
for batch in dataloader:
# 不需要 .to('cuda') - 自动处理!
optimizer.zero_grad()
loss = model(batch).mean()
accelerator.backward(loss) # +4
optimizer.step()
**配置**(交互式):
bash
accelerate config
**问题**:
- 哪种机器?(单/多 GPU/TPU/CPU)
- 多少台机器?(1)
- 混合精度?(no/fp16/bf16/fp8)
- DeepSpeed?(no/yes)
**启动**(适用于任何设置):
bash
# 单 GPU
accelerate launch train.py
# 多 GPU (8 个 GPU)
accelerate launch --multi_gpu --num_processes 8 train.py
# 多节点
accelerate launch --multi_gpu --num_processes 16
--num_machines 2 --machine_rank 0
--main_process_ip $MASTER_ADDR
train.py
### 工作流 2:混合精度训练
**启用 FP16/BF16**:
python
from accelerate import Accelerator