[ PROMPT_NODE_22504 ]

Infrastructure Modal 故障排查

[ SKILL_DOCUMENTATION ]

# Modal 故障排除指南 ## 安装问题 ### 身份验证失败 **错误**: `modal setup` 无法完成或令牌无效 **解决方案**: bash # 重新进行身份验证 modal token new # 查看当前令牌 modal config show # 通过环境变量设置令牌 export MODAL_TOKEN_ID=ak-... export MODAL_TOKEN_SECRET=as-... ### 包安装问题 **错误**: `pip install modal` 失败 **解决方案**: bash # 升级 pip pip install --upgrade pip # 使用特定 Python 版本安装 python3.11 -m pip install modal # 从 wheel 安装 pip install modal --prefer-binary ## 容器镜像问题 ### 镜像构建失败 **错误**: `ImageBuilderError: Failed to build image` **解决方案**: python # 锁定包版本以避免冲突 image = modal.Image.debian_slim().pip_install( "torch==2.1.0", "transformers==4.36.0", # 锁定版本 "accelerate==0.25.0" ) # 使用兼容的 CUDA 版本 image = modal.Image.from_registry( "nvidia/cuda:12.1.0-cudnn8-runtime-ubuntu22.04", # 匹配 PyTorch CUDA add_python="3.11" ) ### 依赖冲突 **错误**: `ERROR: Cannot install package due to conflicting dependencies` **解决方案**: python # 分层安装依赖 base = modal.Image.debian_slim().pip_install("torch") ml = base.pip_install("transformers") # 在 torch 之后安装 # 使用 uv 以获得更好的解析效果 image = modal.Image.debian_slim().uv_pip_install( "torch", "transformers" ) ### 大镜像构建超时 **错误**: 镜像构建超过时间限制 **解决方案**: python # 拆分为多个层（更好的缓存） base = modal.Image.debian_slim().pip_install("torch") # 已缓存 ml = base.pip_install("transformers", "datasets") # 已缓存 app = ml.copy_local_dir("./src", "/app") # 代码变更时重新构建 # 在构建期间下载模型，而不是在运行时 image = modal.Image.debian_slim().pip_install("transformers").run_commands( "python -c 'from transformers import AutoModel; AutoModel.from_pretrained("bert-base")'" ) ## GPU 问题 ### GPU 不可用 **错误**: `RuntimeError: CUDA not available` **解决方案**: python # 确保指定了 GPU @app.function(gpu="T4") # 必须指定 GPU def my_function(): import torch assert torch.cuda.is_available() # 检查镜像中的 CUDA 兼容性 image = modal.Image.from_registry( "nvidia/cuda:12.1.0-cudnn8-devel-ubuntu22.04", add_python="3.11" ).pip_install( "torch", index_url="http"

数据来源：claude-code-templates（MIT），中文翻译由 AI 生成。详见关于我们。

BAGUA AI