Bases: CpuPlatform
CPU platform with AMD Zen (ZenDNN/zentorch) optimizations.
Model-load time (dispatch_cpu_unquantized_gemm in layers/utils.py): - Routes linear ops to zentorch_linear_unary. - When VLLM_ZENTORCH_WEIGHT_PREPACK=1 (default), eagerly prepacks weights via zentorch_weight_prepack_for_linear.
Source code in vllm/platforms/zen_cpu.py
| class ZenCpuPlatform(CpuPlatform):
"""CPU platform with AMD Zen (ZenDNN/zentorch) optimizations.
Model-load time (dispatch_cpu_unquantized_gemm in layers/utils.py):
- Routes linear ops to zentorch_linear_unary.
- When VLLM_ZENTORCH_WEIGHT_PREPACK=1 (default), eagerly prepacks
weights via zentorch_weight_prepack_for_linear.
"""
device_name: str = "cpu"
device_type: str = "cpu"
def is_zen_cpu(self) -> bool:
# is_cpu() also returns True for this platform (inherited from CpuPlatform).
return True
|