马上注册,结交更多好友,享用更多功能,让你轻松玩转社区。
您需要 登录 才可以下载或查看,没有账号?立即注册
x
在第一时间我们支持FlashMLA,并开放代码至:
代码仓库地址:https://github.com/MetaX-MACA/FlashMLA
欢迎给我们提Iusse交流及贡献:https://github.com/MetaX-MACA/FlashMLA/issues
FlashMLA on MXMACAWe provide the implementation of FlashMLA from FlashAttention-2(version 2.6.3), based on MACA toolkit and C500 chips. FlashAttention-2 currently supports: - Datatype fp16 and bf16.
- Multi-Token Parallelism = 1
- Paged kvcache with block size equal to 2^n (n >= 0)
How to run on MXMACA DeviceInstallationRequirements: - MXMACA GPUs.
- MACA development toolkit.
- mcTlass source code.
- mcPytorch2.1 and mcTriton2.1 from maca toolkit wheel package and above.
To install:
|