DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Efficent platform for inference and serving local LLMs including an OpenAI compatible API server.
基于原生 vLLM 框架适配 Enflame GCU(S60)推出的大模型推理系统
Minimalist ML framework for Rust
ccache – a fast compiler cache
PaddlePaddle Model Zoo
由燧原开发和维护的模型合集,提供人工智能各个应用领域(包括并不限于CV/NLP/推荐等)中经典和SOTA模型的训练和推理应用示例。