diff --git a/README.md b/README.md index de2864e758f527f5aa16a91ab1738b79acb9d675..4f77effa29669c91331ad7b1a3ce72fc1ec5f2c4 100644 --- a/README.md +++ b/README.md @@ -41,12 +41,14 @@ DeepSparkHub甄选上百个应用算法和模型,覆盖AI和通用计算各领 | [Llama3-8B](nlp/llm/llama3_8b/pytorch) | PyTorch | Megatron-DeepSpeed | Bookcorpus | 4.1.1 | | [Llama3-8B](nlp/llm/llama3_8b/megatron-lm) | PyTorch | Megatron-LM | GPT Small-117M | 4.3.0 | | [Llama3-8B SFT](nlp/llm/llama3_8b_sft/pytorch) | PyTorch | ColossalAI | school_math_0.25M | 4.1.1 | +| [Llama3-8B SFT](nlp/llm/llama3_8b/openrlhf) | PyTorch | OpenRLHF | Meta-Llama-3-8B | 4.3.0 | | [Llama3-8B PPO](nlp/llm/llama3_8b/openrlhf) | PyTorch | OpenRLHF | Llama-3-8b-sft-mixture | 4.2.0 | | [Llama3-8B DPO](nlp/llm/llama3_8b/openrlhf) | PyTorch | OpenRLHF | Llama-3-8b-sft-mixture | 4.2.0 | | [Llama3-8B KTO](nlp/llm/llama3_8b/openrlhf) | PyTorch | OpenRLHF | Llama-3-8b-sft-mixture | 4.2.0 | | [Mamba-2](nlp/llm/mamba-2/pytorch) | PyTorch | Megatron-LM | GPT Small-117M | 4.1.1 | | [MiniCPM](nlp/llm/minicpm/pytorch) | PyTorch | DeepSpeed | MiniCPM-2B-sft-bf16 | 4.2.0 | | [Mixtral 8x7B](nlp/llm/mixtral/pytorch) | PyTorch | Megatron-LM | GPT Small-117M | 4.1.1 | +| [Mixtral 8x7B](nlp/llm/mixtral/openrlhf) | PyTorch | OpenRLHF | Mixtral-8x7B-v0.1 | 4.3.0 | | [Phi-3](nlp/llm/phi-3/pytorch) | PyTorch | Torchrun | Phi-3-mini-4k-instruct | 4.2.0 | | [QWen-7B](nlp/llm/qwen-7b/pytorch) | PyTorch | Firefly | qwen-7b | 3.4.0 | | [QWen1.5-7B](nlp/llm/qwen1.5-7b/pytorch) | PyTorch | Firefly | school_math | 4.1.1 | diff --git a/README_en.md b/README_en.md index 3116386737b60d693bde9627ba883a1c87b74a59..3aba08f0a1e1aa13d6fb69ef4730e9cea36ac8c5 100644 --- a/README_en.md +++ b/README_en.md @@ -43,12 +43,14 @@ individuals, healthcare, education, communication, energy, and more. | [Llama3-8B](nlp/llm/llama3_8b/pytorch) | PyTorch | Megatron-DeepSpeed | Bookcorpus | 4.1.1 | | [Llama3-8B](nlp/llm/llama3_8b/megatron-lm) | PyTorch | Megatron-LM | GPT Small-117M | 4.3.0 | | [Llama3-8B SFT](nlp/llm/llama3_8b_sft/pytorch) | PyTorch | ColossalAI | school_math_0.25M | 4.1.1 | +| [Llama3-8B SFT](nlp/llm/llama3_8b/openrlhf) | PyTorch | OpenRLHF | Meta-Llama-3-8B | 4.3.0 | | [Llama3-8B PPO](nlp/llm/llama3_8b/openrlhf) | PyTorch | OpenRLHF | Llama-3-8b-sft-mixture | 4.2.0 | | [Llama3-8B DPO](nlp/llm/llama3_8b/openrlhf) | PyTorch | OpenRLHF | Llama-3-8b-sft-mixture | 4.2.0 | | [Llama3-8B KTO](nlp/llm/llama3_8b/openrlhf) | PyTorch | OpenRLHF | Llama-3-8b-sft-mixture | 4.2.0 | | [Mamba-2](nlp/llm/mamba-2/pytorch) | PyTorch | Megatron-LM | GPT Small-117M | 4.1.1 | | [MiniCPM](nlp/llm/minicpm/pytorch) | PyTorch | DeepSpeed | MiniCPM-2B-sft-bf16 | 4.2.0 | | [Mixtral 8x7B](nlp/llm/mixtral/pytorch) | PyTorch | Megatron-LM | GPT Small-117M | 4.1.1 | +| [Mixtral 8x7B](nlp/llm/mixtral/openrlhf) | PyTorch | OpenRLHF | Mixtral-8x7B-v0.1 | 4.3.0 | | [Phi-3](nlp/llm/phi-3/pytorch) | PyTorch | Torchrun | Phi-3-mini-4k-instruct | 4.2.0 | | [QWen-7B](nlp/llm/qwen-7b/pytorch) | PyTorch | Firefly | qwen-7b | 3.4.0 | | [QWen1.5-7B](nlp/llm/qwen1.5-7b/pytorch) | PyTorch | Firefly | school_math | 4.1.1 | diff --git a/RELEASE.md b/RELEASE.md index 30575c0b666935783ae702b6a3aeec37fa81fc29..de9449591beac6accee4d1a72854b59020bee2b8 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -3,6 +3,56 @@ # DeepSparkHub Release Notes +## DeepSparkHub 25.09 Release Notes + +### 模型与算法 + +* 新增了10个大模型强化学习微调示例,使用了[verl](https://github.com/volcengine/verl)、[OpenRLHF](https://github.com/OpenRLHF/OpenRLHF)、[Megatron-LM](https://github.com/NVIDIA/Megatron-LM)、[Colossal-AI](https://github.com/hpcaitech/ColossalAI)、[deepspeed](https://github.com/deepspeedai/DeepSpeed)工具箱 + +
大模型 | ||||
---|---|---|---|---|
Qwen2-7B GRPO (verl) | +Qwen2.5-VL-7B GRPO (verl) | +Qwen3-8B GRPO (verl) | +||
DeepSeek-LLM-7B PPO (verl) | +Gemma-2-2B-IT PPO (verl) | +Llama-3-8B SFT (OpenRLHF) | +||
Mixtral-8x7B-v0.1 SFT (OpenRLHF) | +Llama-3-8B (Megatron-LM) | +Qwen2.5-3B (Colossal-AI) | +||
CosyVoice2-0.5B (deepspeed) | ++ | + |