From dd75ee403a184b632f1b3f302b567636cdd214ab Mon Sep 17 00:00:00 2001
From: zhangyihuiben <zhangyihuiben@sina.com>
Date: Fri, 31 Oct 2025 10:56:09 +0800
Subject: [PATCH] =?UTF-8?q?=E6=95=B4=E6=94=B9=E7=B2=BE=E5=BA=A6=E6=96=87?=
 =?UTF-8?q?=E6=A1=A3?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

---
 .../advanced_development/accuracy_comparison.md      | 12 ++++++------
 .../advanced_development/accuracy_comparison.md      |  2 +-
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/docs/mindformers/docs/source_en/advanced_development/accuracy_comparison.md b/docs/mindformers/docs/source_en/advanced_development/accuracy_comparison.md
index d098180d13..b79afb3f86 100644
--- a/docs/mindformers/docs/source_en/advanced_development/accuracy_comparison.md
+++ b/docs/mindformers/docs/source_en/advanced_development/accuracy_comparison.md
@@ -191,18 +191,18 @@ The following tables describe the configuration comparison with Megatron-LM.
     | `moe-router-topk-scaling-factor`      | Top-*k* score scaling factor.              | `routed_scaling_factor`               | Top-*k* score scaling factor.              |
     | `moe-router-enable-expert-bias`       | Specifies whether to use the bias of an expert.        | `balance_via_topk_bias`               | Specifies whether to use the bias of an expert.        |
     | `moe-router-bias-update-rate`         | Update rate of expert bias.           | `topk_bias_update_rate`               | Update rate of expert bias.           |
-    | `moe-use-legacy-grouped-gemm`         | Specifies whether to use the source version of Grouped GEMM.       | Not supported.                                |                            |
-    | `moe-aux-loss-coeff`                  | Auxiliary loss coefficient of MoE.                | Not supported.                                |                            |
-    | `moe-z-loss-coeff`                    | MoE z-loss coefficient.             | Not supported.                                |                            |
+    | `moe-use-legacy-grouped-gemm`         | Specifies whether to use the source version of Grouped GEMM.       | Not supported.                        |                            |
+    | `moe-aux-loss-coeff`                  | Auxiliary loss coefficient of MoE.                | Not supported.                        |                            |
+    | `moe-z-loss-coeff`                    | MoE z-loss coefficient.             | Not supported.                        |                            |
     | `moe-input-jitter-eps`                | Input jitter noise of MoE.         | `moe_input_jitter_eps`                | Input jitter noise of MoE.         |
-    | `moe-token-dispatcher-type`           | Token scheduling policy (for example, **allgather**).   | Not supported.                                |                            |
+    | `moe-token-dispatcher-type`           | Token scheduling policy (for example, **allgather**).   | `moe_token_dispatcher_type`           |  Token scheduling policy (for example, **allgather**).                          |
     | `moe-enable-deepep`                   | Specifies whether to enable DeepEP hybrid expert optimization.        | `moe_enable_deepep`                   | Specifies whether to enable DeepEP hybrid expert optimization.        |
     | `moe-per-layer-logging`               | Prints logs at each MoE layer.               | `moe_per_layer_logging`               | Prints logs at each MoE layer.               |
     | `moe-expert-capacity-factor`          | Expansion ratio of the expert capacity.             | `capacity_factor`                     | Expansion ratio of the expert capacity.             |
     | `moe-pad-expert-input-to-capacity`    | Specifies whether to fill the expert input to the capacity upper limit.       | `moe_pad_expert_input_to_capacity`    | Specifies whether to fill the expert input to the capacity upper limit.       |
     | `moe-token-drop-policy`               | Token discarding policy (for example, **probs** or **position**).| `enable_sdrop`                        | Token discarding policy (for example, **probs** or **position**).|
-    | `moe-extended-tp`                     | Enables extended tensor parallelism.          | Not supported.                                |                            |
-    | `moe-use-upcycling`                   | Specifies whether to enable expert upcycling.          | Not supported.                                |                            |
+    | `moe-extended-tp`                     | Enables extended tensor parallelism.          | Not supported.                        |                            |
+    | `moe-use-upcycling`                   | Specifies whether to enable expert upcycling.          | Not supported.                        |                            |
     | `moe-permute-fusion`                  | Enables internal permute fusion optimization of experts.       | `moe_permute_fusion`                  | Enables internal permute fusion optimization of experts.       |
     | `mtp-num-layers`                      | Number of MoE layers.                  | `mtp_depth`                           | Number of MoE layers.                  |
     | `mtp-loss-scaling-factor`             | Loss scaling in the MoE architecture.              | `mtp_loss_factor`                     | Loss scaling in the MoE architecture.              |
diff --git a/docs/mindformers/docs/source_zh_cn/advanced_development/accuracy_comparison.md b/docs/mindformers/docs/source_zh_cn/advanced_development/accuracy_comparison.md
index d75faadc99..8b77d80cf6 100644
--- a/docs/mindformers/docs/source_zh_cn/advanced_development/accuracy_comparison.md
+++ b/docs/mindformers/docs/source_zh_cn/advanced_development/accuracy_comparison.md
@@ -195,7 +195,7 @@ Megatron-LM 是一个面向大规模训练任务的成熟框架，具备高度
     | `moe-aux-loss-coeff`                  | MoE 辅助损失系数                 | 不支持配置                                 |                            |
     | `moe-z-loss-coeff`                    | MoE z-loss 系数              | 不支持配置                                 |                            |
     | `moe-input-jitter-eps`                | MoE 输入 jitter 噪声量          | `moe_input_jitter_eps`                | MoE 输入 jitter 噪声量          |
-    | `moe-token-dispatcher-type`           | token 调度策略（allgather 等）    | 不支持配置                                 |                            |
+    | `moe-token-dispatcher-type`           | token 调度策略（allgather 等）    | `moe_token_dispatcher_type`           | token 调度策略（allgather 等）    |
     | `moe-enable-deepep`                   | 是否启用 DeepEP 混合专家优化         | `moe_enable_deepep`                   | 是否启用 DeepEP 混合专家优化         |
     | `moe-per-layer-logging`               | 每层 MoE 打印日志                | `moe_per_layer_logging`               | 每层 MoE 打印日志                |
     | `moe-expert-capacity-factor`          | expert 容量扩展比例              | `capacity_factor`                     | expert 容量扩展比例              |
-- 
Gitee