Sign in
Sign up
Explore
Enterprise
Education
Search
Help
Terms of use
About Us
Explore
Enterprise
Education
Gitee Premium
Gitee AI
AI teammates
Sign in
Sign up
Fetch the repository succeeded.
Open Source
>
Development Tools
>
Compile/Build/Deploy
&&
Donate
Please sign in before you donate.
Cancel
Sign in
Scan WeChat QR to Pay
Cancel
Complete
Prompt
Switch to Alipay.
OK
Cancel
Watch
Unwatch
Watching
Releases Only
Ignoring
1.4K
Star
7.6K
Fork
1.4K
GVP
方舟编译器
/
OpenArkCompiler
Code
Issues
200
Pull Requests
14
Wiki
Insights
Pipelines
Service
Quality Analysis
Jenkins for Gitee
Tencent CloudBase
Tencent Cloud Serverless
悬镜安全
Aliyun SAE
Codeblitz
SBOM
Don’t show this again
Update failed. Please try again later!
Remove this flag
Content Risk Flag
This task is identified by
as the content contains sensitive information such as code security bugs, privacy leaks, etc., so it is only accessible to contributors of this repository.
SR-add opt性能分析
Backlog
#I45F06
Leo Young
member
Opened this issue
2021-08-13 15:52
以如下case为例: ``` void add(int n) { for (int i = 0;i < n; i++) { int c = a[i]; a[i] = c+1; b[i] = c; } } ``` gcc O2 shows: (从结果看没有做sr) ``` add: cmp w0, 0 ble .L1 adrp x3, .LANCHOR0 mov x1, 0 add x3, x3, :lo12:.LANCHOR0 add x5, x3, 24 .p2align 2 .L3: ldr w2, [x3, x1, lsl 2] add w4, w2, 1 str w2, [x5, x1, lsl 2] str w4, [x3, x1, lsl 2] add x1, x1, 1 cmp w0, w1 bgt .L3 .L1: ret .size add, .-add ``` maplec O2 shows: (只做mul的sr) ``` .text .align 3 .globl add .hidden add .type add, %function add: .L.107__5: cmp w0, #0 ble .L.107__1 sxtw x1, w0 mov x0, #0 lsl x4, x1, #2 adrp x1, b add x5, x1, #:lo12:b adrp x1, a add x2, x1, #:lo12:a .L.107__2: ldr w1, [x2,x0] add w3, w1, #1 str w3, [x2,x0] str w1, [x5,x0] add x0, x0, #4 cmp x0, x4 blo .L.107__2 .L.107__1: ret .L.107__7: .size add, .-add ``` maplec O2 + sradd shows: ``` add: .L.107__5: cmp w0, #0 ble .L.107__1 sxtw x2, w0 adrp x0, a add x0, x0, #:lo12:a adrp x1, b add x1, x1, #:lo12:b add x4, x0, x2, LSL #2 .L.107__2: ldr w2, [x0] add w3, w2, #1 str w3, [x0] add x0, x0, #4 <====== 可向上合并 str w2, [x1] add x1, x1, #4 <==== 可向上合并,合并后循环内可以少一条add指令。 cmp x0, x4 blo .L.107__2 .L.107__1: ret .L.107__7: .size add, .-add ``` llvm O2 shows: ``` add: // @add // %bb.0: // %entry cmp w0, #1 // =1 b.lt .LBB0_3 // %bb.1: // %for.body.preheader adrp x9, b adrp x10, a mov w8, w0 add x9, x9, :lo12:b add x10, x10, :lo12:a .LBB0_2: // %for.body // =>This Inner Loop Header: Depth=1 ldr w11, [x10] subs x8, x8, #1 // =1 add w12, w11, #1 // =1 str w12, [x10], #4 str w11, [x9], #4 b.ne .LBB0_2 .LBB0_3: // %for.cond.cleanup ret .Lfunc_end0: .size add, .Lfunc_end0-add ```
以如下case为例: ``` void add(int n) { for (int i = 0;i < n; i++) { int c = a[i]; a[i] = c+1; b[i] = c; } } ``` gcc O2 shows: (从结果看没有做sr) ``` add: cmp w0, 0 ble .L1 adrp x3, .LANCHOR0 mov x1, 0 add x3, x3, :lo12:.LANCHOR0 add x5, x3, 24 .p2align 2 .L3: ldr w2, [x3, x1, lsl 2] add w4, w2, 1 str w2, [x5, x1, lsl 2] str w4, [x3, x1, lsl 2] add x1, x1, 1 cmp w0, w1 bgt .L3 .L1: ret .size add, .-add ``` maplec O2 shows: (只做mul的sr) ``` .text .align 3 .globl add .hidden add .type add, %function add: .L.107__5: cmp w0, #0 ble .L.107__1 sxtw x1, w0 mov x0, #0 lsl x4, x1, #2 adrp x1, b add x5, x1, #:lo12:b adrp x1, a add x2, x1, #:lo12:a .L.107__2: ldr w1, [x2,x0] add w3, w1, #1 str w3, [x2,x0] str w1, [x5,x0] add x0, x0, #4 cmp x0, x4 blo .L.107__2 .L.107__1: ret .L.107__7: .size add, .-add ``` maplec O2 + sradd shows: ``` add: .L.107__5: cmp w0, #0 ble .L.107__1 sxtw x2, w0 adrp x0, a add x0, x0, #:lo12:a adrp x1, b add x1, x1, #:lo12:b add x4, x0, x2, LSL #2 .L.107__2: ldr w2, [x0] add w3, w2, #1 str w3, [x0] add x0, x0, #4 <====== 可向上合并 str w2, [x1] add x1, x1, #4 <==== 可向上合并,合并后循环内可以少一条add指令。 cmp x0, x4 blo .L.107__2 .L.107__1: ret .L.107__7: .size add, .-add ``` llvm O2 shows: ``` add: // @add // %bb.0: // %entry cmp w0, #1 // =1 b.lt .LBB0_3 // %bb.1: // %for.body.preheader adrp x9, b adrp x10, a mov w8, w0 add x9, x9, :lo12:b add x10, x10, :lo12:a .LBB0_2: // %for.body // =>This Inner Loop Header: Depth=1 ldr w11, [x10] subs x8, x8, #1 // =1 add w12, w11, #1 // =1 str w12, [x10], #4 str w11, [x9], #4 b.ne .LBB0_2 .LBB0_3: // %for.cond.cleanup ret .Lfunc_end0: .size add, .Lfunc_end0-add ```
Comments (
0
)
Sign in
to comment
Status
Backlog
Backlog
Doing
Done
Closed
Assignees
Not set
fredchow
fredchow
Assignee
Collaborator
+Assign
+Mention
Labels
Not set
Label settings
Milestones
No related milestones
No related milestones
Pull Requests
None yet
None yet
Successfully merging a pull request will close this issue.
Branches
No related branch
Branches (27)
Tags (2)
master
lite_maplecg_assembler
merge_branch_20230823
merge_branch_20230608
lite_maplecg
update_llvm_15
merge_branch
wchen_merge169
dev_MapleFE
newir09
fredchow_ginlinepgo1
fredchow_lfovect1
wchen_merge174
fye_pgo_cglower
wchen_merge175
fredchow_lfounroll4
fredchow_funcdelete1
newir08
wchen_merge170
ahuang_m103
newir07
wchen_merge173
newir06
newir05
dev_MapleFE_v2
newir04
abstractir
v1.0.0
v0.2.1
Planed to start   -   Planed to end
-
Top level
Not Top
Top Level: High
Top Level: Medium
Top Level: Low
Priority
Not specified
Serious
Main
Secondary
Unimportant
参与者(2)
C++
1
https://gitee.com/openarkcompiler/OpenArkCompiler.git
git@gitee.com:openarkcompiler/OpenArkCompiler.git
openarkcompiler
OpenArkCompiler
OpenArkCompiler
Going to Help Center
Search
Git 命令在线学习
如何在 Gitee 导入 GitHub 仓库
Git 仓库基础操作
企业版和社区版功能对比
SSH 公钥设置
如何处理代码冲突
仓库体积过大,如何减小?
如何找回被删除的仓库数据
Gitee 产品配额说明
GitHub仓库快速导入Gitee及同步更新
什么是 Release(发行版)
将 PHP 项目自动发布到 packagist.org
Repository Report
Back to the top
Login prompt
This operation requires login to the code cloud account. Please log in before operating.
Go to login
No account. Register