-
- Downloads
AArch64: Add NEON, SVE and SVE2 RTL patterns for Complex Addition.
This adds implementation for the optabs for add complex operations. With this the following C code: void f90 (float complex a[restrict N], float complex b[restrict N], float complex c[restrict N]) { for (int i=0; i < N; i++) c[i] = a[i] + (b[i] * I); } generates f90: mov x3, 0 .p2align 3,,7 .L2: ldr q0, [x0, x3] ldr q1, [x1, x3] fcadd v0.4s, v0.4s, v1.4s, #90 str q0, [x2, x3] add x3, x3, 16 cmp x3, 1600 bne .L2 ret instead of f90: add x3, x1, 1600 .p2align 3,,7 .L2: ld2 {v4.4s - v5.4s}, [x0], 32 ld2 {v2.4s - v3.4s}, [x1], 32 fsub v0.4s, v4.4s, v3.4s fadd v1.4s, v5.4s, v2.4s st2 {v0.4s - v1.4s}, [x2], 32 cmp x3, x1 bne .L2 ret gcc/ChangeLog: * config/aarch64/aarch64-simd.md (cadd<rot><mode>3): New. * config/aarch64/iterators.md (SVE2_INT_CADD_OP): New. * config/aarch64/aarch64-sve.md (cadd<rot><mode>3): New. * config/aarch64/aarch64-sve2.md (cadd<rot><mode>3): New.
Showing
- gcc/config/aarch64/aarch64-simd.md 8 additions, 0 deletionsgcc/config/aarch64/aarch64-simd.md
- gcc/config/aarch64/aarch64-sve.md 14 additions, 0 deletionsgcc/config/aarch64/aarch64-sve.md
- gcc/config/aarch64/aarch64-sve2.md 10 additions, 0 deletionsgcc/config/aarch64/aarch64-sve2.md
- gcc/config/aarch64/iterators.md 4 additions, 0 deletionsgcc/config/aarch64/iterators.md
Loading
Please register or sign in to comment