-
- Downloads
aarch64: Add codegen support for AdvSIMD faminmax
The AArch64 FEAT_FAMINMAX extension is optional from Armv9.2-a and mandatory from Armv9.5-a. It introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch adds code generation support for famax and famin in terms of existing RTL operators. famax/famin is equivalent to first taking abs of the operands and then taking smax/smin on the results of abs. famax/famin (a, b) = smax/smin (abs (a), abs (b)) This fusion of operators is only possible when -march=armv9-a+faminmax flags are passed. We also need to pass -ffast-math flag; if we don't, then a statement like c[i] = __builtin_fmaxf16 (a[i], b[i]); is RTL expanded to UNSPEC_FMAXNM instead of smax (likewise for smin). This code generation is only available on -O2 or -O3 as that is when auto-vectorization is enabled. gcc/ChangeLog: * config/aarch64/aarch64-simd.md (*aarch64_faminmax_fused): Instruction pattern for faminmax codegen. * config/aarch64/iterators.md: Attribute for faminmax codegen. gcc/testsuite/ChangeLog: * gcc.target/aarch64/simd/faminmax-codegen-no-flag.c: New test. * gcc.target/aarch64/simd/faminmax-codegen.c: New test. * gcc.target/aarch64/simd/faminmax-no-codegen.c: New test.
Showing
- gcc/config/aarch64/aarch64-simd.md 9 additions, 0 deletionsgcc/config/aarch64/aarch64-simd.md
- gcc/config/aarch64/iterators.md 3 additions, 0 deletionsgcc/config/aarch64/iterators.md
- gcc/testsuite/gcc.target/aarch64/simd/faminmax-codegen-no-flag.c 217 additions, 0 deletions...tsuite/gcc.target/aarch64/simd/faminmax-codegen-no-flag.c
- gcc/testsuite/gcc.target/aarch64/simd/faminmax-codegen.c 197 additions, 0 deletionsgcc/testsuite/gcc.target/aarch64/simd/faminmax-codegen.c
- gcc/testsuite/gcc.target/aarch64/simd/faminmax-no-codegen.c 267 additions, 0 deletionsgcc/testsuite/gcc.target/aarch64/simd/faminmax-no-codegen.c
Loading
Please register or sign in to comment