-
- Downloads
aarch64: Expand CTZ to RBIT + CLZ for SVE [PR109498]
Currently, we vectorize CTZ for SVE by using the following operation:
.CTZ (X) = (PREC - 1) - .CLZ (X & -X)
Instead, this patch expands CTZ to RBIT + CLZ for SVE, as suggested in PR109498.
The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
OK for mainline?
Signed-off-by:
Soumya AR <soumyaa@nvidia.com>
gcc/ChangeLog:
PR target/109498
* config/aarch64/aarch64-sve.md (ctz<mode>2): Added pattern to expand
CTZ to RBIT + CLZ for SVE.
gcc/testsuite/ChangeLog:
PR target/109498
* gcc.target/aarch64/sve/ctz.c: New test.
Loading
Please register or sign in to comment