-
- Downloads
aarch64: convert vector shift + bitwise and + multiply to vector compare
When using SWAR (SIMD in a register) techniques a comparison operation within such a register can be made by using a combination of shifts, bitwise and and multiplication. If code using this scheme is vectorized then there is potential to replace all these operations with a single vector comparison, by reinterpreting the vector types to match the width of the SWAR register. For example, for the test function packed_cmp_16_32, the original generated code is: ldr q0, [x0] add w1, w1, 1 ushr v0.4s, v0.4s, 15 and v0.16b, v0.16b, v2.16b shl v1.4s, v0.4s, 16 sub v0.4s, v1.4s, v0.4s str q0, [x0], 16 cmp w2, w1 bhi .L20 with this pattern the above can be optimized to: ldr q0, [x0] add w1, w1, 1 cmlt v0.8h, v0.8h, #0 str q0, [x0], 16 cmp w2, w1 bhi .L20 The effect is similar for x86-64. Bootstrapped and reg-tested for x86 and aarch64. gcc/ChangeLog: * match.pd: simplify vector shift + bit_and + multiply. gcc/testsuite/ChangeLog: * gcc.target/aarch64/swar_to_vec_cmp.c: New test. Signed-off-by:Manolis Tsamis <manolis.tsamis@vrull.eu> Signed-off-by:
Philipp Tomsich <philipp.tomsich@vrull.eu>
Loading
Please register or sign in to comment