-
- Downloads
LoongArch: Improve cpymemsi expansion [PR109465]
We'd been generating really bad block move sequences which is recently complained by kernel developers who tried __builtin_memcpy. To improve it: 1. Take the advantage of -mno-strict-align. When it is set, set mode size to UNITS_PER_WORD regardless of the alignment. 2. Half the mode size when (block size) % (mode size) != 0, instead of falling back to ld.bu/st.b at once. 3. Limit the length of block move sequence considering the number of instructions, not the size of block. When -mstrict-align is set and the block is not aligned, the old size limit for straight-line implementation (64 bytes) was definitely too large (we don't have 64 registers anyway). Change since v1: add a comment about the calculation of num_reg. gcc/ChangeLog: PR target/109465 * config/loongarch/loongarch-protos.h (loongarch_expand_block_move): Add a parameter as alignment RTX. * config/loongarch/loongarch.h: (LARCH_MAX_MOVE_BYTES_PER_LOOP_ITER): Remove. (LARCH_MAX_MOVE_BYTES_STRAIGHT): Remove. (LARCH_MAX_MOVE_OPS_PER_LOOP_ITER): Define. (LARCH_MAX_MOVE_OPS_STRAIGHT): Define. (MOVE_RATIO): Use LARCH_MAX_MOVE_OPS_PER_LOOP_ITER instead of LARCH_MAX_MOVE_BYTES_PER_LOOP_ITER. * config/loongarch/loongarch.cc (loongarch_expand_block_move): Take the alignment from the parameter, but set it to UNITS_PER_WORD if !TARGET_STRICT_ALIGN. Limit the length of straight-line implementation with LARCH_MAX_MOVE_OPS_STRAIGHT instead of LARCH_MAX_MOVE_BYTES_STRAIGHT. (loongarch_block_move_straight): When there are left-over bytes, half the mode size instead of falling back to byte mode at once. (loongarch_block_move_loop): Limit the length of loop body with LARCH_MAX_MOVE_OPS_PER_LOOP_ITER instead of LARCH_MAX_MOVE_BYTES_PER_LOOP_ITER. * config/loongarch/loongarch.md (cpymemsi): Pass the alignment to loongarch_expand_block_move. gcc/testsuite/ChangeLog: PR target/109465 * gcc.target/loongarch/pr109465-1.c: New test. * gcc.target/loongarch/pr109465-2.c: New test. * gcc.target/loongarch/pr109465-3.c: New test.
Showing
- gcc/config/loongarch/loongarch-protos.h 1 addition, 1 deletiongcc/config/loongarch/loongarch-protos.h
- gcc/config/loongarch/loongarch.cc 54 additions, 41 deletionsgcc/config/loongarch/loongarch.cc
- gcc/config/loongarch/loongarch.h 4 additions, 6 deletionsgcc/config/loongarch/loongarch.h
- gcc/config/loongarch/loongarch.md 2 additions, 1 deletiongcc/config/loongarch/loongarch.md
- gcc/testsuite/gcc.target/loongarch/pr109465-1.c 9 additions, 0 deletionsgcc/testsuite/gcc.target/loongarch/pr109465-1.c
- gcc/testsuite/gcc.target/loongarch/pr109465-2.c 9 additions, 0 deletionsgcc/testsuite/gcc.target/loongarch/pr109465-2.c
- gcc/testsuite/gcc.target/loongarch/pr109465-3.c 12 additions, 0 deletionsgcc/testsuite/gcc.target/loongarch/pr109465-3.c
Loading
Please register or sign in to comment