-
Roger Sayle authored
This patch is the backend piece of a solution to PRs 101955 and 106245, that adds a define_insn_and_split to the i386 backend, to perform sign extension of a single (least significant) bit using and $1 then neg. Previously, (x<<31)>>31 would be generated as sall $31, %eax // 3 bytes sarl $31, %eax // 3 bytes with this patch the backend now generates: andl $1, %eax // 3 bytes negl %eax // 2 bytes Not only is this smaller in size, but microbenchmarking confirms that it's a performance win on both Intel and AMD; Intel sees only a 2% improvement (perhaps just a size effect), but AMD sees a 7% win. 2023-10-21 Roger Sayle <roger@nextmovesoftware.com> Uros Bizjak <ubizjak@gmail.com> gcc/ChangeLog PR middle-end/101955 PR tree-optimization/106245 * config/i386/i386.md (*extv<mode>_1_0): New define_insn_and_split. gcc/testsuite/ChangeLog PR middle-end/101955 PR tree-optimization/106245 * gcc.target/i386/pr106245-2.c: New test case. * gcc.target/i386/pr106245-3.c: New 32-bit test case. * gcc.target/i386/pr106245-4.c: New 64-bit test case. * gcc.target/i386/pr106245-5.c: Likewise.
Roger Sayle authoredThis patch is the backend piece of a solution to PRs 101955 and 106245, that adds a define_insn_and_split to the i386 backend, to perform sign extension of a single (least significant) bit using and $1 then neg. Previously, (x<<31)>>31 would be generated as sall $31, %eax // 3 bytes sarl $31, %eax // 3 bytes with this patch the backend now generates: andl $1, %eax // 3 bytes negl %eax // 2 bytes Not only is this smaller in size, but microbenchmarking confirms that it's a performance win on both Intel and AMD; Intel sees only a 2% improvement (perhaps just a size effect), but AMD sees a 7% win. 2023-10-21 Roger Sayle <roger@nextmovesoftware.com> Uros Bizjak <ubizjak@gmail.com> gcc/ChangeLog PR middle-end/101955 PR tree-optimization/106245 * config/i386/i386.md (*extv<mode>_1_0): New define_insn_and_split. gcc/testsuite/ChangeLog PR middle-end/101955 PR tree-optimization/106245 * gcc.target/i386/pr106245-2.c: New test case. * gcc.target/i386/pr106245-3.c: New 32-bit test case. * gcc.target/i386/pr106245-4.c: New 64-bit test case. * gcc.target/i386/pr106245-5.c: Likewise.
pr106245-3.c 261 B