-
- Downloads
x86_64: Improve code expanded for highpart multiplications.
While working on a middle-end patch to more aggressively use highpart multiplications on targets that support them, I noticed that the RTL expanded by the x86 backend interacts poorly with register allocation leading to suboptimal code. For the testcase, typedef int __attribute ((mode(TI))) ti_t; long foo(long x) { return ((ti_t)x * 19065) >> 64; } we'd like to avoid: foo: movq %rdi, %rax movl $19065, %edx imulq %rdx movq %rdx, %rax ret and would prefer: foo: movl $19065, %eax imulq %rdi movq %rdx, %rax ret This patch provides a pair of peephole2 transformations to tweak the spills generated by reload, and at the same time replaces the current define_expand with a define_insn pattern using the new [su]mul_highpart RTX codes. 2021-12-20 Roger Sayle <roger@nextmovesoftware.com> Uroš Bizjak <ubizjak@gmail.com> gcc/ChangeLog * config/i386/i386.md (any_mul_highpart): New code iterator. (sgnprefix, s): Add attribute support for [su]mul_highpart. (<s>mul<mode>3_highpart): Delete expander. (<s>mul<mode>3_highpart, <s>mulsi32_highpart_zext): New define_insn patterns. (define_peephole2): Tweak the register allocation for the above instructions after reload. gcc/testsuite/ChangeLog * gcc.target/i386/smuldi3_highpart.c: New test case.
Loading
Please register or sign in to comment