Optimize v4si broadcast for noavx512vl.
This will enable below - vbroadcastss .LC1(%rip), %xmm0 + movl $-45, %edx + vmovd %edx, %xmm0 + vpshufd $0, %xmm0, %xmm0 According to microbenchmark, it's faster than broadcast from memory for TARGET_INTER_UNIT_MOVES_TO_VEC. gcc/ChangeLog: * config/i386/sse.md (*vec_dupv4si): Disable memory operand for !TARGET_INTER_UNIT_MOVES_TO_VEC when prefer_for_speed. gcc/testsuite/ChangeLog: * gcc.target/i386/pr100865-8a.c: Adjust testcase. * gcc.target/i386/pr100865-8c.c: Ditto. * gcc.target/i386/pr100865-9c.c: Ditto.
Showing
- gcc/config/i386/sse.md 6 additions, 1 deletiongcc/config/i386/sse.md
- gcc/testsuite/gcc.target/i386/pr100865-8a.c 1 addition, 1 deletiongcc/testsuite/gcc.target/i386/pr100865-8a.c
- gcc/testsuite/gcc.target/i386/pr100865-8c.c 1 addition, 1 deletiongcc/testsuite/gcc.target/i386/pr100865-8c.c
- gcc/testsuite/gcc.target/i386/pr100865-9c.c 1 addition, 1 deletiongcc/testsuite/gcc.target/i386/pr100865-9c.c
Loading
Please register or sign in to comment