Skip to content
Snippets Groups Projects
Commit 727f8b14 authored by Roger Sayle's avatar Roger Sayle
Browse files

i386: Add additional variant of bswaphisi2_lowpart peephole2.

This patch adds an additional variation of the peephole2 used to convert
bswaphisi2_lowpart into rotlhi3_1_slp, which converts xchgb %ah,%al into
rotw if the flags register isn't live.  The motivating example is:

void ext(int x);
void foo(int x)
{
  ext((x&~0xffff)|((x>>8)&0xff)|((x&0xff)<<8));
}

where GCC with -O2 currently produces:

foo:	movl    %edi, %eax
        rolw    $8, %ax
        movl    %eax, %edi
        jmp     ext

The issue is that the original xchgb (bswaphisi2_lowpart) can only be
performed in "Q" registers that allow the %?h register to be used, so
reload generates the above two movl.  However, it's later in peephole2
where we see that CC_FLAGS can be clobbered, so we can use a rotate word,
which is more forgiving with register allocations.  With the additional
peephole2 proposed here, we now generate:

foo:	rolw    $8, %di
        jmp     ext

2024-07-04  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	* config/i386/i386.md (bswaphisi2_lowpart peephole2): New
	peephole2 variant to eliminate register shuffling.

gcc/testsuite/ChangeLog
	* gcc.target/i386/xchg-4.c: New test case.
parent 759f4abe
No related branches found
No related tags found
No related merge requests found
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment