Skip to content
Snippets Groups Projects
  • Roger Sayle's avatar
    e2886967
    PR 106245: Split (x<<31)>>31 as -(x&1) in i386.md · e2886967
    Roger Sayle authored
    This patch is the backend piece of a solution to PRs 101955 and 106245,
    that adds a define_insn_and_split to the i386 backend, to perform sign
    extension of a single (least significant) bit using and $1 then neg.
    
    Previously, (x<<31)>>31 would be generated as
    
            sall    $31, %eax	// 3 bytes
            sarl    $31, %eax	// 3 bytes
    
    with this patch the backend now generates:
    
            andl    $1, %eax	// 3 bytes
            negl    %eax		// 2 bytes
    
    Not only is this smaller in size, but microbenchmarking confirms
    that it's a performance win on both Intel and AMD; Intel sees only a
    2% improvement (perhaps just a size effect), but AMD sees a 7% win.
    
    2023-10-21  Roger Sayle  <roger@nextmovesoftware.com>
    	    Uros Bizjak  <ubizjak@gmail.com>
    
    gcc/ChangeLog
    	PR middle-end/101955
    	PR tree-optimization/106245
    	* config/i386/i386.md (*extv<mode>_1_0): New define_insn_and_split.
    
    gcc/testsuite/ChangeLog
    	PR middle-end/101955
    	PR tree-optimization/106245
    	* gcc.target/i386/pr106245-2.c: New test case.
    	* gcc.target/i386/pr106245-3.c: New 32-bit test case.
    	* gcc.target/i386/pr106245-4.c: New 64-bit test case.
    	* gcc.target/i386/pr106245-5.c: Likewise.
    e2886967
    History
    PR 106245: Split (x<<31)>>31 as -(x&1) in i386.md
    Roger Sayle authored
    This patch is the backend piece of a solution to PRs 101955 and 106245,
    that adds a define_insn_and_split to the i386 backend, to perform sign
    extension of a single (least significant) bit using and $1 then neg.
    
    Previously, (x<<31)>>31 would be generated as
    
            sall    $31, %eax	// 3 bytes
            sarl    $31, %eax	// 3 bytes
    
    with this patch the backend now generates:
    
            andl    $1, %eax	// 3 bytes
            negl    %eax		// 2 bytes
    
    Not only is this smaller in size, but microbenchmarking confirms
    that it's a performance win on both Intel and AMD; Intel sees only a
    2% improvement (perhaps just a size effect), but AMD sees a 7% win.
    
    2023-10-21  Roger Sayle  <roger@nextmovesoftware.com>
    	    Uros Bizjak  <ubizjak@gmail.com>
    
    gcc/ChangeLog
    	PR middle-end/101955
    	PR tree-optimization/106245
    	* config/i386/i386.md (*extv<mode>_1_0): New define_insn_and_split.
    
    gcc/testsuite/ChangeLog
    	PR middle-end/101955
    	PR tree-optimization/106245
    	* gcc.target/i386/pr106245-2.c: New test case.
    	* gcc.target/i386/pr106245-3.c: New 32-bit test case.
    	* gcc.target/i386/pr106245-4.c: New 64-bit test case.
    	* gcc.target/i386/pr106245-5.c: Likewise.
pr106245-3.c 261 B