Commit e7393cbb authored 5 months ago by Denis Chertykov
The detailed explanation from PR116550:

Test file: udivmoddi.c
problem insn: 484

Before LRA pass we have:
(insn 484 483 485 72 (parallel [
            (set (reg/v:SI 143 [ __q1 ])
                (plus:SI (reg/v:SI 143 [ __q1 ])
                    (const_int -2 [0xfffffffffffffffe])))
            (clobber (scratch:QI))
        ]) "udivmoddi.c":163:405 discrim 5 186 {addsi3}
     (nil))

LRA substitute all scratches with new pseudos, so we have:
(insn 484 483 485 72 (parallel [
            (set (reg/v:SI 143 [ __q1 ])
                (plus:SI (reg/v:SI 143 [ __q1 ])
                    (const_int -2 [0xfffffffffffffffe])))
            (clobber (reg:QI 619))
        ]) "/mnt/d/avr-lra/udivmoddi.c":163:405 discrim 5 186 {addsi3}
     (expr_list:REG_UNUSED (reg:QI 619)
        (nil)))

Pseudo 619 is a special scratch register generated by LRA which is marked in `scratch_bitmap' and can be tested by call `ira_former_scratch_p(regno)'.

In dump file (udivmoddi.c.317r.reload) we have:
      Creating newreg=619
Removing SCRATCH to p619 in insn #484 (nop 3)
rescanning insn with uid = 484.

After that LRA tries to spill (reg:QI 619)
It's a bug because (reg:QI 619) is an output scratch register which is already something like spill register.

Fragment from udivmoddi.c.317r.reload:
      Choosing alt 2 in insn 484:  (0) r  (1) 0  (2) nYnn  (3) &d {addsi3}
      Creating newreg=728 from oldreg=619, assigning class LD_REGS to r728

IMHO: the bug is in lra-constraints.cc in function `get_reload_reg'
fragment of `get_reload_reg':
  if (type == OP_OUT)
    {
      /* Output reload registers tend to start out with a conservative
	 choice of register class.  Usually this is ALL_REGS, although
	 a target might narrow it (for performance reasons) through
	 targetm.preferred_reload_class.  It's therefore quite common
	 for a reload instruction to require a more restrictive class
	 than the class that was originally assigned to the reload register.

	 In these situations, it's more efficient to refine the choice
	 of register class rather than create a second reload register.
	 This also helps to avoid cycling for registers that are only
	 used by reload instructions.  */
      if (REG_P (original)
	  && (int) REGNO (original) >= new_regno_start
	  && INSN_UID (curr_insn) >= new_insn_uid_start
__________________________________^^
	  && in_class_p (original, rclass, &new_class, true))
	{
	  unsigned int regno = REGNO (original);
	  if (lra_dump_file != NULL)
	    {
	      fprintf (lra_dump_file, "	 Reuse r%d for output ", regno);
	      dump_value_slim (lra_dump_file, original, 1);
	    }

This condition incorrectly limits register reuse to ONLY newly generated instructions.
i.e. LRA can reuse registers only from insns generated by himself.

IMHO:It's wrong.
Scratch registers generated by LRA also have to be reused.

The patch is very simple.
On x86_64, it bootstraps+regtests fine.

gcc/
	PR target/116550
	* lra-constraints.cc (get_reload_reg): Reuse scratch registers
	generated by LRA.
parent e32fff67
No related branches found
No related tags found
Hide whitespace changes
Inline Side-by-side
Showing with 2 additions and 1 deletion
Please register or to comment