Skip to content
Snippets Groups Projects
Commit ba773a86 authored by Jeff Law's avatar Jeff Law
Browse files

RISC-V] Slightly improve broadcasting small constants into vectors

I probably spent way more time on this than it's worth...

I was looking at the code we generate for vector SAD and noticed that we were
being a bit silly.  Specifically:

        li      a4,0            # 272   [c=4 l=4]  *movsi_internal/1

Followed shortly by:

        vmv.s.x v3,a4   # 261   [c=4 l=4]  *pred_broadcastrvvm1si/6

And no other uses of a4.  We could have used x0 trivially.

First we adjust the expander so that it doesn't force the constant into a
register.  In the matching pattern we change the appropriate source constraints
from "r" to "rJ" and the output template is changed to use %z for the operand.
The net is we drop the li completely and emit vmv.s.x,v3,x0.

But wait, there's more.  If we're broadcasting a constant in the range
[-16..15] into a vector, we currently load the constant into a register and use
vmv.v.r.  We can instead use vmv.v.i, which avoids loading the constant into a
GPR.  For that case we again avoid forcing the constant into a register in the
expander and adjust the output template to emit vmv.v.x or vmv.v.i based on
whether or not the appropriate operand is a constant or general purpose
register.  So again, we'll drop a load immediate into a scalar for this case.

Whether or not we should use vmv.v.i vs vmv.s.x for loading [-16..15] into the
0th element is probably uarch dependent.  The tradeoff is loading the GPR vs
the broadcast in the vector unit.  I didn't bother with this case.

Tested in my tester (which tests rv64gcv as a default codegen option). Will
wait for the pre-commit tester to render a verdict.

gcc/
	* config/riscv/constraints.md (P): New constraint.
	* config/riscv/vector.md (pred_broadcast<mode> expander): Do
	not force small integers into GPRs so aggressively.
	(pred_broadcast<mode> insn & splitter): Allow splatting small
	constants across the vector register directly.  Allow splatting
	(const_int 0) into element 0 directly.
parent 34b77d1b
No related branches found
No related tags found
No related merge requests found
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment