Commit 962a994d authored 5 months ago by Richard Biener Committed by Richard Biener 5 months ago

Remove SLP_INSTANCE_UNROLLING_FACTOR, compute VF in vect_make_slp_decision

The following prepares us for SLP instances with a non-uniform number
of lanes.  We already have this with load permutation lowering, but
we managed to keep that within the constraints of the per SLP instance
computed VF based on its max_nunits (with a vector type fixed for
each node) and the instance group size which is the number of lanes
in the SLP instance root.  But in the case where arbitrary splitting
and merging SLP nodes at non-power-of-two lane boundaries is allowed
this simple calculation based on the outgoing group size falls apart.

The following, instead of computing a VF during SLP instance
discovery, computes it at vect_make_slp_decision time by walking
the SLP graph and looking at each SLP node in isolation.  We do
track max_nunits per node which could be a VF per node instead or
forgo with both completely (though for BB vectorization we need
to communicate a VF > 1 requirement upward, or compute that after
the fact).  In the end we'd like to delay vector type assignment
and only compute a minimum VF here, allowing vector types to
grow when the actual VF is bigger.

There's slight complication with permutes of externs / constants
as those get their vector type (and thus max_nunits) assigned late.
While we force them to have the same vector type as the result at
the moment their number of lanes can differ.  So those get handled
explicitly there right now to up the VF as needed - the alternative
is to fail vectorization, I have an addition to
vect_maybe_update_slp_op_vectype that would FAIL if the set
vector type isn't within the constraints of the VF.

	* tree-vectorizer.h (SLP_INSTANCE_UNROLLING_FACTOR): Remove.
	(slp_instance::unrolling_factor): Likewise.
	* tree-vect-slp.cc (vect_build_slp_instance): Do not set
	SLP_INSTANCE_UNROLLING_FACTOR.  Remove then dead code.
	Compute and set max_nunits from the RHS nodes merged.
	(vect_update_slp_vf_for_node): New function.
	(vect_make_slp_decision): Use vect_update_slp_vf_for_node
	to compute VF recursively.
	(vect_build_slp_store_interleaving): Get max_nunits and
	properly set that on the permute nodes built.
	(vect_analyze_slp): Do not set SLP_INSTANCE_UNROLLING_FACTOR.

parent 65abc81c

No related branches found

No related tags found

No related merge requests found

Hide whitespace changes

Inline Side-by-side

Showing with 55 additions and 21 deletions

Please register or to comment