Remove SLP_INSTANCE_UNROLLING_FACTOR, compute VF in vect_make_slp_decision
The following prepares us for SLP instances with a non-uniform number of lanes. We already have this with load permutation lowering, but we managed to keep that within the constraints of the per SLP instance computed VF based on its max_nunits (with a vector type fixed for each node) and the instance group size which is the number of lanes in the SLP instance root. But in the case where arbitrary splitting and merging SLP nodes at non-power-of-two lane boundaries is allowed this simple calculation based on the outgoing group size falls apart. The following, instead of computing a VF during SLP instance discovery, computes it at vect_make_slp_decision time by walking the SLP graph and looking at each SLP node in isolation. We do track max_nunits per node which could be a VF per node instead or forgo with both completely (though for BB vectorization we need to communicate a VF > 1 requirement upward, or compute that after the fact). In the end we'd like to delay vector type assignment and only compute a minimum VF here, allowing vector types to grow when the actual VF is bigger. There's slight complication with permutes of externs / constants as those get their vector type (and thus max_nunits) assigned late. While we force them to have the same vector type as the result at the moment their number of lanes can differ. So those get handled explicitly there right now to up the VF as needed - the alternative is to fail vectorization, I have an addition to vect_maybe_update_slp_op_vectype that would FAIL if the set vector type isn't within the constraints of the VF. * tree-vectorizer.h (SLP_INSTANCE_UNROLLING_FACTOR): Remove. (slp_instance::unrolling_factor): Likewise. * tree-vect-slp.cc (vect_build_slp_instance): Do not set SLP_INSTANCE_UNROLLING_FACTOR. Remove then dead code. Compute and set max_nunits from the RHS nodes merged. (vect_update_slp_vf_for_node): New function. (vect_make_slp_decision): Use vect_update_slp_vf_for_node to compute VF recursively. (vect_build_slp_store_interleaving): Get max_nunits and properly set that on the permute nodes built. (vect_analyze_slp): Do not set SLP_INSTANCE_UNROLLING_FACTOR.
Loading
Please register or sign in to comment