Skip to content
Snippets Groups Projects
Commit 7164d982 authored by Richard Biener's avatar Richard Biener Committed by Richard Biener
Browse files

Also lower SLP grouped loads with just one consumer

This makes sure to produce interleaving schemes or load-lanes
for single-element interleaving and other permutes that otherwise
would use more than three vectors.

It exposes the latent issue that single-element interleaving with
large gaps can be inefficient - the mitigation in get_group_load_store_type
doesn't trigger when we clear the load permutation.

It also exposes the fact that not all permutes can be lowered in
the best way in a vector length agnostic way so I've added an
exception to keep power-of-two size contiguous aligned chunks
unlowered (unless we want load-lanes).  The optimal handling
of load/store vectorization is going to continue to be a learning
process.

	* tree-vect-slp.cc (vect_lower_load_permutations): Also
	process single-use grouped loads.
	Avoid lowering contiguous aligned power-of-two sized
	chunks, those are better handled by the vector size
	specific SLP code generation.
	* tree-vect-stmts.cc (get_group_load_store_type): Drop
	the unrelated requirement of a load permutation for the
	single-element interleaving limit.

	* gcc.dg/vect/slp-46.c: Remove XFAIL.
parent 4292297a
No related branches found
No related tags found
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment