gcc/df-core.cc · adb70fc775221b31db19c3649445e3f37e00f14b · COBOLworx / gcc-cobol

3 months ago

middle-end/117932 - further speedup DF worklist solver · 6b390f82

Richard Biener authored 3 months ago

The triple-indirect memory reference we perform for each incoming
edge age <= last_change_age[bbindex_to_postorder[e->src->index]]
is pretty bad and when there are a lot of small BBs like for the
PR26854 testcase this shows in the profile.  The following reduces
this by one level by making last_change_age indexed by BB index
rather than postorder number and realizing that for the first
iteration the age check is always true.  We pay for this by
allocating last_change_age for all BBs in the function but we
do it like for sparsesets and avoid initializing given we check
the considerd bitmap anyway.  We can also elide initializing
last_visit_age in an obvious way given we separated the initial
iteration in the previous change.

Together this improves compile-time in the PR117932 setting by
another 2%.

	PR middle-end/117932
	* df-core.cc (df_worklist_propagate_forward): Elide
	age check for the first iteration, adjust for
	last_change_age change.
	(df_worklist_propagate_backward): Likewise.
	(df_worklist_dataflow_doublequeue): Make last_change_age
	indexed by BB index, avoid clearing both age arrays.

6b390f82

History

middle-end/117932 - further speedup DF worklist solver

Richard Biener authored 3 months ago

The triple-indirect memory reference we perform for each incoming
edge age <= last_change_age[bbindex_to_postorder[e->src->index]]
is pretty bad and when there are a lot of small BBs like for the
PR26854 testcase this shows in the profile.  The following reduces
this by one level by making last_change_age indexed by BB index
rather than postorder number and realizing that for the first
iteration the age check is always true.  We pay for this by
allocating last_change_age for all BBs in the function but we
do it like for sparsesets and avoid initializing given we check
the considerd bitmap anyway.  We can also elide initializing
last_visit_age in an obvious way given we separated the initial
iteration in the previous change.

Together this improves compile-time in the PR117932 setting by
another 2%.

	PR middle-end/117932
	* df-core.cc (df_worklist_propagate_forward): Elide
	age check for the first iteration, adjust for
	last_change_age change.
	(df_worklist_propagate_backward): Likewise.
	(df_worklist_dataflow_doublequeue): Make last_change_age
	indexed by BB index, avoid clearing both age arrays.