Fix setting of call graph node AutoFDO count
We are initializing both the call graph node count and the entry block count of the function with the head_count value from the profile. Count propagation algorithm may refine the entry block count and we may end up with a case where the call graph node count is set to zero but the entry block count is non-zero. That becomes a problem because we have this code in execute_fixup_cfg: profile_count num = node->count; profile_count den = ENTRY_BLOCK_PTR_FOR_FN (cfun)->count; bool scale = num.initialized_p () && !(num == den); Here if num is 0 but den is not 0, scale becomes true and we lose the counts in if (scale) bb->count = bb->count.apply_scale (num, den); This is what happened in the issue reported in PR116743 (a 10% regression in MySQL HAMMERDB tests). 3d9e6767 made an improvement in AutoFDO count propagation, which caused a mismatch between the call graph node count (zero) and the entry block count (non-zero) and subsequent loss of counts as described above. The fix is to update the call graph node count once we've done count propagation. Tested on x86_64-pc-linux-gnu. gcc/ChangeLog: PR gcov-profile/116743 * auto-profile.cc (afdo_annotate_cfg): Fix mismatch between the call graph node count and the entry block count.
Loading
Please register or sign in to comment