-
Richard Biener authored
For a TSVC testcase we see failed register coalescing due to a different schedule of GIMPLE .FMA and stores fed by it. This can be mitigated by making direct internal functions participate in TER - given we're using more and more of such functions to expose target capabilities it seems to be a natural thing to not exempt those. Unfortunately the internal function expanding API doesn't match what we usually have - passing in a target and returning an RTX but instead the LHS of the call is expanded and written to. This makes the TER expansion of a call SSA def a bit unwieldly. Bootstrapped and tested on x86_64-unknown-linux-gnu. The ccmp changes have likely not seen any coverage, the debug stmt changes might not be optimal, we might end up losing on replaceable calls. PR middle-end/117801 * tree-outof-ssa.cc (ssa_is_replaceable_p): Make direct internal function calls replaceable. * expr.cc (get_def_for_expr): Handle replacements with calls. (get_def_for_expr_class): Likewise. (optimize_bitfield_assignment_op): Likewise. (expand_expr_real_1): Likewise. Properly expand direct internal function defs. * cfgexpand.cc (expand_call_stmt): Handle replacements with calls. (avoid_deep_ter_for_debug): Likewise, always create a debug temp for calls. (expand_debug_expr): Likewise, give up for calls. (expand_gimple_basic_block): Likewise. * ccmp.cc (ccmp_candidate_p): Likewise. (get_compare_parts): Likewise.
Richard Biener authoredFor a TSVC testcase we see failed register coalescing due to a different schedule of GIMPLE .FMA and stores fed by it. This can be mitigated by making direct internal functions participate in TER - given we're using more and more of such functions to expose target capabilities it seems to be a natural thing to not exempt those. Unfortunately the internal function expanding API doesn't match what we usually have - passing in a target and returning an RTX but instead the LHS of the call is expanded and written to. This makes the TER expansion of a call SSA def a bit unwieldly. Bootstrapped and tested on x86_64-unknown-linux-gnu. The ccmp changes have likely not seen any coverage, the debug stmt changes might not be optimal, we might end up losing on replaceable calls. PR middle-end/117801 * tree-outof-ssa.cc (ssa_is_replaceable_p): Make direct internal function calls replaceable. * expr.cc (get_def_for_expr): Handle replacements with calls. (get_def_for_expr_class): Likewise. (optimize_bitfield_assignment_op): Likewise. (expand_expr_real_1): Likewise. Properly expand direct internal function defs. * cfgexpand.cc (expand_call_stmt): Handle replacements with calls. (avoid_deep_ter_for_debug): Likewise, always create a debug temp for calls. (expand_debug_expr): Likewise, give up for calls. (expand_gimple_basic_block): Likewise. * ccmp.cc (ccmp_candidate_p): Likewise. (get_compare_parts): Likewise.
gcc NaN GiB