Skip to content
Snippets Groups Projects
  • Richard Biener's avatar
    dc0dea98
    middle-end/117801 - failed register coalescing due to GIMPLE schedule · dc0dea98
    Richard Biener authored
    For a TSVC testcase we see failed register coalescing due to a
    different schedule of GIMPLE .FMA and stores fed by it.  This
    can be mitigated by making direct internal functions participate
    in TER - given we're using more and more of such functions to
    expose target capabilities it seems to be a natural thing to not
    exempt those.
    
    Unfortunately the internal function expanding API doesn't match
    what we usually have - passing in a target and returning an RTX
    but instead the LHS of the call is expanded and written to.  This
    makes the TER expansion of a call SSA def a bit unwieldly.
    
    Bootstrapped and tested on x86_64-unknown-linux-gnu.
    
    The ccmp changes have likely not seen any coverage, the debug stmt
    changes might not be optimal, we might end up losing on replaceable
    calls.
    
    	PR middle-end/117801
    	* tree-outof-ssa.cc (ssa_is_replaceable_p): Make
    	direct internal function calls replaceable.
    	* expr.cc (get_def_for_expr): Handle replacements with calls.
    	(get_def_for_expr_class): Likewise.
    	(optimize_bitfield_assignment_op): Likewise.
    	(expand_expr_real_1): Likewise.  Properly expand direct
    	internal function defs.
    	* cfgexpand.cc (expand_call_stmt): Handle replacements with calls.
    	(avoid_deep_ter_for_debug): Likewise, always create a debug temp
    	for calls.
    	(expand_debug_expr): Likewise, give up for calls.
    	(expand_gimple_basic_block): Likewise.
    	* ccmp.cc (ccmp_candidate_p): Likewise.
    	(get_compare_parts): Likewise.
    dc0dea98
    History
    middle-end/117801 - failed register coalescing due to GIMPLE schedule
    Richard Biener authored
    For a TSVC testcase we see failed register coalescing due to a
    different schedule of GIMPLE .FMA and stores fed by it.  This
    can be mitigated by making direct internal functions participate
    in TER - given we're using more and more of such functions to
    expose target capabilities it seems to be a natural thing to not
    exempt those.
    
    Unfortunately the internal function expanding API doesn't match
    what we usually have - passing in a target and returning an RTX
    but instead the LHS of the call is expanded and written to.  This
    makes the TER expansion of a call SSA def a bit unwieldly.
    
    Bootstrapped and tested on x86_64-unknown-linux-gnu.
    
    The ccmp changes have likely not seen any coverage, the debug stmt
    changes might not be optimal, we might end up losing on replaceable
    calls.
    
    	PR middle-end/117801
    	* tree-outof-ssa.cc (ssa_is_replaceable_p): Make
    	direct internal function calls replaceable.
    	* expr.cc (get_def_for_expr): Handle replacements with calls.
    	(get_def_for_expr_class): Likewise.
    	(optimize_bitfield_assignment_op): Likewise.
    	(expand_expr_real_1): Likewise.  Properly expand direct
    	internal function defs.
    	* cfgexpand.cc (expand_call_stmt): Handle replacements with calls.
    	(avoid_deep_ter_for_debug): Likewise, always create a debug temp
    	for calls.
    	(expand_debug_expr): Likewise, give up for calls.
    	(expand_gimple_basic_block): Likewise.
    	* ccmp.cc (ccmp_candidate_p): Likewise.
    	(get_compare_parts): Likewise.
gcc NaN GiB