i386: Add br_mispredict_scale in cost table.
For later processors, the pipeline went deeper so the penalty for untaken branch can be larger than before. Add a new parameter br_mispredict_scale to describe the penalty, and adopt to noce_max_ifcvt_seq_cost hook to allow longer sequence to be converted with cmove. This improves cpu2017 544 with -Ofast -march=native for 14% on P-core SPR, and 8% on E-core SRF. No other regression observed. gcc/ChangeLog: * config/i386/i386.cc (ix86_noce_max_ifcvt_seq_cost): Adjust cost with ix86_tune_cost->br_mispredict_scale. * config/i386/i386.h (processor_costs): Add br_mispredict_scale. * config/i386/x86-tune-costs.h: Add new br_mispredict_scale to all processor_costs, in which icelake_cost/alderlake_cost with value COSTS_N_INSNS (2) + 3 and other processor with value COSTS_N_INSNS (2). gcc/testsuite/ChangeLog: * gcc.target/i386/cmov12.c: New test.
Showing
- gcc/config/i386/i386.cc 7 additions, 1 deletiongcc/config/i386/i386.cc
- gcc/config/i386/i386.h 2 additions, 0 deletionsgcc/config/i386/i386.h
- gcc/config/i386/x86-tune-costs.h 33 additions, 0 deletionsgcc/config/i386/x86-tune-costs.h
- gcc/testsuite/gcc.target/i386/cmov12.c 21 additions, 0 deletionsgcc/testsuite/gcc.target/i386/cmov12.c
Loading
Please register or sign in to comment