- Oct 20, 2023
-
-
Oleg Endo authored
Fix accidentally inverted comparison. gcc/ChangeLog: PR target/101177 * config/sh/sh.md (unnamed split pattern): Fix comparison of find_regno_note result.
-
Richard Biener authored
The following makes sure to rewrite all gather/scatter detected by dataref analysis plus stmts classified as VMAT_GATHER_SCATTER. Maybe we need to rewrite all refs, the following covers the cases I've run into now. * tree-vect-loop.cc (update_epilogue_loop_vinfo): Rewrite both STMT_VINFO_GATHER_SCATTER_P and VMAT_GATHER_SCATTER stmt refs.
-
Richard Biener authored
I went a little bit too simple with implementing SLP gather support for emulated and builtin based gathers. The following fixes the conflict that appears when running into .MASK_LOAD where we rely on vect_get_operand_map and the bolted-on STMT_VINFO_GATHER_SCATTER_P checking wrecks that. The following properly integrates this with vect_get_operand_map, adding another special index refering to the vect_check_gather_scatter analyzed offset. This unbreaks aarch64 (and hopefully riscv), I'll followup with more fixes and testsuite coverage for x86 where I think I got masked gather SLP support wrong. * tree-vect-slp.cc (off_map, off_op0_map, off_arg2_map, off_arg3_arg2_map): New. (vect_get_operand_map): Get flag whether the stmt was recognized as gather or scatter and use the above accordingly. (vect_get_and_check_slp_defs): Adjust. (vect_build_slp_tree_2): Likewise.
-
Tobias Burnus authored
The omp_lock_hint_* parameters were deprecated in favor of omp_sync_hint_*. While omp.h contained deprecation markers for those, the omp_lib module only contained them for omp_{g,s}_nested. Note: The -Wdeprecated-declarations warning will only become active once openmp_version / _OPENMP is bumped from 201511 (4.5) to 201811 (5.0). libgomp/ChangeLog: * omp_lib.f90.in: Tag omp_lock_hint_* as being deprecated when _OPENMP >= 201811.
-
Juzhe-Zhong authored
1. Remove "m_" prefix as they are not private members. 2. Rename infos -> local_infos, info -> global_info to clarify their meaning. Pushed as it is obvious. gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (pre_vsetvl::fuse_local_vsetvl_info): Rename variables. (pre_vsetvl::pre_global_vsetvl_info): Ditto. (pre_vsetvl::emit_vsetvl): Ditto.
-
Tamar Christina authored
With the patch enabling the vectorization of early-breaks, we'd like to allow bitfield lowering in such loops, which requires the relaxation of allowing multiple exits when doing so. In order to avoid a similar issue to PR107275, the code that rejects loops with certain types of gimple_stmts was hoisted from 'if_convertible_loop_p_1' to 'get_loop_body_in_if_conv_order', to avoid trying to lower bitfields in loops we are not going to vectorize anyway. This also ensures 'ifcvt_local_dec' doesn't accidentally remove statements it shouldn't as it will never come across them. I made sure to add a comment to make clear that there is a direct connection between the two and if we were to enable vectorization of any other gimple statement we should make sure both handle it. gcc/ChangeLog: * tree-if-conv.cc (if_convertible_loop_p_1): Move check from here ... (get_loop_body_if_conv_order): ... to here. (if_convertible_loop_p): Remove single_exit check. (tree_if_conversion): Move single_exit check to if-conversion part and support multiple exits. gcc/testsuite/ChangeLog: * gcc.dg/vect/vect-bitfield-read-1-not.c: New test. * gcc.dg/vect/vect-bitfield-read-2-not.c: New test. * gcc.dg/vect/vect-bitfield-read-8.c: New test. * gcc.dg/vect/vect-bitfield-read-9.c: New test. Co-Authored-By:
Andre Vieira <andre.simoesdiasvieira@arm.com>
-
Tamar Christina authored
The bitfield vectorization support does not currently recognize bitfields inside gconds. This means they can't be used as conditions for early break vectorization which is a functionality we require. This adds support for them by explicitly matching and handling gcond as a source. Testcases are added in the testsuite update patch as the only way to get there is with the early break vectorization. See tests: - vect-early-break_20.c - vect-early-break_21.c gcc/ChangeLog: * tree-vect-patterns.cc (vect_init_pattern_stmt): Copy STMT_VINFO_TYPE from original statement. (vect_recog_bitfield_ref_pattern): Support bitfields in gcond. Co-Authored-By:
Andre Vieira <andre.simoesdiasvieira@arm.com>
-
Hu, Lin1 authored
Hi, all This patch aims to fix some scan-asm fail of pr89229-{5,6,7}b.c since we emit scalar vmov{s,d} here, when trying to use x/ymm 16+ w/o avx512vl but with avx512f+evex512. If everyone has no objection to the modification of this behavior, then we tend to solve these failures by modifying these testcases. BRs, Lin gcc/testsuite/ChangeLog: * gcc.target/i386/pr89229-5b.c: Modify test. * gcc.target/i386/pr89229-6b.c: Ditto. * gcc.target/i386/pr89229-7b.c: Ditto.
-
Juzhe-Zhong authored
Confirm dynamic LMUL algorithm works well for choosing LMUL = 4 for the PR: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111848 But it generate horrible register spillings. The root cause is that we didn't hoist the vmv.v.x outside the loop which increase the SLP loop register pressure. So, change the COSNT_VECTOR move into vec_duplicate splitter that we can gain better optimizations: 1. better LICM. 2. More opportunities of transforming 'vv' into 'vx' in the future. Before this patch: f3: ble a4,zero,.L8 csrr t0,vlenb slli t1,t0,4 csrr a6,vlenb sub sp,sp,t1 csrr a5,vlenb slli a6,a6,3 slli a5,a5,2 add a6,a6,sp vsetvli a7,zero,e16,m8,ta,ma slli a4,a4,3 vid.v v8 addi t6,a5,-1 vand.vi v8,v8,-2 neg t5,a5 vs8r.v v8,0(sp) vadd.vi v8,v8,1 vs8r.v v8,0(a6) j .L4 .L12: vsetvli a7,zero,e16,m8,ta,ma .L4: csrr t0,vlenb slli t0,t0,3 vl8re16.v v16,0(sp) add t0,t0,sp vmv.v.x v8,t6 mv t1,a4 vand.vv v24,v16,v8 mv a6,a4 vl8re16.v v16,0(t0) vand.vv v8,v16,v8 bleu a4,a5,.L3 mv a6,a5 .L3: vsetvli zero,a6,e8,m4,ta,ma vle8.v v20,0(a2) vle8.v v16,0(a3) vsetvli a7,zero,e8,m4,ta,ma vrgatherei16.vv v4,v20,v24 vadd.vv v4,v16,v4 vsetvli zero,a6,e8,m4,ta,ma vse8.v v4,0(a0) vle8.v v20,0(a2) vsetvli a7,zero,e8,m4,ta,ma vrgatherei16.vv v4,v20,v8 vadd.vv v4,v4,v16 vsetvli zero,a6,e8,m4,ta,ma vse8.v v4,0(a1) add a4,a4,t5 add a0,a0,a5 add a3,a3,a5 add a1,a1,a5 add a2,a2,a5 bgtu t1,a5,.L12 csrr t0,vlenb slli t1,t0,4 add sp,sp,t1 jr ra .L8: ret After this patch: f3: ble a4,zero,.L6 csrr a6,vlenb csrr a5,vlenb slli a6,a6,2 slli a5,a5,2 addi a6,a6,-1 slli a4,a4,3 neg t5,a5 vsetvli t1,zero,e16,m8,ta,ma vmv.v.x v24,a6 vid.v v8 vand.vi v8,v8,-2 vadd.vi v16,v8,1 vand.vv v8,v8,v24 vand.vv v16,v16,v24 .L4: mv t1,a4 mv a6,a4 bleu a4,a5,.L3 mv a6,a5 .L3: vsetvli zero,a6,e8,m4,ta,ma vle8.v v28,0(a2) vle8.v v24,0(a3) vsetvli a7,zero,e8,m4,ta,ma vrgatherei16.vv v4,v28,v8 vadd.vv v4,v24,v4 vsetvli zero,a6,e8,m4,ta,ma vse8.v v4,0(a0) vle8.v v28,0(a2) vsetvli a7,zero,e8,m4,ta,ma vrgatherei16.vv v4,v28,v16 vadd.vv v4,v4,v24 vsetvli zero,a6,e8,m4,ta,ma vse8.v v4,0(a1) add a4,a4,t5 add a0,a0,a5 add a3,a3,a5 add a1,a1,a5 add a2,a2,a5 bgtu t1,a5,.L4 .L6: ret Note that this patch triggers multiple FAILs: FAIL: gcc.target/riscv/rvv/autovec/cond/cond_arith_run-3.c execution test FAIL: gcc.target/riscv/rvv/autovec/cond/cond_arith_run-3.c execution test FAIL: gcc.target/riscv/rvv/autovec/cond/cond_arith_run-4.c execution test FAIL: gcc.target/riscv/rvv/autovec/cond/cond_arith_run-4.c execution test FAIL: gcc.target/riscv/rvv/autovec/cond/cond_arith_run-8.c execution test FAIL: gcc.target/riscv/rvv/autovec/cond/cond_arith_run-8.c execution test FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c execution test FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c execution test FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-2.c execution test FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-2.c execution test FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-1.c execution test FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-1.c execution test FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-2.c execution test FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-2.c execution test They failed are all because of bugs on VSETVL PASS: 10dd4: 0c707057 vsetvli zero,zero,e8,mf2,ta,ma 10dd8: 5e06b8d7 vmv.v.i v17,13 10ddc: 9ed030d7 vmv1r.v v1,v13 10de0: b21040d7 vncvt.x.x.w v1,v1 ----> raise illegal instruction since we don't have SEW = 8 -> SEW = 4 narrowing. 10de4: 5e0785d7 vmv.v.v v11,v15 Confirm the recent VSETVL refactor patch: https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633231.html fixed all of them. So this patch should be committed after the VSETVL refactor patch. PR target/111848 gcc/ChangeLog: * config/riscv/riscv-selftests.cc (run_const_vector_selftests): Adapt selftest. * config/riscv/riscv-v.cc (expand_const_vector): Change it into vec_duplicate splitter. gcc/testsuite/ChangeLog: * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-7.c: Adapt test. * gcc.dg/vect/costmodel/riscv/rvv/pr111848.c: New test.
-
Lehua Ding authored
This patch refactors and cleanups the vsetvl pass in order to make the code easier to modify and understand. This patch does several things: 1. Introducing a virtual CFG for vsetvl infos and Phase 1, 2 and 3 only maintain and modify this virtual CFG. Phase 4 performs insertion, modification and deletion of vsetvl insns based on the virtual CFG. The basic block in the virtual CFG is called vsetvl_block_info and the vsetvl information inside is called vsetvl_info. 2. Combine Phase 1 and 2 into a single Phase 1 and unified the demand system, this phase only fuse local vsetvl info in forward direction. 3. Refactor Phase 3, change the logic for determining whether to uplift vsetvl info to a pred basic block to a more unified method that there is a vsetvl info in the vsetvl defintion reaching in compatible with it. 4. Place all modification operations to the RTL in Phase 4 and Phase 5. Phase 4 is responsible for inserting, modifying and deleting vsetvl instructions based on fully optimized vsetvl infos. Phase 5 removes the avl operand from the RVV instruction and removes the unused dest operand register from the vsetvl insns. These modifications resulted in some testcases needing to be updated. The reasons for updating are summarized below: 1. more optimized vlmax_back_prop-{25,26}.c vlmax_conflict-{3,12}.c/vsetvl-{13,23}.c/vsetvl-23.c/ avl_single-{23,84,95}.c/pr109773-1.c 2. less unnecessary fusion avl_single-46.c/imm_bb_prop-1.c/pr109743-2.c/vsetvl-18.c 3. local fuse direction (backward -> forward) scalar_move-1.c 4. add some bugfix testcases. pr111037-{3,4}.c/pr111037-4.c avl_single-{89,104,105,106,107,108,109}.c PR target/111037 PR target/111234 PR target/111725 gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (bitmap_union_of_preds_with_entry): New. (debug): Removed. (compute_reaching_defintion): New. (enum vsetvl_type): Moved. (vlmax_avl_p): Moved. (enum emit_type): Moved. (vlmul_to_str): Moved. (vlmax_avl_insn_p): Removed. (policy_to_str): Moved. (loop_basic_block_p): Removed. (valid_sew_p): Removed. (vsetvl_insn_p): Moved. (vsetvl_vtype_change_only_p): Removed. (after_or_same_p): Removed. (before_p): Removed. (anticipatable_occurrence_p): Removed. (available_occurrence_p): Removed. (insn_should_be_added_p): Removed. (get_all_sets): Moved. (get_same_bb_set): Moved. (gen_vsetvl_pat): Removed. (calculate_vlmul): Moved. (get_max_int_sew): New. (emit_vsetvl_insn): Removed. (get_max_float_sew): New. (eliminate_insn): Removed. (insert_vsetvl): Removed. (count_regno_occurrences): Moved. (get_vl_vtype_info): Removed. (enum def_type): Moved. (validate_change_or_fail): Moved. (change_insn): Removed. (get_all_real_uses): Moved. (get_forward_read_vl_insn): Removed. (get_backward_fault_first_load_insn): Removed. (change_vsetvl_insn): Removed. (avl_source_has_vsetvl_p): Removed. (source_equal_p): Moved. (calculate_sew): Removed. (same_equiv_note_p): Moved. (get_expr_id): New. (incompatible_avl_p): Removed. (get_regno): New. (different_sew_p): Removed. (get_bb_index): New. (different_lmul_p): Removed. (has_no_uses): Moved. (different_ratio_p): Removed. (different_tail_policy_p): Removed. (different_mask_policy_p): Removed. (possible_zero_avl_p): Removed. (enum demand_flags): New. (second_ratio_invalid_for_first_sew_p): Removed. (second_ratio_invalid_for_first_lmul_p): Removed. (enum class): New. (float_insn_valid_sew_p): Removed. (second_sew_less_than_first_sew_p): Removed. (first_sew_less_than_second_sew_p): Removed. (class vsetvl_info): New. (compare_lmul): Removed. (second_lmul_less_than_first_lmul_p): Removed. (second_ratio_less_than_first_ratio_p): Removed. (DEF_INCOMPATIBLE_COND): Removed. (greatest_sew): Removed. (first_sew): Removed. (second_sew): Removed. (first_vlmul): Removed. (second_vlmul): Removed. (first_ratio): Removed. (second_ratio): Removed. (vlmul_for_first_sew_second_ratio): Removed. (vlmul_for_greatest_sew_second_ratio): Removed. (ratio_for_second_sew_first_vlmul): Removed. (class vsetvl_block_info): New. (DEF_SEW_LMUL_FUSE_RULE): New. (always_unavailable): Removed. (avl_unavailable_p): Removed. (class demand_system): New. (sew_unavailable_p): Removed. (lmul_unavailable_p): Removed. (ge_sew_unavailable_p): Removed. (ge_sew_lmul_unavailable_p): Removed. (ge_sew_ratio_unavailable_p): Removed. (DEF_UNAVAILABLE_COND): Removed. (same_sew_lmul_demand_p): Removed. (propagate_avl_across_demands_p): Removed. (reg_available_p): Removed. (support_relaxed_compatible_p): Removed. (demands_can_be_fused_p): Removed. (earliest_pred_can_be_fused_p): Removed. (vsetvl_dominated_by_p): Removed. (avl_info::avl_info): Removed. (avl_info::single_source_equal_p): Removed. (avl_info::multiple_source_equal_p): Removed. (DEF_SEW_LMUL_RULE): New. (avl_info::operator=): Removed. (avl_info::operator==): Removed. (DEF_POLICY_RULE): New. (avl_info::operator!=): Removed. (avl_info::has_non_zero_avl): Removed. (vl_vtype_info::vl_vtype_info): Removed. (vl_vtype_info::operator==): Removed. (DEF_AVL_RULE): New. (vl_vtype_info::operator!=): Removed. (vl_vtype_info::same_avl_p): Removed. (vl_vtype_info::same_vtype_p): Removed. (vl_vtype_info::same_vlmax_p): Removed. (vector_insn_info::operator>=): Removed. (vector_insn_info::operator==): Removed. (class pre_vsetvl): New. (vector_insn_info::parse_insn): Removed. (vector_insn_info::compatible_p): Removed. (vector_insn_info::skip_avl_compatible_p): Removed. (vector_insn_info::compatible_avl_p): Removed. (vector_insn_info::compatible_vtype_p): Removed. (vector_insn_info::available_p): Removed. (vector_insn_info::fuse_avl): Removed. (vector_insn_info::fuse_sew_lmul): Removed. (vector_insn_info::fuse_tail_policy): Removed. (vector_insn_info::fuse_mask_policy): Removed. (vector_insn_info::local_merge): Removed. (vector_insn_info::global_merge): Removed. (vector_insn_info::get_avl_or_vl_reg): Removed. (vector_insn_info::update_fault_first_load_avl): Removed. (vector_insn_info::dump): Removed. (vector_infos_manager::vector_infos_manager): Removed. (vector_infos_manager::create_expr): Removed. (vector_infos_manager::get_expr_id): Removed. (vector_infos_manager::all_same_ratio_p): Removed. (vector_infos_manager::all_avail_in_compatible_p): Removed. (vector_infos_manager::all_same_avl_p): Removed. (vector_infos_manager::expr_set_num): Removed. (vector_infos_manager::release): Removed. (vector_infos_manager::create_bitmap_vectors): Removed. (vector_infos_manager::free_bitmap_vectors): Removed. (vector_infos_manager::dump): Removed. (class pass_vsetvl): Adjust. (pass_vsetvl::get_vector_info): Removed. (pass_vsetvl::get_block_info): Removed. (pass_vsetvl::update_vector_info): Removed. (pass_vsetvl::update_block_info): Removed. (pre_vsetvl::compute_avl_def_data): New. (pass_vsetvl::simple_vsetvl): Removed. (pass_vsetvl::compute_local_backward_infos): Removed. (pass_vsetvl::need_vsetvl): Removed. (pass_vsetvl::transfer_before): Removed. (pass_vsetvl::transfer_after): Removed. (pre_vsetvl::compute_vsetvl_def_data): New. (pass_vsetvl::emit_local_forward_vsetvls): Removed. (pass_vsetvl::prune_expressions): Removed. (pass_vsetvl::compute_local_properties): Removed. (pre_vsetvl::compute_lcm_local_properties): New. (pass_vsetvl::earliest_fusion): Removed. (pre_vsetvl::fuse_local_vsetvl_info): New. (pass_vsetvl::vsetvl_fusion): Removed. (pass_vsetvl::can_refine_vsetvl_p): Removed. (pre_vsetvl::earliest_fuse_vsetvl_info): New. (pass_vsetvl::refine_vsetvls): Removed. (pass_vsetvl::cleanup_vsetvls): Removed. (pass_vsetvl::commit_vsetvls): Removed. (pass_vsetvl::pre_vsetvl): Removed. (pass_vsetvl::get_vsetvl_at_end): Removed. (local_avl_compatible_p): Removed. (pass_vsetvl::local_eliminate_vsetvl_insn): Removed. (pre_vsetvl::pre_global_vsetvl_info): New. (get_first_vsetvl_before_rvv_insns): Removed. (pass_vsetvl::global_eliminate_vsetvl_insn): Removed. (pre_vsetvl::emit_vsetvl): New. (pass_vsetvl::ssa_post_optimization): Removed. (pre_vsetvl::cleaup): New. (pre_vsetvl::remove_avl_operand): New. (pass_vsetvl::df_post_optimization): Removed. (pre_vsetvl::remove_unused_dest_operand): New. (pass_vsetvl::init): Removed. (pass_vsetvl::done): Removed. (pass_vsetvl::compute_probabilities): Removed. (pass_vsetvl::lazy_vsetvl): Adjust. (pass_vsetvl::execute): Adjust. * config/riscv/riscv-vsetvl.def (DEF_INCOMPATIBLE_COND): Removed. (DEF_SEW_LMUL_RULE): New. (DEF_SEW_LMUL_FUSE_RULE): Removed. (DEF_POLICY_RULE): New. (DEF_UNAVAILABLE_COND): Removed (DEF_AVL_RULE): New demand type. (sew_lmul): New demand type. (ratio_only): New demand type. (sew_only): New demand type. (ge_sew): New demand type. (ratio_and_ge_sew): New demand type. (tail_mask_policy): New demand type. (tail_policy_only): New demand type. (mask_policy_only): New demand type. (ignore_policy): New demand type. (avl): New demand type. (non_zero_avl): New demand type. (ignore_avl): New demand type. * config/riscv/t-riscv: Removed riscv-vsetvl.h * config/riscv/riscv-vsetvl.h: Removed. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/scalar_move-1.c: Adjust. * gcc.target/riscv/rvv/vsetvl/avl_single-23.c: Adjust. * gcc.target/riscv/rvv/vsetvl/avl_single-46.c: Adjust. * gcc.target/riscv/rvv/vsetvl/avl_single-84.c: Adjust. * gcc.target/riscv/rvv/vsetvl/avl_single-89.c: Adjust. * gcc.target/riscv/rvv/vsetvl/avl_single-95.c: Adjust. * gcc.target/riscv/rvv/vsetvl/imm_bb_prop-1.c: Adjust. * gcc.target/riscv/rvv/vsetvl/pr109743-2.c: Adjust. * gcc.target/riscv/rvv/vsetvl/pr109773-1.c: Adjust. * gcc.target/riscv/rvv/base/pr111037-1.c: Moved to... * gcc.target/riscv/rvv/vsetvl/pr111037-1.c: ...here. * gcc.target/riscv/rvv/base/pr111037-2.c: Moved to... * gcc.target/riscv/rvv/vsetvl/pr111037-2.c: ...here. * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-25.c: Adjust. * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-26.c: Adjust. * gcc.target/riscv/rvv/vsetvl/vlmax_conflict-12.c: Adjust. * gcc.target/riscv/rvv/vsetvl/vlmax_conflict-3.c: Adjust. * gcc.target/riscv/rvv/vsetvl/vsetvl-13.c: Adjust. * gcc.target/riscv/rvv/vsetvl/vsetvl-18.c: Adjust. * gcc.target/riscv/rvv/vsetvl/vsetvl-23.c: Adjust. * gcc.target/riscv/rvv/vsetvl/avl_single-104.c: New test. * gcc.target/riscv/rvv/vsetvl/avl_single-105.c: New test. * gcc.target/riscv/rvv/vsetvl/avl_single-106.c: New test. * gcc.target/riscv/rvv/vsetvl/avl_single-107.c: New test. * gcc.target/riscv/rvv/vsetvl/avl_single-108.c: New test. * gcc.target/riscv/rvv/vsetvl/avl_single-109.c: New test. * gcc.target/riscv/rvv/vsetvl/pr111037-3.c: New test. * gcc.target/riscv/rvv/vsetvl/pr111037-4.c: New test.
-
Alexandre Oliva authored
The need to initialize edge probabilities has made make_eh_edges undesirably hard to use. I suppose we don't want make_eh_edges to initialize the probability of the newly-added edge itself, so that the caller takes care of it, but identifying the added edge in need of adjustments is inefficient and cumbersome. Change make_eh_edges so that it returns the added edge. for gcc/ChangeLog * tree-eh.cc (make_eh_edges): Return the new edge. * tree-eh.h (make_eh_edges): Likewise.
-
Nathaniel Shead authored
This patch adds checks for attempting to change the active member of a union by methods other than a member access expression. To be able to properly distinguish `*(&u.a) = ` from `u.a = `, this patch redoes the solution for c++/59950 to avoid extranneous *&; it seems that the only case that needed the workaround was when copying empty classes. This patch also ensures that constructors for a union field mark that field as the active member before entering the call itself; this ensures that modifications of the field within the constructor's body don't cause false positives (as these will not appear to be member access expressions). This means that we no longer need to start the lifetime of empty union members after the constructor body completes. As a drive-by fix, this patch also ensures that value-initialised unions are considered to have activated their initial member for the purpose of checking stores and accesses, which catches some additional mistakes pre-C++20. PR c++/101631 PR c++/102286 gcc/cp/ChangeLog: * call.cc (build_over_call): Fold more indirect refs for trivial assignment op. * class.cc (type_has_non_deleted_trivial_default_ctor): Create. * constexpr.cc (cxx_eval_call_expression): Start lifetime of union member before entering constructor. (cxx_eval_component_reference): Check against first member of value-initialised union. (cxx_eval_store_expression): Activate member for value-initialised union. Check for accessing inactive union member indirectly. * cp-tree.h (type_has_non_deleted_trivial_default_ctor): Forward declare. gcc/testsuite/ChangeLog: * g++.dg/cpp1y/constexpr-89336-3.C: Fix union initialisation. * g++.dg/cpp1y/constexpr-union6.C: New test. * g++.dg/cpp1y/constexpr-union7.C: New test. * g++.dg/cpp2a/constexpr-union2.C: New test. * g++.dg/cpp2a/constexpr-union3.C: New test. * g++.dg/cpp2a/constexpr-union4.C: New test. * g++.dg/cpp2a/constexpr-union5.C: New test. * g++.dg/cpp2a/constexpr-union6.C: New test. Signed-off-by:
Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by:
Jason Merrill <jason@redhat.com>
-
Nathaniel Shead authored
This patch improves the errors given when casting from void* in C++26 to include the expected type if the types of the pointed-to objects were not similar. It also ensures (for all standard modes) that void* casts are checked even for DECL_ARTIFICIAL declarations, such as lifetime-extended temporaries, and is only ignored for cases where we know it's OK (e.g. source_location::current) or have no other choice (heap-allocated data). gcc/cp/ChangeLog: * constexpr.cc (is_std_source_location_current): New. (cxx_eval_constant_expression): Only ignore cast from void* for specific cases and improve other diagnostics. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/constexpr-cast4.C: New test. Signed-off-by:
Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by:
Marek Polacek <polacek@redhat.com> Reviewed-by:
Jason Merrill <jason@redhat.com>
-
GCC Administrator authored
-
- Oct 19, 2023
-
-
Marek Polacek authored
This patch is an optimization tweak for cp_fold_r. If we cp_fold_r the COND_EXPR's op0 first, we may be able to evaluate it to a constant if -O. cp_fold has: 3143 if (callee && DECL_DECLARED_CONSTEXPR_P (callee) 3144 && !flag_no_inline) ... 3151 r = maybe_constant_value (x, /*decl=*/NULL_TREE, flag_no_inline is 1 for -O0: 1124 if (opts->x_optimize == 0) 1125 { 1126 /* Inlining does not work if not optimizing, 1127 so force it not to be done. */ 1128 opts->x_warn_inline = 0; 1129 opts->x_flag_no_inline = 1; 1130 } but otherwise it's 0 and cp_fold will maybe_constant_value calls to constexpr functions. And if it doesn't, then folding the COND_EXPR will keep both arms, and we can avoid calling maybe_constant_value. gcc/cp/ChangeLog: * cp-gimplify.cc (cp_fold_r): Don't call maybe_constant_value.
-
Marek Polacek authored
I noticed that Patrick is missing here. gcc/ChangeLog: * doc/contrib.texi: Add entry for Patrick Palka.
-
Andre Vieira authored
This patch enables the compiler to use inbranch simdclones when generating masked loops in autovectorization. gcc/ChangeLog: * omp-simd-clone.cc (simd_clone_adjust_argument_types): Make function compatible with mask parameters in clone. * tree-vect-stmts.cc (vect_build_all_ones_mask): Allow vector boolean typed masks. (vectorizable_simd_clone_call): Enable the use of masked clones in fully masked loops.
-
Andre Vieira authored
When analyzing a loop and choosing a simdclone to use it is possible to choose a simdclone that cannot be used 'inbranch' for a loop that can use partial vectors. This may lead to the vectorizer deciding to use partial vectors which are not supported for notinbranch simd clones. This patch fixes that by disabling the use of partial vectors once a notinbranch simd clone has been selected. gcc/ChangeLog: PR tree-optimization/110485 * tree-vect-stmts.cc (vectorizable_simd_clone_call): Disable partial vectors usage if a notinbranch simdclone has been selected. gcc/testsuite/ChangeLog: * gcc.dg/gomp/pr110485.c: New test.
-
Andre Vieira authored
The vect_get_smallest_scalar_type helper function was using any argument to a simd clone call when trying to determine the smallest scalar type that would be vectorized. This included the function pointer type in a MASK_CALL for instance, and would result in the wrong type being selected. Instead this patch special cases simd_clone_call's and uses only scalar types of the original function that get transformed into vector types. gcc/ChangeLog: * tree-vect-data-refs.cc (vect_get_smallest_scalar_type): Special case simd clone calls and only use types that are mapped to vectors. (simd_clone_call_p): New helper function. gcc/testsuite/ChangeLog: * gcc.dg/vect/vect-simd-clone-16f.c: Remove unnecessary differentation between targets with different pointer sizes. * gcc.dg/vect/vect-simd-clone-17f.c: Likewise. * gcc.dg/vect/vect-simd-clone-18f.c: Likewise.
-
Andre Vieira authored
Teach parloops how to handle a poly nit and bound e ahead of the changes to enable non-constant simdlen. gcc/ChangeLog: * tree-parloops.cc (try_transform_to_exit_first_loop_alt): Accept poly NIT and ALT_BOUND.
-
Andre Vieira authored
SVE simd clones require to be compiled with a SVE target enabled or the argument types will not be created properly. To achieve this we need to copy DECL_FUNCTION_SPECIFIC_TARGET from the original function declaration to the clones. I decided it was probably also a good idea to copy DECL_FUNCTION_SPECIFIC_OPTIMIZATION in case the original function is meant to be compiled with specific optimization options. gcc/ChangeLog: * tree-parloops.cc (create_loop_fn): Copy specific target and optimization options to clone.
-
Andre Vieira authored
Refactor simd clone handling code ahead of support for poly simdlen. gcc/ChangeLog: * omp-simd-clone.cc (simd_clone_subparts): Remove. (simd_clone_init_simd_arrays): Replace simd_clone_supbarts with TYPE_VECTOR_SUBPARTS. (ipa_simd_modify_function_body): Likewise. * tree-vect-stmts.cc (vectorizable_simd_clone_call): Likewise. (simd_clone_subparts): Remove.
-
François Dumont authored
On merge, reuse a merged node's possibly cached hash code only if we are on the same type of hash and this hash is stateless. Usage of function pointers or std::function as hash functor will prevent reusing cached hash code. libstdc++-v3/ChangeLog * include/bits/hashtable_policy.h (_Hash_code_base::_M_hash_code(const _Hash&, const _Hash_node_value<>&)): Remove. (_Hash_code_base::_M_hash_code<_H2>(const _H2&, const _Hash_node_value<>&)): Remove. * include/bits/hashtable.h (_M_src_hash_code<_H2>(const _H2&, const key_type&, const __node_value_type&)): New. (_M_merge_unique<>, _M_merge_multi<>): Use latter. * testsuite/23_containers/unordered_map/modifiers/merge.cc (test04, test05, test06): New test cases.
-
Andrew Pinski authored
In the case of convert_argument, we would return the same expression back rather than error_mark_node after the error message about trying to convert to an incomplete type. This causes issues in the gimplfier trying to see if another conversion is needed. The code here dates back to before the revision history too so it might be the case it never noticed we should return an error_mark_node. Bootstrapped and tested on x86_64-linux-gnu with no regressions. PR c/100532 gcc/c/ChangeLog: * c-typeck.cc (convert_argument): After erroring out about an incomplete type return error_mark_node. gcc/testsuite/ChangeLog: * gcc.dg/pr100532-1.c: New test.
-
Andrew Pinski authored
In a similar way we don't warn about NULL pointer constant conversion to a different named address we should not warn to a different sso endian either. This adds the simple check. Bootstrapped and tested on x86_64-linux-gnu with no regressions. PR c/104822 gcc/c/ChangeLog: * c-typeck.cc (convert_for_assignment): Check for null pointer before warning about an incompatible scalar storage order. gcc/testsuite/ChangeLog: * gcc.dg/sso-18.c: New test. * gcc.dg/sso-19.c: New test.
-
Jason Merrill authored
gcc/ChangeLog: * ABOUT-GCC-NLS: Add usage guidance.
-
Jason Merrill authored
While checking another change, I noticed that the new permerror overloads break gettext with "permerror used incompatibly as both --keyword=permerror:2 --flag=permerror:2:gcc-internal-format and --keyword=permerror:3 --flag=permerror:3:gcc-internal-format". So let's change the name. gcc/ChangeLog: * diagnostic-core.h (permerror): Rename new overloads... (permerror_opt): To this. * diagnostic.cc: Likewise. gcc/cp/ChangeLog: * typeck2.cc (check_narrowing): Adjust.
-
Jason Merrill authored
Since these strings are passed to error_at, they should be marked for translation with G_, like other diagnostic messages, rather than _, which forces immediate (redundant) translation. The use of N_ is less problematic, but also imprecise. gcc/cp/ChangeLog: * parser.cc (cp_parser_primary_expression): Use G_. (cp_parser_using_enum): Likewise. * decl.cc (identify_goto): Likewise.
-
Yannick Moy authored
SPARK RM 6.1.11 introduces a new aspect Side_Effects to denote those functions which may have output parameters, write global variables, raise exceptions and not terminate. This adds support for this aspect and the corresponding pragma in the frontend. Handling of this aspect in the frontend is very similar to the handling of aspect Extensions_Visible: both are Boolean aspects whose expression should be static, they can be specified on the same entities, with the same rule of inheritance from overridden to overriding primitives for tagged types. There is no impact on code generation. gcc/ada/ * aspects.ads: Add aspect Side_Effects. * contracts.adb (Add_Pre_Post_Condition) (Inherit_Subprogram_Contract): Add support for new contract. * contracts.ads: Update comments. * einfo-utils.adb (Get_Pragma): Add support. * einfo-utils.ads (Prag): Update comment. * errout.ads: Add explain codes. * par-prag.adb (Prag): Add support. * sem_ch13.adb (Analyze_Aspect_Specifications) (Check_Aspect_At_Freeze_Point): Add support. * sem_ch6.adb (Analyze_Subprogram_Body_Helper) (Analyze_Subprogram_Declaration): Call new analysis procedure to check SPARK legality rules. (Analyze_SPARK_Subprogram_Specification): New procedure to check SPARK legality rules. Use an explain code for the error. (Analyze_Subprogram_Specification): Move checks to new subprogram. This code was effectively dead, as the kind for parameters was set to E_Void at this point to detect early references. * sem_ch6.ads (Analyze_Subprogram_Specification): Add new procedure. * sem_prag.adb (Analyze_Depends_In_Decl_Part) (Analyze_Global_In_Decl_Part): Adapt legality check to apply only to functions without side-effects. (Analyze_If_Present): Extract functionality in new procedure Analyze_If_Present_Internal. (Analyze_If_Present_Internal): New procedure to analyze given pragma kind. (Analyze_Pragmas_If_Present): New procedure to analyze given pragma kind associated with a declaration. (Analyze_Pragma): Adapt support for Always_Terminates and Exceptional_Cases. Add support for Side_Effects. Make sure to call Analyze_If_Present to ensure pragma Side_Effects is analyzed prior to analyzing pragmas Global and Depends. Use explain codes for the errors. * sem_prag.ads (Analyze_Pragmas_If_Present): Add new procedure. * sem_util.adb (Is_Function_With_Side_Effects): New query function to determine if a function is a function with side-effects. * sem_util.ads (Is_Function_With_Side_Effects): Same. * snames.ads-tmpl: Declare new names for pragma and aspect. * doc/gnat_rm/implementation_defined_aspects.rst: Document new aspect. * doc/gnat_rm/implementation_defined_pragmas.rst: Document new pragma. * gnat_rm.texi: Regenerate.
-
Sheri Bernstein authored
Rewrite for loop containing an exit (which violates GNATcheck rule Exits_From_Conditional_Loops), to use a while loop which contains the exit criteria in its condition. Also, move special case of first time through loop, to come before loop. gcc/ada/ * libgnat/s-imagef.adb (Set_Image_Fixed): Refactor loop.
-
Sheri Bernstein authored
Exempt the GNATcheck rule "Unassigned_OUT_Parameters" with the rationale "the OUT parameter is assigned by component". gcc/ada/ * libgnat/s-imguti.adb (Set_Decimal_Digits): Add pragma to exempt Unassigned_OUT_Parameters. (Set_Floating_Invalid_Value): Likewise
-
Patrick Bernardi authored
Add documentation for the -Q gnatbind switch in GNAT User's Guide and improve gnatbind's help output for the switch to emphasize that it adds the requested number of stacks to the secondary stack pool generated by the binder. gcc/ada/ * bindusg.adb (Display): Make it clear -Q adds to the number of secondary stacks generated by the binder. * doc/gnat_ugn/building_executable_programs_with_gnat.rst: Document the -Q gnatbind switch and fix references to old runtimes. * gnat-style.texi: Regenerate. * gnat_rm.texi: Regenerate. * gnat_ugn.texi: Regenerate.
-
Ronan Desplanques authored
This patch is intended as a readability improvement. It doesn't change the behavior of the compiler. gcc/ada/ * sem_ch3.adb (Constrain_Array): Replace manual list length computation by call to List_Length.
-
Piotr Trojanek authored
gcc/ada/ * exp_aggr.adb (Expand_Container_Aggregate): Simplify with "No".
-
Lewis Hyatt authored
As noted on the PR, commit r13-1544, the fix for PR53431, did not handle the specific case of -Wunknown-pragmas, because that warning is issued during preprocessing, but not by libcpp directly (it comes from the cb_def_pragma callback). Address that by handling this pragma in addition to libcpp pragmas during the early pragma handler. gcc/c-family/ChangeLog: PR c++/89038 * c-pragma.cc (handle_pragma_diagnostic_impl): Handle -Wunknown-pragmas during early processing. gcc/testsuite/ChangeLog: PR c++/89038 * c-c++-common/cpp/Wunknown-pragmas-1.c: New test.
-
Lewis Hyatt authored
This PR was fixed by r12-4797 and r12-5454. Add test coverage from the PR that is not represented elsewhere. gcc/testsuite/ChangeLog: PR preprocessor/82335 * c-c++-common/cpp/diagnostic-pragma-3.c: New test.
-
Tamar Christina authored
As the testcase shows, when a PHI node dominates the loop there is no new definition inside the loop. As such there would be no PHI nodes to update. When we maintain LCSSA form we create an intermediate node in between the two loops to thread alongt the value. However later on when we update the second loop we don't have any PHI nodes to update and so adjust_phi_and_debug_stmts does nothing. This leaves us with an incorrect phi node. Normally this does nothing and just gets ignored. But in the case of the vUSE chain we end up corrupting the chain. As such whenever a PHI node's argument dominates the loop, we should remove the newly created PHI node after edge redirection. The one exception to this is when the loop has been versioned. In such cases the versioned loop may not use the value but the second loop can. When this happens and we add the loop guard unless the join block has the PHI it can't find the original value for use inside the guard block. The next refactoring in the series moves the formation of the guard block inside peeling itself. Here we have all the information and wouldn't need to re-create it later. gcc/ChangeLog: PR tree-optimization/111860 * tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg): Remove PHI nodes that dominate loop. gcc/testsuite/ChangeLog: PR tree-optimization/111860 * gcc.dg/vect/pr111860.c: New test.
-
Richard Biener authored
The following implements SLP vectorization support for gathers without relying on IFNs being pattern detected (and supported by the target). That includes support for emulated gathers but also the legacy x86 builtin path. PR tree-optimization/111131 * tree-vect-loop.cc (update_epilogue_loop_vinfo): Make sure to update all gather/scatter stmt DRs, not only those that eventually got VMAT_GATHER_SCATTER set. * tree-vect-slp.cc (_slp_oprnd_info::first_gs_info): Add. (vect_get_and_check_slp_defs): Handle gathers/scatters, adding the offset as SLP operand and comparing base and scale. (vect_build_slp_tree_1): Handle gathers. (vect_build_slp_tree_2): Likewise. * gcc.dg/vect/vect-gather-1.c: Now expected to vectorize everywhere. * gcc.dg/vect/vect-gather-2.c: Expected to not SLP anywhere. Massage the scale case to more reliably produce a different one. Scan for the specific messages. * gcc.dg/vect/vect-gather-3.c: Masked gather is also supported for AVX2, but not emulated. * gcc.dg/vect/vect-gather-4.c: Expected to not SLP anywhere. Massage to more properly ensure this. * gcc.dg/vect/tsvc/vect-tsvc-s353.c: Expect to vectorize everywhere.
-
Richard Biener authored
The following moves the builtin decl gather vectorization path along the internal function and emulated gather vectorization paths, simplifying the existing function down to generating the call and required conversions to the actual argument types. This thereby exposes the unique support of two times larger number of offset or data vector lanes. It also makes the code path handle SLP in principle (but SLP build needs adjustments for this, patch coming). * tree-vect-stmts.cc (vect_build_gather_load_calls): Rename to ... (vect_build_one_gather_load_call): ... this. Refactor, inline widening/narrowing support ... (vectorizable_load): ... here, do gather vectorization with builtin decls along other gather vectorization.
-
Alex Coplan authored
This patch generalises the TFmode load/store pair patterns to TImode and TDmode. This brings them in line with the DXmode patterns, and uses the same technique with separate mode iterators (TX and TX2) to allow for distinct modes in each arm of the load/store pair. For example, in combination with the post-RA load/store pair fusion pass in the following patch, this improves the codegen for the following varargs testcase involving TImode stores: void g(void *); int foo(int x, ...) { __builtin_va_list ap; __builtin_va_start (ap, x); g(&ap); __builtin_va_end (ap); } from: foo: .LFB0: stp x29, x30, [sp, -240]! .LCFI0: mov w9, -56 mov w8, -128 mov x29, sp add x10, sp, 176 stp x1, x2, [sp, 184] add x1, sp, 240 add x0, sp, 16 stp x1, x1, [sp, 16] str x10, [sp, 32] stp w9, w8, [sp, 40] str q0, [sp, 48] str q1, [sp, 64] str q2, [sp, 80] str q3, [sp, 96] str q4, [sp, 112] str q5, [sp, 128] str q6, [sp, 144] str q7, [sp, 160] stp x3, x4, [sp, 200] stp x5, x6, [sp, 216] str x7, [sp, 232] bl g ldp x29, x30, [sp], 240 .LCFI1: ret to: foo: .LFB0: stp x29, x30, [sp, -240]! .LCFI0: mov w9, -56 mov w8, -128 mov x29, sp add x10, sp, 176 stp x1, x2, [sp, 1bd4971b7c71e70a637a1dq84] add x1, sp, 240 add x0, sp, 16 stp x1, x1, [sp, 16] str x10, [sp, 32] stp w9, w8, [sp, 40] stp q0, q1, [sp, 48] stp q2, q3, [sp, 80] stp q4, q5, [sp, 112] stp q6, q7, [sp, 144] stp x3, x4, [sp, 200] stp x5, x6, [sp, 216] str x7, [sp, 232] bl g ldp x29, x30, [sp], 240 .LCFI1: ret Note that this patch isn't neeed if we only use the mode canonicalization approach in the new ldp fusion pass (since we canonicalize T{I,F,D}mode to V16QImode), but we seem to get slightly better performance with mode canonicalization disabled (see --param=aarch64-ldp-canonicalize-modes in the following patch). gcc/ChangeLog: * config/aarch64/aarch64.md (load_pair_dw_tftf): Rename to ... (load_pair_dw_<TX:mode><TX2:mode>): ... this. (store_pair_dw_tftf): Rename to ... (store_pair_dw_<TX:mode><TX2:mode>): ... this. * config/aarch64/iterators.md (TX2): New.
-