Skip to content
Snippets Groups Projects
  1. Oct 20, 2023
    • Lehua Ding's avatar
      RISC-V: Refactor and cleanup vsetvl pass · 29331e72
      Lehua Ding authored
      This patch refactors and cleanups the vsetvl pass in order to make the code
      easier to modify and understand. This patch does several things:
      
      1. Introducing a virtual CFG for vsetvl infos and Phase 1, 2 and 3 only maintain
         and modify this virtual CFG. Phase 4 performs insertion, modification and
         deletion of vsetvl insns based on the virtual CFG. The basic block in the
         virtual CFG is called vsetvl_block_info and the vsetvl information inside
         is called vsetvl_info.
      2. Combine Phase 1 and 2 into a single Phase 1 and unified the demand system,
         this phase only fuse local vsetvl info in forward direction.
      3. Refactor Phase 3, change the logic for determining whether to uplift vsetvl
         info to a pred basic block to a more unified method that there is a vsetvl
         info in the vsetvl defintion reaching in compatible with it.
      4. Place all modification operations to the RTL in Phase 4 and Phase 5.
         Phase 4 is responsible for inserting, modifying and deleting vsetvl
         instructions based on fully optimized vsetvl infos. Phase 5 removes the avl
         operand from the RVV instruction and removes the unused dest operand
         register from the vsetvl insns.
      
      These modifications resulted in some testcases needing to be updated. The reasons
      for updating are summarized below:
      
      1. more optimized
         vlmax_back_prop-{25,26}.c
         vlmax_conflict-{3,12}.c/vsetvl-{13,23}.c/vsetvl-23.c/
         avl_single-{23,84,95}.c/pr109773-1.c
      2. less unnecessary fusion
         avl_single-46.c/imm_bb_prop-1.c/pr109743-2.c/vsetvl-18.c
      3. local fuse direction (backward -> forward)
         scalar_move-1.c
      4. add some bugfix testcases.
         pr111037-{3,4}.c/pr111037-4.c
         avl_single-{89,104,105,106,107,108,109}.c
      
      	PR target/111037
      	PR target/111234
      	PR target/111725
      
      gcc/ChangeLog:
      
      	* config/riscv/riscv-vsetvl.cc (bitmap_union_of_preds_with_entry): New.
      	(debug): Removed.
      	(compute_reaching_defintion): New.
      	(enum vsetvl_type): Moved.
      	(vlmax_avl_p): Moved.
      	(enum emit_type): Moved.
      	(vlmul_to_str): Moved.
      	(vlmax_avl_insn_p): Removed.
      	(policy_to_str): Moved.
      	(loop_basic_block_p): Removed.
      	(valid_sew_p): Removed.
      	(vsetvl_insn_p): Moved.
      	(vsetvl_vtype_change_only_p): Removed.
      	(after_or_same_p): Removed.
      	(before_p): Removed.
      	(anticipatable_occurrence_p): Removed.
      	(available_occurrence_p): Removed.
      	(insn_should_be_added_p): Removed.
      	(get_all_sets): Moved.
      	(get_same_bb_set): Moved.
      	(gen_vsetvl_pat): Removed.
      	(calculate_vlmul): Moved.
      	(get_max_int_sew): New.
      	(emit_vsetvl_insn): Removed.
      	(get_max_float_sew): New.
      	(eliminate_insn): Removed.
      	(insert_vsetvl): Removed.
      	(count_regno_occurrences): Moved.
      	(get_vl_vtype_info): Removed.
      	(enum def_type): Moved.
      	(validate_change_or_fail): Moved.
      	(change_insn): Removed.
      	(get_all_real_uses): Moved.
      	(get_forward_read_vl_insn): Removed.
      	(get_backward_fault_first_load_insn): Removed.
      	(change_vsetvl_insn): Removed.
      	(avl_source_has_vsetvl_p): Removed.
      	(source_equal_p): Moved.
      	(calculate_sew): Removed.
      	(same_equiv_note_p): Moved.
      	(get_expr_id): New.
      	(incompatible_avl_p): Removed.
      	(get_regno): New.
      	(different_sew_p): Removed.
      	(get_bb_index): New.
      	(different_lmul_p): Removed.
      	(has_no_uses): Moved.
      	(different_ratio_p): Removed.
      	(different_tail_policy_p): Removed.
      	(different_mask_policy_p): Removed.
      	(possible_zero_avl_p): Removed.
      	(enum demand_flags): New.
      	(second_ratio_invalid_for_first_sew_p): Removed.
      	(second_ratio_invalid_for_first_lmul_p): Removed.
      	(enum class): New.
      	(float_insn_valid_sew_p): Removed.
      	(second_sew_less_than_first_sew_p): Removed.
      	(first_sew_less_than_second_sew_p): Removed.
      	(class vsetvl_info): New.
      	(compare_lmul): Removed.
      	(second_lmul_less_than_first_lmul_p): Removed.
      	(second_ratio_less_than_first_ratio_p): Removed.
      	(DEF_INCOMPATIBLE_COND): Removed.
      	(greatest_sew): Removed.
      	(first_sew): Removed.
      	(second_sew): Removed.
      	(first_vlmul): Removed.
      	(second_vlmul): Removed.
      	(first_ratio): Removed.
      	(second_ratio): Removed.
      	(vlmul_for_first_sew_second_ratio): Removed.
      	(vlmul_for_greatest_sew_second_ratio): Removed.
      	(ratio_for_second_sew_first_vlmul): Removed.
      	(class vsetvl_block_info): New.
      	(DEF_SEW_LMUL_FUSE_RULE): New.
      	(always_unavailable): Removed.
      	(avl_unavailable_p): Removed.
      	(class demand_system): New.
      	(sew_unavailable_p): Removed.
      	(lmul_unavailable_p): Removed.
      	(ge_sew_unavailable_p): Removed.
      	(ge_sew_lmul_unavailable_p): Removed.
      	(ge_sew_ratio_unavailable_p): Removed.
      	(DEF_UNAVAILABLE_COND): Removed.
      	(same_sew_lmul_demand_p): Removed.
      	(propagate_avl_across_demands_p): Removed.
      	(reg_available_p): Removed.
      	(support_relaxed_compatible_p): Removed.
      	(demands_can_be_fused_p): Removed.
      	(earliest_pred_can_be_fused_p): Removed.
      	(vsetvl_dominated_by_p): Removed.
      	(avl_info::avl_info): Removed.
      	(avl_info::single_source_equal_p): Removed.
      	(avl_info::multiple_source_equal_p): Removed.
      	(DEF_SEW_LMUL_RULE): New.
      	(avl_info::operator=): Removed.
      	(avl_info::operator==): Removed.
      	(DEF_POLICY_RULE): New.
      	(avl_info::operator!=): Removed.
      	(avl_info::has_non_zero_avl): Removed.
      	(vl_vtype_info::vl_vtype_info): Removed.
      	(vl_vtype_info::operator==): Removed.
      	(DEF_AVL_RULE): New.
      	(vl_vtype_info::operator!=): Removed.
      	(vl_vtype_info::same_avl_p): Removed.
      	(vl_vtype_info::same_vtype_p): Removed.
      	(vl_vtype_info::same_vlmax_p): Removed.
      	(vector_insn_info::operator>=): Removed.
      	(vector_insn_info::operator==): Removed.
      	(class pre_vsetvl): New.
      	(vector_insn_info::parse_insn): Removed.
      	(vector_insn_info::compatible_p): Removed.
      	(vector_insn_info::skip_avl_compatible_p): Removed.
      	(vector_insn_info::compatible_avl_p): Removed.
      	(vector_insn_info::compatible_vtype_p): Removed.
      	(vector_insn_info::available_p): Removed.
      	(vector_insn_info::fuse_avl): Removed.
      	(vector_insn_info::fuse_sew_lmul): Removed.
      	(vector_insn_info::fuse_tail_policy): Removed.
      	(vector_insn_info::fuse_mask_policy): Removed.
      	(vector_insn_info::local_merge): Removed.
      	(vector_insn_info::global_merge): Removed.
      	(vector_insn_info::get_avl_or_vl_reg): Removed.
      	(vector_insn_info::update_fault_first_load_avl): Removed.
      	(vector_insn_info::dump): Removed.
      	(vector_infos_manager::vector_infos_manager): Removed.
      	(vector_infos_manager::create_expr): Removed.
      	(vector_infos_manager::get_expr_id): Removed.
      	(vector_infos_manager::all_same_ratio_p): Removed.
      	(vector_infos_manager::all_avail_in_compatible_p): Removed.
      	(vector_infos_manager::all_same_avl_p): Removed.
      	(vector_infos_manager::expr_set_num): Removed.
      	(vector_infos_manager::release): Removed.
      	(vector_infos_manager::create_bitmap_vectors): Removed.
      	(vector_infos_manager::free_bitmap_vectors): Removed.
      	(vector_infos_manager::dump): Removed.
      	(class pass_vsetvl): Adjust.
      	(pass_vsetvl::get_vector_info): Removed.
      	(pass_vsetvl::get_block_info): Removed.
      	(pass_vsetvl::update_vector_info): Removed.
      	(pass_vsetvl::update_block_info): Removed.
      	(pre_vsetvl::compute_avl_def_data): New.
      	(pass_vsetvl::simple_vsetvl): Removed.
      	(pass_vsetvl::compute_local_backward_infos): Removed.
      	(pass_vsetvl::need_vsetvl): Removed.
      	(pass_vsetvl::transfer_before): Removed.
      	(pass_vsetvl::transfer_after): Removed.
      	(pre_vsetvl::compute_vsetvl_def_data): New.
      	(pass_vsetvl::emit_local_forward_vsetvls): Removed.
      	(pass_vsetvl::prune_expressions): Removed.
      	(pass_vsetvl::compute_local_properties): Removed.
      	(pre_vsetvl::compute_lcm_local_properties): New.
      	(pass_vsetvl::earliest_fusion): Removed.
      	(pre_vsetvl::fuse_local_vsetvl_info): New.
      	(pass_vsetvl::vsetvl_fusion): Removed.
      	(pass_vsetvl::can_refine_vsetvl_p): Removed.
      	(pre_vsetvl::earliest_fuse_vsetvl_info): New.
      	(pass_vsetvl::refine_vsetvls): Removed.
      	(pass_vsetvl::cleanup_vsetvls): Removed.
      	(pass_vsetvl::commit_vsetvls): Removed.
      	(pass_vsetvl::pre_vsetvl): Removed.
      	(pass_vsetvl::get_vsetvl_at_end): Removed.
      	(local_avl_compatible_p): Removed.
      	(pass_vsetvl::local_eliminate_vsetvl_insn): Removed.
      	(pre_vsetvl::pre_global_vsetvl_info): New.
      	(get_first_vsetvl_before_rvv_insns): Removed.
      	(pass_vsetvl::global_eliminate_vsetvl_insn): Removed.
      	(pre_vsetvl::emit_vsetvl): New.
      	(pass_vsetvl::ssa_post_optimization): Removed.
      	(pre_vsetvl::cleaup): New.
      	(pre_vsetvl::remove_avl_operand): New.
      	(pass_vsetvl::df_post_optimization): Removed.
      	(pre_vsetvl::remove_unused_dest_operand): New.
      	(pass_vsetvl::init): Removed.
      	(pass_vsetvl::done): Removed.
      	(pass_vsetvl::compute_probabilities): Removed.
      	(pass_vsetvl::lazy_vsetvl): Adjust.
      	(pass_vsetvl::execute): Adjust.
      	* config/riscv/riscv-vsetvl.def (DEF_INCOMPATIBLE_COND): Removed.
      	(DEF_SEW_LMUL_RULE): New.
      	(DEF_SEW_LMUL_FUSE_RULE): Removed.
      	(DEF_POLICY_RULE): New.
      	(DEF_UNAVAILABLE_COND): Removed
      	(DEF_AVL_RULE): New demand type.
      	(sew_lmul): New demand type.
      	(ratio_only): New demand type.
      	(sew_only): New demand type.
      	(ge_sew): New demand type.
      	(ratio_and_ge_sew): New demand type.
      	(tail_mask_policy): New demand type.
      	(tail_policy_only): New demand type.
      	(mask_policy_only): New demand type.
      	(ignore_policy): New demand type.
      	(avl): New demand type.
      	(non_zero_avl): New demand type.
      	(ignore_avl): New demand type.
      	* config/riscv/t-riscv: Removed riscv-vsetvl.h
      	* config/riscv/riscv-vsetvl.h: Removed.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/riscv/rvv/base/scalar_move-1.c: Adjust.
      	* gcc.target/riscv/rvv/vsetvl/avl_single-23.c: Adjust.
      	* gcc.target/riscv/rvv/vsetvl/avl_single-46.c: Adjust.
      	* gcc.target/riscv/rvv/vsetvl/avl_single-84.c: Adjust.
      	* gcc.target/riscv/rvv/vsetvl/avl_single-89.c: Adjust.
      	* gcc.target/riscv/rvv/vsetvl/avl_single-95.c: Adjust.
      	* gcc.target/riscv/rvv/vsetvl/imm_bb_prop-1.c: Adjust.
      	* gcc.target/riscv/rvv/vsetvl/pr109743-2.c: Adjust.
      	* gcc.target/riscv/rvv/vsetvl/pr109773-1.c: Adjust.
      	* gcc.target/riscv/rvv/base/pr111037-1.c: Moved to...
      	* gcc.target/riscv/rvv/vsetvl/pr111037-1.c: ...here.
      	* gcc.target/riscv/rvv/base/pr111037-2.c: Moved to...
      	* gcc.target/riscv/rvv/vsetvl/pr111037-2.c: ...here.
      	* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-25.c: Adjust.
      	* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-26.c: Adjust.
      	* gcc.target/riscv/rvv/vsetvl/vlmax_conflict-12.c: Adjust.
      	* gcc.target/riscv/rvv/vsetvl/vlmax_conflict-3.c: Adjust.
      	* gcc.target/riscv/rvv/vsetvl/vsetvl-13.c: Adjust.
      	* gcc.target/riscv/rvv/vsetvl/vsetvl-18.c: Adjust.
      	* gcc.target/riscv/rvv/vsetvl/vsetvl-23.c: Adjust.
      	* gcc.target/riscv/rvv/vsetvl/avl_single-104.c: New test.
      	* gcc.target/riscv/rvv/vsetvl/avl_single-105.c: New test.
      	* gcc.target/riscv/rvv/vsetvl/avl_single-106.c: New test.
      	* gcc.target/riscv/rvv/vsetvl/avl_single-107.c: New test.
      	* gcc.target/riscv/rvv/vsetvl/avl_single-108.c: New test.
      	* gcc.target/riscv/rvv/vsetvl/avl_single-109.c: New test.
      	* gcc.target/riscv/rvv/vsetvl/pr111037-3.c: New test.
      	* gcc.target/riscv/rvv/vsetvl/pr111037-4.c: New test.
      29331e72
    • Alexandre Oliva's avatar
      return edge in make_eh_edges · df252e0f
      Alexandre Oliva authored
      The need to initialize edge probabilities has made make_eh_edges
      undesirably hard to use.  I suppose we don't want make_eh_edges to
      initialize the probability of the newly-added edge itself, so that the
      caller takes care of it, but identifying the added edge in need of
      adjustments is inefficient and cumbersome.  Change make_eh_edges so
      that it returns the added edge.
      
      
      for  gcc/ChangeLog
      
      	* tree-eh.cc (make_eh_edges): Return the new edge.
      	* tree-eh.h (make_eh_edges): Likewise.
      df252e0f
    • Nathaniel Shead's avatar
      c++: indirect change of active union member in constexpr [PR101631,PR102286] · 1d260ab0
      Nathaniel Shead authored
      
      This patch adds checks for attempting to change the active member of a
      union by methods other than a member access expression.
      
      To be able to properly distinguish `*(&u.a) = ` from `u.a = `, this
      patch redoes the solution for c++/59950 to avoid extranneous *&; it
      seems that the only case that needed the workaround was when copying
      empty classes.
      
      This patch also ensures that constructors for a union field mark that
      field as the active member before entering the call itself; this ensures
      that modifications of the field within the constructor's body don't
      cause false positives (as these will not appear to be member access
      expressions). This means that we no longer need to start the lifetime of
      empty union members after the constructor body completes.
      
      As a drive-by fix, this patch also ensures that value-initialised unions
      are considered to have activated their initial member for the purpose of
      checking stores and accesses, which catches some additional mistakes
      pre-C++20.
      
      	PR c++/101631
      	PR c++/102286
      
      gcc/cp/ChangeLog:
      
      	* call.cc (build_over_call): Fold more indirect refs for trivial
      	assignment op.
      	* class.cc (type_has_non_deleted_trivial_default_ctor): Create.
      	* constexpr.cc (cxx_eval_call_expression): Start lifetime of
      	union member before entering constructor.
      	(cxx_eval_component_reference): Check against first member of
      	value-initialised union.
      	(cxx_eval_store_expression): Activate member for
      	value-initialised union. Check for accessing inactive union
      	member indirectly.
      	* cp-tree.h (type_has_non_deleted_trivial_default_ctor):
      	Forward declare.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/cpp1y/constexpr-89336-3.C: Fix union initialisation.
      	* g++.dg/cpp1y/constexpr-union6.C: New test.
      	* g++.dg/cpp1y/constexpr-union7.C: New test.
      	* g++.dg/cpp2a/constexpr-union2.C: New test.
      	* g++.dg/cpp2a/constexpr-union3.C: New test.
      	* g++.dg/cpp2a/constexpr-union4.C: New test.
      	* g++.dg/cpp2a/constexpr-union5.C: New test.
      	* g++.dg/cpp2a/constexpr-union6.C: New test.
      
      Signed-off-by: default avatarNathaniel Shead <nathanieloshead@gmail.com>
      Reviewed-by: default avatarJason Merrill <jason@redhat.com>
      1d260ab0
    • Nathaniel Shead's avatar
      c++: Improve diagnostics for constexpr cast from void* · b69ee500
      Nathaniel Shead authored
      
      This patch improves the errors given when casting from void* in C++26 to
      include the expected type if the types of the pointed-to objects were
      not similar. It also ensures (for all standard modes) that void* casts
      are checked even for DECL_ARTIFICIAL declarations, such as
      lifetime-extended temporaries, and is only ignored for cases where we
      know it's OK (e.g. source_location::current) or have no other choice
      (heap-allocated data).
      
      gcc/cp/ChangeLog:
      
      	* constexpr.cc (is_std_source_location_current): New.
      	(cxx_eval_constant_expression): Only ignore cast from void* for
      	specific cases and improve other diagnostics.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/cpp0x/constexpr-cast4.C: New test.
      
      Signed-off-by: default avatarNathaniel Shead <nathanieloshead@gmail.com>
      Reviewed-by: default avatarMarek Polacek <polacek@redhat.com>
      Reviewed-by: default avatarJason Merrill <jason@redhat.com>
      b69ee500
    • GCC Administrator's avatar
      Daily bump. · c85f7481
      GCC Administrator authored
      c85f7481
  2. Oct 19, 2023
    • Marek Polacek's avatar
      c++: small tweak for cp_fold_r · 4d81962b
      Marek Polacek authored
      This patch is an optimization tweak for cp_fold_r.  If we cp_fold_r the
      COND_EXPR's op0 first, we may be able to evaluate it to a constant if -O.
      cp_fold has:
      
      3143         if (callee && DECL_DECLARED_CONSTEXPR_P (callee)
      3144             && !flag_no_inline)
      ...
      3151             r = maybe_constant_value (x, /*decl=*/NULL_TREE,
      
      flag_no_inline is 1 for -O0:
      
      1124   if (opts->x_optimize == 0)
      1125     {
      1126       /* Inlining does not work if not optimizing,
      1127          so force it not to be done.  */
      1128       opts->x_warn_inline = 0;
      1129       opts->x_flag_no_inline = 1;
      1130     }
      
      but otherwise it's 0 and cp_fold will maybe_constant_value calls to
      constexpr functions.  And if it doesn't, then folding the COND_EXPR
      will keep both arms, and we can avoid calling maybe_constant_value.
      
      gcc/cp/ChangeLog:
      
      	* cp-gimplify.cc (cp_fold_r): Don't call maybe_constant_value.
      4d81962b
    • Marek Polacek's avatar
      doc: Update contrib.texi · 86d0b086
      Marek Polacek authored
      I noticed that Patrick is missing here.
      
      gcc/ChangeLog:
      
      	* doc/contrib.texi: Add entry for Patrick Palka.
      86d0b086
    • Andre Vieira's avatar
      vect: Use inbranch simdclones in masked loops · d8e4e7de
      Andre Vieira authored
      This patch enables the compiler to use inbranch simdclones when generating
      masked loops in autovectorization.
      
      gcc/ChangeLog:
      
      	* omp-simd-clone.cc (simd_clone_adjust_argument_types): Make function
      	compatible with mask parameters in clone.
      	* tree-vect-stmts.cc (vect_build_all_ones_mask): Allow vector boolean
      	typed masks.
      	(vectorizable_simd_clone_call): Enable the use of masked clones in
      	fully masked loops.
      d8e4e7de
    • Andre Vieira's avatar
      vect: don't allow fully masked loops with non-masked simd clones [PR 110485] · 8b704ed0
      Andre Vieira authored
      When analyzing a loop and choosing a simdclone to use it is possible to choose
      a simdclone that cannot be used 'inbranch' for a loop that can use partial
      vectors.  This may lead to the vectorizer deciding to use partial vectors which
      are not supported for notinbranch simd clones.  This patch fixes that by
      disabling the use of partial vectors once a notinbranch simd clone has been
      selected.
      
      gcc/ChangeLog:
      
      	PR tree-optimization/110485
      	* tree-vect-stmts.cc (vectorizable_simd_clone_call): Disable partial
      	vectors usage if a notinbranch simdclone has been selected.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.dg/gomp/pr110485.c: New test.
      8b704ed0
    • Andre Vieira's avatar
      vect: Fix vect_get_smallest_scalar_type for simd clones · c9ce8467
      Andre Vieira authored
      The vect_get_smallest_scalar_type helper function was using any argument to a
      simd clone call when trying to determine the smallest scalar type that would be
      vectorized.  This included the function pointer type in a MASK_CALL for
      instance, and would result in the wrong type being selected.  Instead this
      patch special cases simd_clone_call's and uses only scalar types of the
      original function that get transformed into vector types.
      
      gcc/ChangeLog:
      
      	* tree-vect-data-refs.cc (vect_get_smallest_scalar_type): Special case
      	simd clone calls and only use types that are mapped to vectors.
      	(simd_clone_call_p): New helper function.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.dg/vect/vect-simd-clone-16f.c: Remove unnecessary differentation
      	between targets with different pointer sizes.
      	* gcc.dg/vect/vect-simd-clone-17f.c: Likewise.
      	* gcc.dg/vect/vect-simd-clone-18f.c: Likewise.
      c9ce8467
    • Andre Vieira's avatar
      parloops: Allow poly nit and bound · 53d40858
      Andre Vieira authored
      Teach parloops how to handle a poly nit and bound e ahead of the changes to
      enable non-constant simdlen.
      
      gcc/ChangeLog:
      
      	* tree-parloops.cc (try_transform_to_exit_first_loop_alt): Accept
      	poly NIT and ALT_BOUND.
      53d40858
    • Andre Vieira's avatar
      parloops: Copy target and optimizations when creating a function clone · 87d97e26
      Andre Vieira authored
      SVE simd clones require to be compiled with a SVE target enabled or the argument
      types will not be created properly. To achieve this we need to copy
      DECL_FUNCTION_SPECIFIC_TARGET from the original function declaration to the
      clones.  I decided it was probably also a good idea to copy
      DECL_FUNCTION_SPECIFIC_OPTIMIZATION in case the original function is meant to
      be compiled with specific optimization options.
      
      gcc/ChangeLog:
      
      	* tree-parloops.cc (create_loop_fn): Copy specific target and
      	optimization options to clone.
      87d97e26
    • Andre Vieira's avatar
      omp: Replace simd_clone_subparts with TYPE_VECTOR_SUBPARTS · 79a50a17
      Andre Vieira authored
      Refactor simd clone handling code ahead of support for poly simdlen.
      
      gcc/ChangeLog:
      
      	* omp-simd-clone.cc (simd_clone_subparts): Remove.
      	(simd_clone_init_simd_arrays): Replace simd_clone_supbarts with
      	TYPE_VECTOR_SUBPARTS.
      	(ipa_simd_modify_function_body): Likewise.
      	* tree-vect-stmts.cc (vectorizable_simd_clone_call): Likewise.
      	(simd_clone_subparts): Remove.
      79a50a17
    • François Dumont's avatar
      libstdc++: [_Hashtable] Do not reuse untrusted cached hash code · c714b4d3
      François Dumont authored
      On merge, reuse a merged node's possibly cached hash code only if we are on the
      same type of hash and this hash is stateless.
      
      Usage of function pointers or std::function as hash functor will prevent reusing
      cached hash code.
      
      libstdc++-v3/ChangeLog
      
      	* include/bits/hashtable_policy.h
      	(_Hash_code_base::_M_hash_code(const _Hash&, const _Hash_node_value<>&)): Remove.
      	(_Hash_code_base::_M_hash_code<_H2>(const _H2&, const _Hash_node_value<>&)): Remove.
      	* include/bits/hashtable.h
      	(_M_src_hash_code<_H2>(const _H2&, const key_type&, const __node_value_type&)): New.
      	(_M_merge_unique<>, _M_merge_multi<>): Use latter.
      	* testsuite/23_containers/unordered_map/modifiers/merge.cc
      	(test04, test05, test06): New test cases.
      c714b4d3
    • Andrew Pinski's avatar
      c: Fix ICE when an argument was an error mark [PR100532] · 2454ba9e
      Andrew Pinski authored
      In the case of convert_argument, we would return the same expression
      back rather than error_mark_node after the error message about
      trying to convert to an incomplete type. This causes issues in
      the gimplfier trying to see if another conversion is needed.
      
      The code here dates back to before the revision history too so
      it might be the case it never noticed we should return an error_mark_node.
      
      Bootstrapped and tested on x86_64-linux-gnu with no regressions.
      
      	PR c/100532
      
      gcc/c/ChangeLog:
      
      	* c-typeck.cc (convert_argument): After erroring out
      	about an incomplete type return error_mark_node.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.dg/pr100532-1.c: New test.
      2454ba9e
    • Andrew Pinski's avatar
      c: Don't warn about converting NULL to different sso endian [PR104822] · 9f33e4c5
      Andrew Pinski authored
      In a similar way we don't warn about NULL pointer constant conversion to
      a different named address we should not warn to a different sso endian
      either.
      This adds the simple check.
      
      Bootstrapped and tested on x86_64-linux-gnu with no regressions.
      
      	PR c/104822
      
      gcc/c/ChangeLog:
      
      	* c-typeck.cc (convert_for_assignment): Check for null pointer
      	before warning about an incompatible scalar storage order.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.dg/sso-18.c: New test.
      	* gcc.dg/sso-19.c: New test.
      9f33e4c5
    • Jason Merrill's avatar
      ABOUT-GCC-NLS: add usage guidance · 00e7c49f
      Jason Merrill authored
      gcc/ChangeLog:
      
      	* ABOUT-GCC-NLS: Add usage guidance.
      00e7c49f
    • Jason Merrill's avatar
      diagnostic: rename new permerror overloads · 1ec36bcd
      Jason Merrill authored
      While checking another change, I noticed that the new permerror overloads
      break gettext with "permerror used incompatibly as both
       --keyword=permerror:2 --flag=permerror:2:gcc-internal-format and
       --keyword=permerror:3 --flag=permerror:3:gcc-internal-format".  So let's
      change the name.
      
      gcc/ChangeLog:
      
      	* diagnostic-core.h (permerror): Rename new overloads...
      	(permerror_opt): To this.
      	* diagnostic.cc: Likewise.
      
      gcc/cp/ChangeLog:
      
      	* typeck2.cc (check_narrowing): Adjust.
      1ec36bcd
    • Jason Merrill's avatar
      c++: use G_ instead of _ · f53de2ba
      Jason Merrill authored
      Since these strings are passed to error_at, they should be marked for
      translation with G_, like other diagnostic messages, rather than _, which
      forces immediate (redundant) translation.  The use of N_ is less
      problematic, but also imprecise.
      
      gcc/cp/ChangeLog:
      
      	* parser.cc (cp_parser_primary_expression): Use G_.
      	(cp_parser_using_enum): Likewise.
      	* decl.cc (identify_goto): Likewise.
      f53de2ba
    • Yannick Moy's avatar
      ada: Support new SPARK aspect Side_Effects · 04d6c745
      Yannick Moy authored
      SPARK RM 6.1.11 introduces a new aspect Side_Effects to denote
      those functions which may have output parameters, write global
      variables, raise exceptions and not terminate. This adds support
      for this aspect and the corresponding pragma in the frontend.
      
      Handling of this aspect in the frontend is very similar to
      the handling of aspect Extensions_Visible: both are Boolean
      aspects whose expression should be static, they can be specified
      on the same entities, with the same rule of inheritance from
      overridden to overriding primitives for tagged types.
      
      There is no impact on code generation.
      
      gcc/ada/
      
      	* aspects.ads: Add aspect Side_Effects.
      	* contracts.adb (Add_Pre_Post_Condition)
      	(Inherit_Subprogram_Contract): Add support for new contract.
      	* contracts.ads: Update comments.
      	* einfo-utils.adb (Get_Pragma): Add support.
      	* einfo-utils.ads (Prag): Update comment.
      	* errout.ads: Add explain codes.
      	* par-prag.adb (Prag): Add support.
      	* sem_ch13.adb (Analyze_Aspect_Specifications)
      	(Check_Aspect_At_Freeze_Point): Add support.
      	* sem_ch6.adb (Analyze_Subprogram_Body_Helper)
      	(Analyze_Subprogram_Declaration): Call new analysis procedure to
      	check SPARK legality rules.
      	(Analyze_SPARK_Subprogram_Specification): New procedure to check
      	SPARK legality rules. Use an explain code for the error.
      	(Analyze_Subprogram_Specification): Move checks to new subprogram.
      	This code was effectively dead, as the kind for parameters was set
      	to E_Void at this point to detect early references.
      	* sem_ch6.ads (Analyze_Subprogram_Specification): Add new
      	procedure.
      	* sem_prag.adb (Analyze_Depends_In_Decl_Part)
      	(Analyze_Global_In_Decl_Part): Adapt legality check to apply only
      	to functions without side-effects.
      	(Analyze_If_Present): Extract functionality in new procedure
      	Analyze_If_Present_Internal.
      	(Analyze_If_Present_Internal): New procedure to analyze given
      	pragma kind.
      	(Analyze_Pragmas_If_Present): New procedure to analyze given
      	pragma kind associated with a declaration.
      	(Analyze_Pragma): Adapt support for Always_Terminates and
      	Exceptional_Cases. Add support for Side_Effects. Make sure to call
      	Analyze_If_Present to ensure pragma Side_Effects is analyzed prior
      	to analyzing pragmas Global and Depends. Use explain codes for the
      	errors.
      	* sem_prag.ads (Analyze_Pragmas_If_Present): Add new procedure.
      	* sem_util.adb (Is_Function_With_Side_Effects): New query function
      	to determine if a function is a function with side-effects.
      	* sem_util.ads (Is_Function_With_Side_Effects): Same.
      	* snames.ads-tmpl: Declare new names for pragma and aspect.
      	* doc/gnat_rm/implementation_defined_aspects.rst: Document new aspect.
      	* doc/gnat_rm/implementation_defined_pragmas.rst: Document new pragma.
      	* gnat_rm.texi: Regenerate.
      04d6c745
    • Sheri Bernstein's avatar
      ada: Refactor code to remove GNATcheck violation · c1fbfe5a
      Sheri Bernstein authored
      Rewrite for loop containing an exit (which violates GNATcheck
      rule Exits_From_Conditional_Loops), to use a while loop
      which contains the exit criteria in its condition.
      Also, move special case of first time through loop, to come
      before loop.
      
      gcc/ada/
      
      	* libgnat/s-imagef.adb (Set_Image_Fixed): Refactor loop.
      c1fbfe5a
    • Sheri Bernstein's avatar
      ada: Add pragma Annotate for GNATcheck exemptions · 0f3c6348
      Sheri Bernstein authored
      Exempt the GNATcheck rule "Unassigned_OUT_Parameters"
      with the rationale "the OUT parameter is assigned by component".
      
      gcc/ada/
      
      	* libgnat/s-imguti.adb (Set_Decimal_Digits): Add pragma to exempt
      	Unassigned_OUT_Parameters.
      	(Set_Floating_Invalid_Value): Likewise
      0f3c6348
    • Patrick Bernardi's avatar
      ada: Document gnatbind -Q switch · 1555d181
      Patrick Bernardi authored
      Add documentation for the -Q gnatbind switch in GNAT User's Guide and
      improve gnatbind's help output for the switch to emphasize that it adds the
      requested number of stacks to the secondary stack pool generated by the
      binder.
      
      gcc/ada/
      
      	* bindusg.adb (Display): Make it clear -Q adds to the number of
      	secondary stacks generated by the binder.
      	* doc/gnat_ugn/building_executable_programs_with_gnat.rst:
      	Document the -Q gnatbind switch and fix references to old
      	runtimes.
      	* gnat-style.texi: Regenerate.
      	* gnat_rm.texi: Regenerate.
      	* gnat_ugn.texi: Regenerate.
      1555d181
    • Ronan Desplanques's avatar
      ada: Seize opportunity to reuse List_Length · 0c29a990
      Ronan Desplanques authored
      This patch is intended as a readability improvement. It doesn't
      change the behavior of the compiler.
      
      gcc/ada/
      
      	* sem_ch3.adb (Constrain_Array): Replace manual list length
      	computation by call to List_Length.
      0c29a990
    • Piotr Trojanek's avatar
      ada: Simplify "not Present" with "No" · 7b1b787b
      Piotr Trojanek authored
      gcc/ada/
      
      	* exp_aggr.adb (Expand_Container_Aggregate): Simplify with "No".
      7b1b787b
    • Lewis Hyatt's avatar
      c++: Make -Wunknown-pragmas controllable by #pragma GCC diagnostic [PR89038] · 19cc4b9d
      Lewis Hyatt authored
      As noted on the PR, commit r13-1544, the fix for PR53431, did not handle
      the specific case of -Wunknown-pragmas, because that warning is issued
      during preprocessing, but not by libcpp directly (it comes from the
      cb_def_pragma callback).  Address that by handling this pragma in
      addition to libcpp pragmas during the early pragma handler.
      
      gcc/c-family/ChangeLog:
      
      	PR c++/89038
      	* c-pragma.cc (handle_pragma_diagnostic_impl):  Handle
      	-Wunknown-pragmas during early processing.
      
      gcc/testsuite/ChangeLog:
      
      	PR c++/89038
      	* c-c++-common/cpp/Wunknown-pragmas-1.c: New test.
      19cc4b9d
    • Lewis Hyatt's avatar
      libcpp: testsuite: Add test for fixed _Pragma bug [PR82335] · 202a214d
      Lewis Hyatt authored
      This PR was fixed by r12-4797 and r12-5454. Add test coverage from the PR
      that is not represented elsewhere.
      
      gcc/testsuite/ChangeLog:
      
      	PR preprocessor/82335
      	* c-c++-common/cpp/diagnostic-pragma-3.c: New test.
      202a214d
    • Tamar Christina's avatar
      middle-end: don't create LC-SSA PHI variables for PHI nodes who dominate loop · 217a0fcb
      Tamar Christina authored
      As the testcase shows, when a PHI node dominates the loop there is no new
      definition inside the loop.  As such there would be no PHI nodes to update.
      
      When we maintain LCSSA form we create an intermediate node in between the two
      loops to thread alongt the value.  However later on when we update the second
      loop we don't have any PHI nodes to update and so adjust_phi_and_debug_stmts
      does nothing.   This leaves us with an incorrect phi node.  Normally this does
      nothing and just gets ignored.  But in the case of the vUSE chain we end up
      corrupting the chain.
      
      As such whenever a PHI node's argument dominates the loop, we should remove
      the newly created PHI node after edge redirection.
      
      The one exception to this is when the loop has been versioned.  In such cases
      the versioned loop may not use the value but the second loop can.
      
      When this happens and we add the loop guard unless the join block has the PHI
      it can't find the original value for use inside the guard block.
      
      The next refactoring in the series moves the formation of the guard block
      inside peeling itself.  Here we have all the information and wouldn't
      need to re-create it later.
      
      gcc/ChangeLog:
      
      	PR tree-optimization/111860
      	* tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg):
      	Remove PHI nodes that dominate loop.
      
      gcc/testsuite/ChangeLog:
      
      	PR tree-optimization/111860
      	* gcc.dg/vect/pr111860.c: New test.
      217a0fcb
    • Richard Biener's avatar
      tree-optimization/111131 - SLP for non-IFN gathers · beab5b95
      Richard Biener authored
      The following implements SLP vectorization support for gathers
      without relying on IFNs being pattern detected (and supported by
      the target).  That includes support for emulated gathers but also
      the legacy x86 builtin path.
      
      	PR tree-optimization/111131
      	* tree-vect-loop.cc (update_epilogue_loop_vinfo): Make
      	sure to update all gather/scatter stmt DRs, not only those
      	that eventually got VMAT_GATHER_SCATTER set.
      	* tree-vect-slp.cc (_slp_oprnd_info::first_gs_info): Add.
      	(vect_get_and_check_slp_defs): Handle gathers/scatters,
      	adding the offset as SLP operand and comparing base and scale.
      	(vect_build_slp_tree_1): Handle gathers.
      	(vect_build_slp_tree_2): Likewise.
      
      	* gcc.dg/vect/vect-gather-1.c: Now expected to vectorize
      	everywhere.
      	* gcc.dg/vect/vect-gather-2.c: Expected to not SLP anywhere.
      	Massage the scale case to more reliably produce a different
      	one.  Scan for the specific messages.
      	* gcc.dg/vect/vect-gather-3.c: Masked gather is also supported
      	for AVX2, but not emulated.
      	* gcc.dg/vect/vect-gather-4.c: Expected to not SLP anywhere.
      	Massage to more properly ensure this.
      	* gcc.dg/vect/tsvc/vect-tsvc-s353.c: Expect to vectorize
      	everywhere.
      beab5b95
    • Richard Biener's avatar
      Refactor x86 vectorized gather path · b068886d
      Richard Biener authored
      The following moves the builtin decl gather vectorization path along
      the internal function and emulated gather vectorization paths,
      simplifying the existing function down to generating the call and
      required conversions to the actual argument types.  This thereby
      exposes the unique support of two times larger number of offset
      or data vector lanes.  It also makes the code path handle SLP
      in principle (but SLP build needs adjustments for this, patch coming).
      
      	* tree-vect-stmts.cc (vect_build_gather_load_calls): Rename
      	to ...
      	(vect_build_one_gather_load_call): ... this.  Refactor,
      	inline widening/narrowing support ...
      	(vectorizable_load): ... here, do gather vectorization
      	with builtin decls along other gather vectorization.
      b068886d
    • Alex Coplan's avatar
      aarch64: Generalise TFmode load/store pair patterns · 947fb34a
      Alex Coplan authored
      This patch generalises the TFmode load/store pair patterns to TImode and
      TDmode.  This brings them in line with the DXmode patterns, and uses the
      same technique with separate mode iterators (TX and TX2) to allow for
      distinct modes in each arm of the load/store pair.
      
      For example, in combination with the post-RA load/store pair fusion pass
      in the following patch, this improves the codegen for the following
      varargs testcase involving TImode stores:
      
      void g(void *);
      int foo(int x, ...)
      {
          __builtin_va_list ap;
          __builtin_va_start (ap, x);
          g(&ap);
          __builtin_va_end (ap);
      }
      
      from:
      
      foo:
      .LFB0:
      	stp	x29, x30, [sp, -240]!
      .LCFI0:
      	mov	w9, -56
      	mov	w8, -128
      	mov	x29, sp
      	add	x10, sp, 176
      	stp	x1, x2, [sp, 184]
      	add	x1, sp, 240
      	add	x0, sp, 16
      	stp	x1, x1, [sp, 16]
      	str	x10, [sp, 32]
      	stp	w9, w8, [sp, 40]
      	str	q0, [sp, 48]
      	str	q1, [sp, 64]
      	str	q2, [sp, 80]
      	str	q3, [sp, 96]
      	str	q4, [sp, 112]
      	str	q5, [sp, 128]
      	str	q6, [sp, 144]
      	str	q7, [sp, 160]
      	stp	x3, x4, [sp, 200]
      	stp	x5, x6, [sp, 216]
      	str	x7, [sp, 232]
      	bl	g
      	ldp	x29, x30, [sp], 240
      .LCFI1:
      	ret
      
      to:
      
      foo:
      .LFB0:
      	stp	x29, x30, [sp, -240]!
      .LCFI0:
      	mov	w9, -56
      	mov	w8, -128
      	mov	x29, sp
      	add	x10, sp, 176
      	stp	x1, x2, [sp, 1bd4971b7c71e70a637a1dq84]
      	add	x1, sp, 240
      	add	x0, sp, 16
      	stp	x1, x1, [sp, 16]
      	str	x10, [sp, 32]
      	stp	w9, w8, [sp, 40]
      	stp	q0, q1, [sp, 48]
      	stp	q2, q3, [sp, 80]
      	stp	q4, q5, [sp, 112]
      	stp	q6, q7, [sp, 144]
      	stp	x3, x4, [sp, 200]
      	stp	x5, x6, [sp, 216]
      	str	x7, [sp, 232]
      	bl	g
      	ldp	x29, x30, [sp], 240
      .LCFI1:
      	ret
      
      Note that this patch isn't neeed if we only use the mode
      canonicalization approach in the new ldp fusion pass (since we
      canonicalize T{I,F,D}mode to V16QImode), but we seem to get slightly
      better performance with mode canonicalization disabled (see
      --param=aarch64-ldp-canonicalize-modes in the following patch).
      
      gcc/ChangeLog:
      
      	* config/aarch64/aarch64.md (load_pair_dw_tftf): Rename to ...
      	(load_pair_dw_<TX:mode><TX2:mode>): ... this.
      	(store_pair_dw_tftf): Rename to ...
      	(store_pair_dw_<TX:mode><TX2:mode>): ... this.
      	* config/aarch64/iterators.md (TX2): New.
      947fb34a
    • Alex Coplan's avatar
      aarch64, testsuite: Fix up pr71727.c · 61ea0a89
      Alex Coplan authored
      The test is trying to check that we don't use q-register stores with
      -mstrict-align, so actually check specifically for that.
      
      This is a prerequisite to avoid regressing:
      
      scan-assembler-not "add\tx0, x0, :"
      
      with the upcoming ldp fusion pass, as we change where the ldps are
      formed such that a register is used rather than a symbolic (lo_sum)
      address for the first load.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/aarch64/pr71727.c: Adjust scan-assembler-not to
      	make sure we don't have q-register stores with -mstrict-align.
      61ea0a89
    • Alex Coplan's avatar
      aarch64, testsuite: Tweak sve/pcs/args_9.c to allow stps · cf776eeb
      Alex Coplan authored
      With the new ldp/stp pass enabled, there is a change in the codegen for
      this test as follows:
      
              add     x8, sp, 16
              ptrue   p3.h, mul3
              str     p3, [x8]
      -       str     x8, [sp, 8]
      -       str     x9, [sp]
      +       stp     x9, x8, [sp]
              ptrue   p3.d, vl8
              ptrue   p2.s, vl7
              ptrue   p1.h, vl6
      
      i.e. we now form an stp that we were missing previously. This patch
      adjusts the scan-assembler such that it should pass whether or not
      we form the stp.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/aarch64/sve/pcs/args_9.c: Adjust scan-assemblers to
      	allow for stp.
      cf776eeb
    • Alex Coplan's avatar
      aarch64, testsuite: Prevent stp in lr_free_1.c · 583ca5f5
      Alex Coplan authored
      The test is looking for individual stores which are able to be merged
      into stp instructions.  The test currently passes -fno-schedule-fusion
      -fno-peephole2, presumably to prevent these stores from being turned
      into stps, but this is no longer sufficient with the new ldp/stp fusion
      pass.
      
      As such, we add --param=aarch64-stp-policy=never to prevent stps being
      formed.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/aarch64/lr_free_1.c: Add
      	--param=aarch64-stp-policy=never to dg-options.
      583ca5f5
    • Alex Coplan's avatar
      rtl-ssa: Support inferring uses of mem in change_insns · 505f1202
      Alex Coplan authored
      Currently, rtl_ssa::change_insns requires all new uses and defs to be
      specified explicitly.  This turns out to be rather inconvenient for
      forming load pairs in the new aarch64 load pair pass, as the pass has to
      determine which mem def the final load pair consumes, and then obtain or
      create a suitable use (i.e. significant bookkeeping, just to keep the
      RTL-SSA IR consistent).  It turns out to be much more convenient to
      allow change_insns to infer which def is consumed and create a suitable
      use of mem itself.  This patch does that.
      
      gcc/ChangeLog:
      
      	* rtl-ssa/changes.cc (function_info::finalize_new_accesses): Add new
      	parameter to give final insn position, infer use of mem if it isn't
      	specified explicitly.
      	(function_info::change_insns): Pass down final insn position to
      	finalize_new_accesses.
      	* rtl-ssa/functions.h: Add parameter to finalize_new_accesses.
      505f1202
    • Alex Coplan's avatar
      rtl-ssa: Add entry point to allow re-parenting uses · ba230aa1
      Alex Coplan authored
      This is needed by the upcoming aarch64 load pair pass, as it can
      re-order stores (when alias analysis determines this is safe) and thus
      change which mem def a given use consumes (in the RTL-SSA view, there is
      no alias disambiguation of memory).
      
      gcc/ChangeLog:
      
      	* rtl-ssa/accesses.cc (function_info::reparent_use): New.
      	* rtl-ssa/functions.h (function_info): Declare new member
      	function reparent_use.
      ba230aa1
    • Alex Coplan's avatar
      rtl-ssa: Add drop_memory_access helper · c95aab23
      Alex Coplan authored
      Add a helper routine to access-utils.h which removes the memory access
      from an access_array, if it has one.
      
      gcc/ChangeLog:
      
      	* rtl-ssa/access-utils.h (drop_memory_access): New.
      c95aab23
    • Alex Coplan's avatar
      rtl-ssa: Fix bug in function_info::add_insn_after · c3380833
      Alex Coplan authored
      In the case that !insn->is_debug_insn () && next->is_debug_insn (), this
      function was missing an update of the prev pointer on the first nondebug
      insn following the sequence of debug insns starting at next.
      
      This can lead to corruption of the insn chain, in that we end up with:
      
        insn->next_any_insn ()->prev_any_insn () != insn
      
      in this case.  This patch fixes that.
      
      gcc/ChangeLog:
      
      	* rtl-ssa/insns.cc (function_info::add_insn_after): Ensure we
      	update the prev pointer on the following nondebug insn in the
      	case that !insn->is_debug_insn () && next->is_debug_insn ().
      c3380833
    • Haochen Jiang's avatar
      x86: Correct ISA enabled for clients since Arrow Lake · faa0e82b
      Haochen Jiang authored
      gcc/ChangeLog:
      
      	* config/i386/i386.h: Correct the ISA enabled for Arrow Lake.
      	Also make Clearwater Forest depends on Sierra Forest.
      	* config/i386/i386-options.cc: Revise the order of the macro
      	definition to avoid confusion.
      	* doc/extend.texi: Revise documentation.
      	* doc/invoke.texi: Correct documentation.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/i386/funcspec-56.inc: Group Clearwater Forest
      	with atom cores.
      faa0e82b
    • Andrew Stubbs's avatar
      amdgcn: deprecate Fiji device and multilib · 56ed1055
      Andrew Stubbs authored
      LLVM wants to remove it, which breaks our build.  This patch means that
      most users won't notice that change, when it comes, and those that do will
      have chosen to enable Fiji explicitly.
      
      I'm selecting gfx900 as the new default as that's the least likely for users
      to want, which means most users will specify -march explicitly, which means
      we'll be free to change the default again, when we need to, without breaking
      anybody's makefiles.
      
      gcc/ChangeLog:
      
      	* config.gcc (amdgcn): Switch default to --with-arch=gfx900.
      	Implement support for --with-multilib-list.
      	* config/gcn/t-gcn-hsa: Likewise.
      	* doc/install.texi: Likewise.
      	* doc/invoke.texi: Mark Fiji deprecated.
      56ed1055
Loading