Skip to content
Snippets Groups Projects
  1. Jan 11, 2025
    • mengqinggang's avatar
      LoongArch: Generate the final immediate for lu12i.w, lu32i.d and lu52i.d · f30423ea
      mengqinggang authored
      Generate 0x1010 instead of 0x1010000>>12 for lu12i.w. lu32i.d and lu52i.d use
      the same processing.
      
      gcc/ChangeLog:
      
      	* config/loongarch/lasx.md: Use new loongarch_output_move.
      	* config/loongarch/loongarch-protos.h (loongarch_output_move):
      	Change parameters from (rtx, rtx) to (rtx *).
      	* config/loongarch/loongarch.cc (loongarch_output_move):
      	Generate final immediate for lu12i.w and lu52i.d.
      	* config/loongarch/loongarch.md:
      	Generate final immediate for lu32i.d and lu52i.d.
      	* config/loongarch/lsx.md: Use new loongarch_output_move.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/loongarch/imm-load.c: Not generate ">>".
      f30423ea
    • Iain Buclaw's avatar
      d: Merge dmd, druntime 2b89c2909d, phobos bdedad3bf · dd3026f0
      Iain Buclaw authored
      D front-end changes:
      
              - Import latest fixes from dmd v2.110.0-beta.1.
      
      D runtime changes:
      
              - Import latest fixes from druntime v2.110.0-beta.1.
      
      Phobos changes:
      
              - Import latest fixes from phobos v2.110.0-beta.1.
      	- Added `popGrapheme' function to `std.uni'.
      
      gcc/d/ChangeLog:
      
      	* dmd/MERGE: Merge upstream dmd 2b89c2909d.
      	* Make-lang.in (D_FRONTEND_OBJS): Rename d/basicmangle.o to
      	d/mangle-basic.o, d/cppmangle.o to d/mangle-cpp.o, and d/dmangle.o to
      	d/mangle-package.o.
      	(d/mangle-%.o): New rule.
      	* d-builtins.cc (maybe_set_builtin_1): Update for new front-end
      	interface.
      	* d-diagnostic.cc (verrorReport): Likewise.
      	(verrorReportSupplemental): Likewise.
      	* d-frontend.cc (getTypeInfoType): Likewise.
      	* d-lang.cc (d_init_options): Likewise.
      	(d_handle_option): Likewise.
      	(d_post_options): Likewise.
      	* d-target.cc (TargetC::contributesToAggregateAlignment): New.
      	* d-tree.h (create_typeinfo): Adjust prototype.
      	* decl.cc (layout_struct_initializer): Update for new front-end
      	interface.
      	* typeinfo.cc (create_typeinfo): Remove generate parameter.
      	* types.cc (layout_aggregate_members): Update for new front-end
      	interface.
      
      libphobos/ChangeLog:
      
      	* libdruntime/MERGE: Merge upstream druntime 2b89c2909d.
      	* src/MERGE: Merge upstream phobos bdedad3bf.
      dd3026f0
    • Andrew MacLeod's avatar
      Use relations when simplifying MIN and MAX. · b0eeb540
      Andrew MacLeod authored
      Query for known relations between the operands, and pass that to
      fold_range to help simplify MIN and MAX relations.
      Make it type agnostic as well.
      
      Adapt testcases from DOM to EVRP (e suffix) and test floats (f suffix).
      
      	PR tree-optimization/88575
      	gcc/
      	* vr-values.cc (simplify_using_ranges::fold_cond_with_ops): Query
      	relation between op0 and op1 and utilize it.
      	(simplify_using_ranges::simplify): Do not eliminate float checks.
      
      	gcc/testsuite/
      	* gcc.dg/tree-ssa/minmax-27.c: Disable VRP.
      	* gcc.dg/tree-ssa/minmax-27e.c: New.
      	* gcc.dg/tree-ssa/minmax-27f.c: New.
      	* gcc.dg/tree-ssa/minmax-28.c: Disable VRP.
      	* gcc.dg/tree-ssa/minmax-28e.c: New.
      	* gcc.dg/tree-ssa/minmax-28f.c: New.
      b0eeb540
    • GCC Administrator's avatar
      Daily bump. · 4951a90e
      GCC Administrator authored
      4951a90e
  2. Jan 10, 2025
    • Iain Buclaw's avatar
      d: Merge dmd, druntime 4ccb01fde5, phobos eab6595ad · c82395e0
      Iain Buclaw authored
      D front-end changes:
      
      	- Added pragma for ImportC to allow setting `nothrow', `@nogc'
      	  or `pure'.
      	- Mixin templates can now use assignment syntax.
      
      D runtime changes:
      
      	- Removed `ThreadBase.criticalRegionLock' from `core.thread'.
      	- Added `expect', `[un]likely', `trap' to `core.builtins'.
      
      Phobos changes:
      
      	- Import latest fixes from phobos v2.110.0-beta.1.
      
      gcc/d/ChangeLog:
      
      	* dmd/MERGE: Merge upstream dmd 4ccb01fde5.
      	* Make-lang.in (D_FRONTEND_OBJS): Rename d/foreachvar.o to
      	d/visitor-foreachvar.o, d/visitor.o to d/visitor-package.o, and
      	d/statement_rewrite_walker.o to d/visitor-statement_rewrite_walker.o.
      	(D_FRONTEND_OBJS): Rename
      	d/{parsetime,permissive,postorder,transitive}visitor.o to
      	d/visitor-{parsetime,permissive,postorder,transitive}.o.
      	(D_FRONTEND_OBJS): Remove d/sapply.o.
      	(d.tags): Add dmd/common/*.h.
      	(d/visitor-%.o:): New rule.
      	* d-codegen.cc (get_frameinfo): Update for new front-end interface.
      
      libphobos/ChangeLog:
      
      	* libdruntime/MERGE: Merge upstream druntime 4ccb01fde5.
      	* src/MERGE: Merge upstream phobos eab6595ad.
      c82395e0
    • Iain Buclaw's avatar
      d: Merge dmd, druntime 6884b433d2, phobos 48d581a1f · a7ae0c31
      Iain Buclaw authored
      D front-end changes:
      
      	- It's now deprecated to declare `auto ref' parameters without
      	  putting those two keywords next to each other.
              - An error is now given for case fallthough for multivalued
      	  cases.
              - An error is now given for constructors with field destructors
      	  with stricter attributes.
              - An error is now issued for `in'/`out' contracts of `nothrow'
      	  functions that may throw.
      	- `auto ref' can now be applied to local, static, extern, and
      	  global variables.
      
      D runtime changes:
      
              - Import latest fixes from druntime v2.110.0-beta.1.
      
      Phobos changes:
      
              - Import latest fixes from phobos v2.110.0-beta.1.
      
      gcc/d/ChangeLog:
      
      	* dmd/MERGE: Merge upstream dmd 6884b433d2.
      	* d-builtins.cc (build_frontend_type): Update for new front-end
      	interface.
      	(d_build_builtins_module): Likewise.
      	(matches_builtin_type): Likewise.
      	(covariant_with_builtin_type_p): Likewise.
      	* d-codegen.cc (lower_struct_comparison): Likewise.
      	(call_side_effect_free_p): Likewise.
      	* d-compiler.cc (Compiler::paintAsType): Likewise.
      	* d-convert.cc (convert_expr): Likewise.
      	(convert_for_assignment): Likewise.
      	* d-target.cc (Target::isVectorTypeSupported): Likewise.
      	(Target::isVectorOpSupported): Likewise.
      	(Target::isReturnOnStack): Likewise.
      	* decl.cc (get_symbol_decl): Likewise.
      	* expr.cc (build_return_dtor): Likewise.
      	* imports.cc (class ImportVisitor): Likewise.
      	* toir.cc (class IRVisitor): Likewise.
      	* types.cc (class TypeVisitor): Likewise.
      
      libphobos/ChangeLog:
      
      	* libdruntime/MERGE: Merge upstream druntime 6884b433d2.
      	* src/MERGE: Merge upstream phobos 48d581a1f.
      a7ae0c31
    • Alex Coplan's avatar
      vect: Also cost gconds for scalar [PR118211] · 086031c0
      Alex Coplan authored
      Currently we only cost gconds for the vector loop while we omit costing
      them when analyzing the scalar loop; this unfairly penalizes the vector
      loop in the case of loops with early exits.
      
      This (together with the previous patches) enables us to vectorize
      std::find with 64-bit element sizes.
      
      gcc/ChangeLog:
      
      	PR tree-optimization/118211
      	PR tree-optimization/116126
      	* tree-vect-loop.cc (vect_compute_single_scalar_iteration_cost):
      	Don't skip over gconds.
      086031c0
    • Alex Coplan's avatar
      vect: Ensure we add vector skip guard even when versioning for aliasing [PR118211] · f4e259b4
      Alex Coplan authored
      This fixes a latent wrong code issue whereby vect_do_peeling determined
      the wrong condition for inserting the vector skip guard.  Specifically
      in the case where the loop niters are unknown at compile time we used to
      check:
      
        !LOOP_REQUIRES_VERSIONING (loop_vinfo)
      
      but LOOP_REQUIRES_VERSIONING is true for loops which we have versioned
      for aliasing, and that has nothing to do with prolog peeling.  I think
      this condition should instead be checking specifically if we aren't
      versioning for alignment.
      
      As it stands, when we version for alignment, we don't peel, so the
      vector skip guard is indeed redundant in that case.
      
      With the testcase added (reduced from the Fortran frontend) we would
      version for aliasing, omit the vector skip guard, and then at runtime we
      would peel sufficient iterations for alignment that there wasn't a full
      vector iteration left when we entered the vector body, thus overflowing
      the output buffer.
      
      gcc/ChangeLog:
      
      	PR tree-optimization/118211
      	PR tree-optimization/116126
      	* tree-vect-loop-manip.cc (vect_do_peeling): Adjust skip_vector
      	condition to only omit the edge if we're versioning for
      	alignment.
      
      gcc/testsuite/ChangeLog:
      
      	PR tree-optimization/118211
      	PR tree-optimization/116126
      	* gcc.dg/vect/vect-early-break_130.c: New test.
      f4e259b4
    • Tamar Christina's avatar
      vect: Fix dominators when adding a guard to skip the vector loop [PR118211] · f1c6789a
      Tamar Christina authored
      
      The alignment peeling changes exposed a latent missing dominator update
      with early break vectorization, specifically when inserting the vector
      skip edge, since the new edge bypasses the prolog skip block and thus
      has the potential to subvert its dominance.  This patch fixes that.
      
      gcc/ChangeLog:
      
      	PR tree-optimization/118211
      	PR tree-optimization/116126
      	* tree-vect-loop-manip.cc (vect_do_peeling): Update immediate
      	dominators of nodes that were dominated by the prolog skip block
      	after inserting vector skip edge.  Initialize prolog variable to
      	NULL to avoid bogus -Wmaybe-uninitialized during bootstrap.
      
      gcc/testsuite/ChangeLog:
      
      	PR tree-optimization/118211
      	PR tree-optimization/116126
      	* g++.dg/vect/vect-early-break_6.cc: New test.
      
      Co-Authored-By: default avatarAlex Coplan <alex.coplan@arm.com>
      f1c6789a
    • Alex Coplan's avatar
      vect: Don't guard scalar epilogue for inverted loops [PR118211] · 0a462451
      Alex Coplan authored
      For loops with LOOP_VINFO_EARLY_BREAKS_VECT_PEELED we should always
      enter the scalar epilogue, so avoid emitting a guard on entry to the
      epilogue.
      
      gcc/ChangeLog:
      
      	PR tree-optimization/118211
      	PR tree-optimization/116126
      	* tree-vect-loop-manip.cc (vect_do_peeling): Avoid emitting an
      	epilogue guard for inverted early-exit loops.
      0a462451
    • Alex Coplan's avatar
      vect: Force alignment peeling to vectorize more early break loops [PR118211] · 68326d5d
      Alex Coplan authored
      
      This allows us to vectorize more loops with early exits by forcing
      peeling for alignment to make sure that we're guaranteed to be able to
      safely read an entire vector iteration without crossing a page boundary.
      
      To make this work for VLA architectures we have to allow compile-time
      non-constant target alignments.  We also have to override the result of
      the target's preferred_vector_alignment hook if it isn't a power-of-two
      multiple of the TYPE_SIZE of the chosen vector type.
      
      gcc/ChangeLog:
      
      	PR tree-optimization/118211
      	PR tree-optimization/116126
      	* tree-vect-data-refs.cc (vect_analyze_early_break_dependences):
      	Set need_peeling_for_alignment flag on read DRs instead of
      	failing vectorization.  Punt on gathers.
      	(dr_misalignment): Handle non-constant target alignments.
      	(vect_compute_data_ref_alignment): If need_peeling_for_alignment
      	flag is set on the DR, then override the target alignment chosen
      	by the preferred_vector_alignment hook to choose a safe
      	alignment.
      	(vect_supportable_dr_alignment): Override
      	support_vector_misalignment hook if need_peeling_for_alignment
      	is set on the DR: in this case we must return
      	dr_unaligned_unsupported in order to force peeling.
      	* tree-vect-loop-manip.cc (vect_do_peeling): Allow prolog
      	peeling by a compile-time non-constant amount.
      	* tree-vectorizer.h (dr_vec_info): Add new flag
      	need_peeling_for_alignment.
      
      gcc/testsuite/ChangeLog:
      
      	PR tree-optimization/118211
      	PR tree-optimization/116126
      	* gcc.dg/tree-ssa/cunroll-13.c: Don't vectorize.
      	* gcc.dg/tree-ssa/cunroll-14.c: Likewise.
      	* gcc.dg/unroll-6.c: Likewise.
      	* gcc.dg/tree-ssa/gen-vect-28.c: Likewise.
      	* gcc.dg/vect/vect-104.c: Expect to vectorize.
      	* gcc.dg/vect/vect-early-break_108-pr113588.c: Likewise.
      	* gcc.dg/vect/vect-early-break_109-pr113588.c: Likewise.
      	* gcc.dg/vect/vect-early-break_110-pr113467.c: Likewise.
      	* gcc.dg/vect/vect-early-break_3.c: Likewise.
      	* gcc.dg/vect/vect-early-break_65.c: Likewise.
      	* gcc.dg/vect/vect-early-break_8.c: Likewise.
      	* gfortran.dg/vect/vect-5.f90: Likewise.
      	* gfortran.dg/vect/vect-8.f90: Likewise.
      	* gcc.dg/vect/vect-switch-search-line-fast.c:
      
      Co-Authored-By: default avatarTamar Christina <tamar.christina@arm.com>
      68326d5d
    • Tamar Christina's avatar
      AArch64: correct Cortex-X4 MIDR · ddcfae1d
      Tamar Christina authored
      The Parts Num field for the MIDR for Cortex-X4 is wrong.  It's currently the
      parts number for a Cortex-A720 (which does have the right number).
      
      The correct number can be found in the Cortex-X4 Technical Reference Manual [1]
      on page 382 in Issue Number 5.
      
      [1] https://developer.arm.com/documentation/102484/latest/
      
      gcc/ChangeLog:
      
      	* config/aarch64/aarch64-cores.def (AARCH64_CORE): Fix cortex-x4 parts
      	num.
      ddcfae1d
    • Iain Buclaw's avatar
      d: Merge dmd, druntime 34875cd6e1, phobos ebd24da8a · 89629b27
      Iain Buclaw authored
      D front-end changes:
      
              - Import dmd v2.110.0-beta.1.
              - `ref' can now be applied to local, static, extern, and global
      	  variables.
      
      D runtime changes:
      
              - Import druntime v2.110.0-beta.1.
      
      Phobos changes:
      
              - Import phobos v2.110.0-beta.1.
      
      gcc/d/ChangeLog:
      
      	* dmd/MERGE: Merge upstream dmd 34875cd6e1.
      	* dmd/VERSION: Bump version to v2.110.0-beta.1.
      	* Make-lang.in (D_FRONTEND_OBJS): Add d/deps.o, d/timetrace.o.
      	* decl.cc (class DeclVisitor): Update for new front-end interface.
      	* expr.cc (class ExprVisitor): Likewise
      	* typeinfo.cc (check_typeinfo_type): Likewise.
      
      libphobos/ChangeLog:
      
      	* libdruntime/MERGE: Merge upstream druntime 34875cd6e1.
      	* src/MERGE: Merge upstream phobos ebd24da8a.
      89629b27
    • Jonathan Wakely's avatar
      libstdc++: Fix unused parameter warnings in <bits/atomic_futex.h> · c9353e0f
      Jonathan Wakely authored
      This fixes warnings like the following during bootstrap:
      
      sparc-sun-solaris2.11/libstdc++-v3/include/bits/atomic_futex.h:324:53: warning: unused parameter ‘__mo’ [-Wunused-parameter]
        324 |     _M_load_when_equal(unsigned __val, memory_order __mo)
            |                                        ~~~~~~~~~~~~~^~~~
      
      libstdc++-v3/ChangeLog:
      
      	* include/bits/atomic_futex.h (__atomic_futex_unsigned): Remove
      	names of unused parameters in non-futex implementation.
      Unverified
      c9353e0f
    • Marek Polacek's avatar
      c++: add fixed test [PR118391] · d2017159
      Marek Polacek authored
      Fixed by r15-6740.
      
      	PR c++/118391
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/cpp2a/lambda-uneval20.C: New test.
      d2017159
    • Wilco Dijkstra's avatar
      libatomic: Cleanup AArch64 ifunc selection · 81bcf412
      Wilco Dijkstra authored
      Simplify and cleanup ifunc selection logic.  Since LRCPC3 does
      not imply LSE2, has_rcpc3() should also check LSE2 is enabled.
      
      Passes regress and bootstrap, OK for commit?
      
      libatomic:
      	* config/linux/aarch64/host-config.h (has_lse2): Cleanup.
      	(has_lse128): Likewise.
      	(has_rcpc3): Add early check for LSE2.
      81bcf412
    • Torbjörn SVENSSON's avatar
      testsuite: arm: Add pattern for armv8-m.base to cmse-15.c test · cfd7c54b
      Torbjörn SVENSSON authored
      
      Since armv8-m.base uses thumb1 that does not suport sibcall/tailcall,
      a pattern is needed that uses PUSH/BL/POP sequence instead of a single
      B instruction to reuse an already existing function in the compile unit.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/arm/cmse/cmse-15.c: Added pattern for armv8-m.base.
      
      Signed-off-by: default avatarTorbjörn SVENSSON <torbjorn.svensson@foss.st.com>
      cfd7c54b
    • Paul-Antoine Arras's avatar
      Do not call cp_parser_omp_dispatch directly in cp_parser_pragma · b5a67989
      Paul-Antoine Arras authored
      This is a followup to
      ed49709a OpenMP: C++ front-end support for dispatch + adjust_args.
      
      The call to cp_parser_omp_dispatch only belongs in cp_parser_omp_construct. In
      cp_parser_pragma, handle PRAGMA_OMP_DISPATCH by calling cp_parser_omp_construct.
      
      gcc/cp/ChangeLog:
      
      	* parser.cc (cp_parser_pragma): Replace call to cp_parser_omp_dispatch
      	with cp_parser_omp_construct and check context.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/gomp/dispatch-8.C: New test.
      b5a67989
    • Jakub Jelinek's avatar
      c++: Fix ICE with invalid defaulted operator <=> [PR118387] · 4c688399
      Jakub Jelinek authored
      In the following testcase there are 2 issues, one is that B doesn't
      have operator<=> and the other is that A's operator<=> has int return
      type, i.e. not the standard comparison category.
      Because of the int return type, retcat is cc_last; when we first
      try to synthetize it, it is therefore with tentative false and complain
      tf_none, we find that B doesn't have operator<=> and because retcat isn't
      tc_last, don't try to search for other operators in genericize_spaceship.
      And then mark the operator deleted.
      When trying to explain the use of the deleted operator, tentative is still
      false, but complain is tf_error_or_warning.
      do_one_comp will first do:
        tree comp = build_new_op (loc, code, flags, lhs, rhs,
                                  NULL_TREE, NULL_TREE, &overload,
                                  tentative ? tf_none : complain);
      and because complain isn't tf_none, it will actually diagnose the bug
      already, but then (tentative || complain) is true and we call
      genericize_spaceship, which has
        if (tag == cc_last && is_auto (type))
          {
      ...
          }
      
        gcc_checking_assert (tag < cc_last);
      and because tag is cc_last and type isn't auto, we just ICE on that
      assertion.
      
      The patch fixes it by returning error_mark_node from genericize_spaceship
      instead of failing the assertion.
      
      Note, the PR raises another problem.
      If on the same testcase the B b; line is removed, we silently synthetize
      operator<=> which will crash at runtime due to returning without a return
      statement.  That is because the standard says that in that case
      it should return static_cast<int>(std::strong_ordering::equal);
      but I can't find anywhere wording which would say that if that isn't
      valid, the function is deleted.
      https://eel.is/c++draft/class.compare#class.spaceship-2.2
      seems to talk just about cases where there are some members and their
      comparison is invalid it is deleted, but here there are none and it
      follows
      https://eel.is/c++draft/class.compare#class.spaceship-3.sentence-2
      So, we synthetize with tf_none, see the static_cast is invalid, don't
      add error_mark_node statement silently, but as the function isn't deleted,
      we just silently emit it.
      Should the standard be amended to say that the operator should be deleted
      even if it has no elements and the static cast from
      https://eel.is/c++draft/class.compare#class.spaceship-3.sentence-2
      ?
      
      2025-01-10  Jakub Jelinek  <jakub@redhat.com>
      
      	PR c++/118387
      	* method.cc (genericize_spaceship): For tag == cc_last if
      	type is not auto just return error_mark_node instead of failing
      	checking assertion.
      
      	* g++.dg/cpp2a/spaceship-synth17.C: New test.
      4c688399
    • Jason Merrill's avatar
      c++: modules and DECL_REPLACEABLE_P · e86daddb
      Jason Merrill authored
      We need to remember that the ::operator new is replaceable to avoid a bogus
      error about __builtin_operator_new finding a non-replaceable function.
      
      This affected __get_temporary_buffer in stl_tempbuf.h.
      
      gcc/cp/ChangeLog:
      
      	* module.cc (trees_out::core_bools): Write replaceable_operator.
      	(trees_in::core_bools): Read it.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/modules/operator-2_a.C: New test.
      	* g++.dg/modules/operator-2_b.C: New test.
      e86daddb
    • Richard Biener's avatar
      Fix some memory leaks · 9193641d
      Richard Biener authored
      The following fixes memory leaks found compiling SPEC CPU 2017 with
      valgrind.
      
      	* df-core.cc (rest_of_handle_df_finish): Release dflow for
      	problems without free function (like LR).
      	* gimple-crc-optimization.cc (crc_optimization::loop_may_calculate_crc):
      	Release loop_bbs on all exits.
      	* tree-vectorizer.h (supportable_indirect_convert_operation): Change.
      	* tree-vect-generic.cc (expand_vector_conversion): Adjust.
      	* tree-vect-stmts.cc (vectorizable_conversion): Use auto_vec for
      	converts.
      	(supportable_indirect_convert_operation): Get a reference to
      	the output vector of converts.
      9193641d
    • Vladimir N. Makarov's avatar
      [PR118017][LRA]: Fix test for i686 · 94d8de53
      Vladimir N. Makarov authored
      My previous patch for PR118017 contains a test which fails on i686.  The patch fixes this.
      
      gcc/testsuite/ChangeLog:
      
      	PR target/118017
      	* gcc.target/i386/pr118017.c: Check target int128.
      94d8de53
    • Christophe Lyon's avatar
      arm: [MVE intrinsics] Fix tuples field name (PR 118332) · 288ac095
      Christophe Lyon authored
      The previous fix only worked for C, for C++ we need to add more
      information to the underlying type so that
      finish_class_member_access_expr accepts it.
      
      We use the same logic as in aarch64's register_tuple_type for AdvSIMD
      tuples.
      
      This patch makes gcc.target/arm/mve/intrinsics/pr118332.c pass in C++
      mode.
      
      gcc/ChangeLog:
      
      	PR target/118332
      	* config/arm/arm-mve-builtins.cc (wrap_type_in_struct): Delete.
      	(register_type_decl): Delete.
      	(register_builtin_tuple_types): Use
      	lang_hooks.types.simulate_record_decl.
      288ac095
    • Richard Biener's avatar
      Fix bootstrap on !HARDREG_PRE_REGNOS targets · 55341185
      Richard Biener authored
      Pushed as obvious.
      
      	* gcse.cc (pass_hardreg_pre::gate): Wrap possibly unused
      	fun argument.
      55341185
    • Richard Biener's avatar
      rtl-optimization/117467 - limit ext-dce memory use · 03faac50
      Richard Biener authored
      The following puts in a hard limit on ext-dce because it might end
      up requiring memory on the order of the number of basic blocks
      times the number of pseudo registers.  The limiting follows what
      GCSE based passes do and thus I re-use --param max-gcse-memory here.
      
      This doesn't in any way address the implementation issues of the pass,
      but it reduces the memory-use when compiling the
      module_first_rk_step_part1.F90 TU from 521.wrf_r from 25GB to 1GB.
      
      	PR rtl-optimization/117467
      	PR rtl-optimization/117934
      	* ext-dce.cc (ext_dce_execute): Do nothing if a memory
      	allocation estimate exceeds what is allowed by
      	--param max-gcse-memory.
      03faac50
    • Marek Polacek's avatar
      c++: ICE with pack indexing and partial inst [PR117937] · d6444794
      Marek Polacek authored
      Here we ICE in expand_expr_real_1:
      
            if (exp)
              {
                tree context = decl_function_context (exp);
                gcc_assert (SCOPE_FILE_SCOPE_P (context)
                            || context == current_function_decl
      
      on something like this test:
      
        void
        f (auto... args)
        {
          [&]<size_t... i>(seq<i...>) {
      	g(args...[i]...);
          }(seq<0>());
        }
      
      because while current_function_decl is:
      
        f<int>(int)::<lambda(seq<i ...>)> [with long unsigned int ...i = {0}]
      
      (correct), context is:
      
        f<int>(int)::<lambda(seq<i ...>)>
      
      which is only the partial instantiation.
      
      I think that when tsubst_pack_index gets a partial instantiation, e.g.
      {*args#0} as the pack, we should still tsubst it.  The args#0's value-expr
      can be __closure->__args#0 where the closure's context is the partially
      instantiated operator().  So we should let retrieve_local_specialization
      find the right args#0.
      
      	PR c++/117937
      
      gcc/cp/ChangeLog:
      
      	* pt.cc (tsubst_pack_index): tsubst the pack even when it's not
      	PACK_EXPANSION_P.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/cpp26/pack-indexing13.C: New test.
      	* g++.dg/cpp26/pack-indexing14.C: New test.
      d6444794
    • Stefan Schulze Frielinghaus's avatar
      s390: Add expander for uaddc/usubc optabs · 8a2d5bc2
      Stefan Schulze Frielinghaus authored
      gcc/ChangeLog:
      
      	* config/s390/s390-protos.h (s390_emit_compare): Add mode
      	parameter for the resulting RTX.
      	* config/s390/s390.cc (s390_emit_compare): Dito.
      	(s390_emit_compare_and_swap): Change.
      	(s390_expand_vec_strlen): Change.
      	(s390_expand_cs_hqi): Change.
      	(s390_expand_split_stack_prologue): Change.
      	* config/s390/s390.md (*add<mode>3_carry1_cc): Renamed to ...
      	(add<mode>3_carry1_cc): this and in order to use the
      	corresponding gen function, encode CC mode into pattern.
      	(*sub<mode>3_borrow_cc): Renamed to ...
      	(sub<mode>3_borrow_cc): this and in order to use the
      	corresponding gen function, encode CC mode into pattern.
      	(*add<mode>3_alc_carry1_cc): Renamed to ...
      	(add<mode>3_alc_carry1_cc): this and in order to use the
      	corresponding gen function, encode CC mode into pattern.
      	(sub<mode>3_slb_borrow1_cc): New.
      	(uaddc<mode>5): New.
      	(usubc<mode>5): New.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/s390/uaddc-1.c: New test.
      	* gcc.target/s390/uaddc-2.c: New test.
      	* gcc.target/s390/uaddc-3.c: New test.
      	* gcc.target/s390/usubc-1.c: New test.
      	* gcc.target/s390/usubc-2.c: New test.
      	* gcc.target/s390/usubc-3.c: New test.
      8a2d5bc2
    • Andrew Carlotti's avatar
      docs: Document new hardreg PRE pass · 016e2f00
      Andrew Carlotti authored
      gcc/ChangeLog:
      
      	* doc/passes.texi: Document hardreg PRE pass.
      016e2f00
    • Andrew Carlotti's avatar
      Add new hardreg PRE pass · e7f98d96
      Andrew Carlotti authored
      This pass is used to optimise assignments to the FPMR register in
      aarch64.  I chose to implement this as a middle-end pass because it
      mostly reuses the existing RTL PRE code within gcse.cc.
      
      Compared to RTL PRE, the key difference in this new pass is that we
      insert new writes directly to the destination hardreg, instead of
      writing to a new pseudo-register and copying the result later.  This
      requires changes to the analysis portion of the pass, because sets
      cannot be moved before existing instructions that set, use or clobber
      the hardreg, and the value becomes unavailable after any uses of
      clobbers of the hardreg.
      
      Any uses of the hardreg in debug insns will be deleted.  We could do
      better than this, but for the aarch64 fpmr I don't think we emit useful
      debuginfo for deleted fp8 instructions anyway (and I don't even know if
      it's possible to have a debug fpmr use when entering hardreg PRE).
      
      gcc/ChangeLog:
      
      	* config/aarch64/aarch64.h (HARDREG_PRE_REGNOS): New macro.
      	* gcse.cc (doing_hardreg_pre_p): New global variable.
      	(do_load_motion): New boolean check.
      	(current_hardreg_regno): New global variable.
      	(compute_local_properties): Unset transp for hardreg clobbers.
      	(prune_hardreg_uses): New function.
      	(want_to_gcse_p): Use different checks for hardreg PRE.
      	(oprs_unchanged_p): Disable load motion for hardreg PRE pass.
      	(hash_scan_set): For hardreg PRE, skip non-hardreg sets and
      	check for hardreg clobbers.
      	(record_last_mem_set_info): Skip for hardreg PRE.
      	(compute_pre_data): Prune hardreg uses from transp bitmap.
      	(pre_expr_reaches_here_p_work): Add sentence to comment.
      	(insert_insn_start_basic_block): New functions.
      	(pre_edge_insert): Don't add hardreg sets to predecessor block.
      	(pre_delete): Use hardreg for the reaching reg.
      	(reset_hardreg_debug_uses): New function.
      	(pre_gcse): For hardreg PRE, reset debug uses and don't insert
      	copies.
      	(one_pre_gcse_pass): Disable load motion for hardreg PRE.
      	(execute_hardreg_pre): New.
      	(class pass_hardreg_pre): New.
      	(pass_hardreg_pre::gate): New.
      	(make_pass_hardreg_pre): New.
      	* passes.def (pass_hardreg_pre): New pass.
      	* tree-pass.h (make_pass_hardreg_pre): New.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/aarch64/acle/fpmr-1.c: New test.
      	* gcc.target/aarch64/acle/fpmr-2.c: New test.
      	* gcc.target/aarch64/acle/fpmr-3.c: New test.
      	* gcc.target/aarch64/acle/fpmr-4.c: New test.
      e7f98d96
    • Andrew Carlotti's avatar
      Disable a broken multiversioning optimisation · 21212f08
      Andrew Carlotti authored
      This patch skips redirect_to_specific clone for aarch64 and riscv,
      because the optimisation has two flaws:
      
      1. It checks the value of the "target" attribute, even on targets that
      don't use this attribute for multiversioning.
      
      2. The algorithm used is too aggressive, and will eliminate the
      indirection in some cases where the runtime choice of callee version
      can't be determined statically at compile time.  A correct would need to
      verify that:
       - if the current caller version were selected at runtime, then the
         chosen callee version would be eligible for selection.
       - if any higher priority callee version were selected at runtime, then
         a higher priority caller version would have been eligble for
         selection (and hence the current caller version wouldn't have been
         selected).
      
      The current checks only verify a more restrictive version of the first
      condition, and don't check the second condition at all.
      
      Fixing the optimisation properly would require implementing target hooks
      to check for implications between version attributes, which is too
      complicated for this stage.  However, I would like to see this hook
      implemented in the future, since it could also help deduplicate other
      multiversioning code.
      
      Since this behaviour has existed for x86 and powerpc for a while, I
      think it's best to preserve the existing behaviour on those targets,
      unless any maintainer for those targets disagrees.
      
      gcc/ChangeLog:
      
      	* multiple_target.cc
      	(redirect_to_specific_clone): Assert that "target" attribute is
      	used for FMV before checking it.
      	(ipa_target_clone): Skip redirect_to_specific_clone on some
      	targets.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.target/aarch64/mv-pragma.C: New test.
      21212f08
    • Andrew Carlotti's avatar
      docs: Add new AArch64 flags · abbe2905
      Andrew Carlotti authored
      gcc/ChangeLog:
      
      	* doc/invoke.texi: Add new AArch64 flags.
      abbe2905
    • Andrew Carlotti's avatar
      aarch64: Add new +xs flag · f06c6f8b
      Andrew Carlotti authored
      GCC does not emit tlbi instructions, so this only affects the flags
      passed through to the assembler.
      
      gcc/ChangeLog:
      
      	* config/aarch64/aarch64-arches.def (V8_7A): Add XS.
      	* config/aarch64/aarch64-option-extensions.def (XS): New flag.
      f06c6f8b
    • Andrew Carlotti's avatar
      aarch64: Add new +wfxt flag · 4984119b
      Andrew Carlotti authored
      GCC does not currently emit the wfet or wfit instructions, so this
      primarily affects the flags passed through to the assembler.
      
      gcc/ChangeLog:
      
      	* config/aarch64/aarch64-arches.def (V8_7A): Add WFXT.
      	* config/aarch64/aarch64-option-extensions.def (WFXT): New flag.
      4984119b
    • Andrew Carlotti's avatar
      aarch64: Add new +rcpc2 flag · 5747c121
      Andrew Carlotti authored
      gcc/ChangeLog:
      
      	* config/aarch64/aarch64-arches.def (V8_4A): Add RCPC2.
      	* config/aarch64/aarch64-option-extensions.def
      	(RCPC2): New flag.
      	(RCPC3): Add RCPC2 dependency.
      	* config/aarch64/aarch64.h (TARGET_RCPC2): Use new flag.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/aarch64/cpunative/native_cpu_21.c: Add rcpc2 to
      	expected feature string instead of rcpc.
      	* gcc.target/aarch64/cpunative/native_cpu_22.c: Ditto.
      5747c121
    • Andrew Carlotti's avatar
      aarch64: Add new +flagm2 flag · f5915726
      Andrew Carlotti authored
      GCC does not currently emit the axflag or xaflag instructions, so this
      primarily affects the flags passed through to the assembler.
      
      gcc/ChangeLog:
      
      	* config/aarch64/aarch64-arches.def (V8_5A): Add FLAGM2.
      	* config/aarch64/aarch64-option-extensions.def (FLAGM2): New flag.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/aarch64/cpunative/native_cpu_21.c: Add flagm2 to
      	expected feature string instead of flagm.
      	* gcc.target/aarch64/cpunative/native_cpu_22.c: Ditto.
      f5915726
    • Andrew Carlotti's avatar
      aarch64: Add new +frintts flag · 32a45a21
      Andrew Carlotti authored
      gcc/ChangeLog:
      
      	* config/aarch64/aarch64-arches.def (V8_5A): Add FRINTTS
      	* config/aarch64/aarch64-option-extensions.def (FRINTTS): New flag.
      	* config/aarch64/aarch64.h (TARGET_FRINT): Use new flag.
      	* config/aarch64/arm_acle.h: Use new flag for frintts intrinsics.
      	* config/aarch64/arm_neon.h: Ditto.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/aarch64/cpunative/native_cpu_21.c: Add frintts to
      	expected feature string.
      	* gcc.target/aarch64/cpunative/native_cpu_22.c: Ditto.
      32a45a21
    • Andrew Carlotti's avatar
      aarch64: Add new +jscvt flag · 2c891357
      Andrew Carlotti authored
      gcc/ChangeLog:
      
      	* config/aarch64/aarch64-arches.def (V8_3A): Add JSCVT.
      	* config/aarch64/aarch64-option-extensions.def (JSCVT): New flag.
      	* config/aarch64/aarch64.h (TARGET_JSCVT): Use new flag.
      	* config/aarch64/arm_acle.h: Use new flag for jscvt intrinsics.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/aarch64/cpunative/native_cpu_21.c: Add jscvt to
      	expected feature string.
      	* gcc.target/aarch64/cpunative/native_cpu_22.c: Ditto.
      2c891357
    • Andrew Carlotti's avatar
      aarch64: Add new +fcma flag · 9bbb91e8
      Andrew Carlotti authored
      This includes +fcma as a dependency of +sve, and means that we can
      finally support fcma intrinsics on a64fx.
      
      Also add fcma to the Features list in several cpunative testcases that
      incorrectly included sve without fcma.
      
      gcc/ChangeLog:
      
      	* config/aarch64/aarch64-arches.def (V8_3A): Add FCMA.
      	* config/aarch64/aarch64-option-extensions.def (FCMA): New flag.
      	(SVE): Add FCMA dependency.
      	* config/aarch64/aarch64.h (TARGET_COMPLEX): Use new flag.
      	* config/aarch64/arm_neon.h: Use new flag for fcma intrinsics.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/aarch64/cpunative/info_15: Add fcma to Features.
      	* gcc.target/aarch64/cpunative/info_16: Ditto.
      	* gcc.target/aarch64/cpunative/info_17: Ditto.
      	* gcc.target/aarch64/cpunative/info_8: Ditto.
      	* gcc.target/aarch64/cpunative/info_9: Ditto.
      9bbb91e8
    • Andrew Carlotti's avatar
      aarch64: Use PAUTH instead of V8_3A in some places · 20385cb9
      Andrew Carlotti authored
      gcc/ChangeLog:
      
      	* config/aarch64/aarch64.cc
      	(aarch64_expand_epilogue): Use TARGET_PAUTH.
      	* config/aarch64/aarch64.md: Update comment.
      20385cb9
    • Jakub Jelinek's avatar
      c: Fix up expr location for __builtin_stdc_rotate_* [PR118376] · 76b7f60f
      Jakub Jelinek authored
      Seems I forgot to set_c_expr_source_range for the __builtin_stdc_rotate_*
      case (the other __builtin_stdc_* cases already have it), which means
      the locations in expr are uninitialized, sometimes causing ICEs in linemap
      code, at other times just valgrind errors about uninitialized var uses.
      
      2025-01-10  Jakub Jelinek  <jakub@redhat.com>
      
      	PR c/118376
      	* c-parser.cc (c_parser_postfix_expression): Call
      	set_c_expr_source_range before break in the __builtin_stdc_rotate_*
      	case.
      
      	* gcc.dg/pr118376.c: New test.
      76b7f60f
Loading