Skip to content
Snippets Groups Projects
  1. Jan 10, 2025
    • Jonathan Wakely's avatar
      libstdc++: Fix unused parameter warnings in <bits/atomic_futex.h> · c9353e0f
      Jonathan Wakely authored
      This fixes warnings like the following during bootstrap:
      
      sparc-sun-solaris2.11/libstdc++-v3/include/bits/atomic_futex.h:324:53: warning: unused parameter ‘__mo’ [-Wunused-parameter]
        324 |     _M_load_when_equal(unsigned __val, memory_order __mo)
            |                                        ~~~~~~~~~~~~~^~~~
      
      libstdc++-v3/ChangeLog:
      
      	* include/bits/atomic_futex.h (__atomic_futex_unsigned): Remove
      	names of unused parameters in non-futex implementation.
      c9353e0f
    • Marek Polacek's avatar
      c++: add fixed test [PR118391] · d2017159
      Marek Polacek authored
      Fixed by r15-6740.
      
      	PR c++/118391
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/cpp2a/lambda-uneval20.C: New test.
      d2017159
    • Wilco Dijkstra's avatar
      libatomic: Cleanup AArch64 ifunc selection · 81bcf412
      Wilco Dijkstra authored
      Simplify and cleanup ifunc selection logic.  Since LRCPC3 does
      not imply LSE2, has_rcpc3() should also check LSE2 is enabled.
      
      Passes regress and bootstrap, OK for commit?
      
      libatomic:
      	* config/linux/aarch64/host-config.h (has_lse2): Cleanup.
      	(has_lse128): Likewise.
      	(has_rcpc3): Add early check for LSE2.
      81bcf412
    • Torbjörn SVENSSON's avatar
      testsuite: arm: Add pattern for armv8-m.base to cmse-15.c test · cfd7c54b
      Torbjörn SVENSSON authored
      
      Since armv8-m.base uses thumb1 that does not suport sibcall/tailcall,
      a pattern is needed that uses PUSH/BL/POP sequence instead of a single
      B instruction to reuse an already existing function in the compile unit.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/arm/cmse/cmse-15.c: Added pattern for armv8-m.base.
      
      Signed-off-by: default avatarTorbjörn SVENSSON <torbjorn.svensson@foss.st.com>
      cfd7c54b
    • Paul-Antoine Arras's avatar
      Do not call cp_parser_omp_dispatch directly in cp_parser_pragma · b5a67989
      Paul-Antoine Arras authored
      This is a followup to
      ed49709a OpenMP: C++ front-end support for dispatch + adjust_args.
      
      The call to cp_parser_omp_dispatch only belongs in cp_parser_omp_construct. In
      cp_parser_pragma, handle PRAGMA_OMP_DISPATCH by calling cp_parser_omp_construct.
      
      gcc/cp/ChangeLog:
      
      	* parser.cc (cp_parser_pragma): Replace call to cp_parser_omp_dispatch
      	with cp_parser_omp_construct and check context.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/gomp/dispatch-8.C: New test.
      b5a67989
    • Jakub Jelinek's avatar
      c++: Fix ICE with invalid defaulted operator <=> [PR118387] · 4c688399
      Jakub Jelinek authored
      In the following testcase there are 2 issues, one is that B doesn't
      have operator<=> and the other is that A's operator<=> has int return
      type, i.e. not the standard comparison category.
      Because of the int return type, retcat is cc_last; when we first
      try to synthetize it, it is therefore with tentative false and complain
      tf_none, we find that B doesn't have operator<=> and because retcat isn't
      tc_last, don't try to search for other operators in genericize_spaceship.
      And then mark the operator deleted.
      When trying to explain the use of the deleted operator, tentative is still
      false, but complain is tf_error_or_warning.
      do_one_comp will first do:
        tree comp = build_new_op (loc, code, flags, lhs, rhs,
                                  NULL_TREE, NULL_TREE, &overload,
                                  tentative ? tf_none : complain);
      and because complain isn't tf_none, it will actually diagnose the bug
      already, but then (tentative || complain) is true and we call
      genericize_spaceship, which has
        if (tag == cc_last && is_auto (type))
          {
      ...
          }
      
        gcc_checking_assert (tag < cc_last);
      and because tag is cc_last and type isn't auto, we just ICE on that
      assertion.
      
      The patch fixes it by returning error_mark_node from genericize_spaceship
      instead of failing the assertion.
      
      Note, the PR raises another problem.
      If on the same testcase the B b; line is removed, we silently synthetize
      operator<=> which will crash at runtime due to returning without a return
      statement.  That is because the standard says that in that case
      it should return static_cast<int>(std::strong_ordering::equal);
      but I can't find anywhere wording which would say that if that isn't
      valid, the function is deleted.
      https://eel.is/c++draft/class.compare#class.spaceship-2.2
      seems to talk just about cases where there are some members and their
      comparison is invalid it is deleted, but here there are none and it
      follows
      https://eel.is/c++draft/class.compare#class.spaceship-3.sentence-2
      So, we synthetize with tf_none, see the static_cast is invalid, don't
      add error_mark_node statement silently, but as the function isn't deleted,
      we just silently emit it.
      Should the standard be amended to say that the operator should be deleted
      even if it has no elements and the static cast from
      https://eel.is/c++draft/class.compare#class.spaceship-3.sentence-2
      ?
      
      2025-01-10  Jakub Jelinek  <jakub@redhat.com>
      
      	PR c++/118387
      	* method.cc (genericize_spaceship): For tag == cc_last if
      	type is not auto just return error_mark_node instead of failing
      	checking assertion.
      
      	* g++.dg/cpp2a/spaceship-synth17.C: New test.
      4c688399
    • Jason Merrill's avatar
      c++: modules and DECL_REPLACEABLE_P · e86daddb
      Jason Merrill authored
      We need to remember that the ::operator new is replaceable to avoid a bogus
      error about __builtin_operator_new finding a non-replaceable function.
      
      This affected __get_temporary_buffer in stl_tempbuf.h.
      
      gcc/cp/ChangeLog:
      
      	* module.cc (trees_out::core_bools): Write replaceable_operator.
      	(trees_in::core_bools): Read it.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/modules/operator-2_a.C: New test.
      	* g++.dg/modules/operator-2_b.C: New test.
      e86daddb
    • Richard Biener's avatar
      Fix some memory leaks · 9193641d
      Richard Biener authored
      The following fixes memory leaks found compiling SPEC CPU 2017 with
      valgrind.
      
      	* df-core.cc (rest_of_handle_df_finish): Release dflow for
      	problems without free function (like LR).
      	* gimple-crc-optimization.cc (crc_optimization::loop_may_calculate_crc):
      	Release loop_bbs on all exits.
      	* tree-vectorizer.h (supportable_indirect_convert_operation): Change.
      	* tree-vect-generic.cc (expand_vector_conversion): Adjust.
      	* tree-vect-stmts.cc (vectorizable_conversion): Use auto_vec for
      	converts.
      	(supportable_indirect_convert_operation): Get a reference to
      	the output vector of converts.
      9193641d
    • Vladimir N. Makarov's avatar
      [PR118017][LRA]: Fix test for i686 · 94d8de53
      Vladimir N. Makarov authored
      My previous patch for PR118017 contains a test which fails on i686.  The patch fixes this.
      
      gcc/testsuite/ChangeLog:
      
      	PR target/118017
      	* gcc.target/i386/pr118017.c: Check target int128.
      94d8de53
    • Christophe Lyon's avatar
      arm: [MVE intrinsics] Fix tuples field name (PR 118332) · 288ac095
      Christophe Lyon authored
      The previous fix only worked for C, for C++ we need to add more
      information to the underlying type so that
      finish_class_member_access_expr accepts it.
      
      We use the same logic as in aarch64's register_tuple_type for AdvSIMD
      tuples.
      
      This patch makes gcc.target/arm/mve/intrinsics/pr118332.c pass in C++
      mode.
      
      gcc/ChangeLog:
      
      	PR target/118332
      	* config/arm/arm-mve-builtins.cc (wrap_type_in_struct): Delete.
      	(register_type_decl): Delete.
      	(register_builtin_tuple_types): Use
      	lang_hooks.types.simulate_record_decl.
      288ac095
    • Richard Biener's avatar
      Fix bootstrap on !HARDREG_PRE_REGNOS targets · 55341185
      Richard Biener authored
      Pushed as obvious.
      
      	* gcse.cc (pass_hardreg_pre::gate): Wrap possibly unused
      	fun argument.
      55341185
    • Richard Biener's avatar
      rtl-optimization/117467 - limit ext-dce memory use · 03faac50
      Richard Biener authored
      The following puts in a hard limit on ext-dce because it might end
      up requiring memory on the order of the number of basic blocks
      times the number of pseudo registers.  The limiting follows what
      GCSE based passes do and thus I re-use --param max-gcse-memory here.
      
      This doesn't in any way address the implementation issues of the pass,
      but it reduces the memory-use when compiling the
      module_first_rk_step_part1.F90 TU from 521.wrf_r from 25GB to 1GB.
      
      	PR rtl-optimization/117467
      	PR rtl-optimization/117934
      	* ext-dce.cc (ext_dce_execute): Do nothing if a memory
      	allocation estimate exceeds what is allowed by
      	--param max-gcse-memory.
      03faac50
    • Marek Polacek's avatar
      c++: ICE with pack indexing and partial inst [PR117937] · d6444794
      Marek Polacek authored
      Here we ICE in expand_expr_real_1:
      
            if (exp)
              {
                tree context = decl_function_context (exp);
                gcc_assert (SCOPE_FILE_SCOPE_P (context)
                            || context == current_function_decl
      
      on something like this test:
      
        void
        f (auto... args)
        {
          [&]<size_t... i>(seq<i...>) {
      	g(args...[i]...);
          }(seq<0>());
        }
      
      because while current_function_decl is:
      
        f<int>(int)::<lambda(seq<i ...>)> [with long unsigned int ...i = {0}]
      
      (correct), context is:
      
        f<int>(int)::<lambda(seq<i ...>)>
      
      which is only the partial instantiation.
      
      I think that when tsubst_pack_index gets a partial instantiation, e.g.
      {*args#0} as the pack, we should still tsubst it.  The args#0's value-expr
      can be __closure->__args#0 where the closure's context is the partially
      instantiated operator().  So we should let retrieve_local_specialization
      find the right args#0.
      
      	PR c++/117937
      
      gcc/cp/ChangeLog:
      
      	* pt.cc (tsubst_pack_index): tsubst the pack even when it's not
      	PACK_EXPANSION_P.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/cpp26/pack-indexing13.C: New test.
      	* g++.dg/cpp26/pack-indexing14.C: New test.
      d6444794
    • Stefan Schulze Frielinghaus's avatar
      s390: Add expander for uaddc/usubc optabs · 8a2d5bc2
      Stefan Schulze Frielinghaus authored
      gcc/ChangeLog:
      
      	* config/s390/s390-protos.h (s390_emit_compare): Add mode
      	parameter for the resulting RTX.
      	* config/s390/s390.cc (s390_emit_compare): Dito.
      	(s390_emit_compare_and_swap): Change.
      	(s390_expand_vec_strlen): Change.
      	(s390_expand_cs_hqi): Change.
      	(s390_expand_split_stack_prologue): Change.
      	* config/s390/s390.md (*add<mode>3_carry1_cc): Renamed to ...
      	(add<mode>3_carry1_cc): this and in order to use the
      	corresponding gen function, encode CC mode into pattern.
      	(*sub<mode>3_borrow_cc): Renamed to ...
      	(sub<mode>3_borrow_cc): this and in order to use the
      	corresponding gen function, encode CC mode into pattern.
      	(*add<mode>3_alc_carry1_cc): Renamed to ...
      	(add<mode>3_alc_carry1_cc): this and in order to use the
      	corresponding gen function, encode CC mode into pattern.
      	(sub<mode>3_slb_borrow1_cc): New.
      	(uaddc<mode>5): New.
      	(usubc<mode>5): New.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/s390/uaddc-1.c: New test.
      	* gcc.target/s390/uaddc-2.c: New test.
      	* gcc.target/s390/uaddc-3.c: New test.
      	* gcc.target/s390/usubc-1.c: New test.
      	* gcc.target/s390/usubc-2.c: New test.
      	* gcc.target/s390/usubc-3.c: New test.
      8a2d5bc2
    • Andrew Carlotti's avatar
      docs: Document new hardreg PRE pass · 016e2f00
      Andrew Carlotti authored
      gcc/ChangeLog:
      
      	* doc/passes.texi: Document hardreg PRE pass.
      016e2f00
    • Andrew Carlotti's avatar
      Add new hardreg PRE pass · e7f98d96
      Andrew Carlotti authored
      This pass is used to optimise assignments to the FPMR register in
      aarch64.  I chose to implement this as a middle-end pass because it
      mostly reuses the existing RTL PRE code within gcse.cc.
      
      Compared to RTL PRE, the key difference in this new pass is that we
      insert new writes directly to the destination hardreg, instead of
      writing to a new pseudo-register and copying the result later.  This
      requires changes to the analysis portion of the pass, because sets
      cannot be moved before existing instructions that set, use or clobber
      the hardreg, and the value becomes unavailable after any uses of
      clobbers of the hardreg.
      
      Any uses of the hardreg in debug insns will be deleted.  We could do
      better than this, but for the aarch64 fpmr I don't think we emit useful
      debuginfo for deleted fp8 instructions anyway (and I don't even know if
      it's possible to have a debug fpmr use when entering hardreg PRE).
      
      gcc/ChangeLog:
      
      	* config/aarch64/aarch64.h (HARDREG_PRE_REGNOS): New macro.
      	* gcse.cc (doing_hardreg_pre_p): New global variable.
      	(do_load_motion): New boolean check.
      	(current_hardreg_regno): New global variable.
      	(compute_local_properties): Unset transp for hardreg clobbers.
      	(prune_hardreg_uses): New function.
      	(want_to_gcse_p): Use different checks for hardreg PRE.
      	(oprs_unchanged_p): Disable load motion for hardreg PRE pass.
      	(hash_scan_set): For hardreg PRE, skip non-hardreg sets and
      	check for hardreg clobbers.
      	(record_last_mem_set_info): Skip for hardreg PRE.
      	(compute_pre_data): Prune hardreg uses from transp bitmap.
      	(pre_expr_reaches_here_p_work): Add sentence to comment.
      	(insert_insn_start_basic_block): New functions.
      	(pre_edge_insert): Don't add hardreg sets to predecessor block.
      	(pre_delete): Use hardreg for the reaching reg.
      	(reset_hardreg_debug_uses): New function.
      	(pre_gcse): For hardreg PRE, reset debug uses and don't insert
      	copies.
      	(one_pre_gcse_pass): Disable load motion for hardreg PRE.
      	(execute_hardreg_pre): New.
      	(class pass_hardreg_pre): New.
      	(pass_hardreg_pre::gate): New.
      	(make_pass_hardreg_pre): New.
      	* passes.def (pass_hardreg_pre): New pass.
      	* tree-pass.h (make_pass_hardreg_pre): New.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/aarch64/acle/fpmr-1.c: New test.
      	* gcc.target/aarch64/acle/fpmr-2.c: New test.
      	* gcc.target/aarch64/acle/fpmr-3.c: New test.
      	* gcc.target/aarch64/acle/fpmr-4.c: New test.
      e7f98d96
    • Andrew Carlotti's avatar
      Disable a broken multiversioning optimisation · 21212f08
      Andrew Carlotti authored
      This patch skips redirect_to_specific clone for aarch64 and riscv,
      because the optimisation has two flaws:
      
      1. It checks the value of the "target" attribute, even on targets that
      don't use this attribute for multiversioning.
      
      2. The algorithm used is too aggressive, and will eliminate the
      indirection in some cases where the runtime choice of callee version
      can't be determined statically at compile time.  A correct would need to
      verify that:
       - if the current caller version were selected at runtime, then the
         chosen callee version would be eligible for selection.
       - if any higher priority callee version were selected at runtime, then
         a higher priority caller version would have been eligble for
         selection (and hence the current caller version wouldn't have been
         selected).
      
      The current checks only verify a more restrictive version of the first
      condition, and don't check the second condition at all.
      
      Fixing the optimisation properly would require implementing target hooks
      to check for implications between version attributes, which is too
      complicated for this stage.  However, I would like to see this hook
      implemented in the future, since it could also help deduplicate other
      multiversioning code.
      
      Since this behaviour has existed for x86 and powerpc for a while, I
      think it's best to preserve the existing behaviour on those targets,
      unless any maintainer for those targets disagrees.
      
      gcc/ChangeLog:
      
      	* multiple_target.cc
      	(redirect_to_specific_clone): Assert that "target" attribute is
      	used for FMV before checking it.
      	(ipa_target_clone): Skip redirect_to_specific_clone on some
      	targets.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.target/aarch64/mv-pragma.C: New test.
      21212f08
    • Andrew Carlotti's avatar
      docs: Add new AArch64 flags · abbe2905
      Andrew Carlotti authored
      gcc/ChangeLog:
      
      	* doc/invoke.texi: Add new AArch64 flags.
      abbe2905
    • Andrew Carlotti's avatar
      aarch64: Add new +xs flag · f06c6f8b
      Andrew Carlotti authored
      GCC does not emit tlbi instructions, so this only affects the flags
      passed through to the assembler.
      
      gcc/ChangeLog:
      
      	* config/aarch64/aarch64-arches.def (V8_7A): Add XS.
      	* config/aarch64/aarch64-option-extensions.def (XS): New flag.
      f06c6f8b
    • Andrew Carlotti's avatar
      aarch64: Add new +wfxt flag · 4984119b
      Andrew Carlotti authored
      GCC does not currently emit the wfet or wfit instructions, so this
      primarily affects the flags passed through to the assembler.
      
      gcc/ChangeLog:
      
      	* config/aarch64/aarch64-arches.def (V8_7A): Add WFXT.
      	* config/aarch64/aarch64-option-extensions.def (WFXT): New flag.
      4984119b
    • Andrew Carlotti's avatar
      aarch64: Add new +rcpc2 flag · 5747c121
      Andrew Carlotti authored
      gcc/ChangeLog:
      
      	* config/aarch64/aarch64-arches.def (V8_4A): Add RCPC2.
      	* config/aarch64/aarch64-option-extensions.def
      	(RCPC2): New flag.
      	(RCPC3): Add RCPC2 dependency.
      	* config/aarch64/aarch64.h (TARGET_RCPC2): Use new flag.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/aarch64/cpunative/native_cpu_21.c: Add rcpc2 to
      	expected feature string instead of rcpc.
      	* gcc.target/aarch64/cpunative/native_cpu_22.c: Ditto.
      5747c121
    • Andrew Carlotti's avatar
      aarch64: Add new +flagm2 flag · f5915726
      Andrew Carlotti authored
      GCC does not currently emit the axflag or xaflag instructions, so this
      primarily affects the flags passed through to the assembler.
      
      gcc/ChangeLog:
      
      	* config/aarch64/aarch64-arches.def (V8_5A): Add FLAGM2.
      	* config/aarch64/aarch64-option-extensions.def (FLAGM2): New flag.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/aarch64/cpunative/native_cpu_21.c: Add flagm2 to
      	expected feature string instead of flagm.
      	* gcc.target/aarch64/cpunative/native_cpu_22.c: Ditto.
      f5915726
    • Andrew Carlotti's avatar
      aarch64: Add new +frintts flag · 32a45a21
      Andrew Carlotti authored
      gcc/ChangeLog:
      
      	* config/aarch64/aarch64-arches.def (V8_5A): Add FRINTTS
      	* config/aarch64/aarch64-option-extensions.def (FRINTTS): New flag.
      	* config/aarch64/aarch64.h (TARGET_FRINT): Use new flag.
      	* config/aarch64/arm_acle.h: Use new flag for frintts intrinsics.
      	* config/aarch64/arm_neon.h: Ditto.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/aarch64/cpunative/native_cpu_21.c: Add frintts to
      	expected feature string.
      	* gcc.target/aarch64/cpunative/native_cpu_22.c: Ditto.
      32a45a21
    • Andrew Carlotti's avatar
      aarch64: Add new +jscvt flag · 2c891357
      Andrew Carlotti authored
      gcc/ChangeLog:
      
      	* config/aarch64/aarch64-arches.def (V8_3A): Add JSCVT.
      	* config/aarch64/aarch64-option-extensions.def (JSCVT): New flag.
      	* config/aarch64/aarch64.h (TARGET_JSCVT): Use new flag.
      	* config/aarch64/arm_acle.h: Use new flag for jscvt intrinsics.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/aarch64/cpunative/native_cpu_21.c: Add jscvt to
      	expected feature string.
      	* gcc.target/aarch64/cpunative/native_cpu_22.c: Ditto.
      2c891357
    • Andrew Carlotti's avatar
      aarch64: Add new +fcma flag · 9bbb91e8
      Andrew Carlotti authored
      This includes +fcma as a dependency of +sve, and means that we can
      finally support fcma intrinsics on a64fx.
      
      Also add fcma to the Features list in several cpunative testcases that
      incorrectly included sve without fcma.
      
      gcc/ChangeLog:
      
      	* config/aarch64/aarch64-arches.def (V8_3A): Add FCMA.
      	* config/aarch64/aarch64-option-extensions.def (FCMA): New flag.
      	(SVE): Add FCMA dependency.
      	* config/aarch64/aarch64.h (TARGET_COMPLEX): Use new flag.
      	* config/aarch64/arm_neon.h: Use new flag for fcma intrinsics.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/aarch64/cpunative/info_15: Add fcma to Features.
      	* gcc.target/aarch64/cpunative/info_16: Ditto.
      	* gcc.target/aarch64/cpunative/info_17: Ditto.
      	* gcc.target/aarch64/cpunative/info_8: Ditto.
      	* gcc.target/aarch64/cpunative/info_9: Ditto.
      9bbb91e8
    • Andrew Carlotti's avatar
      aarch64: Use PAUTH instead of V8_3A in some places · 20385cb9
      Andrew Carlotti authored
      gcc/ChangeLog:
      
      	* config/aarch64/aarch64.cc
      	(aarch64_expand_epilogue): Use TARGET_PAUTH.
      	* config/aarch64/aarch64.md: Update comment.
      20385cb9
    • Jakub Jelinek's avatar
      c: Fix up expr location for __builtin_stdc_rotate_* [PR118376] · 76b7f60f
      Jakub Jelinek authored
      Seems I forgot to set_c_expr_source_range for the __builtin_stdc_rotate_*
      case (the other __builtin_stdc_* cases already have it), which means
      the locations in expr are uninitialized, sometimes causing ICEs in linemap
      code, at other times just valgrind errors about uninitialized var uses.
      
      2025-01-10  Jakub Jelinek  <jakub@redhat.com>
      
      	PR c/118376
      	* c-parser.cc (c_parser_postfix_expression): Call
      	set_c_expr_source_range before break in the __builtin_stdc_rotate_*
      	case.
      
      	* gcc.dg/pr118376.c: New test.
      76b7f60f
    • Richard Sandiford's avatar
      rtl: Remove invalid compare simplification [PR117186] · 06c4cf39
      Richard Sandiford authored
      g:d882fe51, posted at
      https://gcc.gnu.org/pipermail/gcc-patches/2000-July/033786.html ,
      added code to treat:
      
        (set (reg:CC cc) (compare:CC (gt:M (reg:CC cc) 0) (lt:M (reg:CC cc) 0)))
      
      as a nop.  This PR shows that that isn't always correct.
      The compare in the set above is between two 0/1 booleans (at least
      on STORE_FLAG_VALUE==1 targets), whereas the unknown comparison that
      produced the incoming (reg:CC cc) is unconstrained; it could be between
      arbitrary integers, or even floats.  The fold is therefore replacing a
      cc that is valid for both signed and unsigned comparisons with one that
      is only known to be valid for signed comparisons.
      
        (gt (compare (gt cc 0) (lt cc 0) 0)
      
      does simplify to:
      
        (gt cc 0)
      
      but:
      
        (gtu (compare (gt cc 0) (lt cc 0) 0)
      
      does not simplify to:
      
        (gtu cc 0)
      
      The optimisation didn't come with a testcase, but it was added for
      i386's cmpstrsi, now cmpstrnsi.  That probably doesn't matter as much
      as it once did, since it's now conditional on -minline-all-stringops.
      But the patch is almost 25 years old, so whatever the original
      motivation was, it seems likely that other things now rely on it.
      
      It therefore seems better to try to preserve the optimisation on rtl
      rather than get rid of it.  To do that, we need to look at how the
      result of the outer compare is used.  We'd therefore be looking at four
      instructions (the gt, the lt, the compare, and the use of the compare),
      but combine already allows that for 3-instruction combinations thanks
      to:
      
        /* If the source is a COMPARE, look for the use of the comparison result
           and try to simplify it unless we already have used undobuf.other_insn.  */
      
      When applied to boolean inputs, a comparison operator is
      effectively a boolean logical operator (AND, ANDNOT, XOR, etc.).
      simplify_logical_relational_operation already had code to simplify
      logical operators between two comparison results, but:
      
      * It only handled IOR, which doesn't cover all the cases needed here.
        The others are easily added.
      
      * It treated comparisons of integers as having an ORDERED/UNORDERED result.
        Therefore:
      
        * it would not treat "true for LT + EQ + GT" as "always true" for
          comparisons between integers, because the mask excluded the UNORDERED
          condition.
      
        * it would try to convert "true for LT + GT" into LTGT even for comparisons
          between integers.  To prevent an ICE later, the code used:
      
             /* Many comparison codes are only valid for certain mode classes.  */
             if (!comparison_code_valid_for_mode (code, mode))
               return 0;
      
          However, this used the wrong mode, since "mode" is here the integer
          result of the comparisons (and the mode of the IOR), not the mode of
          the things being compared.  Thus the effect was to reject all
          floating-point-only codes, even when comparing floats.
      
        I think instead the code should detect whether the comparison is between
        integer values and remove UNORDERED from consideration if so.  It then
        always produces a valid comparison (or an always true/false result),
        and so comparison_code_valid_for_mode is not needed.  In particular,
        "true for LT + GT" becomes NE for comparisons between integers but
        remains LTGT for comparisons between floats.
      
      * There was a missing check for whether the comparison inputs had
        side effects.
      
      While there, it also seemed worth extending
      simplify_logical_relational_operation to unsigned comparisons, since
      that makes the testing easier.
      
      As far as that testing goes: the patch exhaustively tests all
      combinations of integer comparisons in:
      
        (cmp1 (cmp2 X Y) (cmp3 X Y))
      
      for the 10 integer comparisons, giving 1000 fold attempts in total.
      It then tries all combinations of (X in {-1,0,1} x Y in {-1,0,1})
      on the result of the fold, giving 9 checks per fold, or 9000 in total.
      That's probably more than is typical for self-tests, but it seems to
      complete in neglible time, even for -O0 builds.
      
      gcc/
      	PR rtl-optimization/117186
      	* rtl.h (simplify_context::simplify_logical_relational_operation): Add
      	an invert0_p parameter.
      	* simplify-rtx.cc (unsigned_comparison_to_mask): New function.
      	(mask_to_unsigned_comparison): Likewise.
      	(comparison_code_valid_for_mode): Delete.
      	(simplify_context::simplify_logical_relational_operation): Add
      	an invert0_p parameter.  Handle AND and XOR.  Handle unsigned
      	comparisons.  Handle always-false results.  Ignore the low bit
      	of the mask if the operands are always ordered and remove the
      	then-redundant check of comparison_code_valid_for_mode.  Check
      	for side-effects in the operands before simplifying them away.
      	(simplify_context::simplify_binary_operation_1): Remove
      	simplification of (compare (gt ...) (lt ...)) and instead...
      	(simplify_context::simplify_relational_operation_1): ...handle
      	comparisons of comparisons here.
      	(test_comparisons): New function.
      	(test_scalar_ops): Call it.
      
      gcc/testsuite/
      	PR rtl-optimization/117186
      	* gcc.dg/torture/pr117186.c: New test.
      	* gcc.target/aarch64/pr117186.c: Likewise.
      06c4cf39
    • Alexandre Oliva's avatar
      [ifcombine] drop other misuses of uniform_integer_cst_p · 47ac6ca9
      Alexandre Oliva authored
      As Jakub pointed out in PR118206, the use of uniform_integer_cst_p in
      ifcombine makes no sense, we're not dealing with vectors.  Indeed,
      I've been misunderstanding and misusing it since I cut&pasted it from
      some preexisting match predicate in earlier version of the ifcombine
      field-merge patch.
      
      
      for  gcc/ChangeLog
      
      	* gimple-fold.cc (decode_field_reference): Drop misuses of
      	uniform_integer_cst_p.
      	(fold_truth_andor_for_ifcombine): Likewise.
      47ac6ca9
    • Alexandre Oliva's avatar
      [ifcombine] fix mask variable test to match use [PR118344] · fd4e979d
      Alexandre Oliva authored
      There was a cut&pasto in the rr_and_mask's adjustment to match the
      combined type: the test on whether there was a mask already was
      testing the wrong variable, and then it might crash or otherwise fail
      accessing an undefined mask.  This only hit with checking enabled,
      and rarely at that.
      
      
      for  gcc/ChangeLog
      
      	PR tree-optimization/118344
      	* gimple-fold.cc (fold_truth_andor_for_ifcombine): Fix typo in
      	rr_and_mask's type adjustment test.
      
      for  gcc/testsuite/ChangeLog
      
      	PR tree-optimization/118344
      	* gcc.dg/field-merge-19.c: New.
      fd4e979d
    • Alexandre Oliva's avatar
      [ifcombine] reuse left-hand mask to decode right-hand xor operand · 740c8497
      Alexandre Oliva authored
      If fold_truth_andor_for_ifcombine applies a mask to an xor, say
      because the result of the xor is compared with a power of two [minus
      one], we have to apply the same mask when processing both the left-
      and right-hand xor paths for the transformation to be sound.  Arrange
      for decode_field_reference to propagate the incoming mask along with
      the expression to the right-hand operand.
      
      Don't require the right-hand xor operand to be a constant, that was a
      cut&pasto.
      
      
      for  gcc/ChangeLog
      
      	* gimple-fold.cc (decode_field_reference): Add xor_pand_mask.
      	Propagate pand_mask to the right-hand xor operand.  Don't
      	require the right-hand xor operand to be a constant.
      	(fold_truth_andor_for_ifcombine): Pass right-hand mask when
      	appropriate.
      740c8497
    • Alexandre Oliva's avatar
      [ifcombine] adjust for narrowing converts before shifts [PR118206] · c96a6c2c
      Alexandre Oliva authored
      A narrowing conversion and a shift both drop bits from the loaded
      value, but we need to take into account which one comes first to get
      the right number of bits and mask.
      
      Fold when applying masks to parts, comparing the parts, and combining
      the results, in the odd chance either mask happens to be zero.
      
      
      for  gcc/ChangeLog
      
      	PR tree-optimization/118206
      	* gimple-fold.cc (decode_field_reference): Account for upper
      	bits dropped by narrowing conversions whether before or after
      	a right shift.
      	(fold_truth_andor_for_ifcombine): Fold masks, compares, and
      	combined results.
      
      for  gcc/testsuite/ChangeLog
      
      	PR tree-optimization/118206
      	* gcc.dg/field-merge-18.c: New.
      c96a6c2c
    • Alexandre Oliva's avatar
      testsuite: generalized field-merge tests for <32-bit int [PR118025] · d3c91b04
      Alexandre Oliva authored
      Explicitly convert constants to the desired types, so as to not elicit
      warnings about implicit truncations, nor execution errors, on targets
      whose ints are narrower than 32 bits.
      
      
      for  gcc/testsuite/ChangeLog
      
      	PR testsuite/118025
      	* gcc.dg/field-merge-1.c: Convert constants to desired types.
      	* gcc.dg/field-merge-3.c: Likewise.
      	* gcc.dg/field-merge-4.c: Likewise.
      	* gcc.dg/field-merge-5.c: Likewise.
      	* gcc.dg/field-merge-11.c: Likewise.
      	* gcc.dg/field-merge-17.c: Don't mess with padding bits.
      d3c91b04
    • Alexandre Oliva's avatar
      testsuite: generalize ifcombine field-merge tests [PR118025] · 261ffe68
      Alexandre Oliva authored
      A number of tests that check for specific ifcombine transformations
      fail on AVR and PRU targets, whose type sizes and alignments aren't
      conducive of the expected transformations.  Adjust the expectations.
      
      Most execution tests should run successfully regardless of the
      transformations, but a few that could conceivably fail if short and
      char have the same bit width now check for that and bypass the tests
      that would fail.
      
      Conversely, one test that had such a runtime test, but that would work
      regardless, no longer has that runtime test, and its types are
      narrowed so that the transformations on 32-bit targets are more likely
      to be the same as those that used to take place on 64-bit targets.
      This latter change is somewhat obviated by a separate patch, but I've
      left it in place anyway.
      
      
      for  gcc/testsuite/ChangeLog
      
      	PR testsuite/118025
      	* gcc.dg/field-merge-1.c: Skip BIT_FIELD_REF counting on AVR and PRU.
      	* gcc.dg/field-merge-3.c: Bypass the test if short doesn't have the
      	expected size.
      	* gcc.dg/field-merge-8.c: Likewise.
      	* gcc.dg/field-merge-9.c: Likewise.  Skip optimization counting on
      	AVR and PRU.
      	* gcc.dg/field-merge-13.c: Skip optimization counting on AVR and PRU.
      	* gcc.dg/field-merge-15.c: Likewise.
      	* gcc.dg/field-merge-17.c: Likewise.
      	* gcc.dg/field-merge-16.c: Likewise.  Drop runtime bypass.  Use
      	smaller types.
      	* gcc.dg/field-merge-14.c: Add comments.
      261ffe68
    • Alexandre Oliva's avatar
      ifcombine field-merge: improve handling of dwords · 38401c58
      Alexandre Oliva authored
      On 32-bit hosts, data types with 64-bit alignment aren't getting
      treated as desired by ifcombine field-merging: we limit the choice of
      modes at BITS_PER_WORD sizes, but when deciding the boundary for a
      split, we'd limit the choice only by the alignment, so we wouldn't
      even consider a split at an odd 32-bit boundary.  Fix that by limiting
      the boundary choice by word choice as well.
      
      Now, this would still leave misaligned 64-bit fields in 64-bit-aligned
      data structures unhandled by ifcombine on 32-bit hosts.  We already
      need to loading them as double words, and if they're not byte-aligned,
      the code gets really ugly, but ifcombine could improve it if it allows
      double-word loads as a last resort.  I've added that.
      
      
      for  gcc/ChangeLog
      
      	* gimple-fold.cc (fold_truth_andor_for_ifcombine): Limit
      	boundary choice by word size as well.  Try aligned double-word
      	loads as a last resort.
      
      for  gcc/testsuite/ChangeLog
      
      	* gcc.dg/field-merge-17.c: New.
      38401c58
    • Martin Jambor's avatar
      ipa-cp: Fold-convert values when necessary (PR 118138) · d019ab4f
      Martin Jambor authored
      PR 118138 and quite a few duplicates that it has acquired in a short
      time show that even though we are careful to make sure we do not loose
      any bits when newly allowing type conversions in jump-functions, we
      still need to perform the fold conversions during IPA constant
      propagation and not just at the end in order to properly perform
      sign-extensions or zero-extensions as appropriate.
      
      This patch does just that, changing a safety predicate we already use
      at the appropriate places to return the necessary type.
      
      gcc/ChangeLog:
      
      2025-01-03  Martin Jambor  <mjambor@suse.cz>
      
      	PR ipa/118138
      	* ipa-cp.cc (ipacp_value_safe_for_type): Return the appropriate
      	type instead of a bool, accept NULL_TREE VALUEs.
      	(propagate_vals_across_arith_jfunc): Use the new returned value of
      	ipacp_value_safe_for_type.
      	(propagate_vals_across_ancestor): Likewise.
      	(propagate_scalar_across_jump_function): Likewise.
      
      gcc/testsuite/ChangeLog:
      
      2025-01-03  Martin Jambor  <mjambor@suse.cz>
      
      	PR ipa/118138
      	* gcc.dg/ipa/pr118138.c: New test.
      d019ab4f
    • Thomas Schwinge's avatar
      nvptx: Add '__builtin_frame_address(0)' test case · 86175a64
      Thomas Schwinge authored
      Documenting the status quo.
      
      	gcc/testsuite/
      	* gcc.target/nvptx/__builtin_frame_address_0-1.c: New.
      86175a64
    • Thomas Schwinge's avatar
      nvptx: Add '__builtin_stack_address()' test case · 91dec10f
      Thomas Schwinge authored
      Documenting the status quo.
      
      	gcc/testsuite/
      	* gcc.target/nvptx/__builtin_stack_address-1.c: New.
      91dec10f
    • Torbjörn SVENSSON's avatar
      testsuite: arm: Use -std=c17 and effective-target arm_arch_v5te_thumb · f447c3c0
      Torbjörn SVENSSON authored
      
      With -std=c23, the following errors are now emitted as the function
      prototype and implementation does not match:
      
      .../pr59858.c: In function 're_search_internal':
      .../pr59858.c:95:17: error: too many arguments to function 'check_matching'
      .../pr59858.c:75:12: note: declared here
      .../pr59858.c: At top level:
      .../pr59858.c:100:1: error: conflicting types for 'check_matching'; have 'int(re_match_context_t *, int *)'
      .../pr59858.c:75:12: note: previous declaration of 'check_matching' with type 'int(void)'
      .../pr59858.c: In function 'check_matching':
      .../pr59858.c:106:14: error: too many arguments to function 'transit_state'
      .../pr59858.c:77:23: note: declared here
      .../pr59858.c: At top level:
      .../pr59858.c:111:1: error: conflicting types for 'transit_state'; have 're_dfastate_t *(re_match_context_t *, re_dfastate_t *)'
      .../pr59858.c:77:23: note: previous declaration of 'transit_state' with type 're_dfastate_t *(void)'
      .../pr59858.c: In function 'transit_state':
      .../pr59858.c:116:7: error: too many arguments to function 'build_trtable'
      .../pr59858.c:79:12: note: declared here
      .../pr59858.c: At top level:
      .../pr59858.c:121:1: error: conflicting types for 'build_trtable'; have 'int(const re_dfa_t *, re_dfastate_t *)'
      .../pr59858.c:79:12: note: previous declaration of 'build_trtable' with type 'int(void)'
      
      Adding -std=c17 removes these errors.
      
      Also, updated test case to use -mcpu=unset/-march=unset feature
      introduced in r15-3606-g7d6c6a0d15c.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/arm/pr59858.c: Use -std=c17 and effective-target
      	arm_arch_v5te_thumb.
      
      Signed-off-by: default avatarTorbjörn SVENSSON <torbjorn.svensson@foss.st.com>
      f447c3c0
    • squirek's avatar
      ada: Incorrect accessibilty level for library level subprograms · 3ff216b7
      squirek authored
      The patch fixes an issue in the compiler whereby accessibility level
      calculations for objects declared witihin library-level subprograms
      were done incorrectly - potentially allowing runtime accessibility
      checks to spuriously pass.
      
      gcc/ada/ChangeLog:
      
      	* accessibility.adb:
      	(Innermost_master_Scope_Depth): Add special case for expressions
      	within library level subprograms.
      3ff216b7
Loading