Skip to content
Snippets Groups Projects
  1. Jan 24, 2024
    • Tamar Christina's avatar
      AArch64: Do not allow SIMD clones with simdlen 1 [PR113552] · 306713c9
      Tamar Christina authored
      The AArch64 vector PCS does not allow simd calls with simdlen 1,
      however due to a bug we currently do allow it for num == 0.
      
      This causes us to emit a symbol that doesn't exist and we fail to link.
      
      gcc/ChangeLog:
      
      	PR tree-optimization/113552
      	* config/aarch64/aarch64.cc
      	(aarch64_simd_clone_compute_vecsize_and_simdlen): Block simdlen 1.
      
      gcc/testsuite/ChangeLog:
      
      	PR tree-optimization/113552
      	* gcc.target/aarch64/pr113552.c: New test.
      	* gcc.target/aarch64/simd_pcs_attribute-3.c: Remove bogus check.
      306713c9
    • Martin Jambor's avatar
      ipa-cp: Fix check for exceeding param_ipa_cp_value_list_size (PR 113490) · bc4a20bc
      Martin Jambor authored
      When the check for exceeding param_ipa_cp_value_list_size limit was
      modified to be ignored for generating values from self-recursive
      calls, it should have been changed from equal to, to equals to or is
      greater than.  This omission manifests itself as PR 113490.
      
      When I examined the condition I also noticed that the parameter should
      come from the callee rather than the caller, since the value list is
      associated with the former and not the latter.  In practice the limit
      is of course very likely to be the same, but I fixed this aspect of
      the condition too.  I briefly audited all other uses of opt_for_fn in
      ipa-cp.cc and all the others looked OK.
      
      gcc/ChangeLog:
      
      2024-01-19  Martin Jambor  <mjambor@suse.cz>
      
      	PR ipa/113490
      	* ipa-cp.cc (ipcp_lattice<valtype>::add_value): Bail out if value
      	count is equal or greater than the limit.  Use the limit from the
      	callee.
      
      gcc/testsuite/ChangeLog:
      
      2024-01-22  Martin Jambor  <mjambor@suse.cz>
      
      	PR ipa/113490
      	* gcc.dg/ipa/pr113490.c: New test.
      bc4a20bc
    • David Malcolm's avatar
      analyzer: fix taint false +ve due to overzealous state purging [PR112977] · e503f9ac
      David Malcolm authored
      
      gcc/analyzer/ChangeLog:
      	PR analyzer/112977
      	* engine.cc (impl_region_model_context::on_liveness_change): Pass
      	m_ext_state to sm_state_map::on_liveness_change.
      	* program-state.cc (sm_state_map::on_svalue_leak): Guard removal
      	of map entry based on can_purge_p.
      	(sm_state_map::on_liveness_change): Add ext_state param.  Add
      	workaround for bad interaction between state purging and
      	alt-inherited sm-state.
      	* program-state.h (sm_state_map::on_liveness_change): Add
      	ext_state param.
      	* sm-taint.cc
      	(taint_state_machine::has_alt_get_inherited_state_p): New.
      	(taint_state_machine::can_purge_p): Return false for "has_lb" and
      	"has_ub".
      	* sm.h (state_machine::has_alt_get_inherited_state_p): New vfunc.
      
      gcc/testsuite/ChangeLog:
      	PR analyzer/112977
      	* gcc.dg/plugin/plugin.exp: Add taint-pr112977.c.
      	* gcc.dg/plugin/taint-pr112977.c: New test.
      
      Signed-off-by: default avatarDavid Malcolm <dmalcolm@redhat.com>
      e503f9ac
    • David Malcolm's avatar
      analyzer kernel plugin: implement __check_object_size [PR112927] · b6e53757
      David Malcolm authored
      
      PR analyzer/112927 reports a false positive from -Wanalyzer-tainted-size
      seen on the Linux kernel's drivers/char/ipmi/ipmi_devintf.c with the
      analyzer kernel plugin.
      
      The issue is that in:
      
      (A):
        if (msg->data_len > 272) {
          return -90;
        }
      
      (B):
        n = msg->data_len;
        __check_object_size(to, n);
        n = copy_from_user(to, from, n);
      
      the analyzer is treating __check_object_size as having arbitrary side
      effects, and, in particular could modify msg->data_len.  Hence the
      sanitization that occurs at (A) above is treated as being for a
      different value than the size obtained at (B), hence the bogus warning
      at the call to copy_from_user.
      
      Fixed by extending the analyzer kernel plugin to "teach" it that
      __check_object_size has no side effects.
      
      gcc/testsuite/ChangeLog:
      	PR analyzer/112927
      	* gcc.dg/plugin/analyzer_kernel_plugin.c
      	(class known_function___check_object_size): New.
      	(kernel_analyzer_init_cb): Register it.
      	* gcc.dg/plugin/plugin.exp: Add taint-pr112927.c.
      	* gcc.dg/plugin/taint-pr112927.c: New test.
      
      Signed-off-by: default avatarDavid Malcolm <dmalcolm@redhat.com>
      b6e53757
    • Gaius Mulley's avatar
      PR modula2/113559 FIO.mod lseek requires cssize_t rather than longint · 3de031c9
      Gaius Mulley authored
      
      This patch fixes a bug in gcc/m2/gm2-libs/FIO.mod which failed to cast the
      whence parameter into the correct type.  The patch casts the whence
      parameter for lseek to SYSTEM.CSSIZE_T.
      
      gcc/m2/ChangeLog:
      
      	PR modula2/113559
      	* gm2-libs/FIO.mod (SetPositionFromBeginning): Convert pos into
      	CSSIZE_T during call to lseek.
      	(SetPositionFromEnd): Convert pos into CSSIZE_T during call to
      	lseek.
      
      Signed-off-by: default avatarGaius Mulley <gaiusmod2@gmail.com>
      3de031c9
    • Rainer Orth's avatar
      testsuite: i386: Don't restrict gcc.dg/vect/vect-simd-clone-16c.c etc. to i686 [PR113556] · b8f54195
      Rainer Orth authored
      A couple of gcc.dg/vect/vect-simd-clone-1*.c tests FAIL on 32-bit
      Solaris/x86 since 20230222:
      
      FAIL: gcc.dg/vect/vect-simd-clone-16c.c scan-tree-dump-times vect
      "[\\\\n\\\\r] [^\\\\n]* = foo\\\\.simdclone" 2
      FAIL: gcc.dg/vect/vect-simd-clone-16d.c scan-tree-dump-times vect
      "[\\\\n\\\\r] [^\\\\n]* = foo\\\\.simdclone" 2
      FAIL: gcc.dg/vect/vect-simd-clone-17c.c scan-tree-dump-times vect
      "[\\\\n\\\\r] [^\\\\n]* = foo\\\\.simdclone" 2
      FAIL: gcc.dg/vect/vect-simd-clone-17d.c scan-tree-dump-times vect
      "[\\\\n\\\\r] [^\\\\n]* = foo\\\\.simdclone" 2
      FAIL: gcc.dg/vect/vect-simd-clone-18c.c scan-tree-dump-times vect
      "[\\\\n\\\\r] [^\\\\n]* = foo\\\\.simdclone" 2
      FAIL: gcc.dg/vect/vect-simd-clone-18d.c scan-tree-dump-times vect
      "[\\\\n\\\\r] [^\\\\n]* = foo\\\\.simdclone" 2
      
      The problem is that the 32-bit Solaris/x86 triple still uses i386,
      although gcc defaults to -mpentium4.  However, the tests only handle
      x86_64* and i686*, although the tests don't seem to require some
      specific ISA extension not covered by vect_simd_clones.
      
      To fix this, the tests now allow generic i?86.  At the same time, I've
      removed the wildcards from x86_64* and i686* since DejaGnu uses the
      canonical forms.
      
      Tested on i386-pc-solaris2.11 and i686-pc-linux-gnu.
      
      2024-01-24  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>
      
      	gcc/testsuite:
      	PR target/113556
      	* gcc.dg/vect/vect-simd-clone-16c.c: Don't wildcard x86_64 in
      	target specs.  Allow any i?86 target instead of i686 only.
      	* gcc.dg/vect/vect-simd-clone-16d.c: Likewise.
      	* gcc.dg/vect/vect-simd-clone-17c.c: Likewise.
      	* gcc.dg/vect/vect-simd-clone-17d.c: Likewise.
      	* gcc.dg/vect/vect-simd-clone-18c.c: Likewise.
      	* gcc.dg/vect/vect-simd-clone-18d.c: Likewise.
      b8f54195
    • Rainer Orth's avatar
      testsuite: i386: Fix gcc.target/i386/pr80833-1.c on 32-bit Solaris/x86 · f4a2478f
      Rainer Orth authored
      gcc.target/i386/pr80833-1.c FAILs on 32-bit Solaris/x86 since 20220609:
      
      FAIL: gcc.target/i386/pr80833-1.c scan-assembler pextrd
      
      Unlike e.g. Linux/i686, 32-bit Solaris/x86 defaults to -mstackrealign,
      so this patch overrides that to match.
      
      Tested on i386-pc-solaris2.11 and i686-pc-linux-gnu.
      
      2024-01-23  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>
      
      	gcc/testsuite:
      	* gcc.target/i386/pr80833-1.c: Add -mno-stackrealign to dg-options.
      f4a2478f
    • YunQiang Su's avatar
      MIPS: Accept arguments for -mexplicit-relocs · 58af788d
      YunQiang Su authored
      GAS introduced explicit relocs since 2001, and %pcrel_hi/low were
      introduced in 2014.  In future, we may introduce more.
      
      Let's convert -mexplicit-relocs option, and accpet options:
          none, base, pcrel.
      
      We also update gcc/configure.ac to set the value to option
      the gas support when GCC itself is built.
      
      gcc
      	* configure.ac: Detect the explicit relocs support for
      	mips, and define C macro MIPS_EXPLICIT_RELOCS.
      	* config.in: Regenerated.
      	* configure: Regenerated.
      	* doc/invoke.texi(MIPS Options): Add -mexplicit-relocs.
      	* config/mips/mips-opts.h: Define enum mips_explicit_relocs.
      	* config/mips/mips.cc(mips_set_compression_mode): Sorry if
      	!TARGET_EXPLICIT_RELOCS instead of just set it.
      	* config/mips/mips.h: Define TARGET_EXPLICIT_RELOCS and
      	TARGET_EXPLICIT_RELOCS_PCREL with mips_opt_explicit_relocs.
      	* config/mips/mips.opt: Introduce -mexplicit-relocs= option
      	and define -m(no-)explicit-relocs as aliases.
      58af788d
    • Thomas Schwinge's avatar
      MAINTAINERS: Update my work email address · 7fcdb501
      Thomas Schwinge authored
      	* MAINTAINERS: Update my work email address.
      7fcdb501
    • Alex Coplan's avatar
      aarch64: Re-enable ldp/stp fusion pass · da9647e9
      Alex Coplan authored
      Since, to the best of my knowledge, all reported regressions related to
      the ldp/stp fusion pass have now been fixed, and PGO+LTO bootstrap with
      --enable-languages=all is working again with the passes enabled, this
      patch turns the passes back on by default, as agreed with Jakub here:
      
      https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642478.html
      
      gcc/ChangeLog:
      
      	* config/aarch64/aarch64.opt (-mearly-ldp-fusion): Set default
      	to 1.
      	(-mlate-ldp-fusion): Likewise.
      da9647e9
    • Tamar Christina's avatar
      middle-end: rename main_exit_p in reduction code. · 59004711
      Tamar Christina authored
      This renamed main_exit_p to last_val_reduc_p to more accurately
      reflect what the value is calculating.
      
      gcc/ChangeLog:
      
      	* tree-vect-loop.cc (vect_get_vect_def,
      	vect_create_epilog_for_reduction): Rename main_exit_p to
      	last_val_reduc_p.
      59004711
    • Tamar Christina's avatar
      middle-end: fix epilog reductions when vector iters peeled [PR113364] · 72429448
      Tamar Christina authored
      This fixes a bug where vect_create_epilog_for_reduction does not handle the
      case where all exits are early exits.  In this case we should do like induction
      handling code does and not have a main exit.
      
      This shows that some new miscompiles are happening (stage3 is likely miscompiled)
      but that's unrelated to this patch and I'll look at it next.
      
      gcc/ChangeLog:
      
      	PR tree-optimization/113364
      	* tree-vect-loop.cc (vect_create_epilog_for_reduction): If all exits all
      	early exits then we must reduce from the first offset for all of them.
      
      gcc/testsuite/ChangeLog:
      
      	PR tree-optimization/113364
      	* gcc.dg/vect/vect-early-break_107-pr113364.c: New test.
      72429448
    • Tobias Burnus's avatar
      libgomp.texi: Document omp_pause_resource{,_all} and omp_target_memcpy* · d89537a1
      Tobias Burnus authored
      
      libgomp/ChangeLog:
      
      	* libgomp.texi (Runtime Library Routines): Document
      	omp_pause_resource, omp_pause_resource_all and
      	omp_target_memcpy{,_rect}{,_async}.
      
      Co-authored-by: default avatarSandra Loosemore <sandra@codesourcery.com>
      Signed-off-by: default avatarTobias Burnus <tburnus@baylibre.com>
      d89537a1
    • Huanghui Nie's avatar
      libstdc++: [_Hashtable] Remove useless check for _M_before_begin node · ec0a68b9
      Huanghui Nie authored
      
      When removing the first node of a bucket it is useless to check if this bucket
      is the one containing the _M_before_begin node. The bucket before-begin node is
      already transfered to the next pointed-to bucket regardeless if it is the container
      before-begin node.
      
      libstdc++-v3/ChangeLog:
      
      	* include/bits/hashtable.h (_Hahstable<>::_M_remove_bucket_begin): Remove
      	_M_before_begin check and cleanup implementation.
      
      Co-authored-by: default avatarThéo Papadopoulo <papadopoulo@gmail.com>
      ec0a68b9
    • Patrick O'Neill's avatar
      RISC-V: Add regression test for vsetvl bug pr113429 · 7f7d9c52
      Patrick O'Neill authored
      
      The reduced testcase for pr113429 (cam4 failure) needed additional
      modules so it wasn't committed.
      The fuzzer found a c testcase that was also fixed with pr113429's fix.
      Adding it as a regression test.
      
      	PR target/113429
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/riscv/rvv/vsetvl/pr113429.c: New test.
      
      Signed-off-by: default avatarPatrick O'Neill <patrick@rivosinc.com>
      7f7d9c52
    • Juzhe-Zhong's avatar
      RISC-V: Fix large memory usage of VSETVL PASS [PR113495] · 3132d2d3
      Juzhe-Zhong authored
      SPEC 2017 wrf benchmark expose unreasonble memory usage of VSETVL PASS
      that is, VSETVL PASS consume over 33 GB memory which make use impossible
      to compile SPEC 2017 wrf in a laptop.
      
      The root cause is wasting-memory variables:
      
      unsigned num_exprs = num_bbs * num_regs;
      sbitmap *avl_def_loc = sbitmap_vector_alloc (num_bbs, num_exprs);
      sbitmap *m_kill = sbitmap_vector_alloc (num_bbs, num_exprs);
      m_avl_def_in = sbitmap_vector_alloc (num_bbs, num_exprs);
      m_avl_def_out = sbitmap_vector_alloc (num_bbs, num_exprs);
      
      I find that compute_avl_def_data can be achieved by RTL_SSA framework.
      Replace the code implementation base on RTL_SSA framework.
      
      After this patch, the memory-hog issue is fixed.
      
      simple vsetvl memory usage (valgrind --tool=massif --pages-as-heap=yes --massif-out-file=massif.out)
      is 1.673 GB.
      
      lazy vsetvl memory usage (valgrind --tool=massif --pages-as-heap=yes --massif-out-file=massif.out)
      is 2.441 GB.
      
      Tested on both RV32 and RV64, no regression.
      
      gcc/ChangeLog:
      
      	PR target/113495
      	* config/riscv/riscv-vsetvl.cc (get_expr_id): Remove.
      	(get_regno): Ditto.
      	(get_bb_index): Ditto.
      	(pre_vsetvl::compute_avl_def_data): Ditto.
      	(pre_vsetvl::earliest_fuse_vsetvl_info): Fix large memory usage.
      	(pre_vsetvl::pre_global_vsetvl_info): Ditto.
      
      gcc/testsuite/ChangeLog:
      
      	PR target/113495
      	* gcc.target/riscv/rvv/vsetvl/avl_single-107.c: Adapt test.
      3132d2d3
    • GCC Administrator's avatar
      Daily bump. · 3128786c
      GCC Administrator authored
      3128786c
  2. Jan 23, 2024
    • Nathaniel Shead's avatar
      testsuite: Disable new test for PR113292 on targets without TLS support · bf358eaa
      Nathaniel Shead authored
      
      This disables the new test added by r14-8168 on machines that don't have
      TLS support, such as bare-metal ARM.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/modules/pr113292_c.C: Require TLS.
      
      Signed-off-by: default avatarNathaniel Shead <nathanieloshead@gmail.com>
      bf358eaa
    • Marek Polacek's avatar
      c++: -Wdangling-reference and lambda false warning [PR109640] · 9010fdba
      Marek Polacek authored
      -Wdangling-reference checks if a function receives a temporary as its
      argument, and only warns if any of the arguments was a temporary.  But
      we should not warn when the temporary represents a lambda or we generate
      false positives as in the attached testcases.
      
      	PR c++/113256
      	PR c++/111607
      	PR c++/109640
      
      gcc/cp/ChangeLog:
      
      	* call.cc (do_warn_dangling_reference): Don't warn if the temporary
      	is of lambda type.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/warn/Wdangling-reference14.C: New test.
      	* g++.dg/warn/Wdangling-reference15.C: New test.
      	* g++.dg/warn/Wdangling-reference16.C: New test.
      9010fdba
    • Tobias Burnus's avatar
      MAINTAINERS: Update my email address · ed4c7893
      Tobias Burnus authored
      
      ChangeLog:
      
      	* MAINTAINERS: Update my email address.
      
      Signed-off-by: default avatarTobias Burnus <tburnus@baylibre.com>
      ed4c7893
    • Jakub Jelinek's avatar
      c: Call c_fully_fold on __atomic_* operands in atomic_bitint_fetch_using_cas_loop [PR113518] · dbc5f1f5
      Jakub Jelinek authored
      As the following testcase shows, I forgot to call c_fully_fold on the
      __atomic_*/__sync_* operands called on _BitInt address, the expressions
      are then used inside of TARGET_EXPR initializers etc. and are never fully
      folded later, which means we can ICE e.g. on C_MAYBE_CONST_EXPR trees
      inside of those.
      
      The following patch fixes it, while the function currently is only called
      in the C FE because C++ doesn't support BITINT_TYPE, I think guarding the
      calls on !c_dialect_cxx () is safer.
      
      2024-01-23  Jakub Jelinek  <jakub@redhat.com>
      
      	PR c/113518
      	* c-common.cc (atomic_bitint_fetch_using_cas_loop): Call c_fully_fold
      	on lhs_addr, val and model for C.
      
      	* gcc.dg/bitint-77.c: New test.
      dbc5f1f5
    • Andrew Pinski's avatar
      aarch64/expr: Use ccmp when the outer expression is used twice [PR100942] · 06ee648e
      Andrew Pinski authored
      
      Ccmp is not used if the result of the and/ior is used by both
      a GIMPLE_COND and a GIMPLE_ASSIGN. This improves the code generation
      here by using ccmp in this case.
      Two changes is required, first we need to allow the outer statement's
      result be used more than once.
      The second change is that during the expansion of the gimple, we need
      to try using ccmp. This is needed because we don't use expand the ssa
      name of the lhs but rather expand directly from the gimple.
      
      A small note on the ccmp_4.c testcase, we should be able to get slightly
      better than with this patch but it is one extra instruction compared to
      before.
      
      	PR target/100942
      
      gcc/ChangeLog:
      
      	* ccmp.cc (ccmp_candidate_p): Add outer argument.
      	Allow if the outer is true and the lhs is used more
      	than once.
      	(expand_ccmp_expr): Update call to ccmp_candidate_p.
      	* expr.h (expand_expr_real_gassign): Declare.
      	* expr.cc (expand_expr_real_gassign): New function, split out from...
      	(expand_expr_real_1): ...here.
      	* cfgexpand.cc (expand_gimple_stmt_1): Use expand_expr_real_gassign.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/aarch64/ccmp_3.c: New test.
      	* gcc.target/aarch64/ccmp_4.c: New test.
      	* gcc.target/aarch64/ccmp_5.c: New test.
      
      Signed-off-by: default avatarAndrew Pinski <quic_apinski@quicinc.com>
      Co-Authored-By: default avatarRichard Sandiford <richard.sandiford@arm.com>
      06ee648e
    • Andrew Stubbs's avatar
      Update my email in MAINTAINERS · cc082cf9
      Andrew Stubbs authored
      
      ChangeLog:
      
      	* MAINTAINERS: Update
      
      Signed-off-by: default avatarAndrew Stubbs <ams@baylibre.com>
      cc082cf9
    • Alex Coplan's avatar
      aarch64: Fix up debug uses in ldp/stp pass [PR113089] · 3d82ebb6
      Alex Coplan authored
      As the PR shows, we were missing code to update debug uses in the
      load/store pair fusion pass.  This patch fixes that.
      
      The patch tries to give a complete treatment of the debug uses that will
      be affected by the changes we make, and in particular makes an effort to
      preserve debug info where possible, e.g. when re-ordering an update of
      a base register by a constant over a debug use of that register.  When
      re-ordering loads over a debug use of a transfer register, we reset the
      debug insn.  Likewise when re-ordering stores over debug uses of mem.
      
      While doing this I noticed that try_promote_writeback used a strange
      choice of move_range for the pair insn, in that it chose the previous
      nondebug insn instead of the insn itself.  Since the insn is being
      changed, these move ranges are equivalent (at least in terms of nondebug
      insn placement as far as RTL-SSA is concerned), but I think it is more
      natural to choose the pair insn itself.  This is needed to avoid
      incorrectly updating some debug uses.
      
      gcc/ChangeLog:
      
      	PR target/113089
      	* config/aarch64/aarch64-ldp-fusion.cc (reset_debug_use): New.
      	(fixup_debug_use): New.
      	(fixup_debug_uses_trailing_add): New.
      	(fixup_debug_uses): New. Use it ...
      	(ldp_bb_info::fuse_pair): ... here.
      	(try_promote_writeback): Call fixup_debug_uses_trailing_add to
      	fix up debug uses of the base register that are affected by
      	folding in the trailing add insn.
      
      gcc/testsuite/ChangeLog:
      
      	PR target/113089
      	* gcc.c-torture/compile/pr113089.c: New test.
      3d82ebb6
    • Alex Coplan's avatar
      aarch64: Re-parent trailing nondebug base reg uses [PR113089] · 49bfda60
      Alex Coplan authored
      While working on PR113089, I realised we where missing code to re-parent
      trailing nondebug uses of the base register in the case of cancelling
      writeback in the load/store pair pass.  This patch fixes that.
      
      gcc/ChangeLog:
      
      	PR target/113089
      	* config/aarch64/aarch64-ldp-fusion.cc (ldp_bb_info::fuse_pair):
      	Update trailing nondebug uses of the base register in the case
      	of cancelling writeback.
      49bfda60
    • Alex Coplan's avatar
      rtl-ssa: Provide easier access to debug uses [PR113089] · cef60316
      Alex Coplan authored
      This patch adds some accessors to set_info and use_info to make it
      easier to get at and iterate through uses in debug insns.
      
      It is used by the aarch64 load/store pair fusion pass in a subsequent
      patch to fix PR113089, i.e. to update debug uses in the pass.
      
      gcc/ChangeLog:
      
      	PR target/113089
      	* rtl-ssa/accesses.h (use_info::next_debug_insn_use): New.
      	(debug_insn_use_iterator): New.
      	(set_info::first_debug_insn_use): New.
      	(set_info::debug_insn_uses): New.
      	* rtl-ssa/member-fns.inl (use_info::next_debug_insn_use): New.
      	(set_info::first_debug_insn_use): New.
      	(set_info::debug_insn_uses): New.
      cef60316
    • Alex Coplan's avatar
      aarch64: Don't record hazards against paired insns [PR113356] · 639ae543
      Alex Coplan authored
      For the testcase in the PR, we try to pair insns where the first has
      writeback and the second uses the updated base register.  This causes us
      to record a hazard against the second insn, thus narrowing the move
      range away from the end of the BB.
      
      However, it isn't meaningful to record hazards against the other insn
      in the pair, as this doesn't change which pairs can be formed, and also
      doesn't change where the pair is formed (from the perspective of
      nondebug insns).
      
      To see why this is the case, consider the two cases:
      
       - Suppoe we are finding hazards for insns[0].  If we record a hazard
         against insns[1], then range.last becomes
         insns[1]->prev_nondebug_insn (), but note that this is equivalent to
         inserting after insns[1] (since insns[1] is being changed).
       - Now consider finding hazards for insns[1].  Suppose we record
         insns[0] as a hazard.  Then we set range.first = insns[0], which is a
         no-op.
      
      As such, it seems better to never record hazards against the other insn
      in the pair, as we check whether the insns themselves are suitable for
      combination separately (e.g. for ldp checking that they use distinct
      transfer registers).  Avoiding unnecessarily narrowing the move range
      avoids unnecessarily re-ordering over debug insns.
      
      This should also mean that we can only narrow the move range away from
      the end of the BB in the case that we record a hazard for insns[0]
      against insns[1]->prev_nondebug_insn () or earlier.  This means that for
      the non-call-exceptions case, either the move range includes insns[1],
      or we reject the pair (thus the assert tripped in the PR should always
      hold).
      
      gcc/ChangeLog:
      
      	PR target/113356
      	* config/aarch64/aarch64-ldp-fusion.cc (ldp_bb_info::try_fuse_pair):
      	Don't record hazards against the opposite insn in the pair.
      
      gcc/testsuite/ChangeLog:
      
      	PR target/113356
      	* gcc.target/aarch64/pr113356.C: New test.
      639ae543
    • Ronan Desplanques's avatar
      Update year in Gnatvsn · 0ad6908f
      Ronan Desplanques authored
      gcc/ada/
      	* gnatvsn.ads: Update year.
      0ad6908f
    • Xi Ruoyao's avatar
      LoongArch: testsuite: Disable stack protector for got-load.C · 46f3ba56
      Xi Ruoyao authored
      When building GCC with --enable-default-ssp, the stack protector is
      enabled for got-load.C, causing additional GOT loads for
      __stack_chk_guard.  So mem/u will be matched more than 2 times and the
      test will fail.
      
      Disable stack protector to fix this issue.
      
      gcc/testsuite:
      
      	* g++.target/loongarch/got-load.C (dg-options): Add
      	-fno-stack-protector.
      46f3ba56
    • Zac Walker's avatar
      Ifdef `.hidden`, `.type`, and `.size` pseudo-ops for `aarch64-w64-mingw32` target · c608ada2
      Zac Walker authored
      Recent
      change (https://gcc.gnu.org/pipermail/gcc-cvs/2023-December/394915.html)
      added a generic SME support using `.hidden`, `.type`, and ``.size`
      pseudo-ops in the assembly sources, `aarch64-w64-mingw32` does not
      support the pseudo-ops though. This patch wraps usage of those
      pseudo-ops using macros and ifdefs them for `__ELF__` define.
      
      libgcc/
      	* config/aarch64/aarch64-asm.h (HIDDEN, SYMBOL_SIZE, SYMBOL_TYPE)
      	(ENTRY_ALIGN, GNU_PROPERTY): New macros.
      	* config/aarch64/__arm_sme_state.S: Use them.
      	* config/aarch64/__arm_tpidr2_save.S: Likewise.
      	* config/aarch64/__arm_za_disable.S: Likewise.
      	* config/aarch64/crti.S: Likewise.
      	* config/aarch64/lse.S: Likewise.
      c608ada2
    • H.J. Lu's avatar
      gcc.dg/torture/pr113255.c: Fix ia32 test failure · 3936c870
      H.J. Lu authored
      Fix ia32 test failure:
      
      FAIL: gcc.dg/torture/pr113255.c   -O1  (test for excess errors)
      Excess errors:
      cc1: error: '-mstringop-strategy=rep_8byte' not supported for 32-bit code
      
      	PR rtl-optimization/113255
      	* gcc.dg/torture/pr113255.c (dg-additional-options): Add only
      	if not ia32.
      3936c870
    • H.J. Lu's avatar
      m2: Use time_t in time and don't redefine alloca · 2bdf138a
      H.J. Lu authored
      Fix the m2 build warning and error:
      
      [...]
      ../../src/gcc/m2/mc/mc.flex:32:9: warning: "alloca" redefined
         32 | #define alloca __builtin_alloca
            |         ^~~~~~
      In file included from /usr/include/stdlib.h:587,
                       from <stdout>:22:
      /usr/include/alloca.h:35:10: note: this is the location of the previous definition
         35 | # define alloca(size)   __builtin_alloca (size)
            |          ^~~~~~
      ../../src/gcc/m2/mc/mc.flex: In function 'handleDate':
      ../../src/gcc/m2/mc/mc.flex:333:25: error: passing argument 1 of 'time' from incompatible point
      er type [-Wincompatible-pointer-types]
        333 |   time_t  clock = time ((long *)0);
            |                         ^~~~~~~~~
            |                         |
            |                         long int *
      In file included from ../../src/gcc/m2/mc/mc.flex:28:
      /usr/include/time.h:76:29: note: expected 'time_t *' {aka 'long long int *'} but argument is of
       type 'long int *'
         76 | extern time_t time (time_t *__timer) __THROW;
      
      	PR bootstrap/113554
      	* mc/mc.flex (alloca): Don't redefine.
      	(handleDate): Replace (long *)0 with (time_t *)0 when calling
      	time.
      2bdf138a
    • Alex Coplan's avatar
      aarch64: Fix up uses of mem following stp insert [PR113070] · ef86659d
      Alex Coplan authored
      As the PR shows (specifically #c7) we are missing updating uses of mem
      when inserting an stp in the aarch64 load/store pair fusion pass.  This
      patch fixes that.
      
      RTL-SSA has a simple view of memory and by default doesn't allow stores
      to be re-ordered w.r.t. other stores.  In the ldp fusion pass, we do our
      own alias analysis and so can re-order stores over other accesses when
      we deem this is safe.  If neither store can be re-purposed (moved into
      the required position to form the stp while respecting the RTL-SSA
      constraints), then we turn both the candidate stores into "tombstone"
      insns (logically delete them) and insert a new stp insn.
      
      As it stands, we implement the insert case separately (after dealing
      with the candidate stores) in fuse_pair by inserting into the middle of
      the vector of changes.  This is OK when we only have to insert one
      change, but with this fix we would need to insert the change for the new
      stp plus multiple changes to fix up uses of mem (note the number of
      fix-ups is naturally bounded by the alias limit param to prevent
      quadratic behaviour).  If we kept the code structured as is and inserted
      into the middle of the vector, that would lead to repeated moving of
      elements in the vector which seems inefficient.  The structure of the
      code would also be a little unwieldy.
      
      To improve on that situation, this patch introduces a helper class,
      stp_change_builder, which implements a state machine that helps to build
      the required changes directly in program order.  That state machine is
      reponsible for deciding what changes need to be made in what order, and
      the code in fuse_pair then simply follows those steps.
      
      Together with the fix in the previous patch for installing new defs
      correctly in RTL-SSA, this fixes PR113070.
      
      We take the opportunity to rename the function decide_stp_strategy to
      try_repurpose_store, as that seems more descriptive of what it actually
      does, since stp_change_builder is now responsible for the overall change
      strategy.
      
      gcc/ChangeLog:
      
      	PR target/113070
      	* config/aarch64/aarch64-ldp-fusion.cc
      	(struct stp_change_builder): New.
      	(decide_stp_strategy): Reanme to ...
      	(try_repurpose_store): ... this.
      	(ldp_bb_info::fuse_pair): Refactor to use stp_change_builder to
      	construct stp changes.  Fix up uses when inserting new stp insns.
      ef86659d
    • Alex Coplan's avatar
      rtl-ssa: Ensure new defs get inserted [PR113070] · 6dd613df
      Alex Coplan authored
      In r14-5820-ga49befbd2c783e751dc2110b544fe540eb7e33eb I added support to
      RTL-SSA for inserting new insns, which included support for users
      creating new defs.
      
      However, I missed that apply_changes_to_insn needed updating to ensure
      that the new defs actually got inserted into the main def chain.  This
      meant that when the aarch64 ldp/stp pass inserted a new stp insn, the
      stp would just get skipped over during subsequent alias analysis, as its
      def never got inserted into the memory def chain.  This (unsurprisingly)
      led to wrong code.
      
      This patch fixes the issue by ensuring new user-created defs get
      inserted.  I would have preferred to have used a flag internal to the
      defs instead of a separate data structure to keep track of them, but since
      machine_mode increased to 16 bits we're already at 64 bits in access_info,
      and we can't really reuse m_is_temp as the logic in finalize_new_accesses
      requires it to get cleared.
      
      gcc/ChangeLog:
      
      	PR target/113070
      	* rtl-ssa.h: Include hash-set.h.
      	* rtl-ssa/changes.cc (function_info::finalize_new_accesses): Add
      	new_sets parameter and use it to keep track of new user-created sets.
      	(function_info::apply_changes_to_insn): Also call add_def on new sets.
      	(function_info::change_insns): Add hash_set to keep track of new
      	user-created defs.  Plumb it through.
      	* rtl-ssa/functions.h: Add hash_set parameter to finalize_new_accesses and
      	apply_changes_to_insn.
      6dd613df
    • Alex Coplan's avatar
      rtl-ssa: Support for creating new uses [PR113070] · fce3994d
      Alex Coplan authored
      This exposes an interface for users to create new uses in RTL-SSA.
      This is needed for updating uses after inserting a new store pair insn
      in the aarch64 load/store pair fusion pass.
      
      gcc/ChangeLog:
      
      	PR target/113070
      	* rtl-ssa/accesses.cc (function_info::create_use): New.
      	* rtl-ssa/changes.cc (function_info::finalize_new_accesses):
      	Ensure new uses end up referring to permanent defs.
      	* rtl-ssa/functions.h (function_info::create_use): Declare.
      fce3994d
    • Alex Coplan's avatar
      rtl-ssa: Run finalize_new_accesses forwards [PR113070] · e0374b02
      Alex Coplan authored
      The next patch in this series exposes an interface for creating new uses
      in RTL-SSA.  The intent is that new user-created uses can consume new
      user-created defs in the same change group.  This is so that we can
      correctly update uses of memory when inserting a new store pair insn in
      the aarch64 load/store pair fusion pass (the affected uses need to
      consume the new store pair insn).
      
      As it stands, finalize_new_accesses is called as part of the backwards
      insn placement loop within change_insns, but if we want new uses to be
      able to depend on new defs in the same change group, we need
      finalize_new_accesses to be called on earlier insns first.  This is so
      that when we process temporary uses and turn them into permanent uses,
      we can follow the last_def link on the temporary def to ensure we end up
      with a permanent use consuming a permanent def.
      
      gcc/ChangeLog:
      
      	PR target/113070
      	* rtl-ssa/changes.cc (function_info::change_insns): Split out the call
      	to finalize_new_accesses from the backwards placement loop, run it
      	forwards in a separate loop.
      e0374b02
    • Richard Biener's avatar
      tree-optimization/113552 - fix num_call accounting in simd clone vectorization · d5d43dc3
      Richard Biener authored
      The following avoids using exact_log2 on the number of SIMD clone calls
      to be emitted when vectorizing calls since that can easily be not
      a power of two in which case it will return -1.  For different simd
      clones the number of calls will differ by a multiply with a power of two
      only so using floor_log2 is good enough here.
      
      	PR tree-optimization/113552
      	* tree-vect-stmts.cc (vectorizable_simd_clone_call): Use
      	floor_log2 instead of exact_log2 on the number of calls.
      d5d43dc3
    • Jakub Jelinek's avatar
      ia64: Fix up -Wunused-parameter warning · ac98aa78
      Jakub Jelinek authored
      Since r14-6945-gc659dd8bfb55e02a1b97407c1c28f7a0e8f7f09b
      there is a warning
      ../../gcc/config/ia64/ia64.cc: In function ‘void ia64_start_function(FILE*, const char*, tree)’:
      ../../gcc/config/ia64/ia64.cc:3889:59: warning: unused parameter ‘decl’ [-Wunused-parameter]
       3889 | ia64_start_function (FILE *file, const char *fnname, tree decl)
            |                                                      ~~~~~^~~~
      which presumably for bootstraps breaks the bootstrap.
      While the decl parameter is passed to the ASM_OUTPUT_FUNCTION_LABEL macro,
      that macro actually doesn't use that argument, so the removal of
      ATTRIBUTE_UNUSED was incorrect.
      This patch reverts the first ia64.cc hunk from r14-6945.
      
      2024-01-23  Jeff Law  <jlaw@ventanamicro.com>
      	    Jakub Jelinek  <jakub@redhat.com>
      
      	* config/ia64/ia64.cc (ia64_start_function): Add ATTRIBUTE_UNUSED to
      	decl.
      ac98aa78
    • Richard Biener's avatar
      Refactor exit PHI handling in vectorizer epilogue peeling · 02e68389
      Richard Biener authored
      This refactors the handling of PHIs inbetween the main and the
      epilogue loop.  Instead of trying to handle the multiple exit
      and original single exit case together the following separates
      these cases resulting in much easier to understand code.
      
      	* tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg):
      	Separate single and multi-exit case when creating PHIs between
      	the main and epilogue.
      02e68389
    • Richard Sandiford's avatar
      aarch64: Avoid registering duplicate C++ overloads [PR112989] · 659a5a90
      Richard Sandiford authored
      In the original fix for this PR, I'd made sure that
      including <arm_sme.h> didn't reach the final return in
      simulate_builtin_function_decl (which would indicate duplicate
      function definitions).  But it seems I forgot to do the same
      thing for C++, which defines all of its overloads directly.
      
      This patch fixes a case where we still recorded duplicate
      functions for C++.  Thanks to Iain for reporting the resulting
      GC ICE and for help with reproducing it.
      
      gcc/
      	PR target/112989
      	* config/aarch64/aarch64-sve-builtins-shapes.cc (build_one): Skip
      	MODE_single variants of functions that don't take tuple arguments.
      659a5a90
Loading