Skip to content
Snippets Groups Projects
  1. Sep 04, 2024
    • Raphael Moreira Zinsly's avatar
      [PATCH 1/3] RISC-V: Improve codegen for negative repeating large constants · cbea72b2
      Raphael Moreira Zinsly authored
      Improve handling of constants where its upper and lower 32-bit
      halves are the same and have negative values.
      
      e.g. for:
      
      unsigned long f (void) { return 0xf0f0f0f0f0f0f0f0UL; }
      
      Without the patch:
      
      li      a0,-252645376
      addi    a0,a0,240
      li      a5,-252645376
      addi    a5,a5,241
      slli    a5,a5,32
      add     a0,a5,a0
      
      With the patch:
      
      li      a5,252645376
      addi    a5,a5,-241
      slli    a0,a5,32
      add     a0,a0,a5
      xori    a0,a0,-1
      
      gcc/ChangeLog:
      	* config/riscv/riscv.cc (riscv_split_integer_cost): Adjust the
      	cost of negative repeating constants.
      	(riscv_split_integer): Handle negative repeating constants.
      
      gcc/testsuite/ChangeLog:
      	* gcc.target/riscv/synthesis-11.c: New test.
      cbea72b2
    • Tom Tromey's avatar
      Check DECL_NAMELESS in modified_type_die · 5326306e
      Tom Tromey authored
      While working on a patch to the Ada compiler, I found a spot in
      dwarf2out.cc that calls add_name_attribute without respecting
      DECL_NAMELESS.
      
      gcc
      
      	* dwarf2out.cc (modified_type_die): Check DECL_NAMELESS.
      5326306e
    • Jeff Law's avatar
      [RISC-V] Fix scan test output after recent path-splitting changes · 0455e85e
      Jeff Law authored
      The recent path splitting changes from Andrew result in identifying more
      saturation idioms instead of just identifying an overflow check.  As a result
      many of the tests in the RISC-V port started failing a scan check on the
      .expand output.
      
      As expected, identifying a saturation idiom is more helpful than identifying an
      overflow check and the resultant code is better based on my spot checks.
      
      So the right thing to do is to expect more saturation intrinsics in the .expand
      output.
      
      I've verified this fixes the regressions for riscv32-elf and riscv64-elf.
      Pushing to the trunk.
      
      gcc/testsuite
      	* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-13.c: Adjust
      	expected output.
      	* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-14.c: Likewise.
      	* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-15.c: Likewise.
      	* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-16.c: Likewise.
      	* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-17.c: Likewise.
      	* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-18.c: Likewise.
      	* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-19.c: Likewise.
      	* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-20.c: Likewise.
      	* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm_reconcile-1.c:
      	Likewise.
      	* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm_reconcile-2.c:
      	Likewise.
      	* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm_reconcile-5.c:
      	Likewise.
      	* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm_reconcile-6.c:
      	Likewise.
      	* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm_reconcile-9.c:
      	Likewise.
      	* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm_reconcile-10.c:
      	Likewise.
      	* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm_reconcile-13.c:
      	Likewise.
      	* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm_reconcile-14.c:
      	Likewise.
      	* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm_reconcile-15.c:
      	Likewise.
      	* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-9.c: Likewise.
      	* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-10.c: Likewise.
      	* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-11.c: Likewise.
      	* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-12.c: Likewise.
      	* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-13.c: Likewise.
      	* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-14.c: Likewise.
      	* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-15.c: Likewise.
      	* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-16.c: Likewise.
      	* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-17.c: Likewise.
      	* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-18.c: Likewise.
      	* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-19.c: Likewise.
      	* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-20.c: Likewise.
      	* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-21.c: Likewise.
      	* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-22.c: Likewise.
      	* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-23.c: Likewise.
      	* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-24.c: Likewise.
      	* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-33.c: Likewise.
      	* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-34.c: Likewise.
      	* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-35.c: Likewise.
      	* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-36.c: Likewise.
      	* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-37.c: Likewise.
      	* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-38.c: Likewise.
      	* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-39.c: Likewise.
      	* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-40.c: Likewise.
      0455e85e
    • Marek Polacek's avatar
      c++: cleanup coerce_template_template_parm · dedf4534
      Marek Polacek authored
      This function could use some sprucing up.
      
      gcc/cp/ChangeLog:
      
      	* pt.cc (coerce_template_template_parm): Return bool instead of int.
      dedf4534
    • Marek Polacek's avatar
      c++: noexcept and pointer to member function type [PR113108] · c755c7a3
      Marek Polacek authored
      We ICE in nothrow_spec_p because it got a DEFERRED_NOEXCEPT.
      This DEFERRED_NOEXCEPT was created in implicitly_declare_fn
      when declaring
      
        Foo& operator=(Foo&&) = default;
      
      in the test.  The problem is that in resolve_overloaded_unification
      we call maybe_instantiate_noexcept before try_one_overload only in
      the TEMPLATE_ID_EXPR case.
      
      	PR c++/113108
      
      gcc/cp/ChangeLog:
      
      	* pt.cc (resolve_overloaded_unification): Call
      	maybe_instantiate_noexcept.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/cpp1z/noexcept-type28.C: New test.
      c755c7a3
    • Arsen Arsenović's avatar
      c++: add a testcase for [PR 108620] · 858918ef
      Arsen Arsenović authored
      Fixed by r15-2540-g32e678b2ed7521.  Add a testcase, as the original ones
      do not cover this particular failure mode.
      
      gcc/testsuite/ChangeLog:
      
      	PR c++/108620
      	* g++.dg/coroutines/pr108620.C: New test.
      858918ef
    • Arsen Arsenović's avatar
      coros: mark .CO_YIELD as LEAF [PR106973] · 7b7ad3f4
      Arsen Arsenović authored
      We rely on .CO_YIELD calls being followed by an assignment (optionally)
      and then a switch/if in the same basic block.  This implies that a
      .CO_YIELD can never end a block.  However, since a call to .CO_YIELD is
      still a call, if the function containing it calls setjmp, GCC thinks
      that the .CO_YIELD can introduce abnormal control flow, and generates an
      edge for the call.
      
      We know this is not the case; .CO_YIELD calls get removed quite early on
      and have no effect, and result in no other calls, so .CO_YIELD can be
      considered a leaf function, preventing generating an edge when calling
      it.
      
      PR c++/106973 - coroutine generator and setjmp
      
      	PR c++/106973
      
      gcc/ChangeLog:
      
      	* internal-fn.def (CO_YIELD): Mark as ECF_LEAF.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/coroutines/pr106973.C: New test.
      7b7ad3f4
    • Andrew Pinski's avatar
      object-size: Use simple_dce_from_worklist in object-size pass · 97e011a4
      Andrew Pinski authored
      
      While trying to see if there was a way to improve object-size pass
      to use the ranger (for pointer plus), I noticed that it leaves around
      the statement containing __builtin_object_size if it was reduced to a constant.
      This fixes that by using simple_dce_from_worklist.
      
      Bootstrapped and tested on x86_64-linux-gnu.
      
      gcc/ChangeLog:
      
      	* tree-object-size.cc (object_sizes_execute): Mark lhs for maybe dceing
      	if doing a propagate. Call simple_dce_from_worklist.
      
      Signed-off-by: default avatarAndrew Pinski <quic_apinski@quicinc.com>
      97e011a4
    • Richard Biener's avatar
      Use dg-additional-options for gfortran.dg/vect/vect-8.f90 and RISC-V · 284feaa8
      Richard Biener authored
      r14-9122-g67a29f99cc8138 disabled scheduling on a lot of testcases
      for RISC-V for PR113249 but using dg-options.  This makes
      gfortran.dg/vect/vect-8.f90 UNRESOLVED as it relies on default
      flags to enable vectorization.
      
      The following uses dg-additional-options instead.
      
      Tested on riscv64-linux with qemu-user, pushed.
      
      I didn't check all the other adjusted tests for similar issues.
      
      	* gfortran.dg/vect/vect-8.f90: Use dg-additional-options.
      284feaa8
    • Thomas Schwinge's avatar
      nvptx: Use 'enum ptx_version', 'enum ptx_isa' instead of 'int' · fee2fbed
      Thomas Schwinge authored
      This allows getting rid of the respective type casts.  No change in behavior
      intended.
      
      	gcc/
      	* config/nvptx/gen-opt.sh: Use 'enum ptx_isa' instead of 'int'.
      	* config/nvptx/nvptx-gen.opt: Regenerate.
      	* config/nvptx/nvptx.opt: Use 'enum ptx_version' instead of 'int'.
      	* config/nvptx/nvptx-opts.h (enum ptx_isa): Add 'PTX_ISA_unset'.
      	(enum ptx_version): Add 'PTX_VERSION_unset'.
      	* config/nvptx/nvptx-c.cc (nvptx_cpu_cpp_builtins): Adjust.
      	* config/nvptx/nvptx.cc (default_ptx_version_option)
      	(handle_ptx_version_option, nvptx_option_override)
      	(nvptx_file_start): Likewise.
      fee2fbed
    • Frederik Harwath's avatar
      Fix branch prediction dump message · 35e4414b
      Frederik Harwath authored
      
      Instead of, for instance, "Loop got predicted 1 to iterate 10 times"
      the message should be "Loop 1 got predicted to iterate 10 times".
      
      gcc/ChangeLog:
      
      	* predict.cc (pass_profile::execute): Fix dump message.
      
      Co-authored-by: default avatarThomas Schwinge <tschwinge@baylibre.com>
      35e4414b
    • Frederik Harwath's avatar
      Fix gimple_debug_cfg declaration · 347a953d
      Frederik Harwath authored
      Silence a warning. The argument type did not match the definition.
      
      gcc/ChangeLog:
      
      	* tree-cfg.h (gimple_debug_cfg): Change argument type from int
      	to dump_flags_t.
      347a953d
    • Thomas Schwinge's avatar
      Document 'pass_postreload' vs. 'pass_late_compilation' · 438381ef
      Thomas Schwinge authored
      See Subversion r217124 (Git commit 433e4164)
      "Reorganize post-ra pipeline for targets without register allocation".
      
      	gcc/
      	* passes.cc: Document 'pass_postreload' vs. 'pass_late_compilation'.
      	* passes.def: Likewise.
      438381ef
    • Thomas Schwinge's avatar
      nvptx: Specify '-mno-alias' for 'gcc.dg/pr60797.c' [PR60797, PR104957] · b9be3113
      Thomas Schwinge authored
      2014 Subversion r209299 (Git commit 8330537b)
      "Fix PR60797" added this test case, which we now amend so that it's able to
      test its thing also in '--target=nvptx-none' configurations with symbol alias
      support enabled (..., and test nvptx '-mno-alias').
      
      	PR middle-end/60797
      	PR target/104957
      	gcc/testsuite/
      	* gcc.dg/pr60797.c: For nvptx, specify '-mno-alias'.
      b9be3113
    • Thomas Schwinge's avatar
      Add 'gcc.target/nvptx/alias-to-alias-1.c' · a89321c8
      Thomas Schwinge authored
      ... similar to alias to alias usage in 'libgomp.c-c++-common/pr96390.c'.
      
      	PR target/104957
      	gcc/testsuite/
      	* gcc.target/nvptx/alias-to-alias-1.c: New.
      a89321c8
    • Thomas Schwinge's avatar
      Add 'gcc.target/nvptx/alias-weak-1.c' · 2267d254
      Thomas Schwinge authored
      ... testing for the GCC/nvptx "weak alias definitions not supported" error
      diagnostic (limitation of PTX).
      
      	gcc/testsuite/
      	* gcc.target/nvptx/alias-weak-1.c: New.
      2267d254
    • Marc Poulhiès's avatar
      rust: avoid clobbering LIBS · da3a2985
      Marc Poulhiès authored
      
      Save LIBS around calls to AC_SEARCH_LIBS to avoid clobbering $LIBS.
      
      ChangeLog:
      
      	* configure: Regenerate.
      	* configure.ac: Save LIBS around calls to AC_SEARCH_LIBS.
      
      Signed-off-by: default avatarMarc Poulhiès <dkm@kataplop.net>
      Reviewed-by: default avatarThomas Schwinge <tschwinge@baylibre.com>
      Tested-by: default avatarThomas Schwinge <tschwinge@baylibre.com>
      da3a2985
    • Richard Biener's avatar
      Also lower SLP grouped loads with just one consumer · 7164d982
      Richard Biener authored
      This makes sure to produce interleaving schemes or load-lanes
      for single-element interleaving and other permutes that otherwise
      would use more than three vectors.
      
      It exposes the latent issue that single-element interleaving with
      large gaps can be inefficient - the mitigation in get_group_load_store_type
      doesn't trigger when we clear the load permutation.
      
      It also exposes the fact that not all permutes can be lowered in
      the best way in a vector length agnostic way so I've added an
      exception to keep power-of-two size contiguous aligned chunks
      unlowered (unless we want load-lanes).  The optimal handling
      of load/store vectorization is going to continue to be a learning
      process.
      
      	* tree-vect-slp.cc (vect_lower_load_permutations): Also
      	process single-use grouped loads.
      	Avoid lowering contiguous aligned power-of-two sized
      	chunks, those are better handled by the vector size
      	specific SLP code generation.
      	* tree-vect-stmts.cc (get_group_load_store_type): Drop
      	the unrelated requirement of a load permutation for the
      	single-element interleaving limit.
      
      	* gcc.dg/vect/slp-46.c: Remove XFAIL.
      7164d982
    • Jan Hubicka's avatar
      Zen5 tuning part 5: update instruction latencies in x86-tune-costs · 4292297a
      Jan Hubicka authored
      there is nothing exciting in this patch.  I measured latencies and also compared
      them with newly released optimization guide.  There are no dramatic changes
      compared to zen4.  One interesting new bit is that addss is faster and can be
      2 cycles when fed by another addss.
      
      I also increased the large insn bound since decoders seems no longer require
      instructions to be 8 bytes or less.
      
      gcc/ChangeLog:
      
      	* config/i386/x86-tune-costs.h (znver5_cost): Update instruction
      	costs.
      4292297a
    • Andrew Pinski's avatar
      expand: Add dump for costing of positive divides · dbd0eb39
      Andrew Pinski authored
      
      While trying to understand PR 115910 I found it was useful to print out
      the two costs of doing a signed and unsigned division just like was added in
      r15-3272-g3c89c41991d8e8 for popcount==1.
      
      Bootstrapped and tested on x86_64-linux-gnu.
      
      gcc/ChangeLog:
      
      	* expr.cc (expand_expr_divmod): Add dump of the two costs for
      	positive division.
      
      Signed-off-by: default avatarAndrew Pinski <quic_apinski@quicinc.com>
      dbd0eb39
    • Hans-Peter Nilsson's avatar
      CRIS: Add new peephole2 "lra_szext_decomposed_indir_plus" · 62dd893f
      Hans-Peter Nilsson authored
      Exposed when running the test-suite with -flate-combine-instructions.
      
      	* config/cris/cris.md (lra_szext_decomposed_indir_plus): New
      	peephole2 pattern.
      62dd893f
    • Pan Li's avatar
      RISC-V: Allow IMM operand for unsigned scalar .SAT_ADD · 9ea9d059
      Pan Li authored
      
      This patch would like to allow the IMM operand of the unsigned
      scalar .SAT_ADD.  Like the operand 0, the operand 1 of .SAT_ADD
      will be zero extended to Xmode before underlying code generation.
      
      The below test suites are passed for this patch.
      * The rv64gcv fully regression test.
      
      gcc/ChangeLog:
      
      	* config/riscv/riscv.cc (riscv_expand_usadd): Zero extend
      	the second operand of usadd as the first operand does.
      	* config/riscv/riscv.md (usadd<m>3): Allow imm operand for
      	scalar usadd pattern.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/riscv/sat_u_add-11.c: Make asm check robust.
      	* gcc.target/riscv/sat_u_add-15.c: Ditto.
      	* gcc.target/riscv/sat_u_add-19.c: Ditto.
      	* gcc.target/riscv/sat_u_add-23.c: Ditto.
      	* gcc.target/riscv/sat_u_add-3.c: Ditto.
      	* gcc.target/riscv/sat_u_add-7.c: Ditto.
      
      Signed-off-by: default avatarPan Li <pan2.li@intel.com>
      9ea9d059
    • Andrew Pinski's avatar
      aarch64: Fix testcase vec-init-22-speed.c [PR116589] · d8bc31d9
      Andrew Pinski authored
      
      For this testcase, the trunk produces:
      ```
      f_s16:
              fmov    s31, w0
              fmov    s0, w1
      ```
      
      While the testcase was expecting what was produced in GCC 14:
      ```
      f_s16:
              sxth    w0, w0
              sxth    w1, w1
              fmov    d31, x0
              fmov    d0, x1
      ```
      
      After r15-1575-gea8061f46a30 the code was:
      ```
              dup     v31.4h, w0
              dup     v0.4h, w1
      ```
      But when ext-dce was added with r15-1901-g98914f9eba5f19, we get the better code generation now and only fmov's.
      
      Pushed as obvious after running the testcase.
      
      	PR target/116589
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/aarch64/vec-init-22-speed.c: Update scan for better code gen.
      
      Signed-off-by: default avatarAndrew Pinski <quic_apinski@quicinc.com>
      d8bc31d9
  2. Sep 03, 2024
    • Andrew Pinski's avatar
      split-path: Improve ifcvt heurstic for split path [PR112402] · b2b20b27
      Andrew Pinski authored
      
      This simplifies the heurstic for split path to see if the join
      bb is a ifcvt candidate.
      For the predecessors bbs need either to be empty or only have one
      statement in them which could be a decent ifcvt candidate.
      The previous heurstics would miss that:
      ```
      if (a) goto B else goto C;
      B:  goto C;
      C:
      c = PHI<d,e>
      ```
      
      Would be a decent ifcvt candidate. And would also miss:
      ```
      if (a) goto B else goto C;
      B: d = f + 1;  goto C;
      C:
      c = PHI<d,e>
      ```
      
      Also since currently the max number of cmovs being able to produced is 3, we
      should only assume `<= 3` phis can be ifcvt candidates.
      
      The testcase changes for split-path-6.c is that lookharder function
      is a true ifcvt case where we would get cmov as expected; it looks like it
      was not a candidate when the heurstic was added but became one later on.
      pr88797.C is now rejected via it being an ifcvt candidate rather than being about
      DCE/const prop.
      
      The rest of the testsuite changes are just slight change in the dump,
      removing the "*diamnond" part as it was removed from the print.
      
      Bootstrapped and tested on x86_64.
      
      	PR tree-optimization/112402
      
      gcc/ChangeLog:
      
      	* gimple-ssa-split-paths.cc (poor_ifcvt_pred): New function.
      	(is_feasible_trace): Remove old heurstics for ifcvt cases.
      	For num_stmts <=1 for both pred check poor_ifcvt_pred on both
      	pred.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.dg/tree-ssa/split-path-11.c: Update scan.
      	* gcc.dg/tree-ssa/split-path-2.c: Update scan.
      	* gcc.dg/tree-ssa/split-path-5.c: Update scan.
      	* gcc.dg/tree-ssa/split-path-6.c: Update scan.
      	* g++.dg/tree-ssa/pr88797.C: Update scan.
      	* gcc.dg/tree-ssa/split-path-13.c: New test.
      
      Signed-off-by: default avatarAndrew Pinski <quic_apinski@quicinc.com>
      b2b20b27
    • Andrew Pinski's avatar
      split-paths: Move check for # of statements in join earlier · 77e17558
      Andrew Pinski authored
      
      This moves the check for # of statements to copy in join to
      be the first check. This check is the cheapest check so it
      should be first. Plus add a print to the dump file since there
      was none beforehand.
      
      gcc/ChangeLog:
      
      	* gimple-ssa-split-paths.cc (is_feasible_trace): Move
      	check for # of statments in join earlier and add a
      	debug print.
      
      Signed-off-by: default avatarAndrew Pinski <quic_apinski@quicinc.com>
      77e17558
    • Qing Zhao's avatar
      Explicitly document that the "counted_by" attribute is only supported in C. · f9642ffe
      Qing Zhao authored
      The "counted_by" attribute currently is only supported in C, mention this
      explicitly in documentation and also issue warnings when see "counted_by"
      attribute in C++ with -Wattributes.
      
      gcc/c-family/ChangeLog:
      
      	* c-attribs.cc (handle_counted_by_attribute): Is ignored and issues
      	warning with -Wattributes in C++ for now.
      
      gcc/ChangeLog:
      
      	* doc/extend.texi: Explicitly mentions counted_by is available
      	only in C for now.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/ext/flex-array-counted-by.C: New test.
      	* g++.dg/ext/flex-array-counted-by-2.C: New test.
      f9642ffe
    • Jason Merrill's avatar
      c++: support C++11 attributes in C++98 · 3775f71c
      Jason Merrill authored
      I don't see any reason why we can't allow the [[]] attribute syntax in C++98
      mode with a pedwarn just like many other C++11 features.  In fact, we
      already do support it in some places in the grammar, but not in places that
      check cp_nth_tokens_can_be_std_attribute_p.
      
      Let's also follow the C front-end's lead in only warning about them when
      -pedantic.
      
      It still isn't necessary for this function to guard against Objective-C
      message passing syntax; we handle that with tentative parsing in
      cp_parser_statement, and we don't call this function in that context anyway.
      
      gcc/cp/ChangeLog:
      
      	* parser.cc (cp_nth_tokens_can_be_std_attribute_p): Don't check
      	cxx_dialect.
      	* error.cc (maybe_warn_cpp0x): Only complain about C++11 attributes
      	if pedantic.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/cpp0x/gen-attrs-1.C: Also run in C++98 mode.
      	* g++.dg/cpp0x/gen-attrs-11.C: Likewise.
      	* g++.dg/cpp0x/gen-attrs-13.C: Likewise.
      	* g++.dg/cpp0x/gen-attrs-15.C: Likewise.
      	* g++.dg/cpp0x/gen-attrs-75.C: Don't expect C++98 warning after
      	__extension__.
      3775f71c
    • Andi Kleen's avatar
      PR116080: Fix test suite checks for musttail · 1fad396d
      Andi Kleen authored
      This is a new attempt to fix PR116080. The previous try was reverted
      because it just broke a bunch of tests, hiding the problem.
      
      - musttail behaves differently than tailcall at -O0. Some of the test
      run at -O0, so add separate effective target tests for musttail.
      - New effective target tests need to use unique file names
      to make dejagnu caching work
      - Change the tests to use new targets
      - Add a external_musttail test to check for target's ability
      to do tail calls between translation units. This covers some powerpc
      ABIs.
      
      gcc/testsuite/ChangeLog:
      
      	PR testsuite/116080
      	* c-c++-common/musttail1.c: Use musttail target.
      	* c-c++-common/musttail12.c: Use struct_musttail target.
      	* c-c++-common/musttail2.c: Use musttail target.
      	* c-c++-common/musttail3.c: Likewise.
      	* c-c++-common/musttail4.c: Likewise.
      	* c-c++-common/musttail7.c: Likewise.
      	* c-c++-common/musttail8.c: Likewise.
      	* g++.dg/musttail10.C: Likewise. Replace powerpc checks with
      	external_musttail.
      	* g++.dg/musttail11.C: Use musttail target.
      	* g++.dg/musttail6.C: Use musttail target. Replace powerpc
      	checks with external_musttail.
      	* g++.dg/musttail9.C: Use musttail target.
      	* lib/target-supports.exp: Add musttail, struct_musttail,
      	external_musttail targets. Remove optimization for musttail.
      	Use unique file names for musttail.
      1fad396d
    • David Malcolm's avatar
      pretty-print: split up pretty_printer::format into subroutines · 07e74798
      David Malcolm authored
      
      The body of pretty_printer::format is almost 500 lines long,
      mostly comprising two distinct phases.
      
      This patch splits it up so that there are explicit subroutines
      for the two different phases, reducing the scope of various
      locals, and making it easier to e.g. put a breakpoint on phase 2.
      
      No functional change intended.
      
      gcc/ChangeLog:
      	* pretty-print-markup.h (pp_markup::context::context): Drop
      	params "buf" and "chunk_idx", initializing m_buf from pp.
      	(pp_markup::context::m_chunk_idx): Drop field.
      	* pretty-print.cc (pretty_printer::format): Convert param
      	from a text_info * to a text_info &.  Split out phase 1
      	and phase 2 into subroutines...
      	(format_phase_1): New, from pretty_printer::format.
      	(format_phase_2): Likewise.
      	* pretty-print.h (pretty_printer::format): Convert param
      	from a text_info * to a text_info &.
      	(pp_format): Update for above change.  Assert that text_info is
      	non-null.
      
      Signed-off-by: default avatarDavid Malcolm <dmalcolm@redhat.com>
      07e74798
    • David Malcolm's avatar
      pretty-print: add selftest of pp_format's stack · d0891f3a
      David Malcolm authored
      
      gcc/ChangeLog:
      	* pretty-print-format-impl.h (pp_formatted_chunks::get_prev): New
      	accessor.
      	* pretty-print.cc (selftest::push_pp_format): New.
      	(ASSERT_TEXT_TOKEN): New macro.
      	(selftest::test_pp_format_stack): New test.
      	(selftest::pretty_print_cc_tests): New.
      
      Signed-off-by: default avatarDavid Malcolm <dmalcolm@redhat.com>
      d0891f3a
    • David Malcolm's avatar
      pretty-print: naming cleanups · 34f01475
      David Malcolm authored
      
      This patch is a followup to r15-3311-ge31b6176996567 making some
      cleanups to pretty-printing to reflect those changes:
      - renaming "chunk_info" to "pp_formatted_chunks"
      - renaming "cur_chunk_array" to "m_cur_fomatted_chunks"
      - rewording/clarifying comments
      and taking the opportunity to add a "m_" prefix to all fields of
      output_buffer.
      
      No functional change intended.
      
      gcc/analyzer/ChangeLog:
      	* analyzer-logging.cc (logger::logger): Prefix all output_buffer
      	fields with "m_".
      
      gcc/c-family/ChangeLog:
      	* c-ada-spec.cc (dump_ada_node): Prefix all output_buffer fields
      	with "m_".
      	* c-pretty-print.cc (pp_c_integer_constant): Likewise.
      	(pp_c_integer_constant): Likewise.
      	(pp_c_floating_constant): Likewise.
      	(pp_c_fixed_constant): Likewise.
      
      gcc/c/ChangeLog:
      	* c-objc-common.cc (print_type): Prefix all output_buffer fields
      	with "m_".
      
      gcc/cp/ChangeLog:
      	* error.cc (type_to_string): Prefix all output_buffer fields with
      	"m_".
      	(append_formatted_chunk): Likewise.  Rename "chunk_info" to
      	"pp_formatted_chunks" and field cur_chunk_array with
      	m_cur_formatted_chunks.
      
      gcc/fortran/ChangeLog:
      	* error.cc (gfc_move_error_buffer_from_to): Prefix all
      	output_buffer fields with "m_".
      	(gfc_diagnostics_init): Likewise.
      
      gcc/ChangeLog:
      	* diagnostic.cc (diagnostic_set_caret_max_width): Prefix all
      	output_buffer fields with "m_".
      	* dumpfile.cc (emit_any_pending_textual_chunks): Likewise.
      	(emit_any_pending_textual_chunks): Likewise.
      	* gimple-pretty-print.cc (gimple_dump_bb_buff): Likewise.
      	* json.cc (value::dump): Likewise.
      	* pretty-print-format-impl.h (class chunk_info): Rename to...
      	(class pp_formatted_chunks): ...this.  Add friend
      	class output_buffer.  Update comment near end of decl to show
      	the pp_formatted_chunks instance on the chunk_obstack.
      	(pp_formatted_chunks::pop_from_output_buffer): Delete decl.
      	(pp_formatted_chunks::on_begin_quote): Delete decl that should
      	have been removed in r15-3311-ge31b6176996567.
      	(pp_formatted_chunks::on_end_quote): Likewise.
      	(pp_formatted_chunks::m_prev): Update for renaming.
      	* pretty-print.cc (output_buffer::output_buffer): Prefix all
      	fields with "m_".  Rename "cur_chunk_array" to
      	"m_cur_formatted_chunks".
      	(output_buffer::~output_buffer): Prefix all fields with "m_".
      	(output_buffer::push_formatted_chunks): New.
      	(output_buffer::pop_formatted_chunks): New.
      	(pp_write_text_to_stream): Prefix all output_buffer fields with
      	"m_".
      	(pp_write_text_as_dot_label_to_stream): Likewise.
      	(pp_write_text_as_html_like_dot_to_stream): Likewise.
      	(chunk_info::append_formatted_chunk): Rename to...
      	(pp_formatted_chunks::append_formatted_chunk): ...this.
      	(chunk_info::pop_from_output_buffer): Delete.
      	(pretty_printer::format): Update leading comment to mention
      	pushing pp_formatted_chunks, and to reflect changes in
      	r15-3311-ge31b6176996567.  Prefix all output_buffer fields with
      	"m_".
      	(pp_output_formatted_text): Update leading comment to mention
      	popping a pp_formatted_chunks, and to reflect the changes in
      	r15-3311-ge31b6176996567.  Prefix all output_buffer fields with
      	"m_" and rename "cur_chunk_array" to "m_cur_formatted_chunks".
      	Replace call to chunk_info::pop_from_output_buffer with a call to
      	output_buffer::pop_formatted_chunks.
      	(pp_flush): Prefix all output_buffer fields with "m_".
      	(pp_really_flush): Likewise.
      	(pp_clear_output_area): Likewise.
      	(pp_append_text): Likewise.
      	(pretty_printer::remaining_character_count_for_line): Likewise.
      	(pp_newline): Likewise.
      	(pp_character): Likewise.
      	(pp_markup::context::push_back_any_text): Likewise.
      	* pretty-print.h (class chunk_info): Rename to...
      	(class pp_formatted_chunks): ...this.
      	(class output_buffer): Delete unimplemented rule-of-5 members.
      	(output_buffer::push_formatted_chunks): New decl.
      	(output_buffer::pop_formatted_chunks): New decl.
      	(output_buffer::formatted_obstack): Rename to...
      	(output_buffer::m_formatted_obstack): ...this.
      	(output_buffer::chunk_obstack): Rename to...
      	(output_buffer::m_chunk_obstack): ...this.
      	(output_buffer::obstack): Rename to...
      	(output_buffer::m_obstack): ...this.
      	(output_buffer::cur_chunk_array): Rename to...
      	(output_buffer::m_cur_formatted_chunks): ...this.
      	(output_buffer::stream): Rename to...
      	(output_buffer::m_stream): ...this.
      	(output_buffer::line_length): Rename to...
      	(output_buffer::m_line_length): ...this.
      	(output_buffer::digit_buffer): Rename to...
      	(output_buffer::m_digit_buffer): ...this.
      	(output_buffer::flush_p): Rename to...
      	(output_buffer::m_flush_p): ...this.
      	(output_buffer_formatted_text): Prefix all output_buffer fields
      	with "m_".
      	(output_buffer_append_r): Likewise.
      	(output_buffer_last_position_in_text): Likewise.
      	(pretty_printer::set_output_stream): Likewise.
      	(pp_scalar): Likewise.
      	(pp_wide_int): Likewise.
      	* tree-pretty-print.cc (dump_generic_node): Likewise.
      	(dump_generic_node): Likewise.
      	(pp_double_int): Likewise.
      
      Signed-off-by: default avatarDavid Malcolm <dmalcolm@redhat.com>
      34f01475
    • Marek Polacek's avatar
      c++: add fixed test [PR109095] · 5f3a6e26
      Marek Polacek authored
      Fixed by r13-6693.
      
      	PR c++/109095
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/cpp2a/nontype-class66.C: New test.
      5f3a6e26
    • Jan Hubicka's avatar
      Zen5 tuning part 4: update reassocation width · f0ab3de6
      Jan Hubicka authored
      Zen5 has 6 instead of 4 ALUs and the integer multiplication can now execute in
      3 of them.  FP units can do 2 additions and 2 multiplications with latency 2
      and 3.  This patch updates reassociation width accordingly.  This has potential
      of increasing register pressure but unlike while benchmarking znver1 tuning
      I did not noticed this actually causing problem on spec, so this patch bumps
      up reassociation width to 6 for everything except for integer vectors, where
      there are 4 units with typical latency of 1.
      
      Bootstrapped/regtested x86_64-linux, comitted.
      
      gcc/ChangeLog:
      
      	* config/i386/i386.cc (ix86_reassociation_width): Update for Znver5.
      	* config/i386/x86-tune-costs.h (znver5_costs): Update reassociation
      	widths.
      f0ab3de6
    • Jeff Law's avatar
      Drop file that should not have been committed. · 36f63000
      Jeff Law authored
      	* J: Drop file that should not have been committed
      36f63000
    • Jan Hubicka's avatar
      Zen5 tuning part 3: fix typo in previous patch · 910e1769
      Jan Hubicka authored
      gcc/ChangeLog:
      
      	* config/i386/x86-tune-sched.cc (ix86_fuse_mov_alu_p): Fix
      	typo.
      910e1769
    • Jonathan Wakely's avatar
      libstdc++: Fix error handling in fs::hard_link_count for Windows · 71b1639c
      Jonathan Wakely authored
      The recent change to use auto_win_file_handle for
      std::filesystem::hard_link_count caused a regression. The
      std::error_code argument should be cleared if no error occurs, but this
      no longer happens. Add a call to ec.clear() in fs::hard_link_count to
      fix this.
      
      Also change the auto_win_file_handle class to take a reference to the
      std::error_code and set it if an error occurs, to slightly simplify the
      control flow in the fs::equiv_files function.
      
      libstdc++-v3/ChangeLog:
      
      	* src/c++17/fs_ops.cc (auto_win_file_handle): Add error_code&
      	member and set it if CreateFileW or GetFileInformationByHandle
      	fails.
      	(fs::equiv_files) [_GLIBCXX_FILESYSTEM_IS_WINDOWS]: Simplify
      	control flow.
      	(fs::hard_link_count) [_GLIBCXX_FILESYSTEM_IS_WINDOWS]: Clear ec
      	on success.
      	* testsuite/27_io/filesystem/operations/hard_link_count.cc:
      	Check error handling.
      71b1639c
    • Jonathan Wakely's avatar
      libstdc++: Specialize std::disable_sized_sentinel_for for std::move_iterator [PR116549] · 819deae0
      Jonathan Wakely authored
      LWG 3736 added a partial specialization of this variable template for
      two std::move_iterator types. This is needed for the case where the
      types satisfy std::sentinel_for and are subtractable, but do not model
      the semantics requirements of std::sized_sentinel_for.
      
      libstdc++-v3/ChangeLog:
      
      	PR libstdc++/116549
      	* include/bits/stl_iterator.h (disable_sized_sentinel_for):
      	Define specialization for two move_iterator types, as per LWG
      	3736.
      	* testsuite/24_iterators/move_iterator/lwg3736.cc: New test.
      819deae0
    • Richard Biener's avatar
      Dump whether a SLP node represents load/store-lanes · ef0c4482
      Richard Biener authored
      This makes it easier to discover whether SLP load or store nodes
      participate in load/store-lanes accesses.
      
      	* tree-vect-slp.cc (vect_print_slp_tree): Annotate load
      	and store-lanes nodes.
      ef0c4482
    • Richard Biener's avatar
      Fix missed peeling for gaps with SLP load-lanes · bd120de1
      Richard Biener authored
      The following disables peeling for gap avoidance with using smaller
      vector accesses when using load-lanes.
      
      	* tree-vect-stmts.cc (get_group_load_store_type): Only disable
      	peeling for gaps by using smaller vectors when not using
      	load-lanes.
      bd120de1
    • Jan Hubicka's avatar
      Zen5 tuning part 3: scheduler tweaks · e2125a60
      Jan Hubicka authored
      this patch adds support for new fussion in znver5 documented in the
      optimization manual:
      
         The Zen5 microarchitecture adds support to fuse reg-reg MOV Instructions
         with certain ALU instructions. The following conditions need to be met for
         fusion to happen:
           - The MOV should be reg-reg mov with Opcode 0x89 or 0x8B
           - The MOV is followed by an ALU instruction where the MOV and ALU destination register match.
           - The ALU instruction may source only registers or immediate data. There cannot be any memory source.
           - The ALU instruction sources either the source or dest of MOV instruction.
           - If ALU instruction has 2 reg sources, they should be different.
           - The following ALU instructions can fuse with an older qualified MOV instruction:
             ADD ADC AND XOR OP SUB SBB INC DEC NOT SAL / SHL SHR SAR
             (I assume OP is OR)
      
      I also increased issue rate from 4 to 6.  Theoretically znver5 can do more, but
      with our model we can't realy use it.
      Increasing issue rate to 8 leads to infinite loop in scheduler.
      
      Finally, I also enabled fuse_alu_and_branch since it is supported by
      znver5 (I think by earlier zens too).
      
      New fussion pattern moves quite few instructions around in common code:
      @@ -2210,13 +2210,13 @@
              .cfi_offset 3, -32
              leaq    63(%rsi), %rbx
              movq    %rbx, %rbp
      +       shrq    $6, %rbp
      +       salq    $3, %rbp
              subq    $16, %rsp
              .cfi_def_cfa_offset 48
              movq    %rdi, %r12
      -       shrq    $6, %rbp
      -       movq    %rsi, 8(%rsp)
      -       salq    $3, %rbp
              movq    %rbp, %rdi
      +       movq    %rsi, 8(%rsp)
              call    _Znwm
              movq    8(%rsp), %rsi
              movl    $0, 8(%r12)
      @@ -2224,8 +2224,8 @@
              movq    %rax, (%r12)
              movq    %rbp, 32(%r12)
              testq   %rsi, %rsi
      -       movq    %rsi, %rdx
              cmovns  %rsi, %rbx
      +       movq    %rsi, %rdx
              sarq    $63, %rdx
              shrq    $58, %rdx
              sarq    $6, %rbx
      which should help decoder bandwidth and perhaps also cache, though I was not
      able to measure off-noise effect on SPEC.
      
      gcc/ChangeLog:
      
      	* config/i386/i386.h (TARGET_FUSE_MOV_AND_ALU): New tune.
      	* config/i386/x86-tune-sched.cc (ix86_issue_rate): Updat for znver5.
      	(ix86_adjust_cost): Add TODO about znver5 memory latency.
      	(ix86_fuse_mov_alu_p): New.
      	(ix86_macro_fusion_pair_p): Use it.
      	* config/i386/x86-tune.def (X86_TUNE_FUSE_ALU_AND_BRANCH): Add ZNVER5.
      	(X86_TUNE_FUSE_MOV_AND_ALU): New tune;
      e2125a60
Loading