Skip to content
Snippets Groups Projects
  1. Feb 08, 2025
    • Thomas Schwinge's avatar
      Clarify that effective-targets 'exceptions' and 'exceptions_enabled' are orthogonal · 9f4feba6
      Thomas Schwinge authored
      In Subversion r268025 (Git commit 3f21b8e3)
      "Add dg-require-effective-target exceptions", effective-target 'exceptions'
      was added, which "says that AMD GCN does not support [exception handling]".
      
      In Subversion r279246 (Git commit a9046e98)
      "MSP430: Add -fno-exceptions multilib", effective-target 'exceptions_enabled'
      was added "to check if the testing configuration supports exceptions".  Testing
      "if exceptions are unsupported or disabled (e.g. by passing -fno-exceptions)"
      works as expected if exception handling is disabled at the front-end level
      ('-fno-exceptions'; the "exceptions are [...] disabled" case):
      
          exceptions_enabled2066068.cc: In function ‘void foo()’:
          exceptions_enabled2066068.cc:3:27: error: exception handling disabled, use ‘-fexceptions’ to enable
      
      However, effective-target 'exceptions_enabled' additionally assumes that
      "If exceptions aren't supported [by the target], then they're not enabled".
      This is not correct: it's not unlikely that, in presence of explicit/implicit
      '-fexceptions', exception handling code gets fully optimized away by the
      compiler, and therefore effective-target 'exceptions_enabled' test cases may
      PASS even for targets that don't support effective-target 'exceptions'; these
      two effective-targets are orthogonal concepts.
      
      (For completeness: code with trivial instances of C++ exception handling may
      translate into simple '__cxa_allocate_exception', '__cxa_throw' function calls
      without requiring any back end-level "exceptions magic", and then trigger
      unresolved symbols at link time, if these functions are not available.)
      
      This change only affects GCN, as that one currently is the only target declared
      as not supporting effective-target 'exceptions'.
      
      	gcc/
      	* doc/sourcebuild.texi (Effective-Target Keywords): Clarify that
      	effective-target 'exceptions' and 'exceptions_enabled' are
      	orthogonal.
      	gcc/testsuite/
      	* lib/gcc-dg.exp (gcc-dg-prune): Clarify effective-target
      	'exceptions_enabled'.
      	* lib/target-supports.exp
      	(check_effective_target_exceptions_enabled): Don't consider
      	effective-target 'exceptions'.
      	libstdc++-v3/
      	* testsuite/lib/prune.exp (libstdc++-dg-prune): Clarify
      	effective-target 'exceptions_enabled'.
      9f4feba6
    • Thomas Schwinge's avatar
      'gcc.dg/pr88870.c': don't 'dg-require-effective-target nonlocal_goto' · 0e602b23
      Thomas Schwinge authored
      I confirm that back then, 'gcc.dg/pr88870.c' for nvptx failed due to
      'sorry, unimplemented: target cannot support nonlocal goto', however at some
      (indeterminate) point in time, that must've disappeared, and we now don't have
      to 'dg-require-effective-target nonlocal_goto' anymore, and therefore get:
      
          [-UNSUPPORTED:-]{+PASS:+} gcc.dg/pr88870.c {+(test for excess errors)+}
      
      (And, if ever necessary again, this nowadays probably should
      'dg-require-effective-target exceptions' instead of 'nonlocal_goto'.)
      
      	gcc/testsuite/
      	* gcc.dg/pr88870.c: Don't 'dg-require-effective-target nonlocal_goto'.
      0e602b23
    • Jakub Jelinek's avatar
      i386: Fix ICE with conditional QI/HI vector maxmin [PR118776] · 64d8ea05
      Jakub Jelinek authored
      The following testcase ICEs starting with GCC 12 since r12-4526
      although the bug has been introduced already in r12-2751.
      The problem was in the addition of cond_<code><mode> define_expand
      which uses nonimmediate_operand predicates for both maxmin operands
      for all VI1248_AVX512VLBW modes.  It works fine with
      VI48_AVX512VL modes because the <code><mode>3_mask VI48_AVX512VL
      define_expand uses ix86_fixup_binary_operands_no_copy and the
      *avx512f_<code><mode>3<mask_name> VI48_AVX512VL define_insn uses
      % in constraint and !(MEM_P && MEM_P) check in condition (and
      <code><mode>3 define_expand with VI124_256_AVX512F_AVX512BW iterator
      does that too), but eventhough the 8-bit and 16-bit element maxmin
      is commutative too, the <mask_codefor><code><mode>3<mask_name>
      define_insn with VI12_AVX512VL iterator didn't use % in constraint
      to make it commutative.  So, e.g. cond_umaxv32qi define_expand
      allowed nonimmediate_operand for both umax operands, but used
      gen_umaxv32qi_mask which wasn't commutative and only allowed
      nonimmediate_operand for the second operand.
      
      The following patch fixes it by keeping the <code><mode>3
      VI124_256_AVX512F_AVX512BW define_expand as is (it does
      ix86_fixup_binary_operands_no_copy) but extending the
      <code><mode>3_mask define_expand from VI48_AVX512VL to
      VI1248_AVX512VLBW which keeps the current modes with their
      ISA conditions and adds the VI12_AVX512VL modes under additional
      TARGET_AVX512BW condition, and turning the actual define_insn
      into an * prefixed name (which it was before just for the non-masked
      case) and having the same commutative operand handling as in other
      define_insns.
      
      2025-02-08  Jakub Jelinek  <jakub@redhat.com>
      
      	PR target/118776
      	* config/i386/sse.md (<code><mode>3_mask): Use VI1248_AVX512VLBW
      	iterator rather than VI48_AVX512VL.
      	(<mask_codefor><code><mode>3<mask_name>): Rename to ...
      	(*avx512bw_<code><mode>3<mask_name>): ... this.  Use
      	nonimmediate_operand rather than register_operand predicate and %v
      	rather than v constraint for operand 1 and adjust condition to reject
      	MEMs in both operand 1 and 2.
      
      	* gcc.target/i386/pr118776.c: New test.
      64d8ea05
    • H.J. Lu's avatar
      x86: Verify that PUSH/POP can be skipped · 846837c2
      H.J. Lu authored
      
      For
      
      int f(int);
      
      int advance(int dz)
      {
          if (dz > 0)
              return (dz + dz) * dz;
          else
              return dz * f(dz);
      }
      
      Before r15-1619-g3b9b8d6cfdf593
      
      advance(int):
              push    rbx
              mov     ebx, edi
              test    edi, edi
              jle     .L2
              imul    ebx, edi
              lea     eax, [rbx+rbx]
              pop     rbx
              ret
      .L2:
              call    f(int)
              imul    eax, ebx
              pop     rbx
              ret
      
      After
      
       advance(int):
              test    edi, edi
              jle     .L2
              imul    edi, edi
              lea     eax, [rdi+rdi]
              ret
      .L2:
              sub     rsp, 24
              mov     DWORD PTR [rsp+12], edi
              call    f(int)
              imul    eax, DWORD PTR [rsp+12]
              add     rsp, 24
              ret
      
      There's no call in if branch, it's not optimal to push rbx at the entry
      of the function, it can be sinked to else branch. When "jle .L2" is not
      taken, it can save one push instruction.  Update pr111673.c to verify
      that this optimization isn't turned off.
      
      	PR rtl-optimization/111673
      	* gcc.target/i386/pr111673.c: Verify that PUSH/POP can be
      	skipped.
      
      Signed-off-by: default avatarH.J. Lu <hjl.tools@gmail.com>
      846837c2
    • GCC Administrator's avatar
      Daily bump. · 278bf572
      GCC Administrator authored
      278bf572
  2. Feb 07, 2025
    • Andrew Pinski's avatar
      aarch64: gimple fold aes[ed] [PR114522] · 7d8e8f89
      Andrew Pinski authored
      
      Instead of waiting to get combine/rtl optimizations fixed here. This fixes the
      builtins at the gimple level. It should provide for slightly faster compile time
      since we have a simplification earlier on.
      
      Built and tested for aarch64-linux-gnu.
      
      gcc/ChangeLog:
      
      	PR target/114522
      	* config/aarch64/aarch64-builtins.cc (aarch64_fold_aes_op): New function.
      	(aarch64_general_gimple_fold_builtin): Call aarch64_fold_aes_op for crypto_aese
      	and crypto_aesd.
      
      Signed-off-by: default avatarAndrew Pinski <quic_apinski@quicinc.com>
      7d8e8f89
    • Harald Anlauf's avatar
      Fortran: fix initialization of allocatable non-deferred character [PR59252] · 818c36a8
      Harald Anlauf authored
      	PR fortran/59252
      
      gcc/fortran/ChangeLog:
      
      	* trans-expr.cc (gfc_trans_subcomponent_assign): Initialize
      	allocatable non-deferred character with NULL properly.
      
      gcc/testsuite/ChangeLog:
      
      	* gfortran.dg/allocatable_char_1.f90: New test.
      818c36a8
    • Peter Bergner's avatar
      rs6000: Add cast to avoid pointer to integer comparison warning [PR117674] · c9b8a8fc
      Peter Bergner authored
      2025-02-07  Peter Bergner  <bergner@linux.ibm.com>
      
      libgcc/
      	PR target/117674
      	* config/rs6000/linux-unwind.h (ppc_backchain_fallback): Add cast to
      	avoid comparison between pointer and integer warning.
      c9b8a8fc
    • Andi Kleen's avatar
      Add a cache of recent lines · 66af77cb
      Andi Kleen authored
      For larger files the file_cache line index will be spread out to make
      the index fit into the fixed buffer, so any access to the non latest line
      will need some skipping of lines.
      
      Most accesses for line are near the latest line because
      a diagnostic is likely near where the scanner is currently lexing.
      
      Add a second cache for recent lines. It is organized as a ring buffer
      and maintains the last 256 lines relative to the last input line.
      
      With that, enabling -Wmisleading-indentation for the test case in
      PR preprocessor/118168, is within the run-to-run variation.
      
      gcc/ChangeLog:
      
      	PR preprocessor/118168
      	* input.cc (file_cache::m_line_recent,
      	m_line_recent_first, m_line_recent_last): Add.
      	(file_cache_slot::evict): Clear new fields.
      	(file_cache_slot::create): Clear new fields.
      	(file_cache_slot::file_cache_slot): Initialize new fields.
      	(file_cache_slot::~file_cache_slot): Release m_line_recent.
      	(file_cache_slot::get_next_line): Maintain ring buffer of lines
      	in m_line_recent.
      	(file_cache_slot::read_line_num): Use m_line_recent to look up
      	recent lines quickly.
      66af77cb
    • Richard Earnshaw's avatar
      arm: Prefer POP {lo-reg} over LDR lo-reg, ... for thumb2 [PR118089] · 0b6453d5
      Richard Earnshaw authored
      For thumb2, popping a single low register off the stack should prefer
      POP over LDR to mirror the behaviour of the PUSH on entry.  This saves
      a couple of bytes in the resulting image.  This is a relatively niche
      case as it's rare to push a single low register onto the stack, but
      still worth getting right.
      
      Whilst fixing this I've also restructured the code here somewhat to
      fix a bug I observed by inspection and to improve the code slightly.
      
      Firstly, the single register case is hoisted above the main loop.
      This not only avoids creating some RTL that immediately becomes
      garbage but also avoids us needing to check for this case in every
      iteration of the main loop body.
      
      Secondly, we iterate over just the non-zero bits in the reg mask
      rather than every bit and then checking if there's work to do for that
      bit.
      
      Finally, when emitting a pop that also pops SP off the stack we
      shouldn't be emitting a stack-adjust CFA note.  The new SP value comes
      from the popped value, not from an adjustment of the previous SP
      value.
      
      gcc:
      	PR target/118089
      	* config/arm/arm.cc (arm_emit_multi_reg_pop): Restructure.
      	Don't emit LDR on thumb2 when POP can be used for smaller code.
      	Don't add a CFA adjust note when SP is popped off the stack.
      
      gcc/testsuite:
      	PR target/118089
      	* gcc.target/arm/thumb2-pop-loreg.c: New test.
      0b6453d5
    • Richard Earnshaw's avatar
      arm: fix ICE due to fix for POP {PC} change · 7bee3709
      Richard Earnshaw authored
      My earlier change for making the compiler prefer
      
      	POP	{PC}
      
      over
      
      	LDR	PC, [SP], #4
      
      had a slightly unexpected consequence in that we now also call
      arm_emit_multi_reg_pop to handle single register pops when the
      register is not PC.  This exposed a latent bug in this function where
      the dwarf unwinding notes on the single-register POP were not being
      set correctly.
      
      gcc/
      	PR target/118089
      	* config/arm/arm.cc (arm_emit_multi_reg_pop): Add a CFA adjust
      	note to single-register POP instructions.
      7bee3709
    • Jeff Law's avatar
      [rtl-optimization/116244] Don't create bogus regs in alter_subreg · 38891014
      Jeff Law authored
      > Jeff Law <jeffreyalaw@gmail.com> writes:
      >> So pulling on this thread leads me into the code that sets up
      >> ALLOCNO_WMODE in create_insn_allocnos:
      >>
      >>>            if ((a = ira_curr_regno_allocno_map[regno]) == NULL)
      >>>              {
      >>>                a = ira_create_allocno (regno, false, ira_curr_loop_tree_node);
      >>>                if (outer != NULL && GET_CODE (outer) == SUBREG)
      >>>                  {
      >>>                    machine_mode wmode = GET_MODE (outer);
      >>>                    if (partial_subreg_p (ALLOCNO_WMODE (a), wmode))
      >>>                      ALLOCNO_WMODE (a) = wmode;
      >>>                  }
      >>>              }
      >> Note how we only set ALLOCNO_MODE only at allocno creation, so it'll
      >> work as intended if and only if the first reference is via a SUBREG.
      >
      > Huh, yeah, I agree that that looks wrong.
      >
      >> ISTM the fix here is to always do the check and set ALLOCNO_WMODE.
      >>[ Snipped discussion on a non-issue. ]
      
      >
      > So ISTM that moving the code out of the "if (... == NULL)" should be
      > enough on its own.
      >
      >> And it all makes sense that you caught this.  You and another colleague
      >> at ARM were trying to address this exact problem ~11 years ago ;-)
      >
      > Heh, thought it sounded familiar :)
      
      So attached is the updated patch that adjusts IRA to avoid this problem.
      
      Georg-Johann, this may explain an issue you were running into as well where you
      got an invalid allocation.  I think yours was at the higher end of the register
      file, but the core issue is potentially the same (looking at the first use
      rather than all of them for paradoxical subregs).
      
      I've had this in my tester about a week.  So it's been through the crosses as
      well as various native bootstraps, including but not limited to m68k, ppc,
      s390, hppa, sh4, etc.  And just for good measure I bootstrapped & regression
      tested it on x86_64 a few minutes ago.
      
      Pushing to the trunk.
      
      	PR rtl-optimization/116244
      gcc/
      	* ira-build.cc (create_insn_allocnos): Do not restrict the check for
      	subreg uses to allocno creation time.  Do it for all uses.
      
      gcc/testsuite/
      	* g++.target/m68k/m68k.exp: New test driver.
      	* g++.target/m68k/pr116244.C: New test.
      38891014
    • Jakub Jelinek's avatar
      c++: Fix up name independent decl in structured binding handling in range for [PR115586] · ca7c6d12
      Jakub Jelinek authored
      cp_parser_range_for temporarily reverts IDENTIFIER_BINDING changes
      to hide the decls from the structured bindings from lookup during
      parsing of the expression after :
      If there are 2 or more name independent decls, we undo IDENTIFIER_BINDING
      for the same name multiple times, even when just one has been added
      (with a TREE_LIST inside of it as decl).
      
      The following patch fixes it by handling the _ name at most once, the
      later loop will DTRT then and just reinstall the temporarily hidden
      binding with the TREE_LIST in there.
      
      2025-02-07  Jakub Jelinek  <jakub@redhat.com>
      
      	PR c++/115586
      	* parser.cc (cp_parser_range_for): For name independent decls in
      	structured bindings, only push the name/binding once per
      	structured binding.
      
      	* g++.dg/cpp26/name-independent-decl9.C: New test.
      	* g++.dg/cpp26/name-independent-decl10.C: New test.
      ca7c6d12
    • Jakub Jelinek's avatar
      c++: Fix up handling of for/while loops with declarations in condition [PR86769] · 35d40b56
      Jakub Jelinek authored
      As the following testcase show (note, just for-{3,4,6,7,8}.C, constexpr-86769.C
      and stmtexpr27.C FAIL without the patch, the rest is just that I couldn't
      find coverage for some details and so added tests we don't regress or for5.C
      is from Marek's attempt in the PR), we weren't correctly handling for/while
      loops with declarations as conditions.
      
      The C++ FE has the simplify_loop_decl_cond function which transforms
      such loops as mentioned in the comment:
                  while (A x = 42) { }
                  for (; A x = 42;) { }
         becomes
                  while (true) { A x = 42; if (!x) break; }
                  for (;;) { A x = 42; if (!x) break; }
      For for loops this is not enough, as the x declaration should be
      still in scope when expr (if any) is executed, and injecting the
      expr expression into the body in the FE needs to have the continue
      label in between, something normally added by the c-family
      genericization.  One of my thoughts was to just add there an artificial
      label plus the expr expression in the FE and tell c-family about that
      label, so that it doesn't create it but uses what has been emitted.
      
      Unfortunately break/continue are resolved to labels only at c-family
      genericization time and by moving the condition (and its preparation
      statements such as the DECL_EXPR) into the body (and perhaps by also
      moving there the (increment) expr as well) we resolve incorrectly any
      break/continue statement appearing in cond (or newly perhaps also expr)
      expression(s).  While in standard C++ one can't have something like that
      there, with statement expressions they are possible there, and we actually
      have testsuite coverage that when they appear outside of the body of the
      loop they bind to an outer loop rather than the inner one.  When the FE
      moves everything into the body, c-family can't distinguish any more between
      the user body vs. the condition/preparation statements vs. expr expression.
      
      So, the following patch instead keeps them separate and does the merging
      only at the c-family loop genericization time.  For that the patch
      introduces two new operands of FOR_STMT and WHILE_STMT, *_COND_PREP
      which is forced to be a BIND_EXPR which contains the preparation statements
      like DECL_EXPR, and the initialization of that variable, so basically what
      {FOR,WHILE}_BODY has when we encounter the function dealing with this,
      except one optional CLEANUP_STMT at the end which holds cleanups for the
      variable if it needs to be destructed.  This CLEANUP_STMT is removed and
      the actual cleanup saved into another new operand, *_COND_CLEANUP.
      
      The c-family loop genericization handles such loops roughly the way
      https://eel.is/c++draft/stmt.for and https://eel.is/c++draft/stmt.while
      specifies, so the body is (if *_COND_CLEANUP is non-NULL)
      { A x = 42; try { if (!x) break; body; cont_label: expr; } finally { cleanup; } }
      and otherwise
      { A x = 42; if (!x) break; body; cont_label: expr; }
      i.e. the *_COND, *_BODY, optional continue label, FOR_EXPR  are appended
      into the body of the *_COND_PREP BIND_EXPR.
      
      And when doing constexpr evaluation of such FOR/WHILE loops, we treat
      it similarly, first evaluate *_COND_PREP except the
            for (tree decl = BIND_EXPR_VARS (t); decl; decl = DECL_CHAIN (decl))
              destroy_value_checked (ctx, decl, non_constant_p);
      part of BIND_EXPR handling for it, then evaluate *_COND (and decide based
      on whether it was false or true like before), then *_BODY, then FOR_EXPR,
      then *_COND_CLEANUP (similarly to the way how CLEANUP_STMT handling handles
      that) and finally do those destroy_value_checked.
      
      Note, the constexpr-86769.C testcase FAILs with both clang++ and MSVC (note,
      the rest of tests PASS with clang++) but I believe it must be just a bug
      in those compilers, new int is done in all the constructors and delete is
      done in the destructor, so when clang++ reports one of the new int weren't
      deallocated during constexpr evaluation I don't see how that would be
      possible.  When the same test has all the constexpr stuff, all the new int
      are properly deleted at runtime when compiled by both compilers and valgrind
      is happy about it, no leaks.
      
      2025-02-07  Jakub Jelinek  <jakub@redhat.com>
      	    Jason Merrill  <jason@redhat.com>
      
      	PR c++/86769
      gcc/c-family/
      	* c-common.def (FOR_STMT): Add 2 operands and document them.
      	(WHILE_STMT): Likewise.
      	* c-common.h (WHILE_COND_PREP, WHILE_COND_CLEANUP): Define.
      	(FOR_COND_PREP, FOR_COND_CLEANUP): Define.
      	* c-gimplify.cc (genericize_c_loop): Add COND_PREP and COND_CLEANUP
      	arguments, handle them if they are non-NULL.
      	(genericize_for_stmt, genericize_while_stmt, genericize_do_stmt):
      	Adjust callers.
      gcc/c/
      	* c-parser.cc (c_parser_while_statement): Add 2 further NULL_TREE
      	operands to build_stmt.
      	(c_parser_for_statement): Likewise.
      gcc/cp/
      	* semantics.cc (set_one_cleanup_loc): New function.
      	(set_cleanup_locs): Use it.
      	(simplify_loop_decl_cond): Remove.
      	(adjust_loop_decl_cond): New function.
      	(begin_while_stmt): Add 2 further NULL_TREE operands to build_stmt.
      	(finish_while_stmt_cond): Call adjust_loop_decl_cond instead of
      	simplify_loop_decl_cond.
      	(finish_while_stmt): Call do_poplevel also on WHILE_COND_PREP if
      	non-NULL and also use pop_stmt_list rather than do_poplevel for
      	WHILE_BODY in that case.  Call set_one_cleanup_loc.
      	(begin_for_stmt): Add 2 further NULL_TREE operands to build_stmt.
      	(finish_for_cond): Call adjust_loop_decl_cond instead of
      	simplify_loop_decl_cond.
      	(finish_for_stmt): Call do_poplevel also on FOR_COND_PREP if non-NULL
      	and also use pop_stmt_list rather than do_poplevel for FOR_BODY in
      	that case.  Call set_one_cleanup_loc.
      	* constexpr.cc (cxx_eval_loop_expr): Handle
      	{WHILE,FOR}_COND_{PREP,CLEANUP}.
      	(check_for_return_continue): Handle {WHILE,FOR}_COND_PREP.
      	(potential_constant_expression_1): RECUR on
      	{WHILE,FOR}_COND_{PREP,CLEANUP}.
      gcc/testsuite/
      	* g++.dg/diagnostic/redeclaration-7.C: New test.
      	* g++.dg/expr/for3.C: New test.
      	* g++.dg/expr/for4.C: New test.
      	* g++.dg/expr/for5.C: New test.
      	* g++.dg/expr/for6.C: New test.
      	* g++.dg/expr/for7.C: New test.
      	* g++.dg/expr/for8.C: New test.
      	* g++.dg/ext/stmtexpr27.C: New test.
      	* g++.dg/cpp2a/constexpr-86769.C: New test.
      	* g++.dg/cpp26/name-independent-decl7.C: New test.
      	* g++.dg/cpp26/name-independent-decl8.C: New test.
      35d40b56
    • Richard Biener's avatar
      jit/118780 - make sure to include dlfcn.h when plugin support is disabled · e2296253
      Richard Biener authored
      The following makes the dlfcn.h explicitly requested which avoids
      build failure when JIT is enabled but plugin support disabled as
      currently the include is conditional on plugin support.
      
      	PR jit/118780
      gcc/
      	* system.h: Check INCLUDE_DLFCN_H for including dlfcn.h instead
      	of ENABLE_PLUGIN.
      	* plugin.cc: Define INCLUDE_DLFCN_H.
      
      gcc/jit/
      	* jit-playback.cc: Define INCLUDE_DLFCN_H.
      	* jit-result.cc: Likewise.
      e2296253
    • Giuseppe D'Angelo's avatar
      libstdc++: fix a dangling reference crash in ranges::is_permutation [PR118160] · 2a2bd96d
      Giuseppe D'Angelo authored
      
      The code was caching the result of `invoke(proj, *it)` in a local
      `auto &&` variable. The problem is that this may create dangling
      references, for instance in case `proj` is `std::identity` (the common
      case) and `*it` produces a prvalue: lifetime extension does not
      apply here due to the expressions involved.
      
      Instead, store (and lifetime-extend) the result of `*it` in a separate
      variable, then project that variable. While at it, also forward the
      result of the projection to the predicate, so that the predicate can
      act on the proper value category.
      
      libstdc++-v3/ChangeLog:
      
      	PR libstdc++/118160
      	PR libstdc++/100249
      	* include/bits/ranges_algo.h (__is_permutation_fn): Avoid a
      	dangling reference by storing the result of the iterator
      	dereference and the result of the projection in two distinct
      	variables, in order to lifetime-extend each one.
      	Forward the projected value to the predicate.
      	* testsuite/25_algorithms/is_permutation/constrained.cc: Add a
      	test with a range returning prvalues. Test it in a constexpr
      	context, in order to rely on the compiler to catch UB.
      
      Signed-off-by: default avatarGiuseppe D'Angelo <giuseppe.dangelo@kdab.com>
      2a2bd96d
    • Jonathan Wakely's avatar
      libstdc++: Handle exceptions in std::ostream::sentry destructor · 6e758f37
      Jonathan Wakely authored
      Because basic_ostream::sentry::~sentry is implicitly noexcept, we can't
      let any exceptions escape from it, or the program would terminate. If
      the streambuf's sync() function throws, or if it returns an error and
      setting badbit in the stream state throws, then the program would
      terminate.
      
      LWG 835 intended to prevent exceptions from being thrown by the
      std::basic_ostream::sentry destructor, but failed to cover the case
      where the streambuf's sync() member throws an exception. LWG 4188 is
      needed to fix that part. In any case, LWG 835 was never implemented for
      libstdc++ so this does that, as well as my proposed fix for 4188 (that
      badbit should be set if pubsync() exits via an exception).
      
      In order to avoid a second try-catch block to handle an exception that
      might be thrown by setting badbit, this introduces an RAII helper class
      that temporarily clears the stream's exceptions mask, then restores it
      afterwards.
      
      The try-catch block doesn't handle the forced_unwind exception
      explicitly, because catching and rethrowing that would just terminate
      when it reached the sentry's implicit noexcept(true) anyway.
      
      libstdc++-v3/ChangeLog:
      
      	* include/bits/ostream.h (basic_ostream::_Disable_exceptions):
      	RAII helper type.
      	(basic_ostream::sentry::~sentry): Use _Disable_exceptions. Add
      	try-catch block around call to pubsync.
      	* testsuite/27_io/basic_ostream/exceptions/char/lwg4188.cc: New
      	test.
      	* testsuite/27_io/basic_ostream/exceptions/wchar_t/lwg4188.cc:
      	New test.
      6e758f37
    • Jonathan Wakely's avatar
      libstdc++: Add comment about use of always_inline attributes [PR111050] · 89f007c2
      Jonathan Wakely authored
      Add a comment referencing PR 111050, to ensure the fix made by
      r12-9903-g1be57348229666 doesn't get reverted.
      
      libstdc++-v3/ChangeLog:
      
      	PR libstdc++/111050
      	* include/bits/hashtable_policy.h (_Hash_node_value_base): Add
      	comment about always_inline attributes.
      89f007c2
    • Pan Li's avatar
      RISC-V: Make VXRM as global register [PR118103] · 1c8e6734
      Pan Li authored
      
      Inspired by PR118103, the VXRM register should be treated almost the
      same as the FRM register, aka cooperatively-managed global register.
      Thus, add the VXRM to global_regs to avoid the elimination by the
      late-combine pass.
      
      For example as below code:
      
        21   │
        22   │ void compute ()
        23   │ {
        24   │   size_t vl = __riscv_vsetvl_e16m1 (N);
        25   │   vuint16m1_t va = __riscv_vle16_v_u16m1 (a, vl);
        26   │   vuint16m1_t vb = __riscv_vle16_v_u16m1 (b, vl);
        27   │   vuint16m1_t vc = __riscv_vaaddu_vv_u16m1 (va, vb, __RISCV_VXRM_RDN, vl);
        28   │
        29   │   __riscv_vse16_v_u16m1 (c, vc, vl);
        30   │ }
        31   │
        32   │ int main ()
        33   │ {
        34   │   initialize ();
        35   │   compute();
        36   │
        37   │   return 0;
        38   │ }
      
      After compile with -march=rv64gcv -O3, we will have:
      
        30   │ compute:
        31   │     csrwi   vxrm,2
        32   │     lui a3,%hi(a)
        33   │     lui a4,%hi(b)
        34   │     addi    a4,a4,%lo(b)
        35   │     vsetivli    zero,4,e16,m1,ta,ma
        36   │     addi    a3,a3,%lo(a)
        37   │     vle16.v v2,0(a4)
        38   │     vle16.v v1,0(a3)
        39   │     lui a4,%hi(c)
        40   │     addi    a4,a4,%lo(c)
        41   │     vaaddu.vv   v1,v1,v2
        42   │     vse16.v v1,0(a4)
        43   │     ret
        44   │     .size   compute, .-compute
        45   │     .section    .text.startup,"ax",@progbits
        46   │     .align  1
        47   │     .globl  main
        48   │     .type   main, @function
        49   │ main:
             |     // csrwi   vxrm,2 deleted after inline
        50   │     addi    sp,sp,-16
        51   │     sd  ra,8(sp)
        52   │     call    initialize
        53   │     lui a3,%hi(a)
        54   │     lui a4,%hi(b)
        55   │     vsetivli    zero,4,e16,m1,ta,ma
        56   │     addi    a4,a4,%lo(b)
        57   │     addi    a3,a3,%lo(a)
        58   │     vle16.v v2,0(a4)
        59   │     vle16.v v1,0(a3)
        60   │     lui a4,%hi(c)
        61   │     addi    a4,a4,%lo(c)
        62   │     li  a0,0
        63   │     vaaddu.vv   v1,v1,v2
      
      The below test suites are passed for this patch.
      * The rv64gcv fully regression test.
      
      	PR target/118103
      
      gcc/ChangeLog:
      
      	* config/riscv/riscv.cc (riscv_conditional_register_usage): Add
      	the VXRM as the global_regs.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/riscv/rvv/base/pr118103-2.c: New test.
      	* gcc.target/riscv/rvv/base/pr118103-run-2.c: New test.
      
      Signed-off-by: default avatarPan Li <pan2.li@intel.com>
      1c8e6734
    • Alexandre Oliva's avatar
      [testsuite] tolerate later success [PR108357] · d790f013
      Alexandre Oliva authored
      On leon3-elf and presumably on other targets, the test fails due to
      differences in calling conventions and other reasons, that add extra
      gimple stmts that prevent the expected optimization at the expected
      point.  The optimization takes place anyway, just a little later, so
      tolerate that.
      
      
      for  gcc/testsuite/ChangeLog
      
      	PR tree-optimization/108357
      	* gcc.dg/tree-ssa/pr108357.c: Tolerate later optimization.
      d790f013
    • Andrew Pinski's avatar
      aarch64: Fix bootstrap with --enable-checking=release [PR118771] · ea4278b1
      Andrew Pinski authored
      
      With release checking we get an uninitialization warning
      inside aarch64_split_move because of jump threading for the case of `npieces==0`
      but `npieces` is never 0 (but there is no way the compiler can know that.
      So this fixes the issue by adding a `gcc_assert` to the function which asserts
      that `npieces > 0` and fixes the uninitialization warning.
      
      Bootstrapped and tested on aarch64-linux-gnu (with and without --enable-checking=release).
      
      The warning:
      
      aarch64.cc: In function 'void aarch64_split_move(rtx, rtx, machine_mode)':
      aarch64.cc:3418:31: error: '*(rtx_def**)((char*)&dst_pieces + offsetof(auto_vec<rtx_def*, 4>,auto_vec<rtx_def*, 4>::m_data[0]))' may be used uninitialized [-Werror=maybe-uninitialized]
       3418 |   if (reg_overlap_mentioned_p (dst_pieces[0], src))
            |       ~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~
      aarch64.cc:3408:20: note: 'dst_pieces' declared here
       3408 |   auto_vec<rtx, 4> dst_pieces, src_pieces;
            |                    ^~~~~~~~~~
      
      	PR target/118771
      gcc/ChangeLog:
      
      	* config/aarch64/aarch64.cc (aarch64_split_move): Assert that npieces is
      	greater than 0.
      
      Signed-off-by: default avatarAndrew Pinski <quic_apinski@quicinc.com>
      ea4278b1
    • Thomas Schwinge's avatar
      Honor dump options for C/C++ '-fdump-tree-original' · 5fdcbe48
      Thomas Schwinge authored
      In addition to upcoming use of '-fdump-tree-original-lineno', this patch
      actually resolves XFAILs for 'c-c++-common/goacc/pr92793-1.c', which had
      gotten added as part of commit fa410314
      "[OpenACC] Elaborate testcases that verify column location information [PR92793]".
      
      	gcc/c-family/
      	* c-gimplify.cc (c_genericize): Pass 'local_dump_flags' to
      	'print_c_tree'.
      	* c-pretty-print.cc (c_pretty_printer::statement): Pass
      	'dump_flags' to 'dump_generic_node'.
      	(c_pretty_printer::c_pretty_printer): Initialize 'dump_flags'.
      	(print_c_tree): Add 'dump_flags_t' formal parameter.
      	(debug_c_tree): Adjust.
      	* c-pretty-print.h (c_pretty_printer): Add 'dump_flags_t
      	dump_flags'.
      	(c_pretty_printer::c_pretty_printer): Add 'dump_flags_t' formal
      	parameter.
      	(print_c_tree): Adjust.
      	gcc/testsuite/
      	* c-c++-common/goacc/pr92793-1.c: Remove
      	'-fdump-tree-original-lineno' XFAILs.
      5fdcbe48
    • Marek Polacek's avatar
      c++: ICE with unparsed noexcept [PR117106] · f5ef1f9e
      Marek Polacek authored
      
      In a member-specification of a class, a noexcept-specifier is
      a complete-class context.  Thus we delay parsing until the end of
      the class via our DEFERRED_PARSE mechanism; see cp_parser_save_noexcept
      and cp_parser_late_noexcept_specifier.
      
      We also attempt to defer instantiation of noexcept-specifiers in order
      to reduce the number of instantiations; this is done via DEFERRED_NOEXCEPT.
      
      We can even have both, as in noexcept65.C: a DEFERRED_PARSE wrapped in
      DEFERRED_NOEXCEPT, which uses the DEFPARSE_INSTANTIATIONS mechanism.
      noexcept65.C works, because when we really need the noexcept, which is
      when parsing the body of S::A::A(), the noexcept will have been parsed
      already; noexcepts are parsed before bodies of member function.
      
      But in this test we have:
      
        struct A {
            int x;
            template<class>
            void foo() noexcept(noexcept(x)) {}
            auto bar() -> decltype(foo<int>()) {} // #1
        };
      
      and I think the decltype in #1 needs the unparsed noexcept before it
      could have been parsed.  clang++ rejects the test and I suppose we
      should reject it as well, rather than crashing on a DEFERRED_PARSE
      in tsubst_expr.
      
      	PR c++/117106
      	PR c++/118190
      
      gcc/cp/ChangeLog:
      
      	* pt.cc (maybe_instantiate_noexcept): Give an error if the noexcept
      	hasn't been parsed yet.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/cpp0x/noexcept89.C: New test.
      	* g++.dg/cpp0x/noexcept90.C: New test.
      
      Reviewed-by: default avatarJason Merrill <jason@redhat.com>
      f5ef1f9e
    • Simon Martin's avatar
      c++: Properly support null pointer constants in conditional operators [PR118282] · 0b2f34ca
      Simon Martin authored
      We've been rejecting the following valid code since GCC 4
      
      === cut here ===
      struct A {
        explicit A (int);
        operator void* () const;
      };
      void foo (const A& x) {
        auto res = 0 ? x : 0;
      }
      int main () {
        A a{5};
        foo(a);
      }
      === cut here ===
      
      The problem is that for COND_EXPR, add_builtin_candidate has an early
      return if the true and false values are not pointers that does not take
      null pointer constants into account. This causes to not find any valid
      conversion, and fail to compile.
      
      This patch fixes the condition to also pass if the true/false values are
      not pointers but null pointer constants, which resolves the PR.
      
      	PR c++/118282
      
      gcc/cp/ChangeLog:
      
      	* call.cc (add_builtin_candidate): Also check for null_ptr_cst_p
      	operands.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/conversion/op8.C: New test.
      0b2f34ca
    • Jakub Jelinek's avatar
      c++: Don't use CLEANUP_EH_ONLY for new expression cleanup [PR118763] · fcecc74c
      Jakub Jelinek authored
      The following testcase is miscompiled since r12-6325 stopped
      preevaluating the initializers for new expression.
      If evaluating the initializers throws, there is a correct cleanup
      for that, but it is marked CLEANUP_EH_ONLY.  While in standard
      C++ that is just fine, if it has statement expressions, it can
      return or goto out of the expression and we should delete the
      pointer in that case too.
      
      There is already a sentry variable initialized to true and
      set to false after everything is initialized and used as a guard
      for the cleanup, so just removing the CLEANUP_EH_ONLY flag does
      everything we need.  And in the normal case of the initializer
      not using statement expressions at least with -O2 we get the same code,
      while the change changes one
      try { sentry = true; ... sentry = false; } catch { if (sentry) delete ...; }
      into
      try { sentry = true; ... sentry = false; } finally { if (sentry) delete ...; }
      optimizations will see that sentry is false when reaching the finally
      other than through an exception.
      
      Though, wonder what other CLEANUP_EH_ONLY cleanups might be an issue
      with statement expressions.
      
      2025-02-07  Jakub Jelinek  <jakub@redhat.com>
      
      	PR c++/118763
      	* init.cc (build_new_1): Don't set CLEANUP_EH_ONLY.
      
      	* g++.dg/asan/pr118763.C: New test.
      fcecc74c
    • Jakub Jelinek's avatar
      c++: Use cplus_decl_attributes rather than decl_attributes in grokdecl [PR118773] · 801fb71f
      Jakub Jelinek authored
      My r15-3046 change regressed the first half of the following testcase.
      When it calls decl_attributes, it doesn't handle attributes with
      dependent arguments correctly and so is now rejected that N is not
      a constant integer during template parsing.
      
      I've actually followed the pointer/reference case which did that
      too and that one has been failing for a couple of years on the
      second part of the testcase.
      
      Note, there is also
                if (decl_context != PARM && decl_context != TYPENAME)
                  /* Assume that any attributes that get applied late to
                     templates will DTRT when applied to the declaration
                     as a whole.  */
                  late_attrs = splice_template_attributes (&attrs, type);
                returned_attrs = decl_attributes (&type,
                                                  attr_chainon (returned_attrs,
                                                                attrs),
                                                  attr_flags);
                returned_attrs = attr_chainon (late_attrs, returned_attrs);
      call directly to decl_attributes in grokdeclarator, but this one handles
      the splicing manually, so maybe it is ok as is (and I don't have a testcase
      of anything misbehaving for that).
      
      2025-02-07  Jakub Jelinek  <jakub@redhat.com>
      
      	PR c++/118773
      	* decl.cc (grokdeclarator): Use cplus_decl_attributes rather than
      	decl_attributes for std_attributes on pointer and array types.
      
      	* g++.dg/cpp0x/gen-attrs-87.C: New test.
      	* g++.dg/gomp/attrs-3.C: Adjust expected diagnostics.
      801fb71f
    • Jakub Jelinek's avatar
      c++: Allow constexpr reads from volatile std::nullptr_t objects [PR118661] · 6c8e6d6f
      Jakub Jelinek authored
      As mentioned in the PR, https://eel.is/c++draft/conv.lval#note-1
      says that even volatile reads from std::nullptr_t typed objects actually
      don't read anything and https://eel.is/c++draft/expr.const#10.9
      says that even those are ok in constant expressions.
      
      So, the following patch adjusts the r9-4793 changes to have an exception
      for NULLPTR_TYPE.
      As [conv.lval]/3 also talks about accessing to inactive member, I've added
      testcase to cover that as well.
      
      2025-02-07  Jakub Jelinek  <jakub@redhat.com>
      
      	PR c++/118661
      	* constexpr.cc (potential_constant_expression_1): Don't diagnose
      	lvalue-to-rvalue conversion of volatile lvalue if it has NULLPTR_TYPE.
      	* decl2.cc (decl_maybe_constant_var_p): Return true for constexpr
      	decls with NULLPTR_TYPE even if they are volatile.
      
      	* g++.dg/cpp0x/constexpr-volatile4.C: New test.
      	* g++.dg/cpp0x/constexpr-union9.C: New test.
      6c8e6d6f
    • Paul Thomas's avatar
      Fortran: Fix default init of finalizable derived argus [PR116829] · 251aa524
      Paul Thomas authored
      2025-02-07  Tomáš Trnka  <trnka@scm.com>
      
      gcc/fortran
      	PR fortran/116829
      	* trans-decl.cc (init_intent_out_dt): Always call
      	gfc_init_default_dt() for BT_DERIVED to apply s->value if the
      	symbol isn't allocatable. Also simplify the logic a bit.
      
      gcc/testsuite/
      	PR fortran/116829
      	* gfortran.dg/derived_init_7.f90: New test.
      251aa524
    • Richard Biener's avatar
      tree-optimization/115538 - possible wrong-code with SLP conversion · 4931a637
      Richard Biener authored
      The following fixes a latent issue where we use ranges to verify
      correctness of a vector conversion optimization.  We rely on ranges
      from 'op0' which for SLP is extracted from the representative stmt
      which does not necessarily correspond to any actual scalar operation.
      We also do not verify the range of all scalar lanes in the SLP
      operand match.  The following rectifies this, restricting the support
      to single-lane SLP nodes at this point - on branches we'd simply
      not perform this optimization with SLP.
      
      	PR tree-optimization/115538
      	* tree-vectorizer.h (vect_get_slp_scalar_def): Declare.
      	* tree-vect-slp.cc (vect_get_slp_scalar_def): New helper.
      	* tree-vect-generic.cc (expand_vector_conversion): Adjust.
      	* tree-vect-stmts.cc (vectorizable_conversion): For SLP
      	correctly look at ranges of the scalar defs of the SLP operand.
      	(supportable_indirect_convert_operation): Likewise.
      4931a637
    • Tobias Burnus's avatar
      [gcn] Fix the output amdhsa.version · 6aa3329b
      Tobias Burnus authored
      The amdhsa.version depends on the code object version; while V3 had 1.0,
      V4 has 1.1 and V5 (and V6) have 1.2. GCC used 1.0 but generated since
      a while either V4 or, with -march=gfx...-generic, V6. Now it uses the
      proper version again.
      
      gcc/ChangeLog:
      
      	* config/gcn/gcn.cc (gcn_hsa_declare_function_name): Update
      	'amdhsa.version' output to match used code version.
      	* config/gcn/gen-gcn-device-macros.awk: Add a comment to
      	crosslink.
      6aa3329b
    • Tobias Burnus's avatar
      [GCN] Handle generic ISA names in libgomp's plugin-gcn.c · 8561e4e2
      Tobias Burnus authored
      libgomp/ChangeLog:
      
      	* plugin/plugin-gcn.c (ELFABIVERSION_AMDGPU_HSA_V6,
      	EF_AMDGPU_GENERIC_VERSION_V, EF_AMDGPU_GENERIC_VERSION_OFFSET,
      	GET_GENERIC_VERSION): New #define.
      	(elf_gcn_isa_is_generic): New.
      	(isa_matches_agent): Accept all generic code objects on the first
      	go; extend the diagnostic and handle runtime-failed case.
      	(create_and_finalize_hsa_program): Call it also after loading
      	the code failed, pass the status.
      8561e4e2
    • Xi Ruoyao's avatar
      LoongArch: Correct the mode for mask{eq,ne}z · bad9a730
      Xi Ruoyao authored
      For mask{eq,ne}z, rk is always compared with 0 in the full width, thus
      the mode for rk should be X.
      
      I found the issue reviewing a patch fixing a similar issue for RISC-V
      XTheadCondMov [1], but interestingly I cannot find a test case really
      blowing up on LoongArch.  But as the issue is obvious enough let's fix
      it anyway so it won't blow up in the future.
      
      [1]: https://gcc.gnu.org/pipermail/gcc-patches/2025-January/674004.html
      
      gcc/ChangeLog:
      
      	* config/loongarch/loongarch.md
      	(*sel<code><GPR:mode>_using_<GPR2:mode>): Rename to ...
      	(*sel<code><GPR:mode>_using_<X:mode>): ... here.
      	(GPR2): Remove as nothing uses it now.
      bad9a730
    • Alexandre Oliva's avatar
      [ifcombine] avoid creating out-of-bounds BIT_FIELD_REFs [PR118514] · 075ddb52
      Alexandre Oliva authored
      If decode_field_reference finds a load that accesses past the inner
      object's size, bail out.
      
      Drop the too-strict assert.
      
      
      for  gcc/ChangeLog
      
      	PR tree-optimization/118514
      	PR tree-optimization/118706
      	* gimple-fold.cc (decode_field_reference): Refuse to consider
      	merging out-of-bounds BIT_FIELD_REFs.
      	(make_bit_field_load): Drop too-strict assert.
      	* tree-eh.cc (bit_field_ref_in_bounds_p): Rename to...
      	(access_in_bounds_of_type_p): ... this.  Change interface,
      	export.
      	(tree_could_trap_p): Adjust.
      	* tree-eh.h (access_in_bounds_of_type_p): Declare.
      
      for  gcc/testsuite/ChangeLog
      
      	PR tree-optimization/118514
      	PR tree-optimization/118706
      	* gcc.dg/field-merge-25.c: New.
      075ddb52
    • Tobias Burnus's avatar
      [gcn] Add gfx9-generic and generic-associated gfx* · b5a29a93
      Tobias Burnus authored
      This patch adds gfx9-generic, completing the gfx*-generic support.
      It also adds all gfx* devices that are part of any of the gfx*-generic,
      i.e. gfx902, gfx904, gfx909, gfx1031, gfx1032, gfx1033, gfx1034,
      gfx1035, gfx1101, gfx1102, gfx1150, gfx1151, gfx1152, and gfx1153.
      
      gcc/ChangeLog:
      
      	* config/gcn/gcn-devices.def (GCN_DEVICE): Add gfx9-generic,
      	gfx902, gfx904, gfx909, gfx1031, gfx1032, gfx1033, gfx1034,
      	gfx1035, gfx1101, gfx1102, gfx1150, gfx1151, gfx1152, and gfx1153.
      	Add a currently unused column linking, a specific ISA to a generic
      	one (if it exists).
      	* config/gcn/gcn-tables.opt: Regenerate
      	* doc/invoke.texi (AMD GCN): Add the the new gfc... and the older
      	gfx{10-3,11}-generic to -march= as 'experimental'.
      b5a29a93
    • Tobias Burnus's avatar
      [gcn] Fix gfx906's sramecc setting · fa554462
      Tobias Burnus authored
      When compiling with -g, mkoffload.cc creates a device object file itself;
      however, in order that the linker dos not complain, the ELF flags must
      match what the compiler / linker does. For gfx906, the assembler defaults
      to sramecc = any, but gcn-devices.def contained unsupported, which is not
      the same - causing link errors. That's a regression caused by commit
      r15-4540-ga6b26e5ea09779 - which can be best seen by looking at the
      changes to mkoffload.cc.
      
      Additionally, this commit adds '...' to the GCN_DEVICE #define in gcn.cc
      to make it agnostic to the addition of fields.
      
      gcc/ChangeLog:
      
      	* config/gcn/gcn-devices.def (GCN_DEVICE): Change sramecc for
      	gfx906 to 'any'.
      	* config/gcn/gcn.cc (GCN_DEVICE): Add tailing ... to #define.
      fa554462
    • Alexandre Oliva's avatar
      [testsuite] [sparc] select ultrasparc for fsmuld test · 7722b65f
      Alexandre Oliva authored
      vis3move-3.c expects fsmuld, that is not available on all variants of
      sparc.  Select a cpu that supports it for the test.
      
      Now, -mfix-ut699 irrevocbly disables fsmuld, so skip the test if the
      test configuration uses that option.
      
      
      for  gcc/testsuite/ChangeLog
      
      	* gcc.target/sparc/vis3move-3.c: Select ultrasparc.  Skip with
      	-mfix-ut699.
      7722b65f
    • Alexandre Oliva's avatar
      [testsuite] [sparc] skip tls tests if emulated · d1061212
      Alexandre Oliva authored
      A number of tls tests expect TLS-specific relocations, that are not
      present when tls is emulated, as on e.g. leon3-elf.  Skip the tests
      when tls is emulated.
      
      
      for  gcc/testsuite/ChangeLog
      
      	* gcc.target/sparc/tls-ld-int16.c: Skip when tls is emulated.
      	* gcc.target/sparc/tls-ld-int32.c: Likewise.
      	* gcc.target/sparc/tls-ld-int8.c: Likewise.
      	* gcc.target/sparc/tls-ld-uint16.c: Likewise.
      	* gcc.target/sparc/tls-ld-uint32.c: Likewise.
      	* gcc.target/sparc/tls-ld-uint8.c: Likewise.
      d1061212
    • Alexandre Oliva's avatar
      [testsuite] [sparc] skip sparc-ret-1 with -mfix-ut699 · 9a551d63
      Alexandre Oliva authored
      Option -mfix-ut699 changes the set of instructions that can be placed
      in the delay slot, preventing the expected insn placement.  Skip the
      test if the option is present.
      
      
      for  gcc/testsuite/ChangeLog
      
      	* gcc.target/sparc/sparc-ret-1.c: Skip on -mfix-ut699.
      9a551d63
    • Alexandre Oliva's avatar
      [testsuite] [sparc] use -mtune in alignment tuning test · 670f83c0
      Alexandre Oliva authored
      If -mcpu=leon3 is present in the command line for a test run,
      overriding it with -mcpu=niagara7 is not enough to override the tuning
      for leon3 selected by the previous -mcpu option.
      
      niagara7-align.c tests for niagara7 alignment tuning, so use -mtune
      rather than -mcpu.
      
      
      for  gcc/testsuite/ChangeLog
      
      	* gcc.target/sparc/niagara7-align.c: Use -mtune.
      670f83c0
    • H.J. Lu's avatar
      ira: Add a target hook for callee-saved register cost scale · d3ff498c
      H.J. Lu authored
      
      commit 3b9b8d6c
      Author: Surya Kumari Jangala <jskumari@linux.ibm.com>
      Date:   Tue Jun 25 08:37:49 2024 -0500
      
          ira: Scale save/restore costs of callee save registers with block frequency
      
      scales the cost of saving/restoring a callee-save hard register in epilogue
      and prologue with the entry block frequency, which, if not optimizing for
      size, is 10000, for all targets.  As the result, callee-saved registers
      may not be used to preserve local variable values across calls on some
      targets, like x86.  Add a target hook for the callee-saved register cost
      scale in epilogue and prologue used by IRA.  The default version of this
      target hook returns 1 if optimizing for size, otherwise returns the entry
      block frequency.  Add an x86 version of this target hook to restore the
      old behavior prior to the above commit.
      
      	PR rtl-optimization/111673
      	PR rtl-optimization/115932
      	PR rtl-optimization/116028
      	PR rtl-optimization/117081
      	PR rtl-optimization/117082
      	PR rtl-optimization/118497
      	* ira-color.cc (assign_hard_reg): Call the target hook for the
      	callee-saved register cost scale in epilogue and prologue.
      	* target.def (ira_callee_saved_register_cost_scale): New target
      	hook.
      	* targhooks.cc (default_ira_callee_saved_register_cost_scale):
      	New.
      	* targhooks.h (default_ira_callee_saved_register_cost_scale):
      	Likewise.
      	* config/i386/i386.cc (ix86_ira_callee_saved_register_cost_scale):
      	New.
      	(TARGET_IRA_CALLEE_SAVED_REGISTER_COST_SCALE): Likewise.
      	* doc/tm.texi: Regenerated.
      	* doc/tm.texi.in (TARGET_IRA_CALLEE_SAVED_REGISTER_COST_SCALE):
      	New.
      
      Signed-off-by: default avatarH.J. Lu <hjl.tools@gmail.com>
      d3ff498c
Loading