Skip to content
Snippets Groups Projects
  1. Feb 14, 2025
  2. Feb 13, 2025
  3. Feb 11, 2025
  4. Feb 10, 2025
  5. Feb 07, 2025
  6. Feb 06, 2025
  7. Feb 05, 2025
    • rdubner's avatar
      f63d7bf0
    • Jeff Law's avatar
      [committed] Disable ABS instruction on bfin port · 3e08a4ec
      Jeff Law authored
      I was looking at a regression on the bfin port with a recent change to the IRA
      and stumbled across this just doing a general port healthyness evaluation.
      
      The ABS instruction in the blackfin ISA is defined as saturating on INT_MIN,
      which is a bit unexpected.  We certainly can't use it when -fwrapv is enabled.
      Given the failures on the C23 uabs tests, I'm inclined to just disable the
      pattern completely.
      
      Fixes pr23047, uabs-2 and uabs-3.
      
      While it's not a regression, it's the blackfin port, so I think we've got a
      higher degree of freedom here.
      
      Pushing to the trunk.
      
      gcc/
      	* config/bfin/bfin.md (abssi): Disable pattern.
      3e08a4ec
    • Simon Martin's avatar
      c++: Reject default arguments for template class friend functions [PR118319] · 198f4df0
      Simon Martin authored
      We segfault upon the following invalid code
      
      === cut here ===
      template <int> struct S {
        friend void foo (int a = []{}());
      };
      void foo (int a) {}
      int main () {
        S<0> t;
        foo ();
      }
      === cut here ===
      
      The problem is that we end up with a LAMBDA_EXPR callee in
      set_flags_from_callee, and dereference its NULL_TREE
      TREE_TYPE (TREE_TYPE (..)).
      
      This patch sets the default argument to error_mark_node and gives a hard
      error for template class friend functions that do not meet the
      requirement in C++17 11.3.6/4 (the change is restricted to templates per
      discussion with Jason).
      
      	PR c++/118319
      
      gcc/cp/ChangeLog:
      
      	* decl.cc (grokfndecl): Inspect all friend function parameters.
      	If it's not valid for them to have a default value and we're
      	processing a template, set the default value to error_mark_node
      	and give a hard error.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/parse/defarg18.C: New test.
      	* g++.dg/parse/defarg18a.C: New test.
      198f4df0
    • Vladimir N. Makarov's avatar
      [PR115568][LRA]: Use more strict output reload check in rematerialization · 98545441
      Vladimir N. Makarov authored
        In this PR case LRA rematerialized a value from inheritance insn
      instead of output reload one.  This resulted in considering a
      rematerilization candidate value available when it was actually
      not.  As a consequence an insn after rematerliazation used the
      unexpected value and this use resulted in fp exception.  The patch
      fixes this bug.
      
      gcc/ChangeLog:
      
      	PR rtl-optimization/115568
      	* lra-remat.cc (create_cands): Check that output reload insn is
      	adjacent to given insn.  Update a comment.
      
      gcc/testsuite/ChangeLog:
      
      	PR rtl-optimization/115568
      	* gcc.target/i386/pr115568.c: New.
      98545441
    • Ian Lance Taylor's avatar
      go: update builtin function attributes · 0006c07b
      Ian Lance Taylor authored
      	PR go/118746
      	* go-gcc.cc (class Gcc_backend): Define builtin_cold,
      	builtin_leaf, builtin_nonnull.  Alphabetize constants.
      	(Gcc_backend::Gcc_backend): Update attributes for builtin
      	functions to match builtins.def.
      	(Gcc_backend::define_builtin): Split out attribute setting into
      	set_attribtues.
      	(Gcc_backend::set_attribtues): New method split out of
      	define_builtin.  Support new flag values.
      0006c07b
    • Richard Sandiford's avatar
      aarch64: Fix sve/acle/general/ldff1_8.c failures · 50a31b67
      Richard Sandiford authored
      gcc.target/aarch64/sve/acle/general/ldff1_8.c and
      gcc.target/aarch64/sve/ptest_1.c were failing because the
      aarch64 port was giving a zero (unknown) cost to instructions
      that compute two results in parallel.  This was latent until
      r15-1575-gea8061f46a30, which fixed rtl-ssa to treat zero costs
      as unknown.
      
      A long-standing todo here is to make insn_cost derive costs from md
      information, rather than having to write a lot of matching code in
      aarch64_rtx_costs.  But that's not something we can do for GCC 15.
      
      This patch instead treats the cost of a PARALLEL as being the maximum
      cost of its constituent sets.  I don't like this very much, since it
      isn't really target-specific behaviour.  If it were stage 1, I'd be
      trying to change pattern_cost instead.
      
      gcc/
      	* config/aarch64/aarch64.cc (aarch64_insn_cost): Give PARALLELs
      	the same cost as the costliest SET.
      50a31b67
    • Tobias Burnus's avatar
      Fortran/OpenMP: Add location data to 'sorry' [PR118740] · 6f95af4f
      Tobias Burnus authored
      	PR fortran/118740
      
      gcc/fortran/ChangeLog:
      
      	* openmp.cc (gfc_match_omp_context_selector, match_omp_metadirective):
      	Change sorry to sorry_at and use gfc_current_locus as location.
      	* trans-openmp.cc (gfc_trans_omp_clauses): Likewise, but use n->where.
      
      gcc/testsuite/ChangeLog:
      
      	* gfortran.dg/gomp/append_args-2.f90: Update for line change.
      6f95af4f
    • Jakub Jelinek's avatar
      cselib: Fix up previous patch for SPARC [PR117239] · 6094801d
      Jakub Jelinek authored
      Sorry, our CI bot just notified me I broke SPARC build.  There are two
       #ifdef STACK_ADDRESS_OFFSET
      guarded snippets and the macro is only defined on SPARC target, so I didn't
      notice there was a syntax error.
      
      Fixed thusly.
      
      2025-02-05  Jakub Jelinek  <jakub@redhat.com>
      
      	PR rtl-optimization/117239
      	* cselib.cc (cselib_init): Remove spurious closing paren in
      	the #ifdef STACK_ADDRESS_OFFSET specific code.
      6094801d
    • James K. Lowden's avatar
      publish generated cobol documentation · 50fb0d56
      James K. Lowden authored
      50fb0d56
    • James K. Lowden's avatar
      c14a30f3
    • Jakub Jelinek's avatar
      cselib: For CALL_INSNs to const/pure fns invalidate memory below sp [PR117239] · 886ce970
      Jakub Jelinek authored
      The following testcase is miscompiled on x86_64 during postreload.
      After reload (with IPA-RA figuring out the calls don't modify any
      registers but %rax for return value) postreload sees
      (insn 14 12 15 2 (set (mem:DI (plus:DI (reg/f:DI 7 sp)
                      (const_int 16 [0x10])) [0  S8 A64])
              (reg:DI 1 dx [orig:105 q+16 ] [105])) "pr117239.c":18:7 95 {*movdi_internal}
           (nil))
      (call_insn/i 15 14 16 2 (set (reg:SI 0 ax)
              (call (mem:QI (symbol_ref:DI ("baz") [flags 0x3]  <function_decl 0x7ffb2e2bdf00 r>) [0 baz S1 A8])
                  (const_int 24 [0x18]))) "pr117239.c":18:7 1476 {*call_value}
           (expr_list:REG_CALL_DECL (symbol_ref:DI ("baz") [flags 0x3]  <function_decl 0x7ffb2e2bdf00 baz>)
              (expr_list:REG_EH_REGION (const_int 0 [0])
                  (nil)))
          (nil))
      (insn 16 15 18 2 (parallel [
                  (set (reg/f:DI 7 sp)
                      (plus:DI (reg/f:DI 7 sp)
                          (const_int 24 [0x18])))
                  (clobber (reg:CC 17 flags))
              ]) "pr117239.c":18:7 285 {*adddi_1}
           (expr_list:REG_ARGS_SIZE (const_int 0 [0])
              (nil)))
      ...
      (call_insn/i 19 18 21 2 (set (reg:SI 0 ax)
              (call (mem:QI (symbol_ref:DI ("foo") [flags 0x3]  <function_decl 0x7ffb2e2bdb00 l>) [0 foo S1 A8])
                  (const_int 0 [0]))) "pr117239.c":19:3 1476 {*call_value}
           (expr_list:REG_CALL_DECL (symbol_ref:DI ("foo") [flags 0x3]  <function_decl 0x7ffb2e2bdb00 foo>)
              (expr_list:REG_EH_REGION (const_int 0 [0])
                  (nil)))
          (nil))
      (insn 21 19 26 2 (parallel [
                  (set (reg/f:DI 7 sp)
                      (plus:DI (reg/f:DI 7 sp)
                          (const_int -24 [0xffffffffffffffe8])))
                  (clobber (reg:CC 17 flags))
              ]) "pr117239.c":19:3 discrim 1 285 {*adddi_1}
           (expr_list:REG_ARGS_SIZE (const_int 24 [0x18])
              (nil)))
      (insn 26 21 24 2 (set (mem:DI (plus:DI (reg/f:DI 7 sp)
                      (const_int 16 [0x10])) [0  S8 A64])
              (reg:DI 1 dx [orig:105 q+16 ] [105])) "pr117239.c":19:3 discrim 1 95 {*movdi_internal}
           (nil))
      i.e.
              movq    %rdx, 16(%rsp)
              call    baz
              addq    $24, %rsp
      ...
              call    foo
              subq    $24, %rsp
              movq    %rdx, 16(%rsp)
      Now, postreload uses cselib and cselib remembered that %rdx value has been
      stored into 16(%rsp).  Both baz and foo are pure calls.  If they weren't,
      when processing those CALL_INSNs cselib would invalidate all MEMs
            if (RTL_LOOPING_CONST_OR_PURE_CALL_P (insn)
                || !(RTL_CONST_OR_PURE_CALL_P (insn)))
              cselib_invalidate_mem (callmem);
      where callmem is (mem:BLK (scratch)).  But they are pure, so instead the
      code just invalidates the argument slots from CALL_INSN_FUNCTION_USAGE.
      The calls actually clobber more than that, even const/pure calls clobber
      all memory below the stack pointer.  And that is something that hasn't been
      invalidated.  In this failing testcase, the call to baz is not a big deal,
      we don't have anything remembered in memory below %rsp at that call.
      But then we increment %rsp by 24, so the %rsp+16 is now 8 bytes below stack
      and do the call to foo.  And that call now actually, not just in theory,
      clobbers the memory below the stack pointer (in particular overwrites it
      with the return value).  But cselib does not invalidate.  Then %rsp
      is decremented again (in preparation for another call, to bar) and cselib
      is processing store of %rdx (which IPA-RA says has not been modified by
      either baz or foo calls) to %rsp + 16, and it sees the memory already has
      that value, so the store is useless, let's remove it.
      But it is not, the call to foo has changed it, so it needs to be stored
      again.
      
      The following patch adds targetted invalidation of memory below stack
      pointer (or on SPARC memory below stack pointer + 2047 when stack bias is
      used, or on PA memory above stack pointer instead).
      It does so only in !ACCUMULATE_OUTGOING_ARGS or cfun->calls_alloca functions,
      because in other functions the stack pointer should be constant from
      the end of prologue till start of epilogue and so nothing should be stored
      within the function below the stack pointer.
      
      Now, memory below stack pointer is special, except for functions using
      alloca/VLAs I believe no addressable memory should be there, it should be
      purely outgoing function argument area, if we take address of some automatic
      variable, it should live all the time above the outgoing function argument
      area.  So on top of just trying to flush memory below stack pointer
      (represented by %rsp - PTRDIFF_MAX with PTRDIFF_MAX size on most arches),
      the patch tries to optimize and only invalidate memory that has address
      clearly derived from stack pointer (memory with other bases is not
      invalidated) and if we can prove (we see same SP_DERIVED_VALUE_P bases in
      both VALUEs) it is above current stack, also don't call
      canon_anti_dependence which might just give up in certain cases.
      
      I've gathered statistics from x86_64-linux and i686-linux
      bootstraps/regtests.  During -m64 compilations from those, there were
      3718396 + 42634 + 27761 cases of processing MEMs in cselib_invalidate_mem
      (callmem[1]) calls, the first number is number of MEMs not invalidated
      because of the optimization, i.e.
      +             if (sp_derived_base == NULL_RTX)
      +               {
      +                 has_mem = true;
      +                 num_mems++;
      +                 p = &(*p)->next;
      +                 continue;
      +               }
      in the patch, the second number is number of MEMs not invalidated because
      canon_anti_dependence returned false and finally the last number is number
      of MEMs actually invalidated (so that is what hasn't been invalidated
      before).  During -m32 compilations the numbers were
      1422412 + 39354 + 16509 with the same meaning.
      
      Note, when there is no red zone, in theory even the sp = sp + incr
      instruction invalidates memory below the new stack pointer, as signal
      can come and overwrite the memory.  So maybe we should be invalidating
      something at those instructions as well.  But in leaf functions we certainly
      can have even addressable automatic vars in the red zone (which would make
      it harder to distinguish), on the other side aren't normally storing
      anything below the red zone, and in non-leaf it should normally be just the
      outgoing arguments area.
      
      2025-02-05  Jakub Jelinek  <jakub@redhat.com>
      
      	PR rtl-optimization/117239
      	* cselib.cc: Include predict.h.
      	(callmem): Change type from rtx to rtx[2].
      	(cselib_preserve_only_values): Use callmem[0] rather than callmem.
      	(cselib_invalidate_mem): Optimize and don't try to invalidate
      	for the mem_rtx == callmem[1] case MEMs which clearly can't be
      	below the stack pointer.
      	(cselib_process_insn): Use callmem[0] rather than callmem.
      	For const/pure calls also call cselib_invalidate_mem (callmem[1])
      	in !ACCUMULATE_OUTGOING_ARGS or cfun->calls_alloca functions.
      	(cselib_init): Initialize callmem[0] rather than callmem and also
      	initialize callmem[1].
      
      	* gcc.dg/pr117239.c: New test.
      886ce970
    • Richard Earnshaw's avatar
      arm: Use POP {pc} to return when returning [PR118089] · 5163cf2a
      Richard Earnshaw authored
      When generating thumb2 code,
      	LDM SP!, {PC}
      is a two-byte instruction, whereas
      	LDR PC, [SP], #4
      is needs 4 bytes.  When optimizing for size, or when there's no obvious
      performance benefit prefer the former.
      
      gcc/ChangeLog:
      
      	PR target/118089
      	* config/arm/arm.cc (thumb2_expand_return): Use LDM SP!, {PC}
      	when optimizing for size, or when there's no performance benefit over
      	LDR PC, [SP], #4.
      	(arm_expand_epilogue): Likewise.
      5163cf2a
    • Richard Earnshaw's avatar
      arm: remove constraints from *pop_multiple_with_writeback_and_return · b47c7a5a
      Richard Earnshaw authored
      This pattern is intended to be used only by the epilogue generation
      code and will always use fixed hard registers.  As such, it does not
      need any register constraints, which might be misleading if a
      post-reload pass wanted to try renumbering various registers.  So
      remove the constraints.
      
      Futhermore, to permit this pattern to match when popping just the PC
      (which is not a valid register_operand), remove the match on the first
      transfer register: pop_multiple_return will validate everything it
      needs to.
      
      gcc/ChangeLog:
      
      	* config/arm/arm.md (*pop_multiple_with_writeback_and_return): Remove
      	constraints.  Don't validate the first transfer register here.
      b47c7a5a
    • Richard Earnshaw's avatar
      arm: cleanup code in ldm_stm_operation_p; relax limits on ldm/stm · aead1d44
      Richard Earnshaw authored
      I needed to make some adjustments to this function to permit a push or
      pop of a single register in thumb2 code, since ldm/stm can be a
      two-byte instruction instead of 4.  Trying to read the code as it was
      made me scratch my head as the logic was not very clear.  So this
      patch cleans up the code somewhat, fixes a couple of minor bugs and
      removes the limit of having to use multiple registers when using this
      form of the instruction (the shape of this pattern is such that I
      can't see it being generated automatically by the compiler, so there
      should be no adverse affects of this).
      
      Buglets fixed:
        - Validate that the first element contains RETURN if we're matching
          a return instruction.
        - Don't allow the base address register to be stored if saving regs
          and the address is being updated (this is unpredictable in the
          architecture).
        - Verify that the last register loaded in a RETURN insn is the PC.
      
      gcc/
      	* config/arm/arm.cc (decompose_addr_for_ldm_stm): New function.
      	(ldm_stm_operation_p): Rework to clarify logic.  Allow single
      	registers to be pushed or popped using LDM/STM.
      aead1d44
    • Xi Ruoyao's avatar
      vect: Fix wrong code with pr108692.c on targets with only non-widening ABD [PR118727] · da88e702
      Xi Ruoyao authored
      With things like
      
        // signed char a_14, a_16;
        a.0_4 = (unsigned char) a_14;
        _5 = (int) a.0_4;
        b.1_6 = (unsigned char) b_16;
        _7 = (int) b.1_6;
        c_17 = _5 - _7;
        _8 = ABS_EXPR <c_17>;
        r_18 = _8 + r_23;
      
      An ABD pattern will be recognized for _8:
      
        patt_31 = .ABD (a.0_4, b.1_6);
      
      It's still correct.  But then when the SAD pattern is recognized:
      
        patt_29 = SAD_EXPR <a_14, b_16, r_23>;
      
      This is not correct.  This only happens for targets with both uabd and
      sabd but not vec_widen_{s,u}abd, currently LoongArch is the only target
      affected.
      
      The problem is vect_look_through_possible_promotion will throw away a
      series of conversions if the effect is equivalent to a sign change and a
      promotion, but here the sign change is definitely relevant, and the
      promotion is also relevant for "mixed sign" cases like
      r += abs((unsigned int)(unsigned char) a - (signed int)(signed char) b
      (we need to promote to HImode as the difference can exceed the range of
      QImode).
      
      If there were any redundant promotion, it should have been stripped in
      vect_recog_abd_pattern (i.e. when patt_31 = .ABD (a.0_4, b.1_6) is
      recognized) instead of in vect_recog_sad_pattern, or we'd have a
      missed-optimization if the ABD output is not summerized.  So anyway
      vect_recog_sad_pattern is just not a proper location to call
      vect_look_through_possible_promotion for the ABD inputs, remove the
      calls to fix the issue.
      
      gcc/ChangeLog:
      
      	PR tree-optimization/118727
      	* tree-vect-patterns.cc (vect_recog_sad_pattern): Don't call
      	vect_look_through_possible_promotion on ABD inputs.
      
      gcc/testsuite/ChangeLog:
      
      	PR tree-optimization/118727
      	* gcc.dg/pr108692.c: Mention PR 118727 in the comment.
      	* gcc.dg/pr118727.c: New test case.
      Unverified
      da88e702
    • Richard Sandiford's avatar
      testsuite: Revert to the original version of pr100056.c · 754137d9
      Richard Sandiford authored
      r15-268-g9dbff9c05520 restored the original GCC 11 output for
      pr100056.c, so this patch reverts the changes made to the test
      in r12-7259-g25332d2325c7.  (The code parts of r12-7259 still
      seem useful, as a belt-and-braces thing.)
      
      gcc/testsuite/
      	* gcc.target/aarch64/pr100056.c: Restore the original version of
      	the scan-assemblers.
      754137d9
    • Rainer Orth's avatar
      libstdc++: Fix gnu.ver CXXABI_1.3.16 for Solaris [PR118701] · 6b49883e
      Rainer Orth authored
      This patch
      
      commit c6977f76
      Author: Andreas Schwab <schwab@suse.de>
      Date:   Tue Jan 21 23:50:15 2025 +0100
      
          libstdc++: correct symbol version of typeinfo for bfloat16_t on RISC-V
      
      broke the libstdc++-abi/abi_check test on Solaris: the log shows
      
      1 incompatible symbols
      0
      Argument "{CXXABI_1.3.15}" isn't numeric in numeric eq (==) at /vol/gcc/src/hg/master/local/libstdc++-v3/scripts/extract_symvers.pl line 129.
      version status: incompatible
      type: uncategorized
      status: added
      
      The problem has two parts:
      
      * The patch above introduced a new version in libstdc++.so,
        CXXABI_1.3.16, which everywhere but on RISC-V contains no symbols (a
        weak version).  This is the first time this happened in libstdc++.
      
      * Solaris uses scripts/extract_symvers.pl to determine the version info.
        The script currently chokes on the pvs output for weak versions:
      
        libstdc++.so.6.0.34 -	CXXABI_1.3.16 [WEAK]: {CXXABI_1.3.15};
      
        instead of
      
        libstdc++.so.6.0.34 -	CXXABI_1.3.16: {CXXABI_1.3.15};
      
      While this patch hardens the script to cope with weak versions, there's
      no reason to introduce them in the first place.  So the new version is
      only created on __riscv.
      
      Tested on i386-pc-solaris2.11, sparc-sun-solaris2.11, and
      x86_64-pc-linux-gnu.
      
      2025-01-29  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>
      	    Jonathan Wakely  <jwakely@redhat.com>
      
      	libstdc++-v3:
      	PR libstdc++/118701
      	* config/abi/pre/gnu.ver (CXXABI_1.3.16): Move __riscv guard
      	around version.
      	* scripts/extract_symvers.pl: Allow for weak versions.
      	* testsuite/util/testsuite_abi.cc (check_version): Wrap
      	CXXABI_1.3.16 in __riscv.
      6b49883e
    • Jin Ma's avatar
      MAINTAINERS: Add myself to write after approval · 884893ae
      Jin Ma authored
      ChangeLog:
      
      	* MAINTAINERS: Add myself.
      884893ae
    • Tobias Burnus's avatar
      fortran/trans-openmp.cc: Use the correct member in gfc_omp_namelist [PR118745] · 3a588270
      Tobias Burnus authored
      gcc/fortran/ChangeLog:
      
      	PR fortran/118745
      	* trans-openmp.cc (gfc_trans_omp_declare_variant): Use
      	append_args_list in the condition for the append_arg location.
      3a588270
    • Jerry DeLisle's avatar
      Fortran: Fix PR 47485. · e41a5a2a
      Jerry DeLisle authored
      
      The -MT and -MQ options should replace the default target in the
      generated dependency file. deps_add_target needs to be called before
      cpp_read_main_file, otherwise the original object name is added.
      
      Contributed by Vincent Vanlaer <vincenttc@volkihar.be>
      
      	PR fortran/47485
      
      gcc/fortran/ChangeLog:
      
      	* cpp.cc: fix -MT/-MQ adding additional target instead of
      	replacing the default.
      
      gcc/testsuite/ChangeLog:
      
      	* gfortran.dg/dependency_generation_1.f90: New test.
      
      Signed-off-by: default avatarVincent Vanlaer <vincenttc@volkihar.be>
      e41a5a2a
Loading