Skip to content
Snippets Groups Projects
  1. Jul 31, 2024
    • Richard Biener's avatar
      middle-end/101478 - ICE with degenerate address during gimplification · 33ead640
      Richard Biener authored
      When we gimplify &MEM[0B + 4] we are re-folding the address in case
      types are not canonical which ends up with a constant address that
      recompute_tree_invariant_for_addr_expr ICEs on.  Properly guard
      that call.
      
      	PR middle-end/101478
      	* gimplify.cc (gimplify_addr_expr): Check we still have an
      	ADDR_EXPR before calling recompute_tree_invariant_for_addr_expr.
      
      	* gcc.dg/pr101478.c: New testcase.
      33ead640
    • Hongyu Wang's avatar
      i386: Mark target option with optimization when enabled with opt level [PR116065] · a59c4e49
      Hongyu Wang authored
      When introducing munroll-only-small-loops, the option was marked as
      Target Save and added to -O2 default which makes attribute(optimize)
      resets target option and causing error when cmdline has O1 and
      funciton attribute has O2 and other target options. Mark this option
      as Optimization to fix.
      
      gcc/ChangeLog
      
      	PR target/116065
      	* config/i386/i386.opt (munroll-only-small-loops): Mark as
      	Optimization instead of Save.
      
      gcc/testsuite/ChangeLog
      
      	PR target/116065
      	* gcc.target/i386/pr116065.c: New test.
      a59c4e49
    • Richard Sandiford's avatar
      recog: Disallow subregs in mode-punned value [PR115881] · d63b6d8b
      Richard Sandiford authored
      In g:9d20529d, I'd extended
      insn_propagation to handle simple cases of hard-reg mode punning.
      The punned "to" value was created using simplify_subreg rather
      than simplify_gen_subreg, on the basis that hard-coded subregs
      aren't generally useful after RA (where hard-reg propagation is
      expected to happen).
      
      This PR is about a case where the subreg gets pushed into the
      operands of a plus, but the subreg on one of the operands
      cannot be simplified.  Specifically, we have to generate
      (subreg:SI (reg:DI sp) 0) rather than (reg:SI sp), since all
      references to the stack pointer must be via stack_pointer_rtx.
      
      However, code in x86 (reasonably) expects no subregs of registers
      to appear after RA, except for special cases like strict_low_part.
      This leads to an awkward situation where we can't ban subregs of sp
      (because of the strict_low_part use), can't allow direct references
      to sp in other modes (because of the stack_pointer_rtx requirement),
      and can't allow rvalue uses of the subreg (because of the "no subregs
      after RA" assumption).  It all seems a bit of a mess...
      
      I sat on this for a while in the hope that a clean solution might
      become apparent, but in the end, I think we'll just have to check
      manually for nested subregs and punt on them.
      
      gcc/
      	PR rtl-optimization/115881
      	* recog.cc: Include rtl-iter.h.
      	(insn_propagation::apply_to_rvalue_1): Check that the result
      	of simplify_subreg does not include nested subregs.
      
      gcc/testsuite/
      	PR rtl-optimization/115881
      	* gcc.c-torture/compile/pr115881.c: New test.
      d63b6d8b
    • Kewen Lin's avatar
      rs6000: Relax some FLOAT128 expander condition for FLOAT128_IEEE_P [PR105359] · 993a3c08
      Kewen Lin authored
      As PR105359 shows, we disable some FLOAT128 expanders for
      64-bit long double, but in fact IEEE float128 types like
      __ieee128 are only guarded with TARGET_FLOAT128_TYPE and
      TARGET_LONG_DOUBLE_128 is only checked when determining if
      we can reuse long_double_type_node.  So this patch is to
      relax all affected FLOAT128 expander conditions for
      FLOAT128_IEEE_P.  By the way, currently IBM double double
      type __ibm128 is guarded by TARGET_LONG_DOUBLE_128, so we
      have to use TARGET_LONG_DOUBLE_128 for it.  IMHO, it's not
      necessary and can be enhanced later.
      
      Btw, for all test cases mentioned in PR105359, I removed
      the xfails and tested them with explicit -mlong-double-64,
      both pr79004.c and float128-hw.c are tested well and
      float128-hw4.c isn't tested (unsupported due to 64 bit
      long double conflicts with -mabi=ieeelongdouble).
      
      	PR target/105359
      
      gcc/ChangeLog:
      
      	* config/rs6000/rs6000.md (@extenddf<FLOAT128:mode>2): Don't check
      	TARGET_LONG_DOUBLE_128 for FLOAT128_IEEE_P modes.
      	(extendsf<FLOAT128:mode>2): Likewise.
      	(trunc<FLOAT128:mode>df2): Likewise.
      	(trunc<FLOAT128:mode>sf2): Likewise.
      	(floatsi<FLOAT128:mode>2): Likewise.
      	(fix_trunc<FLOAT128:mode>si2): Likewise.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/powerpc/pr79004.c: Remove xfails.
      993a3c08
    • Kewen Lin's avatar
      rs6000: Use standard name uabd for absdu insns · 169341f0
      Kewen Lin authored
      r14-1832 adds recognition pattern, ifn and optab for ABD
      (ABsolute Difference), we have some vector absolute
      difference unsigned instructions since ISA 3.0, as the
      associated test cases shown, they are not exploited well
      as we don't define it (them) with a standard name.  So this
      patch is to rename it with standard name first.  And it
      merges both define_expand and define_insn as a separated
      define_expand isn't needed.  Besides, it adjusts the RTL
      pattern by using generic umax and umin rather than
      UNSPEC_VADU, it's more meaningful and can catch umin/umax
      opportunity.
      
      gcc/ChangeLog:
      
      	* config/rs6000/altivec.md (p9_vadu<mode>3): Rename to ...
      	(uabd<mode>3): ... this.  Update RTL pattern with umin and umax rather
      	than UNSPEC_VADU.
      	(vadu<mode>3): Remove.
      	(UNSPEC_VADU): Remove.
      	(usadv16qi): Replace gen_p9_vaduv16qi3 with gen_uabdv16qi3.
      	(usadv8hi): Replace gen_p9_vaduv8hi3 with gen_uabdv8hi3.
      	* config/rs6000/rs6000-builtins.def (__builtin_altivec_vadub): Replace
      	expander with uabdv16qi3.
      	(__builtin_altivec_vaduh): Adjust expander with uabdv8hi3.
      	(__builtin_altivec_vaduw): Adjust expander with uabdv4si3.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/powerpc/abd-vectorize-1.c: New test.
      	* gcc.target/powerpc/abd-vectorize-2.c: New test.
      169341f0
    • Xi Ruoyao's avatar
      LoongArch: Expand some SImode operations through "si3_extend" instructions if TARGET_64BIT · b929083d
      Xi Ruoyao authored
      We already had "si3_extend" insns and we hoped the fwprop or combine
      passes can use them to remove unnecessary sign extensions.  But this
      does not always work: for cases like x << 1 | y, the compiler
      tends to do
      
          (sign_extend:DI
            (ior:SI (ashift:SI (reg:SI $r4)
                               (const_int 1))
                    (reg:SI $r5)))
      
      instead of
      
          (ior:DI (sign_extend:DI (ashift:SI (reg:SI $r4) (const_int 1)))
                  (sign_extend:DI (reg:SI $r5)))
      
      So we cannot match the ashlsi3_extend instruction here and we get:
      
          slli.w $r4,$r4,1
          or     $r4,$r5,$r4
          slli.w $r4,$r4,0    # <= redundant
          jr	   $r1
      
      To eliminate this redundant extension we need to turn SImode shift etc.
      to DImode "si3_extend" operations earlier, when we expand the SImode
      operation.  We are already doing this for addition, now do it for
      shifts, rotates, substract, multiplication, division, and modulo as
      well.
      
      The bytepick.w definition for TARGET_64BIT needs to be adjusted so it
      won't be undone by the shift expanding.
      
      gcc/ChangeLog:
      
      	* config/loongarch/loongarch.md (optab): Add (rotatert "rotr").
      	(<optab:any_shift><mode>3, <optab:any_div><mode>3,
      	sub<mode>3, rotr<mode>3, mul<mode>3): Add a "*" to the insn name
      	so we can redefine the names with define_expand.
      	(*<optab:any_shift>si3_extend): Remove "*" so we can use them
      	in expanders.
      	(*subsi3_extended, *mulsi3_extended): Likewise, also remove the
      	trailing "ed" for consistency.
      	(*<optab:any_div>si3_extended): Add mode for sign_extend to
      	prevent an ICE using it in expanders.
      	(shift_w, arith_w): New define_code_iterator.
      	(<optab:any_w><mode>3): New define_expand.  Expand with
      	<optab:any_w>si3_extend for SImode if TARGET_64BIT.
      	(<optab:arith_w><mode>3): Likewise.
      	(mul<mode>3): Expand to mulsi3_extended for SImode if
      	TARGET_64BIT and ISA_HAS_DIV32.
      	(<optab:any_div><mode>3): Expand to <optab:any_div>si3_extended
      	for SImode if TARGET_64BIT.
      	(rotl<mode>3): Expand to rotrsi3_extend for SImode if
      	TARGET_64BIT.
      	(bytepick_w_<bytepick_imm>): Add mode for lshiftrt and ashift.
      	(bitsize, bytepick_imm, bytepick_w_ashift_amount): New
      	define_mode_attr.
      	(bytepick_w_<bytepick_imm>_extend): Adjust for the RTL change
      	caused by 32-bit shift expanding.  Now bytepick_imm only covers
      	2 and 3, separate one remaining case to ...
      	(bytepick_w_1_extend): ... here, new define_insn.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/loongarch/bitwise_extend.c: New test.
      b929083d
    • GCC Administrator's avatar
      Daily bump. · e7f6a5dc
      GCC Administrator authored
      e7f6a5dc
  2. Jul 30, 2024
    • Jonathan Wakely's avatar
      libstdc++: Fix formatter for low-resolution chrono::zoned_time (LWG 4124) · 4883c957
      Jonathan Wakely authored
      This implements the proposed resolution of LWG 4124, so that
      low-resolution chrono::zoned_time objects can be formatted. The
      formatter for zoned_time<D, P> needs to account for get_local_time
      returning local_time<common_type_t<D, seconds>> not local_time<D>.
      
      libstdc++-v3/ChangeLog:
      
      	* include/bits/chrono_io.h (__local_time_fmt_for): New alias
      	template.
      	(formatter<zoned_time<D, P>>): Use __local_time_fmt_for.
      	* testsuite/std/time/zoned_time/io.cc: Check zoned_time<minutes>
      	can be formatted.
      4883c957
    • Jonathan Wakely's avatar
      libstdc++: Fix std::format output for std::chrono::zoned_time · 8f05ada7
      Jonathan Wakely authored
      When formatting a chrono::zoned_time with an empty chrono-specs, we were
      only formatting its _M_time member, but the ostream insertion operator
      uses the format "{:L%F %T %Z}" which includes the time zone
      abbreviation. The %Z should also be used when formatting with an empty
      chrono-specs.
      
      This commit makes _M_format_to_ostream handle __local_time_fmt
      specializations directly, rather than calling itself recursively to
      format the _M_time member. We need to be able to customize the output of
      _M_format_to_ostream for __local_time_fmt, because we use that type for
      gps_time and tai_time as well as for zoned_time and __local_time_fmt.
      When formatting gps_time and tai_time we don't want to include the time
      zone abbreviation in the "{}" output, but for zoned_time we do want to.
      We can reuse the __is_neg flag passed to _M_format_to_ostream (via
      _M_format) to say that we want the time zone abbreviation.  Currently
      the __is_neg flag is only used for duration specializations, so it's
      available for __local_time_fmt to use.
      
      In addition to fixing the zoned_time output to use %Z, this commit also
      changes the __local_time_fmt output to use %Z. Previously it didn't use
      it, just like zoned_time.  The standard doesn't actually say how to
      format local-time-format-t for an empty chrono-specs, but this behaviour
      seems sensible and is what I'm proposing as part of LWG 4124.
      
      While testing this I noticed that some chrono types were not being
      tested with empty chrono-specs, so this adds more tests. I also noticed
      that std/time/clock/local/io.cc was testing tai_time instead of
      local_time, which was completely wrong. That's fixed now too.
      
      libstdc++-v3/ChangeLog:
      
      	* include/bits/chrono_io.h (__local_fmt_t): Remove unused
      	declaration.
      	(__formatter_chrono::_M_format_to_ostream): Add explicit
      	handling for specializations of __local_time_fmt, including the
      	time zone abbreviation in the output if __is_neg is true.
      	(formatter<chrono::tai_time<D>>::format): Add comment.
      	(formatter<chrono::gps_time<D>>::format): Likewise.
      	(formatter<chrono::__detail::__local_time_fmt::format): Call
      	_M_format with true for the __is_neg flag.
      	* testsuite/std/time/clock/gps/io.cc: Remove unused variable.
      	* testsuite/std/time/clock/local/io.cc: Fix test error that
      	checked tai_time instead of local_time. Add tests for
      	local-time-format-t formatting.
      	* testsuite/std/time/clock/system/io.cc: Check empty
      	chrono-specs.
      	* testsuite/std/time/clock/tai/io.cc: Likewise.
      	* testsuite/std/time/zoned_time/io.cc: Likewise.
      8f05ada7
    • Jonathan Wakely's avatar
      libstdc++: Implement LWG 3886 for std::optional and std::expected · a9e472c6
      Jonathan Wakely authored
      This uses remove_cv_t<T> for the default template argument used for
      deducing a type for a braced-init-list used with std::optional and
      std::expected.
      
      libstdc++-v3/ChangeLog:
      
      	* include/std/expected (expected(U&&), operator=(U&&))
      	(value_or): Use remove_cv_t on default template argument, as per
      	LWG 3886.
      	* include/std/optional (optional(U&&), operator=(U&&))
      	(value_or): Likewise.
      	* testsuite/20_util/expected/lwg3886.cc: New test.
      	* testsuite/20_util/optional/cons/lwg3886.cc: New test.
      a9e472c6
    • Sam James's avatar
      testsuite: fix 'dg-compile' typos · acc70606
      Sam James authored
      'dg-compile' is not a thing, replace it with 'dg-do compile'.
      
      	PR target/68015
      	PR c++/83979
      	* c-c++-common/goacc/loop-shape.c: Fix 'dg-compile' typo.
      	* g++.dg/pr83979.C: Likewise.
      	* g++.target/aarch64/sve/acle/general-c++/attributes_2.C: Likewise.
      	* gcc.dg/tree-ssa/builtin-sprintf-7.c: Likewise.
      	* gcc.dg/tree-ssa/builtin-sprintf-8.c: Likewise.
      	* gcc.target/riscv/amo/zabha-rvwmo-all-amo-ops-char.c: Likewise.
      	* gcc.target/riscv/amo/zabha-rvwmo-all-amo-ops-short.c: Likewise.
      	* gcc.target/s390/20181024-1.c: Likewise.
      	* gcc.target/s390/addr-constraints-1.c: Likewise.
      	* gcc.target/s390/arch12/aghsghmgh-1.c: Likewise.
      	* gcc.target/s390/arch12/mul-1.c: Likewise.
      	* gcc.target/s390/arch13/bitops-1.c: Likewise.
      	* gcc.target/s390/arch13/bitops-2.c: Likewise.
      	* gcc.target/s390/arch13/fp-signedint-convert-1.c: Likewise.
      	* gcc.target/s390/arch13/fp-unsignedint-convert-1.c: Likewise.
      	* gcc.target/s390/arch13/popcount-1.c: Likewise.
      	* gcc.target/s390/pr68015.c: Likewise.
      	* gcc.target/s390/vector/fp-signedint-convert-1.c: Likewise.
      	* gcc.target/s390/vector/fp-unsignedint-convert-1.c: Likewise.
      	* gcc.target/s390/vector/reverse-elements-1.c: Likewise.
      	* gcc.target/s390/vector/reverse-elements-2.c: Likewise.
      	* gcc.target/s390/vector/reverse-elements-3.c: Likewise.
      	* gcc.target/s390/vector/reverse-elements-4.c: Likewise.
      	* gcc.target/s390/vector/reverse-elements-5.c: Likewise.
      	* gcc.target/s390/vector/reverse-elements-6.c: Likewise.
      	* gcc.target/s390/vector/reverse-elements-7.c: Likewise.
      	* gnat.dg/alignment15.adb: Likewise.
      	* gnat.dg/debug4.adb: Likewise.
      	* gnat.dg/inline21.adb: Likewise.
      	* gnat.dg/inline22.adb: Likewise.
      	* gnat.dg/opt37.adb: Likewise.
      	* gnat.dg/warn13.adb: Likewise.
      acc70606
    • Jonathan Wakely's avatar
      libstdc++: Fix name of source file in comment · df67f383
      Jonathan Wakely authored
      libstdc++-v3/ChangeLog:
      
      	* src/c++17/fs_ops.cc: Fix file name in comment.
      df67f383
    • Uros Bizjak's avatar
      i386/testsuite: Add testcase for fixed PR [PR51492] · 8b737ec2
      Uros Bizjak authored
      	PR target/51492
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/i386/pr51492.c: New test.
      8b737ec2
    • Edwin Lu's avatar
      RISC-V: Add configure check for B extention support · 7ef8a9d4
      Edwin Lu authored
      
      Binutils 2.42 and before don't recognize the b extension in the march
      strings even though it supports zba_zbb_zbs. Add a configure check to
      ignore the b in the march string if found.
      
      gcc/ChangeLog:
      
      	* common/config/riscv/riscv-common.cc (riscv_subset_list::to_string):
      	Skip b in march string
      	* config.in: Regenerate.
      	* configure: Regenerate.
      	* configure.ac: Add B assembler check
      
      Signed-off-by: default avatarEdwin Lu <ewlu@rivosinc.com>
      7ef8a9d4
    • Sam James's avatar
      testsuite: fix whitespace in dg-require-effective-target directives · ee12a13d
      Sam James authored
      	PR middle-end/54400
      	PR target/98161
      	* gcc.dg/vect/bb-slp-layout-18.c: Fix whitespace in dg directive.
      	* gcc.dg/vect/bb-slp-pr54400.c: Likewise.
      	* gcc.target/i386/pr98161.c: Likewise.
      ee12a13d
    • Filip Kastl's avatar
      gimple ssa: Teach switch conversion to optimize powers of 2 switches · 2b3533cd
      Filip Kastl authored
      
      Sometimes a switch has case numbers that are powers of 2.  Switch
      conversion usually isn't able to optimize these switches.  This patch
      adds "exponential index transformation" to switch conversion.  After
      switch conversion applies this transformation on the switch the index
      variable of the switch becomes the exponent instead of the whole value.
      For example:
      
      switch (i)
        {
          case (1 << 0): return 0;
          case (1 << 1): return 1;
          case (1 << 2): return 2;
          ...
          case (1 << 30): return 30;
          default: return 31;
        }
      
      gets transformed roughly into
      
      switch (log2(i))
        {
          case 0: return 0;
          case 1: return 1;
          case 2: return 2;
          ...
          case 30: return 30;
          default: return 31;
        }
      
      This enables switch conversion to further optimize the switch.
      
      This patch only enables this transformation if there are optabs for FFS
      so that the base 2 logarithm can be computed efficiently at runtime.
      
      gcc/ChangeLog:
      
      	* tree-switch-conversion.cc (can_log2): New static function to
      	check if gen_log2 can be used on current target.
      	(gen_log2): New static function to generate efficient GIMPLE
      	code for taking an exact base 2 log.
      	(gen_pow2p): New static function to generate efficient GIMPLE
      	code for checking if a value is a power of 2.
      	(switch_conversion::switch_conversion): Track if the
      	transformation happened.
      	(switch_conversion::is_exp_index_transform_viable): New function
      	to decide whether the transformation should be applied.
      	(switch_conversion::exp_index_transform): New function to
      	execute the transformation.
      	(switch_conversion::gen_inbound_check): Don't remove the default
      	BB if the transformation happened.
      	(switch_conversion::expand): Execute the transform if it is
      	viable.  Skip the "sufficiently small case range" test if the
      	transformation is going to be executed.
      	* tree-switch-conversion.h: Add is_exp_index_transform_viable
      	and exp_index_transform.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.dg/tree-ssa/switch-3.c: Disable switch conversion.
      	* gcc.target/i386/switch-exp-transform-1.c: New test.
      	* gcc.target/i386/switch-exp-transform-2.c: New test.
      	* gcc.target/i386/switch-exp-transform-3.c: New test.
      
      Signed-off-by: default avatarFilip Kastl <fkastl@suse.cz>
      2b3533cd
    • Ian Lance Taylor's avatar
      libbacktrace: fix syntax of Windows registration functions · 37aa98f7
      Ian Lance Taylor authored
      Adjust the syntax to keep MSVC happy.
      
      Fixes https://github.com/ianlancetaylor/libbacktrace/issues/131
      
      	* pecoff.c (LDR_DLL_NOTIFICATION): Put function modifier
      	inside parentheses.
      	(LDR_REGISTER_FUNCTION): Likewise.
      37aa98f7
    • Sam James's avatar
      testsuite: fix whitespace in dg-do assemble directive · 2d105efd
      Sam James authored
      	* gcc.target/aarch64/simd/vmmla.c: Fix whitespace in dg directive.
      2d105efd
    • Sam James's avatar
      testsuite: fix whitespace in dg-do preprocess directive · 7f1aa73b
      Sam James authored
      	PR preprocessor/90581
      	* c-c++-common/cpp/fmax-include-depth.c: Fix whitespace in dg directive.
      7f1aa73b
    • Sam James's avatar
      testsuite: fix whitespace in dg-do compile directives · 2e662ded
      Sam James authored
      Nothing seems to change here in reality at least on x86_64-pc-linux-gnu,
      but important to fix nonetheless in case people copy it.
      
      	PR rtl-optimization/48633
      	PR tree-optimization/83072
      	PR tree-optimization/83073
      	PR tree-optimization/96542
      	PR tree-optimization/96707
      	PR tree-optimization/97567
      	PR target/69225
      	PR target/89929
      	PR target/96562
      	* g++.dg/pr48633.C: Fix whitespace in dg directive.
      	* g++.dg/pr96707.C: Likewise.
      	* g++.target/i386/mv28.C: Likewise.
      	* gcc.dg/Warray-bounds-flex-arrays-1.c: Likewise.
      	* gcc.dg/pr83072-2.c: Likewise.
      	* gcc.dg/pr83073.c: Likewise.
      	* gcc.dg/pr96542.c: Likewise.
      	* gcc.dg/pr97567-2.c: Likewise.
      	* gcc.target/i386/avx512fp16-11a.c: Likewise.
      	* gcc.target/i386/avx512fp16-13.c: Likewise.
      	* gcc.target/i386/avx512fp16-14.c: Likewise.
      	* gcc.target/i386/avx512fp16-conjugation-1.c: Likewise.
      	* gcc.target/i386/avx512fp16-neg-1a.c: Likewise.
      	* gcc.target/i386/avx512fp16-set1-pch-1a.c: Likewise.
      	* gcc.target/i386/avx512fp16vl-conjugation-1.c: Likewise.
      	* gcc.target/i386/avx512fp16vl-neg-1a.c: Likewise.
      	* gcc.target/i386/avx512fp16vl-set1-pch-1a.c: Likewise.
      	* gcc.target/i386/avx512vlfp16-11a.c: Likewise.
      	* gcc.target/i386/pr69225-1.c: Likewise.
      	* gcc.target/i386/pr69225-2.c: Likewise.
      	* gcc.target/i386/pr69225-3.c: Likewise.
      	* gcc.target/i386/pr69225-4.c: Likewise.
      	* gcc.target/i386/pr69225-5.c: Likewise.
      	* gcc.target/i386/pr69225-6.c: Likewise.
      	* gcc.target/i386/pr69225-7.c: Likewise.
      	* gcc.target/i386/pr96562-1.c: Likewise.
      	* gcc.target/riscv/rv32e_stack.c: Likewise.
      	* gfortran.dg/c-interop/removed-restrictions-3.f90: Likewise.
      	* gnat.dg/renaming1.adb: Likewise.
      2e662ded
    • Gianluca Guida's avatar
      RISC-V: Add basic support for the Zacas extension · 11c2453a
      Gianluca Guida authored
      This patch adds support for amocas.{b|h|w|d}. Support for amocas.q
      (64/128 bit cas for rv32/64) will be added in a future patch.
      
      Extension: https://github.com/riscv/riscv-zacas
      Ratification: https://jira.riscv.org/browse/RVS-680
      
      
      
      gcc/ChangeLog:
      
      	* common/config/riscv/riscv-common.cc: Add zacas extension.
      	* config/riscv/arch-canonicalize: Make zacas imply zaamo.
      	* config/riscv/riscv.opt: Add zacas.
      	* config/riscv/sync.md (zacas_atomic_cas_value<mode>): New pattern.
      	(atomic_compare_and_swap<mode>): Use new pattern for compare-and-swap ops.
      	(zalrsc_atomic_cas_value_strong<mode>): Rename atomic_cas_value_strong.
      	* doc/sourcebuild.texi: Add Zacas documentation.
      
      gcc/testsuite/ChangeLog:
      
      	* lib/target-supports.exp: Add zacas testsuite infra support.
      	* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-acquire-release.c:
      	Remove zacas to continue to test the lr/sc pairs.
      	* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-acquire.c: Ditto.
      	* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-consume.c: Ditto.
      	* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-relaxed.c: Ditto.
      	* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-release.c: Ditto.
      	* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-seq-cst-relaxed.c: Ditto.
      	* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-seq-cst.c: Ditto.
      	* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-acquire-release.c: Ditto.
      	* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-acquire.c: Ditto.
      	* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-consume.c: Ditto.
      	* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-relaxed.c: Ditto.
      	* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-release.c: Ditto.
      	* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-seq-cst-relaxed.c: Ditto.
      	* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-seq-cst.c: Ditto.
      	* gcc.target/riscv/amo/zabha-zacas-preferred-over-zalrsc.c: New test.
      	* gcc.target/riscv/amo/zacas-char-requires-zabha.c: New test.
      	* gcc.target/riscv/amo/zacas-char-requires-zacas.c: New test.
      	* gcc.target/riscv/amo/zacas-preferred-over-zalrsc.c: New test.
      	* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-char-acq-rel.c: New test.
      	* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-char-acquire.c: New test.
      	* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-char-relaxed.c: New test.
      	* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-char-release.c: New test.
      	* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-char-seq-cst.c: New test.
      	* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-compatability-mapping-no-fence.c:
      	New test.
      	* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-compatability-mapping.cc: New test.
      	* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-int-acq-rel.c: New test.
      	* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-int-acquire.c: New test.
      	* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-int-relaxed.c: New test.
      	* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-int-release.c: New test.
      	* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-int-seq-cst.c: New test.
      	* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-short-acq-rel.c: New test.
      	* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-short-acquire.c: New test.
      	* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-short-relaxed.c: New test.
      	* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-short-release.c: New test.
      	* gcc.target/riscv/amo/zacas-rvwmo-compare-exchange-short-seq-cst.c: New test.
      	* gcc.target/riscv/amo/zacas-ztso-compare-exchange-char-seq-cst.c: New test.
      	* gcc.target/riscv/amo/zacas-ztso-compare-exchange-char.c: New test.
      	* gcc.target/riscv/amo/zacas-ztso-compare-exchange-compatability-mapping-no-fence.c:
      	New test.
      	* gcc.target/riscv/amo/zacas-ztso-compare-exchange-compatability-mapping.cc: New test.
      	* gcc.target/riscv/amo/zacas-ztso-compare-exchange-int-seq-cst.c: New test.
      	* gcc.target/riscv/amo/zacas-ztso-compare-exchange-int.c: New test.
      	* gcc.target/riscv/amo/zacas-ztso-compare-exchange-short-seq-cst.c: New test.
      	* gcc.target/riscv/amo/zacas-ztso-compare-exchange-short.c: New test.
      
      Co-authored-by: default avatarPatrick O'Neill <patrick@rivosinc.com>
      Tested-by: default avatarAndrea Parri <andrea@rivosinc.com>
      Signed-Off-By: default avatarGianluca Guida <gianluca@rivosinc.com>
      11c2453a
    • Patrick O'Neill's avatar
      RISC-V: Remove configure check for zabha · c0af64af
      Patrick O'Neill authored
      
      This patch removes the zabha configure check since it's not a breaking change
      and updates the existing zaamo/zalrsc comment.
      
      gcc/ChangeLog:
      
      	* common/config/riscv/riscv-common.cc
      	(riscv_subset_list::to_string): Remove zabha configure check
      	handling and clarify zaamo/zalrsc comment.
      	* config.in: Regenerate.
      	* configure: Regenerate.
      	* configure.ac: Remove zabha configure check.
      
      Signed-off-by: default avatarPatrick O'Neill <patrick@rivosinc.com>
      c0af64af
    • Jonathan Wakely's avatar
      libstdc++: Fix overwriting files with fs::copy_file on Windows · 017e3f89
      Jonathan Wakely authored
      There are no inode numbers on Windows filesystems, so stat_type::st_ino
      is always zero and the check for equivalent files in do_copy_file was
      incorrectly identifying distinct files as equivalent. This caused
      copy_file to incorrectly report errors when trying to overwrite existing
      files.
      
      The fs::equivalent function already does the right thing on Windows, so
      factor that logic out into a new function that can be reused by
      fs::copy_file.
      
      The tests for fs::copy_file were quite inadequate, so this also adds
      checks for that function's error conditions.
      
      libstdc++-v3/ChangeLog:
      
      	* src/c++17/fs_ops.cc (auto_win_file_handle): Change constructor
      	parameter from const path& to const wchar_t*.
      	(fs::equiv_files): New function.
      	(fs::equivalent): Use equiv_files.
      	* src/filesystem/ops-common.h (fs::equiv_files): Declare.
      	(do_copy_file): Use equiv_files.
      	* src/filesystem/ops.cc (fs::equiv_files): Define.
      	(fs::copy, fs::equivalent): Use equiv_files.
      	* testsuite/27_io/filesystem/operations/copy.cc: Test
      	overwriting directory contents recursively.
      	* testsuite/27_io/filesystem/operations/copy_file.cc: Test
      	overwriting existing files.
      017e3f89
    • Lennox Shou Hao Ho's avatar
      libstdc++: Fix fs::hard_link_count behaviour on MinGW [PR113663] · 65819365
      Lennox Shou Hao Ho authored
      std::filesystem::hard_link_count() always returns 1 on
      mingw-w64ucrt-11.0.1-r3 on Windows 10 19045
      
      hard_link_count() queries _wstat64() on MinGW-w64
      The MSFT documentation claims _wstat64() will always return 1 *non*-NTFS volumes
      https://learn.microsoft.com/en-us/previous-versions/visualstudio/visual-studio-2013/14h5k7ff(v=vs.120)
      
      
      
      My tests suggest that is not always true -
      hard_link_count()/_wstat64() still returns 1 on NTFS.
      GetFileInformationByHandle does return the correct result of 2.
      Please see the PR for a minimal repro.
      
      This patch changes the Windows implementation to always call
      GetFileInformationByHandle.
      
      	PR libstdc++/113663
      
      libstdc++-v3/ChangeLog:
      
      	* src/c++17/fs_ops.cc (fs::equivalent): Moved helper class
      	auto_handle to anonymous namespace as auto_win_file_handle.
      	(fs::hard_link_count): Changed Windows implementation to use
      	information provided by GetFileInformationByHandle which is more
      	reliable.
      	* testsuite/27_io/filesystem/operations/hard_link_count.cc: New
      	test.
      
      Signed-off-by: default avatar"Lennox" Shou Hao Ho <lennoxhoe@gmail.com>
      Reviewed-by: default avatarJonathan Wakely <jwakely@redhat.com>
      65819365
    • Arsen Arsenović's avatar
      c++: diagnose usage of co_await and co_yield in default args [PR115906] · 0c382da0
      Arsen Arsenović authored
      This is a partial fix for PR115906.  Per [expr.await] 2s3, "An
      await-expression shall not appear in a default argument
      ([dcl.fct.default])".  This patch introduces the diagnostic in that
      case, and in the case of a co_yield (as co_yield is defined in terms of
      co_await, so prerequisites of co_await hold).
      
      PR c++/115906 - [coroutines] missing diagnostic and ICE when co_await used as default argument in function declaration
      
      gcc/cp/ChangeLog:
      
      	PR c++/115906
      	* parser.cc (cp_parser_unary_expression): Reject await
      	expressions if use of local variables is currently forbidden.
      	(cp_parser_yield_expression): Reject yield expressions if use of
      	local variables is currently forbidden.
      
      gcc/testsuite/ChangeLog:
      
      	PR c++/115906
      	* g++.dg/coroutines/pr115906-yield.C: New test.
      	* g++.dg/coroutines/pr115906.C: New test.
      	* g++.dg/coroutines/co-await-syntax-02-outside-fn.C: Don't rely
      	on default arguments.
      	* g++.dg/coroutines/co-yield-syntax-01-outside-fn.C: Ditto.
      0c382da0
    • Arsen Arsenovic's avatar
      c++: fix ICE on FUNCTION_DECLs inside coroutines [PR115906] · a362c9ca
      Arsen Arsenovic authored
      When register_local_var_uses iterates a BIND_EXPRs BIND_EXPR_VARS, it
      fails to account for the fact that FUNCTION_DECLs might be present, and
      later passes it to DECL_HAS_VALUE_EXPR_P.  This leads to a tree check
      failure in DECL_HAS_VALUE_EXPR_P:
      
        tree check: expected var_decl or parm_decl or result_decl, have
        function_decl in register_local_var_uses
      
      We only care about PARM_DECL and VAR_DECL, so select only those.
      
      PR c++/115906 - [coroutines] missing diagnostic and ICE when co_await used as default argument in function declaration
      
      gcc/cp/ChangeLog:
      
      	PR c++/115906
      	* coroutines.cc (register_local_var_uses): Only process
      	PARM_DECL and VAR_DECLs.
      
      gcc/testsuite/ChangeLog:
      
      	PR c++/115906
      	* g++.dg/coroutines/coro-function-decl.C: New test.
      a362c9ca
    • Jennifer Schmitz's avatar
      SVE intrinsics: Add strength reduction for division by constant. · 7cde1408
      Jennifer Schmitz authored
      
      This patch folds SVE division where all divisor elements are the same
      power of 2 to svasrd (signed) or svlsr (unsigned).
      Tests were added to check
      1) whether the transform is applied (existing test harness was amended), and
      2) correctness using runtime tests for all input types of svdiv; for signed
      and unsigned integers, several corner cases were covered.
      
      The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
      OK for mainline?
      
      Signed-off-by: default avatarJennifer Schmitz <jschmitz@nvidia.com>
      
      gcc/
      
      	* config/aarch64/aarch64-sve-builtins-base.cc (svdiv_impl::fold):
      	Implement strength reduction.
      
      gcc/testsuite/
      
      	* gcc.target/aarch64/sve/div_const_run.c: New test.
      	* gcc.target/aarch64/sve/acle/asm/div_s32.c: Likewise.
      	* gcc.target/aarch64/sve/acle/asm/div_s64.c: Likewise.
      	* gcc.target/aarch64/sve/acle/asm/div_u32.c: Likewise.
      	* gcc.target/aarch64/sve/acle/asm/div_u64.c: Likewise.
      7cde1408
    • Arsen Arsenović's avatar
      c++: make source_location follow DECL_RAMP_FN · 265aa320
      Arsen Arsenović authored
      This fixes the value of current_function in compiler generated coroutine
      code.
      
      PR c++/110855 - std::source_location doesn't work with C++20 coroutine
      
      gcc/cp/ChangeLog:
      
      	PR c++/110855
      	* cp-gimplify.cc (fold_builtin_source_location): Use the name of
      	the DECL_RAMP_FN of the current function if present.
      
      gcc/testsuite/ChangeLog:
      
      	PR c++/110855
      	* g++.dg/coroutines/pr110855.C: New test.
      265aa320
    • Sam James's avatar
      testsuite: fix dg-do run whitespace · 136f364e
      Sam James authored
      This caused the tests to not be run. I may do further passes for non-run
      next.
      
      Tested on x86_64-pc-linux-gnu and checked test logs before/after.
      
      	PR c/53548
      	PR target/101529
      	PR tree-optimization/102359
      	* c-c++-common/fam-in-union-alone-in-struct-1.c: Fix whitespace in dg directive.
      	* c-c++-common/fam-in-union-alone-in-struct-2.c: Likewise.
      	* c-c++-common/torture/builtin-shufflevector-2.c: Likewise.
      	* g++.dg/pr102359_2.C: Likewise.
      	* g++.target/i386/mvc1.C: Likewise.
      136f364e
    • Paul-Antoine Arras's avatar
      Fix warnings for tree formats in gfc_error · 0450a143
      Paul-Antoine Arras authored
      This enables proper warnings for formats like %qD.
      
      gcc/c-family/ChangeLog:
      
      	* c-format.cc (gcc_gfc_char_table): Add formats for tree objects.
      0450a143
    • Tobias Burnus's avatar
      gfortran.dg/compiler-directive_2.f: Update dg-error · 15158a88
      Tobias Burnus authored
      This is a fallout of commit r15-2378-g29b1587e7d3466
        OpenMP/Fortran: Fix handling of 'declare target' with 'link' clause [PR115559]
      where the '!GCC$' attributes were added in reverse order.
      Result: The error diagnostic for the stdcall/fastcall was reversed.
      Solution: Swap the order in dg-error.
      
      gcc/testsuite/ChangeLog:
      
      	* gfortran.dg/compiler-directive_2.f: Update dg-error.
      15158a88
    • Georg-Johann Lay's avatar
      AVR: Propose to use attribute signal(n) via AVR-LibC's ISR_N. · 92208369
      Georg-Johann Lay authored
      gcc/
      	* doc/extend.texi (AVR Function Attributes): Propose to use
      	attribute signal(n) via AVR-LibC's ISR_N from avr/interrupt.h
      92208369
    • Pan Li's avatar
      RISC-V: Take Xmode instead of Pmode for ussub expanding · 85cff6e4
      Pan Li authored
      
      The Pmode is designed for pointer,  thus leverage the Xmode instead
      for the expanding of the ussub.
      
      gcc/ChangeLog:
      
      	* config/riscv/riscv.cc (riscv_expand_ussub): Promote to Xmode
      	instead of Pmode.
      
      Signed-off-by: default avatarPan Li <pan2.li@intel.com>
      85cff6e4
    • Takayuki 'January June' Suwa's avatar
      xtensa: Add missing speed cost for TYPE_FARITH in TARGET_INSN_COST · c1d35de0
      Takayuki 'January June' Suwa authored
      According to the implemented pipeline model, this cost can be assumed to be
      1 clock cycle.
      
      gcc/ChangeLog:
      
      	* config/xtensa/xtensa.cc (xtensa_insn_cost):
      	Add a case statement for TYPE_FARITH.
      c1d35de0
    • Takayuki 'January June' Suwa's avatar
      xtensa: Fix suboptimal loading of pooled constant value into hardware single-precision FP register · fb7b8296
      Takayuki 'January June' Suwa authored
      We would like to implement the following to store a single-precision FP
      constant in a hardware FP register:
      
      - Load the bit-exact integer image of the pooled single-precision FP
        constant into an address (integer) register
      - Then, assign from that address register to a hardware single-precision
        FP register
      
      	.literal_position
      	.literal	.LC1, 0x3f800000
      ...
      	l32r	a9, .LC1
      	wfr	f0, a9
      
      However, it was emitted as follows:
      
      - Load the address of the FP constant entry in litpool into an address
        register
      - Then, dereference the address via that address register into a hardware
        single-precision FP register
      
      	.literal_position
      	.literal	.LC1, 0x3f800000
      	.literal	.LC2, .LC1
      ...
      	l32r	a9, .LC2
      	lsi	f0, a9, 0
      
      It is obviously inefficient to read the pool twice.
      
      gcc/ChangeLog:
      
      	* config/xtensa/xtensa.md (movsf_internal):
      	Reorder alternative that corresponds to L32R machine instruction,
      	and prefix alternatives that correspond to LSI/SSI instructions
      	with the constraint character '^' so that they are disparaged by
      	reload/LRA.
      fb7b8296
    • Takayuki 'January June' Suwa's avatar
      xtensa: Fix the regression introduce by r15-959-gbe9b3f4375e7 · 8ebb1d79
      Takayuki 'January June' Suwa authored
      It is not wrong but also not optimal to specify that sibcalls require
      register A0 in RTX generation pass, by misleading DFA into thinking it
      is being used in function body.
      It would be better to specify it in pro_and_epilogue as with 'return'
      insn in order to avoid incorrect removing load that restores A0 in
      subsequent passes, but since it is not possible to modify each sibcall
      there, as a workaround we will preface it with a 'use' as before.
      
      This patch effectively reverts commit r15-959-gbe9b3f4375e7
      
      gcc/ChangeLog:
      
      	* config/xtensa/xtensa-protos.h (xtensa_expand_call):
      	Remove the third argument.
      	* config/xtensa/xtensa.cc (xtensa_expand_call):
      	Remove the third argument and the code that uses it.
      	* config/xtensa/xtensa.md (call, call_value, sibcall, sibcall_value):
      	Remove each Boolean constant specified in the third argument of
      	xtensa_expand_call.
      	(sibcall_epilogue): Add emitting '(use A0_REG)' after calling
      	xtensa_expand_epilogue.
      8ebb1d79
    • liuhongt's avatar
      Refine constraint "Bk" to define_special_memory_constraint. · bc1fda00
      liuhongt authored
      For below pattern, RA may still allocate r162 as v/k register, try to
      reload for address with leaq __libc_tsd_CTYPE_B@gottpoff(%rip), %rsi
      which result a linker error.
      
      (set (reg:DI 162)
           (mem/u/c:DI
             (const:DI (unspec:DI
      		 [(symbol_ref:DI ("a") [flags 0x60]  <var_decl 0x7f621f6e1c60 a>)]
      		 UNSPEC_GOTNTPOFF))
      
      Quote from H.J for why linker issue an error.
      >What do these do:
      >
      >        leaq    __libc_tsd_CTYPE_B@gottpoff(%rip), %rax
      >        vmovq   (%rax), %xmm0
      >
      >From x86-64 TLS psABI:
      >
      >The assembler generates for the x@gottpoff(%rip) expressions a R X86
      >64 GOTTPOFF relocation for the symbol x which requests the linker to
      >generate a GOT entry with a R X86 64 TPOFF64 relocation. The offset of
      >the GOT entry relative to the end of the instruction is then used in
      >the instruction. The R X86 64 TPOFF64 relocation is pro- cessed at
      >program startup time by the dynamic linker by looking up the symbol x
      >in the modules loaded at that point. The offset is written in the GOT
      >entry and later loaded by the addq instruction.
      >
      >The above code sequence looks wrong to me.
      
      gcc/ChangeLog:
      
      	PR target/116043
      	* config/i386/constraints.md (Bk): Refine to
      	define_special_memory_constraint.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/i386/pr116043.c: New test.
      bc1fda00
    • Haochen Jiang's avatar
      i386: Add non-optimize prefetchi intrins · b4524c44
      Haochen Jiang authored
      Under -O0, with the "newly" introduced intrins, the variable will be
      transformed as mem instead of the origin symbol_ref. The compiler will
      then treat the operand as invalid and turn the operation into nop, which
      is not expected. Use macro for non-optimize to keep the variable as
      symbol_ref just as how prefetch intrin does.
      
      gcc/ChangeLog:
      
      	* config/i386/prfchiintrin.h
      	(_m_prefetchit0): Add macro for non-optimized option.
      	(_m_prefetchit1): Ditto.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/i386/prefetchi-1b.c: New test.
      b4524c44
    • GCC Administrator's avatar
      Daily bump. · 1caeabdb
      GCC Administrator authored
      1caeabdb
    • Takayuki 'January June' Suwa's avatar
      xtensa: Make use of scaled [U]FLOAT/TRUNC.S instructions · f9c7775f
      Takayuki 'January June' Suwa authored
      [U]FLOAT.S machine instruction in Xtensa ISA, which converts an integer to
      a hardware single-precision FP register, has the ability to divide the
      result by power of two (0 to 15th).
      
      Similarly, [U]TRUNC.S instruction, which truncates single-precision FP to
      integer, can multiply the source value by power of two in advance, but
      neither of these currently uses this function (always specified with 0th
      power of two, i.e. a scaling factor of 1).
      
      This patch unleashes the scaling ability of the above instructions.
      
           /* example */
           float test0(int a) {
             return a / 2.f;
           }
           float test1(unsigned int a) {
             return a / 32768.f;
           }
           int test2(float a) {
             return a * 2;
           }
           unsigned int test3(float a) {
             return a * 32768;
           }
      
           ;; before
           test0:
           	movi.n	a9, 0x3f
           	float.s	f0, a2, 0
           	slli	a9, a9, 24
           	wfr	f1, a9
           	mul.s	f0, f0, f1
           	rfr	a2, f0
           	ret.n
           test1:
           	movi.n	a9, 7
           	ufloat.s	f0, a2, 0
           	slli	a9, a9, 27
           	wfr	f1, a9
           	mul.s	f0, f0, f1
           	rfr	a2, f0
           	ret.n
           test2:
           	wfr	f1, a2
           	add.s	f0, f1, f1
           	trunc.s	a2, f0, 0
           	ret.n
           test3:
           	movi.n	a9, 0x47
           	slli	a9, a9, 24
           	wfr	f1, a2
           	wfr	f2, a9
           	mul.s	f0, f1, f2
           	utrunc.s	a2, f0, 0
           	ret.n
      
           ;; after
           test0:
           	float.s	f0, a2, 1
           	rfr	a2, f0
           	ret.n
           test1:
           	ufloat.s	f0, a2, 15
           	rfr	a2, f0
           	ret.n
           test2:
           	wfr	f0, a2
           	trunc.s	a2, f0, 1
           	ret.n
           test3:
           	wfr	f0, a2
           	utrunc.s	a2, f0, 15
           	ret.n
      
      gcc/ChangeLog:
      
      	* config/xtensa/predicates.md
      	(fix_scaling_operand, float_scaling_operand): New predicates.
      	* config/xtensa/xtensa.md
      	(any_fix/m_fix/s_fix, any_float/m_float/s_float):
      	New code iterators and their attributes.
      	(fix<s_fix>_truncsfsi2): Change from "fix_truncsfsi2".
      	(*fix<s_fix>_truncsfsi2_2x, *fix<s_fix>_truncsfsi2_scaled):
      	New insn definitions.
      	(float<s_float>sisf2): Change from "floatsisf2".
      	(*float<s_float>sisf2_scaled): New insn definition.
      f9c7775f
Loading