Skip to content
Snippets Groups Projects
  1. Oct 22, 2024
    • Jakub Jelinek's avatar
      c: Better fix for speed up compilation of large char array initializers when... · a6db5908
      Jakub Jelinek authored
      c: Better fix for speed up compilation of large char array initializers when not using #embed [PR117190]
      
      On Wed, Oct 16, 2024 at 11:09:32PM +0200, Jakub Jelinek wrote:
      > Apparently my
      > c: Speed up compilation of large char array initializers when not using #embed
      > patch broke building glibc.
      >
      > The issue is that when using CPP_EMBED, we are guaranteed by the
      > preprocessor that there is CPP_NUMBER CPP_COMMA before it and
      > CPP_COMMA CPP_NUMBER after it (or CPP_COMMA CPP_EMBED), so RAW_DATA_CST
      > never ends up at the end of arrays of unknown length.
      > Now, the c_parser_initval optimization attempted to preserve that property
      > rather than changing everything that e.g. inferes array number of elements
      > from the initializer etc. to deal with RAW_DATA_CST at the end, but
      > it didn't take into account the possibility that there could be
      > CPP_COMMA followed by CPP_CLOSE_BRACE (where the CPP_COMMA is redundant).
      >
      > As we are peaking already at 4 tokens in that code, peeking more would
      > require using raw tokens and that seems to be expensive doing it for
      > every pair of tokens due to vec_free done when we are out of raw tokens.
      
      Sorry for rushing the previous patch too much, turns out I was wrong,
      given that the c_parser_peek_nth_token numbering is 1 based, we can peek
      also with c_parser_peek_nth_token (parser, 4) and the loop actually peeked
      just at 3 tokens, not 4.
      
      So, I think it is better to revert the previous patch (but keep the new
      test) and instead peek the 4th non-raw token, which is what the following
      patch does.
      
      Additionally, PR117190 shows one further spot which missed the peek of
      the token after CPP_COMMA, in case it is incomplete array with exactly 65
      elements with redundant comma after it, which this patch handles too.
      
      2024-10-22  Jakub Jelinek  <jakub@redhat.com>
      
      	PR c/117190
      gcc/c/
      	* c-parser.cc (c_parser_initval): Revert 2024-10-17 changes.
      	Instead peek the 4th token and if it is not CPP_NUMBER,
      	handle it like 3rd token CPP_CLOSE_BRACE for orig_len == INT_MAX.
      	Also, check (2 + 2 * i)th raw token for the orig_len == INT_MAX
      	case and punt if it is not CPP_NUMBER.
      gcc/testsuite/
      	* c-c++-common/init-5.c: New test.
      a6db5908
    • Jakub Jelinek's avatar
      c-family: Fix up -Wsizeof-pointer-memaccess ICEs [PR117230] · 5fd1c0c1
      Jakub Jelinek authored
      In the following testcases, we ICE on all 4 function calls.
      The problem is using TYPE_PRECISION on vector types (but guess it
      would be similarly problematic on structures/unions/arrays).
      The test only differentiates between suggestion what to do, whether
      to supply explicit size because sizeof (*p) for
      {,{,un}signed }char *p is not very likely what the user want, or
      dereferencing the pointer, so I think limiting that suggestion
      to integral types is ok.
      
      2024-10-22  Jakub Jelinek  <jakub@redhat.com>
      
      	PR c/117230
      	* c-warn.cc (sizeof_pointer_memaccess_warning): Only compare
      	TYPE_PRECISION of TREE_TYPE (type) to precision of char if
      	TREE_TYPE (type) is integral type.
      
      	* c-c++-common/Wsizeof-pointer-memaccess5.c: New test.
      5fd1c0c1
    • Jakub Jelinek's avatar
      varasm: Handle RAW_DATA_CST in compare_constant [PR117199] · f616bc41
      Jakub Jelinek authored
      On the following testcase without LTO we unnecessarily don't merge
      two identical .LC* constants (constant hashing computes the same hash,
      but as compare_constant returned false for the RAW_DATA_CST in it,
      it never compares equal), and with LTO fails to link because LTO assumes such
      constants have to be merged and so doesn't emit the other constant.
      
      2024-10-22  Jakub Jelinek  <jakub@redhat.com>
      
      	PR middle-end/117199
      	* varasm.cc (compare_constant): Handle RAW_DATA_CST.  Formatting fix
      	in the STRING_CST case.
      
      	* gcc.dg/lto/pr117199_0.c: New test.
      f616bc41
    • Jakub Jelinek's avatar
      varasm: Fix up RAW_DATA_CST handling in array_size_for_constructor [PR117190] · 8f173da4
      Jakub Jelinek authored
      CONSTRUCTOR indices for arrays have bitsize type, and the r15-4375
      patch actually got it right in 6 other spots, but not in this function,
      where it used size_int rather than bitsize_int and so size_binop can ICE
      on type mismatch.
      
      This is covered by the init-5.c testcase I've just posted, though the ICE
      goes away when the C FE is fixed (and when it is not, there is another
      ICE).
      
      2024-10-22  Jakub Jelinek  <jakub@redhat.com>
      
      	PR c/117190
      	* varasm.cc (array_size_for_constructor): For RAW_DATA_CST,
      	use bitsize_int rather than size_int.
      8f173da4
    • Tobias Burnus's avatar
      GCN: Initial generic-target handling, add more GCN macro defines · 1bdeebe6
      Tobias Burnus authored
      Newer llvm-mc assemblers support the gfx*-generic targets, permitting to
      generate code for all GPUs belonging to the same generation, even if not
      optimal code. This requires LLVM 19.
      
      This patch adds the compiler-side support for generic gfx and also
      adds -march=gfx10-3-generic and -march=gfx-11. However, those -march= are
      not documented nor used anywhere, yet.
      
      Disclaimer: Not tested (as my ROCm does not support it); additionally,
      libgomp/plugin/plugin-gcn.c has to be updated before it becomes useful.
      
      For better compatibility with LLVM's Clang, this commit additionally adds
      the macro definitions __GFX<9|10|11>__ for the architecture family,
      __AMDGPU__ besides the existing __AMDGCN__ and the two strings-containing
      macros __amdgcn_processor__ and __amdgcn_target_id__, where the former has
      '-' replaced by '_' but otherwise both contain the lower case name. For the
      new generic targets, the same happens, yielding, e.g., __gfx10_3_generic__.
      
      gcc/ChangeLog:
      
      	* config/gcn/gcn-devices.def: Add generic version/flag as additional
      	value and architecture family entry; update; add gfx-10-3-generic
      	and gfx11-generic.
      	* config/gcn/gcn-hsa.h (ABI_VERSION_SPEC): Remove
      	(ASM_SPEC): Use generated ABI_VERSION_OPT instead.
      	* config/gcn/gcn-tables.opt: Regenerate
      	* config/gcn/gcn.h (gcn_device_def): Add generic_version and
      	arch_family members.
      	(TARGET_CPU_CPP_BUILTINS): Fix allocation bug, handle '-' in the
      	name and add additional macro defines.
      	* config/gcn/gcn.cc (gcn_devices): Handle it.
      	* config/gcn/gen-gcn-device-macros.awk: Likewise; use ELF name
      	for the macro name; generate ABI_VERSION_OPT.
      	* config/gcn/mkoffload.cc (ELFABIVERSION_AMDGPU_HSA_V6,
      	EF_AMDGPU_GENERIC_VERSION_V, EF_AMDGPU_GENERIC_VERSION_OFFSET,
      	GET_GENERIC_VERSION, SET_GENERIC_VERSION): Define.
      	(get_arch): Call SET_GENERIC_VERSION flag on elf_flags.
      	(copy_early_debug_info): If the arch sets the generic version,
      	use ELFABIVERSION_AMDGPU_HSA_V6.
      1bdeebe6
    • Torbjörn SVENSSON's avatar
      testsuite: arm: Use check-function-bodies in fp16-aapcs-* tests · 205515da
      Torbjörn SVENSSON authored
      
      Converted the tests to use check-function-bodies in order to ensure that
      the sequence is correct.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/arm/fp16-aapcs-1.c: Use check-function-bodies.
      	* gcc.target/arm/fp16-aapcs-2.c: Likewise.
      	* gcc.target/arm/fp16-aapcs-3.c: Likewise.
      	* gcc.target/arm/fp16-aapcs-4.c: Likewise.
      
      Signed-off-by: default avatarTorbjörn SVENSSON <torbjorn.svensson@foss.st.com>
      205515da
    • Torbjörn SVENSSON's avatar
      testsuite: arm: Relax expected asm in bitfield* and union-2 tests · a79ca49b
      Torbjörn SVENSSON authored
      
      Below -O2, lsls/lsrs are prefered. For -O2 and above, lsl/lsr are
      prefered.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/arm/cmse/mainline/8_1m/bitfield-4.c: Allow lsl and
      	lsr instructions.
      	* gcc.target/arm/cmse/mainline/8_1m/bitfield-6.c: Likewise.
      	* gcc.target/arm/cmse/mainline/8_1m/bitfield-8.c: Likewise.
      	* gcc.target/arm/cmse/mainline/8_1m/bitfield-and-union.c: Likewise.
      	* gcc.target/arm/cmse/mainline/8_1m/union-2.c: Likewise.
      
      Signed-off-by: default avatarTorbjörn SVENSSON <torbjorn.svensson@foss.st.com>
      a79ca49b
    • Torbjörn SVENSSON's avatar
      testsuite: arm: Use check-function-bodies in cmse-5 tests · 835ad52f
      Torbjörn SVENSSON authored
      
      Converted the tests to use check-function-bodies in order to ensure that
      the sequence is correct.
      This also allows both APSR_nzcvq and APSR_nzcvqg as target selector does
      not work when the -march and/or -mcpu overrides the target to test.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/arm/cmse/mainline/8m/hard-sp/cmse-5.c: Use
      	check-function-bodies.
      	* gcc.target/arm/cmse/mainline/8m/hard/cmse-5.c: Likewise.
      	* gcc.target/arm/cmse/mainline/8m/soft/cmse-5.c: Likewise.
      	* gcc.target/arm/cmse/mainline/8m/softfp-sp/cmse-5.c: Likewise.
      	* gcc.target/arm/cmse/mainline/8m/softfp/cmse-5.c: Likewise.
      	* gcc.target/arm/cmse/mainline/8_1m/hard-sp/cmse-5.c: Likewise.
      	* gcc.target/arm/cmse/mainline/8_1m/hard/cmse-5.c: Likewise.
      	* gcc.target/arm/cmse/mainline/8_1m/soft/cmse-5.c: Likewise.
      	* gcc.target/arm/cmse/mainline/8_1m/softfp-sp/cmse-5.c:
      	Likewise.
      	* gcc.target/arm/cmse/mainline/8_1m/softfp/cmse-5.c: Likewise.
      
      Signed-off-by: default avatarTorbjörn SVENSSON <torbjorn.svensson@foss.st.com>
      835ad52f
    • Jonathan Wakely's avatar
      libstdc++: Avoid using std::__to_address with iterators · 85e5b80e
      Jonathan Wakely authored
      In r12-3935-g82626be2d633a9 I added the partial specialization
      std::pointer_traits<__normal_iterator<It, Cont>> so that __to_address
      would work with __normal_iterator objects. Soon after that, François
      replaced it in r12-6004-g807ad4bc854cae with an overload of __to_address
      that served the same purpose, but was less complicated and less wrong.
      
      I now think that both commits were mistakes, and that instead of adding
      hacks to make __normal_iterator work with __to_address, we should not be
      using __to_address with iterators at all before C++20.
      
      The pre-C++20 std::__to_address function should only be used with
      pointer-like types, specifically allocator_traits<A>::pointer types.
      Those pointer-like types are guaranteed to be contiguous iterators, so
      that getting a raw memory address from them is OK.
      
      For arbitrary iterators, even random access iterators, we don't know
      that it's safe to lower the iterator to a pointer e.g. for std::deque
      iterators it's not, because (it + n) == (std::to_address(it) + n) only
      holds within the same block of the deque's storage.
      
      For C++20, std::to_address does work correctly for contiguous iterators,
      including __normal_iterator, and __to_address just calls std::to_address
      so also works. But we have to be sure we have an iterator that satisfies
      the std::contiguous_iterator concept for it to be safe, and we can't
      check that before C++20.
      
      So for pre-C++20 code the correct way to handle iterators that might be
      pointers or might be __normal_iterator is to call __niter_base, and if
      necessary use is_pointer to check whether __niter_base returned a real
      pointer.
      
      We currently have some uses of std::__to_address with iterators where
      we've checked that they're either pointers, or __normal_iterator
      wrappers around pointers, or satisfy std::contiguous_iterator. But this
      seems a little fragile, and it would be better to just use
      std::__niter_base for the pointers and __normal_iterator cases, and use
      C++20 std::to_address when the C++20 std::contiguous_iterator concept is
      satisfied. This patch does that.
      
      libstdc++-v3/ChangeLog:
      
      	* include/bits/basic_string.h (basic_string::assign): Replace
      	use of __to_address with __niter_base or std::to_address as
      	appropriate.
      	* include/bits/ptr_traits.h (__to_address): Add comment.
      	* include/bits/shared_ptr_base.h (__shared_ptr): Qualify calls
      	to __to_address.
      	* include/bits/stl_algo.h (find): Replace use of __to_address
      	with __niter_base or std::to_address as appropriate. Only use
      	either of them when the range is not empty.
      	* include/bits/stl_iterator.h (__to_address): Remove overload
      	for __normal_iterator.
      	* include/debug/safe_iterator.h (__to_address): Remove overload
      	for _Safe_iterator.
      	* include/std/ranges (views::counted): Replace use of
      	__to_address with std::to_address.
      	* testsuite/24_iterators/normal_iterator/to_address.cc: Removed.
      85e5b80e
    • Jennifer Schmitz's avatar
      testsuite: Add test directive checking removal of link_error · bf11ecbb
      Jennifer Schmitz authored
      
      This test needs a directive checking the removal of the link_error.
      Committed as obvious.
      
      Signed-off-by: default avatarJennifer Schmitz <jschmitz@nvidia.com>
      
      gcc/testsuite/
      	* gcc.dg/tree-ssa/log_ident.c: Add scan for removal of
      	link_error in optimized tree dump.
      bf11ecbb
    • Patrick Palka's avatar
      c++: redundant hashing in register_specialization · ae614b8a
      Patrick Palka authored
      
      After r15-4050-g5dad738c1dd164 register_specialization needs to set
      elt.hash to the (maybe) precomputed hash so that the lookup uses it
      rather than redundantly computing it from scratch.
      
      gcc/cp/ChangeLog:
      
      	* pt.cc (register_specialization): Set elt.hash.
      
      Reviewed-by: default avatarJason Merrill <jason@redhat.com>
      ae614b8a
    • Richard Sandiford's avatar
      testsuite: Skip pr112305.c for -O[01] on simulators · 4e80432c
      Richard Sandiford authored
      gcc.dg/torture/pr112305.c contains an inner loop that executes
      0x8000_0014 times and an outer loop that executes 5 times, giving about
      10 billion total executions of the inner loop body.  At -O2 and above we
      are able to remove the inner loop, but at -O1 we keep a no-op loop:
      
              dls     lr, r3
      .L3:
              subs    r3, r3, #1
              le      lr, .L3
      
      and at -O0 we of course don't optimise.
      
      This can lead to long execution times on simulators, possibly
      triggering a timeout.
      
      gcc/testsuite
      	* gcc.dg/torture/pr112305.c: Skip at -O0 and -O1 for simulators.
      4e80432c
    • Nathaniel Shead's avatar
      c++/modules: Handle forward-declared class types · 9f9afc65
      Nathaniel Shead authored
      
      In some cases we can access members of a namespace-scope class without
      ever having performed name-lookup on it; this can occur when a
      forward-declaration of the class is used as a return type, for
      instance, or with PIMPL.
      
      One possible approach would be to do name lookup in complete_type to
      force lazy loading to occur, but this seems overly expensive for a
      relatively rare case.  Instead, this patch generalises the existing
      pending-entity support to handle this case as well.
      
      Unfortunately this does mean that almost every class definition will be
      added to the pending-entity table, and almost always unnecessarily, but
      I don't see a good way to avoid this.
      
      gcc/cp/ChangeLog:
      
      	* module.cc (depset::DB_IS_MEMBER_BIT): Rename to...
      	(depset::DB_IS_PENDING_BIT): ...this.
      	(depset::is_member): Remove.
      	(depset::is_pending_entity): New function.
      	(depset::hash::make_dependency): Mark definitions of
      	namespace-scope types as maybe-pending entities.
      	(depset::hash::add_class_entities): Rename DB_IS_MEMBER_BIT to
      	DB_IS_PENDING_BIT.
      	(depset::hash::find_dependencies): Use is_pending_entity
      	instead of is_member.
      	(module_state::write_pendings): Likewise; adjust comment.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/modules/inst-4_b.C: Adjust pending-entity count.
      	* g++.dg/modules/member-def-1_c.C: Likewise.
      	* g++.dg/modules/member-def-2_c.C: Likewise.
      	* g++.dg/modules/tpl-spec-3_b.C: Likewise.
      	* g++.dg/modules/tpl-spec-4_b.C: Likewise.
      	* g++.dg/modules/tpl-spec-5_b.C: Likewise.
      	* g++.dg/modules/class-9_a.H: New test.
      	* g++.dg/modules/class-9_b.H: New test.
      	* g++.dg/modules/class-9_c.C: New test.
      
      Signed-off-by: default avatarNathaniel Shead <nathanieloshead@gmail.com>
      Reviewed-by: default avatarJason Merrill <jason@redhat.com>
      9f9afc65
    • Richard Biener's avatar
      tree-optimization/117254 - ICE with access diangostics · d464a52d
      Richard Biener authored
      The diagnostics code fails to handle non-constant domain max.
      
      	PR tree-optimization/117254
      	* gimple-ssa-warn-access.cc (maybe_warn_nonstring_arg):
      	Check the array domain max is constant before using it.
      
      	* gcc.dg/pr117254.c: New testcase.
      d464a52d
    • Andrew Stubbs's avatar
      amdgcn: Refactor device settings into a def file · a6b26e5e
      Andrew Stubbs authored
      
      Almost all device-specific settings are now centralised into gcn-devices.def
      for the compiler, mkoffload, and libgomp.  No longer will we have to touch 10
      files in multiple places just to add another device without any exotic
      features.  (New ISAs and devices with incompatible metadata will continue to
      need a bit more.)
      
      In order to remove the device-specific conditionals in the code a new value
      HSACO_ATTR_UNSUPPORTED has been added, indicating that the assembler will
      reject any setting of that option.
      
      This incorporates some of Tobias's patch from March 2024.
      
      Co-Authored-By: default avatarTobias Burnus <tburnus@baylibre.com>
      
      gcc/ChangeLog:
      
      	* config.gcc (amdgcn): Add gcn-device-macros.h to tm_file.
      	Add gcn-tables.opt to extra_options.
      	* config/gcn/gcn-hsa.h (NO_XNACK): Delete.
      	(NO_SRAM_ECC): Delete.
      	(SRAMOPT): Move definition to generated file gcn-device-macros.h.
      	(XNACKOPT): Likewise.
      	(ASM_SPEC): Redefine using generated values from gcn-device-macros.h.
      	* config/gcn/gcn-opts.h
      	(enum processor_type): Generate from gcn-devices.def.
      	(TARGET_VEGA10): Delete.
      	(TARGET_VEGA20): Delete.
      	(TARGET_GFX908): Delete.
      	(TARGET_GFX90a): Delete.
      	(TARGET_GFX90c): Delete.
      	(TARGET_GFX1030): Delete.
      	(TARGET_GFX1036): Delete.
      	(TARGET_GFX1100): Delete.
      	(TARGET_GFX1103): Delete.
      	(TARGET_XNACK): Redefine to allow for HSACO_ATTR_UNSUPPORTED.
      	(enum hsaco_attr_type): Add HSACO_ATTR_UNSUPPORTED.
      	(TARGET_TGSPLIT): New define.
      	* config/gcn/gcn.cc (gcn_devices): New constant table.
      	(gcn_option_override): Rework to use gcn_devices table.
      	(gcn_omp_device_kind_arch_isa): Likewise.
      	(output_file_start): Likewise.
      	(gcn_hsa_declare_function_name): Rework using TARGET_* macros.
      	* config/gcn/gcn.h (gcn_devices): Declare struct and table.
      	(TARGET_CPU_CPP_BUILTINS): Rework using gcn_devices.
      	* config/gcn/gcn.opt: Move enum data to generated file gcn-tables.opt.
      	Use new names for the default values.
      	* config/gcn/mkoffload.cc (EF_AMDGPU_MACH_AMDGCN_GFX900): Delete.
      	(EF_AMDGPU_MACH_AMDGCN_GFX906): Delete.
      	(EF_AMDGPU_MACH_AMDGCN_GFX908): Delete.
      	(EF_AMDGPU_MACH_AMDGCN_GFX90a): Delete.
      	(EF_AMDGPU_MACH_AMDGCN_GFX90c): Delete.
      	(EF_AMDGPU_MACH_AMDGCN_GFX1030): Delete.
      	(EF_AMDGPU_MACH_AMDGCN_GFX1036): Delete.
      	(EF_AMDGPU_MACH_AMDGCN_GFX1100): Delete.
      	(EF_AMDGPU_MACH_AMDGCN_GFX1103): Delete.
      	(enum elf_arch_code): Define using gcn-devices.def.
      	(get_arch): Rework using gcn-devices.def.
      	(main): Rework using gcn-devices.def
      	* config/gcn/t-gcn-hsa (gcn-tables.opt): Generate file.
      	(gcn-device-macros.h): Generate file.
      	* config/gcn/t-omp-device: Generate isa list from gcn-devices.def.
      	* config/gcn/gcn-devices.def: New file.
      	* config/gcn/gcn-tables.opt: New file.
      	* config/gcn/gcn-tables.opt.urls: New file.
      	* config/gcn/gen-gcn-device-macros.awk: New file.
      	* config/gcn/gen-opt-tables.awk: New file.
      
      libgomp/ChangeLog:
      
      	* plugin/plugin-gcn.c (EF_AMDGPU_MACH): Generate from gcn-devices.def.
      	(gcn_gfx803_s): Delete.
      	(gcn_gfx900_s): Delete.
      	(gcn_gfx906_s): Delete.
      	(gcn_gfx908_s): Delete.
      	(gcn_gfx90a_s): Delete.
      	(gcn_gfx90c_s): Delete.
      	(gcn_gfx1030_s): Delete.
      	(gcn_gfx1036_s): Delete.
      	(gcn_gfx1100_s): Delete.
      	(gcn_gfx1103_s): Delete.
      	(gcn_isa_name_len): Delete.
      	(isa_hsa_name): Rename ...
      	(isa_name): ... to this, and rework using gcn-devices.def.
      	(isa_gcc_name): Delete.
      	(isa_code): Rework using gcn-devices.def.
      	(max_isa_vgprs): Rework using gcn-devices.def.
      	(isa_matches_agent): Update isa_name usage.
      	(GOMP_OFFLOAD_init_device): Improve diagnostic using the name.
      a6b26e5e
    • Richard Biener's avatar
      tree-optimization/117123 - missed PHI equivalence in VN · c33d8c55
      Richard Biener authored
      Value-numbering can use its set of equivalences to prove that
      a PHI node with args <a_1, 5, 10> is equal to a_1 iff on the
      edges with the constants a_1 == 5 and a_1 == 10 hold.  This
      breaks down when the order of PHI args is <5, 10, a_1> as then
      we drop to VARYING early.  The following mitigates this by
      shuffling a copy of the edge vector to always process a SSA name
      argument first.  Which should also handle the special-case of
      a two argument <5, a_1> we already had.
      
      	PR tree-optimization/117123
      	* tree-ssa-sccvn.cc (visit_phi): First process a non-constant
      	argument edge to handle more equivalences.  Remove the
      	two-arg special case.
      
      	* g++.dg/tree-ssa/pr117123.C: New testcase.
      c33d8c55
    • Stefan Schulze Frielinghaus's avatar
      testsuite: Fix typo in ext-floating19.C · 9263523b
      Stefan Schulze Frielinghaus authored
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/cpp23/ext-floating19.C: Fix typo for bfloat16 guard.
      9263523b
    • xuli's avatar
      RISC-V: Add testcases for unsigned .SAT_SUB form 1 with IMM = 1. · adf4ece4
      xuli authored
      
      form 1:
      T __attribute__((noinline))             \
      sat_u_sub_imm##IMM##_##T##_fmt_1 (T y)  \
      {                                       \
        return (T)IMM >= y ? (T)IMM - y : 0;  \
      }
      
      Passed the rv64gcv regression test.
      
      Change-Id: I8805225b445cdbbc685f4f54a4d66c7ee8f748e1
      Signed-off-by: default avatarLi Xu <xuli1@eswincomputing.com>
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/riscv/sat_u_sub_imm-1_4.c: New test.
      	* gcc.target/riscv/sat_u_sub_imm-2_4.c: New test.
      	* gcc.target/riscv/sat_u_sub_imm-3_4.c: New test.
      	* gcc.target/riscv/sat_u_sub_imm-4_2.c: New test.
      adf4ece4
    • xuli's avatar
      Match: Support IMM=1 for unsigned scalar .SAT_SUB IMM form 1 · 4e65e12a
      xuli authored
      
      This patch would like to support .SAT_SUB when one of the op
      is IMM = 1 of form1.
      
      Form 1:
       #define DEF_SAT_U_SUB_IMM_FMT_1(T, IMM) \
       T __attribute__((noinline))             \
       sat_u_sub_imm##IMM##_##T##_fmt_1 (T y)  \
       {                                       \
         return IMM >= y ? IMM - y : 0;        \
       }
      
      Take below form 1 as example:
      DEF_SAT_U_SUB_IMM_FMT_1(uint8_t, 1)
      
      Before this patch:
      __attribute__((noinline))
      uint8_t sat_u_sub_imm1_uint8_t_fmt_1 (uint8_t y)
      {
        uint8_t _1;
        uint8_t _3;
      
        <bb 2> [local count: 1073741824]:
        if (y_2(D) <= 1)
          goto <bb 3>; [41.00%]
        else
          goto <bb 4>; [59.00%]
      
        <bb 3> [local count: 440234144]:
        _3 = y_2(D) ^ 1;
      
        <bb 4> [local count: 1073741824]:
        # _1 = PHI <0(2), _3(3)>
        return _1;
      
      }
      
      After this patch:
      __attribute__((noinline))
      uint8_t sat_u_sub_imm1_uint8_t_fmt_1 (uint8_t y)
      {
        uint8_t _1;
      
      ;;   basic block 2, loop depth 0
      ;;    pred:       ENTRY
        _1 = .SAT_SUB (1, y_2(D)); [tail call]
        return _1;
      ;;    succ:       EXIT
      
      }
      
      The below test suites are passed for this patch:
      1. The rv64gcv fully regression tests.
      2. The x86 bootstrap tests.
      3. The x86 fully regression tests.
      
      Signed-off-by: default avatarLi Xu <xuli1@eswincomputing.com>
      gcc/ChangeLog:
      
      	* match.pd: Support IMM=1.
      4e65e12a
    • xuli's avatar
      RISC-V: Add testcases for unsigned .SAT_SUB form 1 with IMM = max -1. · 93b6f287
      xuli authored
      
      form 1:
      T __attribute__((noinline))             \
      sat_u_sub_imm##IMM##_##T##_fmt_1 (T y)  \
      {                                       \
        return (T)IMM >= y ? (T)IMM - y : 0;  \
      }
      
      Passed the rv64gcv regression test.
      
      Change-Id: Idaa1ab41f2a5785112279ea8ee2c93236457b740
      Signed-off-by: default avatarLi Xu <xuli1@eswincomputing.com>
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/riscv/sat_u_sub_imm-1_3.c: New test.
      	* gcc.target/riscv/sat_u_sub_imm-2_3.c: New test.
      	* gcc.target/riscv/sat_u_sub_imm-3_3.c: New test.
      	* gcc.target/riscv/sat_u_sub_imm-4_1.c: New test.
      93b6f287
    • xuli's avatar
      Match: Support IMM=max-1 for unsigned scalar .SAT_SUB IMM form 1 · 1dccec47
      xuli authored
      
      This patch would like to support .SAT_SUB when one of the op
      is IMM = max - 1 of form1.
      
      Form 1:
       #define DEF_SAT_U_SUB_IMM_FMT_1(T, IMM) \
       T __attribute__((noinline))             \
       sat_u_sub_imm##IMM##_##T##_fmt_1 (T y)  \
       {                                       \
         return IMM >= y ? IMM - y : 0;        \
       }
      
      Take below form 1 as example:
      DEF_SAT_U_SUB_IMM_FMT_1(uint8_t, 254)
      
      Before this patch:
      __attribute__((noinline))
      uint8_t sat_u_sub_imm254_uint8_t_fmt_1 (uint8_t y)
      {
        uint8_t _1;
        uint8_t _3;
      
        <bb 2> [local count: 1073741824]:
        if (y_2(D) != 255)
          goto <bb 3>; [66.00%]
        else
          goto <bb 4>; [34.00%]
      
        <bb 3> [local count: 708669600]:
        _3 = 254 - y_2(D);
      
        <bb 4> [local count: 1073741824]:
        # _1 = PHI <0(2), _3(3)>
        return _1;
      
      }
      
      After this patch:
      __attribute__((noinline))
      uint8_t sat_u_sub_imm254_uint8_t_fmt_1 (uint8_t y)
      {
        uint8_t _1;
      
        <bb 2> [local count: 1073741824]:
        _1 = .SAT_SUB (254, y_2(D)); [tail call]
        return _1;
      
      }
      
      The below test suites are passed for this patch:
      1. The rv64gcv fully regression tests.
      2. The x86 bootstrap tests.
      3. The x86 fully regression tests.
      
      Signed-off-by: default avatarLi Xu <xuli1@eswincomputing.com>
      
      gcc/ChangeLog:
      
      	* match.pd: Support IMM=max-1.
      1dccec47
    • GCC Administrator's avatar
      Daily bump. · 52cc5f04
      GCC Administrator authored
      52cc5f04
  2. Oct 21, 2024
    • Jeff Law's avatar
      [committed][PR rtl-optimization/116488] Fix SIGN_EXTEND source handling in ext-dce · 36e91df7
      Jeff Law authored
      A while back I noticed that the code to call carry_backpropagate was being
      called after the optimization step.  Which seemed wrong, but at the time I
      didn't have a testcase showing it as a problem.  Now I have 4 :-)
      
      The way things used to work, the extension would be stripped away before
      calling carry_backpropagte, meaning carry_backpropagate would never see a
      SIGN_EXTENSION.  Thus the code trying to account for the sign extended bit was
      never reached.
      
      Getting that bit marked live is what's needed to fix these testcases. Fallout
      is minor with just an adjustment needed to sensibly deal with vector modes in a
      place where we didn't have them before.
      
      I'm still somewhat concerned about this code.  Specifically whether or not we
      can get in here with arbitrarily complex RTL, and if so do we need to recurse
      down and look at those sub-expressions.
      
      So while this patch fixes the most pressing issue, I wouldn't be terribly
      surprised if we're back inside this code at some point.
      
      Bootstrapped and regression tested on x86_64, ppc64le, riscv64, s390x, mips64,
      loongarch, aarch64, m68k, alpha, hppa, sh4, sh4eb, perhaps something else that
      I've forgotten...  Also tested on all the crosses in my tester.
      
      	PR rtl-optimization/116488
      	PR rtl-optimization/116579
      	PR rtl-optimization/116915
      	PR rtl-optimization/117226
      gcc/
      	* ext-dce.cc (carry_backpropagate): Properly handle SIGN_EXTEND, add
      	ZERO_EXTEND handling as well.
      	(ext_dce_process_uses): Call carry_backpropagate before the optimization
      	step.
      
      gcc/testsuite/
      	* gcc.dg/torture/pr116488.c: New test.
      	* gcc.dg/torture/pr116579.c: New test.
      	* gcc.dg/torture/pr116915.c: New test.
      	* gcc.dg/torture/pr117226.c: New test.
      36e91df7
    • Pan Li's avatar
      RISC-V: Add testcases for form 8 of vector signed SAT_TRUNC · cb131a40
      Pan Li authored
      
      Form 8:
        #define DEF_VEC_SAT_S_TRUNC_FMT_8(NT, WT, NT_MIN, NT_MAX)             \
        void __attribute__((noinline))                                        \
        vec_sat_s_trunc_##NT##_##WT##_fmt_8 (NT *out, WT *in, unsigned limit) \
        {                                                                     \
          unsigned i;                                                         \
          for (i = 0; i < limit; i++)                                         \
            {                                                                 \
              WT x = in[i];                                                   \
              NT trunc = (NT)x;                                               \
              out[i] = (WT)NT_MIN >= x || x >= (WT)NT_MAX                     \
      	  ? x < 0 ? NT_MIN : NT_MAX                                     \
      	  : trunc;                                                      \
            }                                                                 \
        }
      
      The below test are passed for this patch.
      * The rv64gcv fully regression test.
      
      It is test only patch and obvious up to a point, will commit it
      directly if no comments in next 48H.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-8-i16-to-i8.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-8-i32-to-i16.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-8-i32-to-i8.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-8-i64-to-i16.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-8-i64-to-i32.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-8-i64-to-i8.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-8-i16-to-i8.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-8-i32-to-i16.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-8-i32-to-i8.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-8-i64-to-i16.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-8-i64-to-i32.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-8-i64-to-i8.c: New test.
      
      Signed-off-by: default avatarPan Li <pan2.li@intel.com>
      cb131a40
    • Pan Li's avatar
      RISC-V: Add testcases for form 7 of vector signed SAT_TRUNC · f1388068
      Pan Li authored
      
      Form 7:
        #define DEF_VEC_SAT_S_TRUNC_FMT_7(NT, WT, NT_MIN, NT_MAX)             \
        void __attribute__((noinline))                                        \
        vec_sat_s_trunc_##NT##_##WT##_fmt_7 (NT *out, WT *in, unsigned limit) \
        {                                                                     \
          unsigned i;                                                         \
          for (i = 0; i < limit; i++)                                         \
            {                                                                 \
              WT x = in[i];                                                   \
              NT trunc = (NT)x;                                               \
              out[i] = (WT)NT_MIN > x || x >= (WT)NT_MAX                      \
      	  ? x < 0 ? NT_MIN : NT_MAX                                     \
      	  : trunc;                                                      \
            }                                                                 \
        }
      
      The below test are passed for this patch.
      * The rv64gcv fully regression test.
      
      It is test only patch and obvious up to a point, will commit it
      directly if no comments in next 48H.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-7-i16-to-i8.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-7-i32-to-i16.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-7-i32-to-i8.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-7-i64-to-i16.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-7-i64-to-i32.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-7-i64-to-i8.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-7-i16-to-i8.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-7-i32-to-i16.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-7-i32-to-i8.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-7-i64-to-i16.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-7-i64-to-i32.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-7-i64-to-i8.c: New test.
      
      Signed-off-by: default avatarPan Li <pan2.li@intel.com>
      f1388068
    • Pan Li's avatar
      RISC-V: Add testcases for form 6 of vector signed SAT_TRUNC · f411abe7
      Pan Li authored
      
      Form 6:
        #define DEF_VEC_SAT_S_TRUNC_FMT_6(NT, WT, NT_MIN, NT_MAX)             \
        void __attribute__((noinline))                                        \
        vec_sat_s_trunc_##NT##_##WT##_fmt_6 (NT *out, WT *in, unsigned limit) \
        {                                                                     \
          unsigned i;                                                         \
          for (i = 0; i < limit; i++)                                         \
            {                                                                 \
              WT x = in[i];                                                   \
              NT trunc = (NT)x;                                               \
              out[i] = (WT)NT_MIN >= x || x > (WT)NT_MAX                      \
      	  ? x < 0 ? NT_MIN : NT_MAX                                     \
      	  j: trunc;                                                      \
            }                                                                 \
        }
      
      The below test are passed for this patch.
      * The rv64gcv fully regression test.
      
      It is test only patch and obvious up to a point, will commit it
      directly if no comments in next 48H.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-6-i16-to-i8.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-6-i32-to-i16.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-6-i32-to-i8.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-6-i64-to-i16.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-6-i64-to-i32.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-6-i64-to-i8.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-6-i16-to-i8.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-6-i32-to-i16.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-6-i32-to-i8.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-6-i64-to-i16.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-6-i64-to-i32.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-6-i64-to-i8.c: New test.
      
      Signed-off-by: default avatarPan Li <pan2.li@intel.com>
      f411abe7
    • Pan Li's avatar
      RISC-V: Add testcases for form 5 of vector signed SAT_TRUNC · 108c8ef0
      Pan Li authored
      
      Form 5:
        #define DEF_VEC_SAT_S_TRUNC_FMT_5(NT, WT, NT_MIN, NT_MAX)             \
        void __attribute__((noinline))                                        \
        vec_sat_s_trunc_##NT##_##WT##_fmt_5 (NT *out, WT *in, unsigned limit) \
        {                                                                     \
          unsigned i;                                                         \
          for (i = 0; i < limit; i++)                                         \
            {                                                                 \
              WT x = in[i];                                                   \
              NT trunc = (NT)x;                                               \
              out[i] = (WT)NT_MIN > x || x > (WT)NT_MAX                       \
      	  ? x < 0 ? NT_MIN : NT_MAX                                     \
      	  : trunc;                                                      \
            }                                                                 \
        }
      
      The below test are passed for this patch.
      * The rv64gcv fully regression test.
      
      It is test only patch and obvious up to a point, will commit it
      directly if no comments in next 48H.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-5-i16-to-i8.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-5-i32-to-i16.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-5-i32-to-i8.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-5-i64-to-i16.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-5-i64-to-i32.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-5-i64-to-i8.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-5-i16-to-i8.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-5-i32-to-i16.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-5-i32-to-i8.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-5-i64-to-i16.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-5-i64-to-i32.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-5-i64-to-i8.c: New test.
      
      Signed-off-by: default avatarPan Li <pan2.li@intel.com>
      108c8ef0
    • Pan Li's avatar
      RISC-V: Add testcases for form 4 of vector signed SAT_TRUNC · f30ca986
      Pan Li authored
      
      Form 4:
        #define DEF_VEC_SAT_S_TRUNC_FMT_4(NT, WT, NT_MIN, NT_MAX)             \
        void __attribute__((noinline))                                        \
        vec_sat_s_trunc_##NT##_##WT##_fmt_4 (NT *out, WT *in, unsigned limit) \
        {                                                                     \
          unsigned i;                                                         \
          for (i = 0; i < limit; i++)                                         \
            {                                                                 \
              WT x = in[i];                                                   \
              NT trunc = (NT)x;                                               \
              out[i] = (WT)NT_MIN <= x && x < (WT)NT_MAX                      \
      	  ? trunc                                                       \
      	  : x < 0 ? NT_MIN : NT_MAX;                                    \
            }                                                                 \
        }
      
      The below test are passed for this patch.
      * The rv64gcv fully regression test.
      
      It is test only patch and obvious up to a point, will commit it
      directly if no comments in next 48H.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-4-i16-to-i8.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-4-i32-to-i16.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-4-i32-to-i8.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-4-i64-to-i16.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-4-i64-to-i32.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-4-i64-to-i8.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-4-i16-to-i8.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-4-i32-to-i16.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-4-i32-to-i8.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-4-i64-to-i16.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-4-i64-to-i32.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-4-i64-to-i8.c: New test.
      
      Signed-off-by: default avatarPan Li <pan2.li@intel.com>
      f30ca986
    • Pan Li's avatar
      RISC-V: Add testcases for form 3 of vector signed SAT_TRUNC · efa1617b
      Pan Li authored
      
      Form 3:
        #define DEF_VEC_SAT_S_TRUNC_FMT_3(NT, WT, NT_MIN, NT_MAX)             \
        void __attribute__((noinline))                                        \
        vec_sat_s_trunc_##NT##_##WT##_fmt_3 (NT *out, WT *in, unsigned limit) \
        {                                                                     \
          unsigned i;                                                         \
          for (i = 0; i < limit; i++)                                         \
            {                                                                 \
              WT x = in[i];                                                   \
              NT trunc = (NT)x;                                               \
              out[i] = (WT)NT_MIN < x && x < (WT)NT_MAX                       \
      	  ? trunc                                                       \
      	  : x < 0 ? NT_MIN : NT_MAX;                                    \
            }                                                                 \
        }
      
      The below test are passed for this patch.
      * The rv64gcv fully regression test.
      
      It is test only patch and obvious up to a point, will commit it
      directly if no comments in next 48H.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-3-i16-to-i8.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-3-i32-to-i16.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-3-i32-to-i8.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-3-i64-to-i16.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-3-i64-to-i32.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-3-i64-to-i8.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-3-i16-to-i8.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-3-i32-to-i16.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-3-i32-to-i8.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-3-i64-to-i16.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-3-i64-to-i32.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-3-i64-to-i8.c: New test.
      
      Signed-off-by: default avatarPan Li <pan2.li@intel.com>
      efa1617b
    • Pan Li's avatar
      RISC-V: Add testcases for form 2 of vector signed SAT_TRUNC · 033900fc
      Pan Li authored
      
      Form 2:
        #define DEF_VEC_SAT_S_TRUNC_FMT_2(NT, WT, NT_MIN, NT_MAX)             \
        void __attribute__((noinline))                                        \
        vec_sat_s_trunc_##NT##_##WT##_fmt_2 (NT *out, WT *in, unsigned limit) \
        {                                                                     \
          unsigned i;                                                         \
          for (i = 0; i < limit; i++)                                         \
            {                                                                 \
              WT x = in[i];                                                   \
              NT trunc = (NT)x;                                               \
              out[i] = (WT)NT_MIN < x && x < (WT)NT_MAX                       \
      	  ? trunc                                                       \
      	  : x < 0 ? NT_MIN : NT_MAX;                                    \
            }                                                                 \
        }
      
      The below test are passed for this patch.
      * The rv64gcv fully regression test.
      
      It is test only patch and obvious up to a point, will commit it
      directly if no comments in next 48H.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-2-i16-to-i8.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-2-i32-to-i16.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-2-i32-to-i8.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-2-i64-to-i16.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-2-i64-to-i32.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-2-i64-to-i8.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-2-i16-to-i8.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-2-i32-to-i16.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-2-i32-to-i8.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-2-i64-to-i16.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-2-i64-to-i32.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-2-i64-to-i8.c: New test.
      
      Signed-off-by: default avatarPan Li <pan2.li@intel.com>
      033900fc
    • Pan Li's avatar
      RISC-V: Add testcases for form 1 of vector signed SAT_TRUNC · 1f3a9c08
      Pan Li authored
      
      Form 1:
        #define DEF_VEC_SAT_S_TRUNC_FMT_1(NT, WT, NT_MIN, NT_MAX)             \
        void __attribute__((noinline))                                        \
        vec_sat_s_trunc_##NT##_##WT##_fmt_1 (NT *out, WT *in, unsigned limit) \
        {                                                                     \
          unsigned i;                                                         \
          for (i = 0; i < limit; i++)                                         \
            {                                                                 \
              WT x = in[i];                                                   \
              NT trunc = (NT)x;                                               \
              out[i] = (WT)NT_MIN <= x && x <= (WT)NT_MAX                     \
       	  ? trunc                                                       \
      	  : x < 0 ? NT_MIN : NT_MAX;                                    \
            }                                                                 \
        }
      
      The below test are passed for this patch.
      * The rv64gcv fully regression test.
      
      It is test only patch and obvious up to a point, will commit it
      directly if no comments in next 48H.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_data.h: Add test data for
      	signed SAT_TRUNC.
      	* gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-1-i16-to-i8.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-1-i32-to-i16.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-1-i32-to-i8.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-1-i64-to-i16.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-1-i64-to-i32.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-1-i64-to-i8.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-1-i16-to-i8.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-1-i32-to-i16.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-1-i32-to-i8.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-1-i64-to-i16.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-1-i64-to-i32.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/vec_sat_s_trunc-run-1-i64-to-i8.c: New test.
      
      Signed-off-by: default avatarPan Li <pan2.li@intel.com>
      1f3a9c08
    • Pan Li's avatar
      RISC-V: Implement vector SAT_TRUNC for signed integer · b5a05815
      Pan Li authored
      
      This patch would like to implement the sstrunc for vector signed integer.
      
      Form 1:
        #define DEF_VEC_SAT_S_TRUNC_FMT_1(NT, WT, NT_MIN, NT_MAX)             \
        void __attribute__((noinline))                                        \
        vec_sat_s_trunc_##NT##_##WT##_fmt_1 (NT *out, WT *in, unsigned limit) \
        {                                                                     \
          unsigned i;                                                         \
          for (i = 0; i < limit; i++)                                         \
            {                                                                 \
              WT x = in[i];                                                   \
              NT trunc = (NT)x;                                               \
              out[i] = (WT)NT_MIN <= x && x <= (WT)NT_MAX                     \
      	  ? trunc                                                       \
      	  : x < 0 ? NT_MIN : NT_MAX;                                    \
            }                                                                 \
        }
      
      DEF_VEC_SAT_S_TRUNC_FMT_1(int32_t, int64_t, INT32_MIN, INT32_MAX)
      
      Before this patch:
        27   │     vsetvli a5,a2,e64,m1,ta,ma
        28   │     vle64.v v1,0(a1)
        29   │     slli    a3,a5,3
        30   │     slli    a4,a5,2
        31   │     sub a2,a2,a5
        32   │     add a1,a1,a3
        33   │     vadd.vv v0,v1,v5
        34   │     vsetvli zero,zero,e32,mf2,ta,ma
        35   │     vnsrl.wx    v2,v1,a6
        36   │     vncvt.x.x.w v1,v1
        37   │     vsetvli zero,zero,e64,m1,ta,ma
        38   │     vmsgtu.vv   v0,v0,v4
        39   │     vsetvli zero,zero,e32,mf2,ta,mu
        40   │     vneg.v  v2,v2
        41   │     vxor.vv v1,v2,v3,v0.t
        42   │     vse32.v v1,0(a0)
        43   │     add a0,a0,a4
        44   │     bne a2,zero,.L3
      
      After this patch:
        16   │     vsetvli a5,a2,e32,mf2,ta,ma
        17   │     vle64.v v1,0(a1)
        18   │     slli    a3,a5,3
        19   │     slli    a4,a5,2
        20   │     sub a2,a2,a5
        21   │     add a1,a1,a3
        22   │     vnclip.wi   v1,v1,0
        23   │     vse32.v v1,0(a0)
        24   │     add a0,a0,a4
        25   │     bne a2,zero,.L3
      
      The below test suites are passed for this patch.
      * The rv64gcv fully regression test.
      
      gcc/ChangeLog:
      
      	* config/riscv/autovec.md (sstrunc<mode><v_double_trunc>2): Add
      	new pattern sstrunc for double trunc.
      	(sstrunc<mode><v_quad_trunc>2): Ditto but for quad trunc.
      	(sstrunc<mode><v_oct_trunc>2): Ditto but for oct trunc.
      	* config/riscv/riscv-protos.h (expand_vec_double_sstrunc): Add
      	new func decl to expand double trunc.
      	(expand_vec_quad_sstrunc): Ditto but for quad trunc.
      	(expand_vec_oct_sstrunc): Ditto but for oct trunc.
      	* config/riscv/riscv-v.cc (expand_vec_double_sstrunc): Add new
      	func to expand double trunc.
      	(expand_vec_quad_sstrunc): Ditto but for quad trunc.
      	(expand_vec_oct_sstrunc): Ditto but for oct trunc.
      
      Signed-off-by: default avatarPan Li <pan2.li@intel.com>
      b5a05815
    • Pan Li's avatar
      Vect: Try the pattern of vector signed integer SAT_TRUNC · 2987ca61
      Pan Li authored
      
      Almost the same as vector unsigned integer SAT_TRUNC, try to match
      the signed version during the vector pattern matching.
      
      The below test suites are passed for this patch.
      * The rv64gcv fully regression test.
      * The x86 bootstrap test.
      * The x86 fully regression test.
      
      gcc/ChangeLog:
      
      	* tree-vect-patterns.cc (gimple_signed_integer_sat_trunc): Add
      	new func decl for signed SAT_TRUNC.
      	(vect_recog_sat_trunc_pattern): Try signed match pattern for
      	the SAT_TRUNC.
      
      Signed-off-by: default avatarPan Li <pan2.li@intel.com>
      2987ca61
    • Pan Li's avatar
      Match: Support form 1 for vector signed integer SAT_TRUNC · bdbb74e3
      Pan Li authored
      
      This patch would like to support the form 1 of the vector signed
      integer SAT_TRUNC.  Aka below example:
      
      Form 1:
        #define DEF_VEC_SAT_S_TRUNC_FMT_1(NT, WT, NT_MIN, NT_MAX)             \
        void __attribute__((noinline))                                        \
        vec_sat_s_trunc_##NT##_##WT##_fmt_1 (NT *out, WT *in, unsigned limit) \
        {                                                                     \
          unsigned i;                                                         \
          for (i = 0; i < limit; i++)                                         \
            {                                                                 \
              WT x = in[i];                                                   \
              NT trunc = (NT)x;                                               \
              out[i] = (WT)NT_MIN <= x && x <= (WT)NT_MAX                     \
      	  ? trunc                                                       \
      	  : x < 0 ? NT_MIN : NT_MAX;                                    \
            }                                                                 \
        }
      
      DEF_VEC_SAT_S_TRUNC_FMT_1(int32_t, int64_t, INT32_MIN, INT32_MAX)
      
      Before this patch:
        48   │   _87 = .SELECT_VL (ivtmp_85, POLY_INT_CST [2, 2]);
        49   │   ivtmp_64 = _87 * 8;
        50   │   vect_x_14.10_67 = .MASK_LEN_LOAD (vectp_in.8_65, 64B, { -1, ... }, _87, 0);
        51   │   vect_trunc_15.21_78 = (vector([2,2]) int) vect_x_14.10_67;
        52   │   _61 = VIEW_CONVERT_EXPR<vector([2,2]) unsigned long>(vect_x_14.10_67);
        53   │   _32 = _61 >> 63;
        54   │   vect_patt_52.16_73 = (vector([2,2]) int) _32;
        55   │   vect__46.17_74 = VIEW_CONVERT_EXPR<vector([2,2]) unsigned int>(vect_patt_52.16_73);
        56   │   vect__47.18_75 = -vect__46.17_74;
        57   │   vect__21.19_76 = VIEW_CONVERT_EXPR<vector([2,2]) int>(vect__47.18_75);
        58   │   vect_x.11_68 = VIEW_CONVERT_EXPR<vector([2,2]) unsigned long>(vect_x_14.10_67);
        59   │   vect__5.12_69 = vect_x.11_68 + { 2147483648, ... };
        60   │   mask__34.13_70 = vect__5.12_69 > { 4294967295, ... };
        61   │   _25 = .COND_XOR (mask__34.13_70, vect__21.19_76, { 2147483647, ... }, vect_trunc_15.21_78);
        62   │   ivtmp_80 = _87 * 4;
        63   │   .MASK_LEN_STORE (vectp_out.23_81, 32B, { -1, ... }, _87, 0, _25);
        64   │   vectp_in.8_66 = vectp_in.8_65 + ivtmp_64;
        65   │   vectp_out.23_82 = vectp_out.23_81 + ivtmp_80;
        66   │   ivtmp_86 = ivtmp_85 - _87;
      
      After this patch:
        38   │   _77 = .SELECT_VL (ivtmp_75, POLY_INT_CST [2, 2]);
        39   │   ivtmp_65 = _77 * 8;
        40   │   vect_x_14.10_68 = .MASK_LEN_LOAD (vectp_in.8_66, 64B, { -1, ... }, _77, 0);
        41   │   vect_patt_53.11_69 = .SAT_TRUNC (vect_x_14.10_68);
        42   │   ivtmp_70 = _77 * 4;
        43   │   .MASK_LEN_STORE (vectp_out.12_71, 32B, { -1, ... }, _77, 0, vect_patt_53.11_69);
        44   │   vectp_in.8_67 = vectp_in.8_66 + ivtmp_65;
        45   │   vectp_out.12_72 = vectp_out.12_71 + ivtmp_70;
        46   │   ivtmp_76 = ivtmp_75 - _77;
      
      The below test suites are passed for this patch.
      * The rv64gcv fully regression test.
      * The x86 bootstrap test.
      * The x86 fully regression test.
      
      gcc/ChangeLog:
      
      	* match.pd: Refine matching for vector signed SAT_TRUNC form 1.
      
      Signed-off-by: default avatarPan Li <pan2.li@intel.com>
      bdbb74e3
    • Andrew Carlotti's avatar
      aarch64: Fix costing of move to/from MOVEABLE_SYSREGS · 8193e71a
      Andrew Carlotti authored
      This is necessary to prevent reload assuming that a direct FP->FPMR move
      is valid.
      
      gcc/ChangeLog:
      
      	* config/aarch64/aarch64.cc (aarch64_register_move_cost):
      	Increase costs involving MOVEABLE_SYSREGS.
      8193e71a
    • Andrew Stubbs's avatar
      amdgcn: silence warning · 0b6d94ce
      Andrew Stubbs authored
      FIRST_SGPR_REG is register zero so the compiler always claims this comparison
      is redundant.  It's right, of course, but I'd have preferred to keep the
      comparison for completeness.  Probably the "correct" solution is to use an enum
      for these values.
      
      gcc/ChangeLog:
      
      	* config/gcn/gcn.h (SGPR_REGNO_P): Silence warning.
      0b6d94ce
    • Alex Coplan's avatar
      pair-fusion: Assume alias conflict if common address reg changes [PR116783] · c0e54ce1
      Alex Coplan authored
      As the PR shows, pair-fusion was tricking memory_modified_in_insn_p into
      returning false when a common base register (in this case, x1) was
      modified between the mem and the store insn.  This lead to wrong code as
      the accesses really did alias.
      
      To avoid this sort of problem, this patch avoids invoking RTL alias
      analysis altogether (and assume an alias conflict) if the two insns to
      be compared share a common address register R, and the insns see different
      definitions of R (i.e. it was modified in between).
      
      gcc/ChangeLog:
      
      	PR rtl-optimization/116783
      	* pair-fusion.cc (def_walker::cand_addr_uses): New.
      	(def_walker::def_walker): Add parameter for candidate address
      	uses.
      	(def_walker::alias_conflict_p): Declare.
      	(def_walker::addr_reg_conflict_p): New.
      	(def_walker::conflict_p): New.
      	(store_walker::store_walker): Add parameter for candidate
      	address uses and pass to base ctor.
      	(store_walker::conflict_p): Rename to ...
      	(store_walker::alias_conflict_p): ... this.
      	(load_walker::load_walker): Add parameter for candidate
      	address uses and pass to base ctor.
      	(load_walker::conflict_p): Rename to ...
      	(load_walker::alias_conflict_p): ... this.
      	(pair_fusion_bb_info::try_fuse_pair): Collect address register
      	uses for candidate insns and pass down to alias walkers.
      
      gcc/testsuite/ChangeLog:
      
      	PR rtl-optimization/116783
      	* g++.dg/torture/pr116783.C: New test.
      c0e54ce1
    • Jonathan Wakely's avatar
      libstdc++: Improve 26_numerics/headers/cmath/types_std_c++0x_neg.cc · d0d99fc6
      Jonathan Wakely authored
      This test checks that the special functions in <cmath> are not declared
      prior to C++17. But we can remove the target selector and allow it to be
      tested for C++17 and later, and add target selectors to the individual
      dg-error directives instead.
      
      Also rename the test to match what it actually tests.
      
      libstdc++-v3/ChangeLog:
      
      	* testsuite/26_numerics/headers/cmath/types_std_c++0x_neg.cc:
      	Move to ...
      	* testsuite/26_numerics/headers/cmath/specfun_c++17.cc: here and
      	adjust test to be valid for all -std dialects.
      d0d99fc6
    • Jonathan Wakely's avatar
      libstdc++: Simplify C++98 std::vector::_M_data_ptr overload set · 1003a428
      Jonathan Wakely authored
      We don't need separate overloads for returning a const or non-const
      pointer. We can make the member function const and return a non-const
      pointer, and let vector::data() const convert it to const as needed.
      
      libstdc++-v3/ChangeLog:
      
      	* include/bits/stl_vector.h (vector::_M_data_ptr): Remove
      	non-const overloads. Always return non-const pointer.
      1003a428
    • Jonathan Wakely's avatar
      libstdc++: Fix order of [[...]] and __attribute__((...)) attrs [PR117220] · cba80691
      Jonathan Wakely authored
      GCC allows these in either order, but Clang doesn't like the C++11-style
      [[__nodiscard__]] coming after __attribute__((__always_inline__)).
      
      libstdc++-v3/ChangeLog:
      
      	PR libstdc++/117220
      	* include/bits/stl_iterator.h: Move _GLIBCXX_NODISCARD
      	annotations after __attribute__((__always_inline__)).
      cba80691
Loading