Skip to content
Snippets Groups Projects
  1. Jun 06, 2023
    • Tobias Burnus's avatar
      openmp: Add support for the 'present' modifier · 4ede915d
      Tobias Burnus authored
      This implements support for the OpenMP 5.1 'present' modifier, which can be
      used in map clauses in the 'target', 'target data', 'target data enter' and
      'target data exit' constructs, and in the 'to' and 'from' clauses of the
      'target update' construct.  It is also supported in defaultmap.
      
      The modifier triggers a fatal runtime error if the data specified by the
      clause is not already present on the target device.  It can also be combined
      with 'always' in map clauses.
      
      2023-06-06  Kwok Cheung Yeung  <kcy@codesourcery.com>
      	    Tobias Burnus  <tobias@codesourcery.com>
      
      gcc/c/
      	* c-parser.cc (c_parser_omp_clause_defaultmap,
      	c_parser_omp_clause_map): Parse 'present'.
      	(c_parser_omp_clause_to, c_parser_omp_clause_from): Remove.
      	(c_parser_omp_clause_from_to): New; parse to/from clauses with
      	optional present modifer.
      	(c_parser_omp_all_clauses): Update call.
      	(c_parser_omp_target_data, c_parser_omp_target_enter_data,
      	c_parser_omp_target_exit_data): Handle new map enum values
      	for 'present' mapping.
      
      gcc/cp/
      	* parser.cc (cp_parser_omp_clause_defaultmap,
      	cp_parser_omp_clause_map): Parse 'present'.
      	(cp_parser_omp_clause_from_to): New; parse to/from
      	clauses with optional 'present' modifier.
      	(cp_parser_omp_all_clauses): Update call.
      	(cp_parser_omp_target_data, cp_parser_omp_target_enter_data,
      	cp_parser_omp_target_exit_data): Handle new enum value for
      	'present' mapping.
      	* semantics.cc (finish_omp_target): Likewise.
      
      gcc/fortran/
      	* dump-parse-tree.cc (show_omp_namelist): Display 'present' map
      	modifier.
      	(show_omp_clauses): Display 'present' motion modifier for 'to'
      	and 'from' clauses.
      
      	* gfortran.h (enum gfc_omp_map_op): Add entries with 'present'
      	modifiers.
      	(struct gfc_omp_namelist): Add 'present_modifer'.
      	* openmp.cc (gfc_match_motion_var_list): New, handles optional
      	'present' modifier for to/from clauses.
      	(gfc_match_omp_clauses): Call it for to/from clauses; parse 'present'
      	in defaultmap and map clauses.
      	(resolve_omp_clauses): Allow 'present' modifiers on 'target',
      	'target data', 'target enter' and 'target exit'	directives.
      	* trans-openmp.cc (gfc_trans_omp_clauses): Apply 'present' modifiers
      	to tree node for 'map', 'to' and 'from'	clauses.  Apply 'present' for
      	defaultmap.
      
      gcc/
      	* gimplify.cc (omp_notice_variable): Apply GOVD_MAP_ALLOC_ONLY flag
      	and defaultmap flags if the defaultmap has GOVD_MAP_FORCE_PRESENT flag
      	set.
      	(omp_get_attachment): Handle map clauses with 'present' modifier.
      	(omp_group_base): Likewise.
      	(gimplify_scan_omp_clauses): Reorder present maps to come first.
      	Set GOVD flags for present defaultmaps.
      	(gimplify_adjust_omp_clauses_1): Set map kind for present defaultmaps.
      	* omp-low.cc (scan_sharing_clauses): Handle 'always, present' map
      	clauses.
      	(lower_omp_target): Handle map clauses with 'present' modifier.
      	Handle 'to' and 'from' clauses with 'present'.
      	* tree-core.h (enum omp_clause_defaultmap_kind): Add
      	OMP_CLAUSE_DEFAULTMAP_PRESENT defaultmap kind.
      	* tree-pretty-print.cc (dump_omp_clause): Handle 'map', 'to' and
      	'from' clauses with 'present' modifier.  Handle present defaultmap.
      	* tree.h (OMP_CLAUSE_MOTION_PRESENT): New #define.
      
      include/
      	* gomp-constants.h (GOMP_MAP_FLAG_SPECIAL_5): New.
      	(GOMP_MAP_FLAG_FORCE): Redefine.
      	(GOMP_MAP_FLAG_PRESENT, GOMP_MAP_FLAG_ALWAYS_PRESENT): New.
      	(enum gomp_map_kind): Add map kinds with 'present' modifiers.
      	(GOMP_MAP_COPY_TO_P, GOMP_MAP_COPY_FROM_P): Evaluate to true for
      	map variants with 'present'
      	(GOMP_MAP_ALWAYS_TO_P, GOMP_MAP_ALWAYS_FROM_P): Evaluate to true
      	for map variants with 'always, present' modifiers.
      	(GOMP_MAP_ALWAYS): Redefine.
      	(GOMP_MAP_FORCE_P, GOMP_MAP_PRESENT_P): New.
      
      libgomp/
      	* libgomp.texi (OpenMP 5.1 Impl. status): Set 'present' support for
      	defaultmap to 'Y', add 'Y' entry for 'present' on to/from/map clauses.
      	* target.c (gomp_to_device_kind_p): Add map kinds with 'present'
      	modifier.
      	(gomp_map_vars_existing): Use new GOMP_MAP_FORCE_P macro.
      	(gomp_map_vars_internal, gomp_update, gomp_target_rev):
      	Emit runtime error if memory region not present.
      	* testsuite/libgomp.c-c++-common/target-present-1.c: New test.
      	* testsuite/libgomp.c-c++-common/target-present-2.c: New test.
      	* testsuite/libgomp.c-c++-common/target-present-3.c: New test.
      	* testsuite/libgomp.fortran/target-present-1.f90: New test.
      	* testsuite/libgomp.fortran/target-present-2.f90: New test.
      	* testsuite/libgomp.fortran/target-present-3.f90: New test.
      
      gcc/testsuite/
      
      	* c-c++-common/gomp/map-6.c: Update dg-error, extend to test for
      	duplicated 'present' and extend scan-dump tests for 'present'.
      	* gfortran.dg/gomp/defaultmap-1.f90: Update dg-error.
      	* gfortran.dg/gomp/map-7.f90: Extend parse and dump test for
      	'present'.
      	* gfortran.dg/gomp/map-8.f90: Extend for duplicate 'present'
      	modifier checking.
      	* c-c++-common/gomp/defaultmap-4.c: New test.
      	* c-c++-common/gomp/map-9.c: New test.
      	* c-c++-common/gomp/target-update-1.c: New test.
      	* gfortran.dg/gomp/defaultmap-8.f90: New test.
      	* gfortran.dg/gomp/map-11.f90: New test.
      	* gfortran.dg/gomp/map-12.f90: New test.
      	* gfortran.dg/gomp/target-update-1.f90: New test.
      4ede915d
    • Matthias Kretz's avatar
      libstdc++: Avoid vector casts while still avoiding PR90424 · 9165ede5
      Matthias Kretz authored
      
      Signed-off-by: default avatarMatthias Kretz <m.kretz@gsi.de>
      
      libstdc++-v3/ChangeLog:
      
      	PR libstdc++/109822
      	* include/experimental/bits/simd_builtin.h (_S_store): Rewrite
      	to avoid casts to other vector types. Implement store as
      	succession of power-of-2 sized memcpy to avoid PR90424.
      9165ede5
    • Matthias Kretz's avatar
      libstdc++: Replace use of incorrect non-temporal store · 27e45b75
      Matthias Kretz authored
      
      The call to the base implementation sometimes didn't find a matching
      signature because the _Abi parameter of _SimdImpl* was "wrong" after
      conversion. It has to call into <new ABI tag>::_SimdImpl instead of the
      current ABI tag's _SimdImpl. This also reduces the number of possible
      template instantiations.
      
      Signed-off-by: default avatarMatthias Kretz <m.kretz@gsi.de>
      
      libstdc++-v3/ChangeLog:
      
      	PR libstdc++/110054
      	* include/experimental/bits/simd_builtin.h (_S_masked_store):
      	Call into deduced ABI's SimdImpl after conversion.
      	* include/experimental/bits/simd_x86.h (_S_masked_store_nocvt):
      	Don't use _mm_maskmoveu_si128. Use the generic fall-back
      	implementation. Also fix masked stores without SSE2, which
      	were not doing anything before.
      27e45b75
    • Segher Boessenkool's avatar
      rs6000: genfusion: Delete dead code · a3df359f
      Segher Boessenkool authored
      2023-06-06  Segher Boessenkool  <segher@kernel.crashing.org>
      
      	* config/rs6000/genfusion.pl: Delete some dead code.
      a3df359f
    • Segher Boessenkool's avatar
      rs6000: genfusion: Rewrite load/compare code · 19e5bf1d
      Segher Boessenkool authored
      This makes the code more readable, more digestible, more maintainable,
      more extensible.  That kind of thing.  It does that by pulling things
      apart a bit, but also making what stays together more cohesive lumps.
      
      The original function was a bunch of loops and early-outs, and then
      quite a bit of stuff done per iteration, with the iterations essentially
      independent of each other.  This patch moves the stuff done for one
      iteration to a new _one function.
      
      The second big thing is the stuff printed to the .md file is done in
      "here documents" now, which is a lot more readable than having to quote
      and escape and double-escape pieces of text.  Whitespace inside the
      here-document is significant (will be printed as-is), which is a bit
      awkward sometimes, or might take some getting used to, but it is also
      one of the benefits of using them.
      
      Local variables are declared at first use (or close to first use).
      There also shouldn't be many at all, often you can write easier to
      read and manage code by omitting to name something that is hard to name
      in the first place.
      
      Finally some things are done in more typical, more modern, and tighter
      Perl style, for example REs in "if"s or "qw" for lists of constants.
      
      2023-06-06  Segher Boessenkool  <segher@kernel.crashing.org>
      
      	* config/rs6000/genfusion.pl (gen_ld_cmpi_p10_one): New, rewritten and
      	split out from...
      	(gen_ld_cmpi_p10): ... this.
      19e5bf1d
    • Matthias Kretz's avatar
      libstdc++: Protect against macros · ce2188e4
      Matthias Kretz authored
      
      Signed-off-by: default avatarMatthias Kretz <m.kretz@gsi.de>
      
      libstdc++-v3/ChangeLog:
      
      	* include/experimental/bits/simd.h (__bit_cast): Use
      	__gnu__::__vector_size__ instead of gnu::vector_size.
      ce2188e4
    • Jonathan Wakely's avatar
      libstdc++: Fix ambiguous expression in std::array<T, 0>::front() [PR110139] · 56001fad
      Jonathan Wakely authored
      For 32-bit targets using -pedantic (or using Clang) makes the expression
      _M_elems[0] ambiguous.  The overloaded operator[] that we want to call
      has a size_t parameter, but 0 is type ptrdiff_t for many ILP32 targets,
      so using the implicit conversion from _M_elems to T* and then
      subscripting that is also viable.
      
      Change the 0 to (size_type)0 and also make the conversion to T*
      explicit, so that's it's not viable here. The latter change requires a
      static_cast in data() where we really do want to convert _M_elems to a
      pointer.
      
      libstdc++-v3/ChangeLog:
      
      	PR libstdc++/110139
      	* include/std/array (__array_traits<T, 0>::operator T*()): Make
      	conversion operator explicit.
      	(array::front): Use size_type as subscript operand.
      	(array::data): Use static_cast to make conversion explicit.
      	* testsuite/23_containers/array/element_access/110139.cc: New
      	test.
      56001fad
    • Joseph Faulls's avatar
      libstdc++: Do not assume existence of char8_t codecvt facet · 3d9b3ddb
      Joseph Faulls authored
      It is not required that codecvt<char8_t, char, mbstate_t> facet be
      supported by the locale, nor is it added as part of the default locale.
      This can lead to dangerous behaviour when static_cast.
      
      libstdc++-v3/ChangeLog:
      
      	* include/bits/locale_classes.tcc: Remove check for
      	codecvt<char8_t, char, mbstate_t> facet.
      3d9b3ddb
    • Jonathan Wakely's avatar
      libstdc++: Use close-on-exec for file descriptors in filesystem::copy_file · 7e8e071c
      Jonathan Wakely authored
      libstdc++-v3/ChangeLog:
      
      	* src/filesystem/ops-common.h (do_copy_file) [O_CLOEXEC]: Set
      	close-on-exec flag on file descriptors.
      7e8e071c
    • Jonathan Wakely's avatar
      libstdc++: Make std::filesystem::copy_file work for procfs [PR108178] · 07a0e108
      Jonathan Wakely authored
      The size reported by stat is always zero for some special files such as
      those under /proc, which means the current copy_file implementation
      thinks there is nothing to copy. Instead of trusting the stat value, try
      to read a character from a streambuf and check for EOF.
      
      libstdc++-v3/ChangeLog:
      
      	PR libstdc++/108178
      	* src/filesystem/ops-common.h (do_copy_file): Check for empty
      	files by trying to read a character.
      	* testsuite/27_io/filesystem/operations/copy_file_108178.cc:
      	New test.
      07a0e108
    • Jannik Glückert's avatar
      libstdc++: Use copy_file_range for filesystem::copy_file · d87caacf
      Jannik Glückert authored
      
      copy_file_range is a recent-ish syscall for copying files. It is similar
      to sendfile but allows filesystem-specific optimizations. Common are:
      Reflinks: BTRFS, XFS, ZFS (does not implement the syscall yet)
      Server-side copy: NFS, SMB, Ceph
      
      If copy_file_range is not available for the given files, fall back to
      sendfile / userspace copy.
      
      libstdc++-v3/ChangeLog:
      
      	* acinclude.m4 (_GLIBCXX_USE_COPY_FILE_RANGE): Define.
      	* config.h.in: Regenerate.
      	* configure: Regenerate.
      	* src/filesystem/ops-common.h (copy_file_copy_file_range):
      	Define new function.
      	(do_copy_file): Use it.
      
      Signed-off-by: default avatarJannik Glückert <jannik.glueckert@gmail.com>
      d87caacf
    • Jannik Glückert's avatar
      libstdc++: Also use sendfile for big files · f80a8b42
      Jannik Glückert authored
      
      We were previously only using sendfile for files smaller than 2GB, as
      sendfile needs to be called repeatedly for files bigger than that.
      
      Some quick numbers, copying a 16GB file, average of 10 repetitions:
          old:
              real: 13.4s
              user: 0.14s
              sys : 7.43s
          new:
              real: 8.90s
              user: 0.00s
              sys : 3.68s
      
      libstdc++-v3/ChangeLog:
      
      	* acinclude.m4 (_GLIBCXX_HAVE_LSEEK): Define.
      	* config.h.in: Regenerate.
      	* configure: Regenerate.
      	* src/filesystem/ops-common.h (copy_file_sendfile): Define new
      	function for sendfile logic. Loop to support large files. Skip
      	zero-length files.
      	(do_copy_file): Use it.
      
      Signed-off-by: default avatarJannik Glückert <jannik.glueckert@gmail.com>
      f80a8b42
    • Jeevitha Palanisamy's avatar
      rs6000: Remove duplicate expression [PR106907] · c4deccd4
      Jeevitha Palanisamy authored
      PR106907 has few warnings spotted from cppcheck. In that addressing duplicate
      expression issue here. Here the same expression is used twice in logical
      AND(&&) operation which result in same result so removing that.
      
      2023-06-06  Jeevitha Palanisamy  <jeevitha@linux.ibm.com>
      
      gcc/
      	PR target/106907
      	* config/rs6000/rs6000.cc (vec_const_128bit_to_bytes): Remove
      	duplicate expression.
      c4deccd4
    • Kyrylo Tkachov's avatar
      aarch64: Improve representation of vpaddd intrinsics · 6be5d852
      Kyrylo Tkachov authored
      The aarch64_addpdi pattern is redundant as the reduc_plus_scal_<mode> pattern can already generate
      the required form of the ADDP instruction, and is mostly folded to GIMPLE early on so can benefit from more optimisations.
      Though it turns out that we were missing the folding for the unsigned variants.
      This patch adds that and wires up the vpaddd_u64 and vpaddd_s64 intrinsics through the above pattern instead
      so that we can remove a redundant pattern and get more optimisation earlier.
      
      Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf.
      
      gcc/ChangeLog:
      
      	* config/aarch64/aarch64-builtins.cc (aarch64_general_gimple_fold_builtin):
      	Handle unsigned reduc_plus_scal_ builtins.
      	* config/aarch64/aarch64-simd-builtins.def (addp): Delete DImode instances.
      	* config/aarch64/aarch64-simd.md (aarch64_addpdi): Delete.
      	* config/aarch64/arm_neon.h (vpaddd_s64): Reimplement with
      	__builtin_aarch64_reduc_plus_scal_v2di.
      	(vpaddd_u64): Reimplement with __builtin_aarch64_reduc_plus_scal_v2di_uu.
      6be5d852
    • Kyrylo Tkachov's avatar
      aarch64: Reimplement URSHR,SRSHR patterns with standard RTL codes · 93716409
      Kyrylo Tkachov authored
      Having converted the patterns for the URSRA,SRSRA instructions to standard RTL codes we can also
      easily convert the non-accumulating forms URSHR,SRSHR.
      This patch does that, reusing the various helpers and predicates from that patch in a straightforward way.
      This allows GCC to perform the optimisations in the testcase, matching what Clang does.
      
      Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf.
      
      gcc/ChangeLog:
      
      	* config/aarch64/aarch64-simd.md (aarch64_<sur>shr_n<mode>): Delete.
      	(aarch64_<sra_op>rshr_n<mode><vczle><vczbe>_insn): New define_insn.
      	(aarch64_<sra_op>rshr_n<mode>): New define_expand.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/aarch64/simd/vrshr_1.c: New test.
      93716409
    • Kyrylo Tkachov's avatar
      aarch64: Simplify SHRN, RSHRN expanders and patterns · d2cdfafd
      Kyrylo Tkachov authored
      Now that we've got the <vczle><vczbe> annotations we can get rid of explicit
      !BYTES_BIG_ENDIAN and BYTES_BIG_ENDIAN patterns for the narrowing shift instructions.
      This allows us to clean up the expanders as well.
      
      Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf.
      
      gcc/ChangeLog:
      
      	* config/aarch64/aarch64-simd.md (aarch64_shrn<mode>_insn_le): Delete.
      	(aarch64_shrn<mode>_insn_be): Delete.
      	(*aarch64_<srn_op>shrn<mode>_vect):  Rename to...
      	(*aarch64_<srn_op>shrn<mode><vczle><vczbe>): ... This.
      	(aarch64_shrn<mode>): Remove reference to the above deleted patterns.
      	(aarch64_rshrn<mode>_insn_le): Delete.
      	(aarch64_rshrn<mode>_insn_be): Delete.
      	(aarch64_rshrn<mode><vczle><vczbe>_insn): New define_insn.
      	(aarch64_rshrn<mode>): Remove references to the above deleted patterns.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/aarch64/simd/pr99195_5.c: Add testing for shrn_n, rshrn_n
      	intrinsics.
      d2cdfafd
    • Kyrylo Tkachov's avatar
      aarch64: Improve representation of ADDLV instructions · b327cbe8
      Kyrylo Tkachov authored
      We've received requests to optimise the attached intrinsics testcase.
      We currently generate:
      foo_1:
              uaddlp  v0.4s, v0.8h
              uaddlv  d31, v0.4s
              fmov    x0, d31
              ret
      foo_2:
              uaddlp  v0.4s, v0.8h
              addv    s31, v0.4s
              fmov    w0, s31
              ret
      foo_3:
              saddlp  v0.4s, v0.8h
              addv    s31, v0.4s
              fmov    w0, s31
              ret
      
      The widening pair-wise addition addlp instructions can be omitted if we're just doing an ADDV afterwards.
      Making this optimisation would be quite simple if we had a standard RTL PLUS vector reduction code.
      As we don't, we can use UNSPEC_ADDV as a stand in.
      This patch expresses the SADDLV and UADDLV instructions as an UNSPEC_ADDV over a widened input, thus removing
      the need for separate UNSPEC_SADDLV and UNSPEC_UADDLV codes.
      To optimise the testcases involved we add two splitters that match a vector addition where all participating elements
      are taken and widened from the same vector and then fed into an UNSPEC_ADDV. In that case we can just remove the
      vector PLUS and just emit the simple RTL for SADDLV/UADDLV.
      
      Bootstrapped and tested on aarch64-none-linux-gnu.
      
      gcc/ChangeLog:
      
      	* config/aarch64/aarch64-protos.h (aarch64_parallel_select_half_p):
      	Define prototype.
      	(aarch64_pars_overlap_p): Likewise.
      	* config/aarch64/aarch64-simd.md (aarch64_<su>addlv<mode>):
      	Express in terms of UNSPEC_ADDV.
      	(*aarch64_<su>addlv<VDQV_L:mode>_ze<GPI:mode>): Likewise.
      	(*aarch64_<su>addlv<mode>_reduction): Define.
      	(*aarch64_uaddlv<mode>_reduction_2): Likewise.
      	* config/aarch64/aarch64.cc	(aarch64_parallel_select_half_p): Define.
      	(aarch64_pars_overlap_p): Likewise.
      	* config/aarch64/iterators.md (UNSPEC_SADDLV, UNSPEC_UADDLV): Delete.
      	(VQUADW): New mode attribute.
      	(VWIDE2X_S): Likewise.
      	(USADDLV): Delete.
      	(su): Delete handling of UNSPEC_SADDLV, UNSPEC_UADDLV.
      	* config/aarch64/predicates.md (vect_par_cnst_select_half): Define.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/aarch64/simd/addlv_1.c: New test.
      b327cbe8
    • Richard Biener's avatar
      middle-end/110055 - avoid CLOBBERing static variables · 84eec291
      Richard Biener authored
      The gimplifier can elide initialized constant automatic variables
      to static storage in which case TARGET_EXPR gimplification needs
      to avoid emitting a CLOBBER for them since their lifetime is no
      longer limited.  Failing to do so causes spurious dangling-pointer
      diagnostics on the added testcase for some targets.
      
      	PR middle-end/110055
      	* gimplify.cc (gimplify_target_expr): Do not emit
      	CLOBBERs for variables which have static storage duration
      	after gimplifying their initializers.
      
      	* g++.dg/warn/Wdangling-pointer-pr110055.C: New testcase.
      84eec291
    • Richard Biener's avatar
      tree-optimization/109143 - improve PTA compile time · 21bf2b2f
      Richard Biener authored
      The following improves solution_set_expand to require one less
      iteration over the bitmap and avoid changing the bitmap we iterate
      over.  Plus we handle adjacent subvars in the ID space (the common case)
      and use bitmap_set_range.  This cuts a bit less than 10% off the PTA
      time from the testcase in the PR.
      
      	PR tree-optimization/109143
      	* tree-ssa-structalias.cc (solution_set_expand): Avoid
      	one bitmap iteration and optimize bit range setting.
      21bf2b2f
    • Costas Argyris's avatar
      libiberty: writeargv: Simplify function error mode. · 4d1e4ce9
      Costas Argyris authored
      writeargv can be simplified by getting rid of the error exit mode
      that was only relevant many years ago when the function used
      to open the file descriptor internally.
      
      0001-libiberty-writeargv-Simplify-function-error-mode.patch
      
      From 1271552baee5561fa61652f4ca7673c9667e4f8f Mon Sep 17 00:00:00 2001
      From: Costas Argyris <costas.argyris@gmail.com>
      Date: Mon, 5 Jun 2023 15:02:06 +0100
      Subject: [PATCH] libiberty: writeargv: Simplify function error mode.
      
      The goto-based error mode was based on a previous version
      of the function where it was responsible for opening the
      file, so it had to close it upon any exit:
      
      https://inbox.sourceware.org/gcc-patches/20070417200340.GM9017@sparrowhawk.codesourcery.com/
      
      
      
      (thanks pinskia)
      
      This is no longer the case though since now the function
      takes the file descriptor as input, so the exit mode on
      error can be just a simple return 1 statement.
      
      libiberty/
      	* argv.c (writeargv): Simplify & remove gotos.
      
      Signed-off-by: default avatarCostas Argyris <costas.argyris@gmail.com>
      4d1e4ce9
    • Hans-Peter Nilsson's avatar
      bootstrap rtl-checking: Fix XVEC vs XVECEXP in postreload.cc · 9677cc74
      Hans-Peter Nilsson authored
      	PR bootstrap/110120
      	* postreload.cc (reload_cse_move2add, move2add_use_add2_insn): Use
      	XVECEXP, not XEXP, to access first item of a PARALLEL.
      9677cc74
    • Fei Gao's avatar
      RISC-V] add TC for save-restore cfi directives. · d1344c41
      Fei Gao authored
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/riscv/save-restore-cfi.c: New test to check save-restore
      	cfi directives.
      d1344c41
    • Pan Li's avatar
      RISC-V: Support RVV FP16 ZVFH Reduction floating-point intrinsic API · 78058904
      Pan Li authored
      
      This patch support the intrinsic API of FP16 ZVFH Reduction floating-point.
      Aka SEW=16 for below instructions:
      
      vfredosum vfredusum
      vfredmax vfredmin
      vfwredosum vfwredusum
      
      Then users can leverage the instrinsic APIs to perform the FP=16 related
      reduction operations. Please note not all the instrinsic APIs are coverred
      in the test files, only pick some typical ones due to too many. We will
      perform the FP16 related instrinsic API test entirely soon.
      
      Signed-off-by: default avatarPan Li <pan2.li@intel.com>
      
      gcc/ChangeLog:
      
      	* config/riscv/riscv-vector-builtins-types.def
      	(vfloat16mf4_t): Add vfloat16mf4_t to WF operations.
      	(vfloat16mf2_t): Likewise.
      	(vfloat16m1_t): Likewise.
      	(vfloat16m2_t): Likewise.
      	(vfloat16m4_t): Likewise.
      	(vfloat16m8_t): Likewise.
      	* config/riscv/vector-iterators.md: Add FP=16 to VWF, VWF_ZVE64,
      	VWLMUL1, VWLMUL1_ZVE64, vwlmul1 and vwlmul1_zve64.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/riscv/rvv/base/zvfh-intrinsic.c: Add new test cases.
      78058904
    • Fei Gao's avatar
      [RISC-V] correct machine mode in save-restore cfi RTL. · 17c796c7
      Fei Gao authored
      gcc/ChangeLog:
      
      	* config/riscv/riscv.cc (riscv_adjust_libcall_cfi_prologue): Use Pmode
      	for cfi reg/mem machmode
      	(riscv_adjust_libcall_cfi_epilogue): Use Pmode for cfi reg machmode
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/riscv/save-restore-cfi-2.c: New test to check machmode
      	for cfi reg/mem.
      17c796c7
    • Li Xu's avatar
      RISC-V: Fix 'REQUIREMENT' for machine_mode 'MODE' in vector-iterators.md. · da2d75af
      Li Xu authored
      gcc/ChangeLog:
      
      	* config/riscv/vector-iterators.md:
      	Fix 'REQUIREMENT' for machine_mode 'MODE'.
      	* config/riscv/vector.md (@pred_indexed_<order>store<VNX16_QHS:mode>
      	<VNX16_QHSI:mode>): change VNX16_QHSI to VNX16_QHSDI.
      	(@pred_indexed_<order>store<VNX16_QHS:mode><VNX16_QHSDI:mode>): Ditto.
      da2d75af
    • Pan Li's avatar
      RISC-V: Fix some typo in vector-iterators.md · 6d4b6f7b
      Pan Li authored
      
      This patch would like to fix some typo in vector-iterators.md, aka:
      
      [-"vnx1DI")-]{+"vnx1di")+}
      [-"vnx2SI")-]{+"vnx2si")+}
      [-"vnx1SI")-]{+"vnx1si")+}
      
      Signed-off-by: default avatarPan Li <pan2.li@intel.com>
      
      gcc/ChangeLog:
      
      	* config/riscv/vector-iterators.md: Fix typo in mode attr.
      6d4b6f7b
    • GCC Administrator's avatar
      Daily bump. · 14da7648
      GCC Administrator authored
      14da7648
  2. Jun 05, 2023
    • Andre Vieira's avatar
      Remove widen_plus/minus_expr tree codes · 8ebd1d9a
      Andre Vieira authored
      This patch removes the old widen plus/minus tree codes which have been
      replaced by internal functions.
      
      2023-06-05  Andre Vieira  <andre.simoesdiasvieira@arm.com>
      	    Joel Hutton  <joel.hutton@arm.com>
      
      gcc/ChangeLog:
      
      	* doc/generic.texi: Remove old tree codes.
      	* expr.cc (expand_expr_real_2): Remove old tree code cases.
      	* gimple-pretty-print.cc (dump_binary_rhs): Likewise.
      	* optabs-tree.cc (optab_for_tree_code): Likewise.
      	(supportable_half_widening_operation): Likewise.
      	* tree-cfg.cc (verify_gimple_assign_binary): Likewise.
      	* tree-inline.cc (estimate_operator_cost): Likewise.
      	(op_symbol_code): Likewise.
      	* tree-vect-data-refs.cc (vect_get_smallest_scalar_type): Likewise.
      	(vect_analyze_data_ref_accesses): Likewise.
      	* tree-vect-generic.cc (expand_vector_operations_1): Likewise.
      	* cfgexpand.cc (expand_debug_expr): Likewise.
      	* tree-vect-stmts.cc (vectorizable_conversion): Likewise.
      	(supportable_widening_operation): Likewise.
      	* gimple-range-op.cc (gimple_range_op_handler::maybe_non_standard):
      	Likewise.
      	* optabs.def (vec_widen_ssubl_hi_optab, vec_widen_ssubl_lo_optab,
      	vec_widen_saddl_hi_optab, vec_widen_saddl_lo_optab,
      	vec_widen_usubl_hi_optab, vec_widen_usubl_lo_optab,
      	vec_widen_uaddl_hi_optab, vec_widen_uaddl_lo_optab): Remove optabs.
      	* tree-pretty-print.cc (dump_generic_node): Remove tree code definition.
      	* tree.def (WIDEN_PLUS_EXPR, WIDEN_MINUS_EXPR, VEC_WIDEN_PLUS_HI_EXPR,
      	VEC_WIDEN_PLUS_LO_EXPR, VEC_WIDEN_MINUS_HI_EXPR,
      	VEC_WIDEN_MINUS_LO_EXPR): Likewise.
      8ebd1d9a
    • Andre Vieira's avatar
      internal-fn,vect: Refactor widen_plus as internal_fn · 2f482a07
      Andre Vieira authored
           DEF_INTERNAL_WIDENING_OPTAB_FN and DEF_INTERNAL_NARROWING_OPTAB_FN
      are like DEF_INTERNAL_SIGNED_OPTAB_FN and DEF_INTERNAL_OPTAB_FN
      respectively. With the exception that they provide convenience wrappers
      for a single vector to vector conversion, a hi/lo split or an even/odd
      split.  Each definition for <NAME> will require either signed optabs
      named <UOPTAB> and <SOPTAB> (for widening) or a single <OPTAB> (for
      narrowing) for each of the five functions it creates.
      
            For example, for widening addition the
      DEF_INTERNAL_WIDENING_OPTAB_FN will create five internal functions:
      IFN_VEC_WIDEN_PLUS, IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO,
      IFN_VEC_WIDEN_PLUS_EVEN and IFN_VEC_WIDEN_PLUS_ODD. Each requiring two
      optabs, one for signed and one for unsigned.
            Aarch64 implements the hi/lo split optabs:
            IFN_VEC_WIDEN_PLUS_HI   -> vec_widen_<su>add_hi_<mode> -> (u/s)addl2
            IFN_VEC_WIDEN_PLUS_LO  -> vec_widen_<su>add_lo_<mode> -> (u/s)addl
      
           This gives the same functionality as the previous
      WIDEN_PLUS/WIDEN_MINUS tree codes which are expanded into
      VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI.
      
      2023-06-05  Andre Vieira  <andre.simoesdiasvieira@arm.com>
      	    Joel Hutton  <joel.hutton@arm.com>
      	    Tamar Christina  <tamar.christina@arm.com>
      
      gcc/ChangeLog:
      
      	* config/aarch64/aarch64-simd.md (vec_widen_<su>addl_lo_<mode>): Rename
      	this ...
      	(vec_widen_<su>add_lo_<mode>): ... to this.
      	(vec_widen_<su>addl_hi_<mode>): Rename this ...
      	(vec_widen_<su>add_hi_<mode>): ... to this.
      	(vec_widen_<su>subl_lo_<mode>): Rename this ...
      	(vec_widen_<su>sub_lo_<mode>): ... to this.
      	(vec_widen_<su>subl_hi_<mode>): Rename this ...
      	(vec_widen_<su>sub_hi_<mode>): ...to this.
      	* doc/generic.texi: Document new IFN codes.
      	* internal-fn.cc (lookup_hilo_internal_fn): Add lookup function.
      	(commutative_binary_fn_p): Add widen_plus fn's.
      	(widening_fn_p): New function.
      	(narrowing_fn_p): New function.
      	(direct_internal_fn_optab): Change visibility.
      	* internal-fn.def (DEF_INTERNAL_WIDENING_OPTAB_FN): Macro to define an
      	internal_fn that expands into multiple internal_fns for widening.
      	(IFN_VEC_WIDEN_PLUS, IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO,
      	IFN_VEC_WIDEN_PLUS_EVEN, IFN_VEC_WIDEN_PLUS_ODD,
      	IFN_VEC_WIDEN_MINUS, IFN_VEC_WIDEN_MINUS_HI,
      	IFN_VEC_WIDEN_MINUS_LO, IFN_VEC_WIDEN_MINUS_ODD,
      	IFN_VEC_WIDEN_MINUS_EVEN): Define widening  plus,minus functions.
      	* internal-fn.h (direct_internal_fn_optab): Declare new prototype.
      	(lookup_hilo_internal_fn): Likewise.
      	(widening_fn_p): Likewise.
      	(Narrowing_fn_p): Likewise.
      	* optabs.cc (commutative_optab_p): Add widening plus optabs.
      	* optabs.def (OPTAB_D): Define widen add, sub optabs.
      	* tree-vect-patterns.cc (vect_recog_widen_op_pattern): Support
      	patterns with a hi/lo or even/odd split.
      	(vect_recog_sad_pattern): Refactor to use new IFN codes.
      	(vect_recog_widen_plus_pattern): Likewise.
      	(vect_recog_widen_minus_pattern): Likewise.
      	(vect_recog_average_pattern): Likewise.
      	* tree-vect-stmts.cc (vectorizable_conversion): Add support for
      	_HILO IFNs.
      	(supportable_widening_operation): Likewise.
      	* tree.def (WIDEN_SUM_EXPR): Update example to use new IFNs.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/aarch64/vect-widen-add.c: Test that new
      	IFN_VEC_WIDEN_PLUS is being used.
      	* gcc.target/aarch64/vect-widen-sub.c: Test that new
      	IFN_VEC_WIDEN_MINUS is being used.
      2f482a07
    • Andre Vieira's avatar
      vect: Refactor to allow internal_fn's · fe29963d
      Andre Vieira authored
      Refactor vect-patterns to allow patterns to be internal_fns starting
      with widening_plus/minus patterns
      
      2023-06-05  Andre Vieira  <andre.simoesdiasvieira@arm.com>
      	    Joel Hutton  <joel.hutton@arm.com>
      
      gcc/ChangeLog:
      	* tree-vect-patterns.cc: Add include for gimple-iterator.
      	(vect_recog_widen_op_pattern): Refactor to use code_helper.
      	(vect_gimple_build): New function.
      	* tree-vect-stmts.cc (simple_integer_narrowing): Refactor to use
      	code_helper.
      	(vectorizable_call): Likewise.
      	(vect_gen_widened_results_half): Likewise.
      	(vect_create_vectorized_demotion_stmts): Likewise.
      	(vect_create_vectorized_promotion_stmts): Likewise.
      	(vect_create_half_widening_stmts): Likewise.
      	(vectorizable_conversion): Likewise.
      	(supportable_widening_operation): Likewise.
      	(supportable_narrowing_operation): Likewise.
      	* tree-vectorizer.h (supportable_widening_operation): Change
      	prototype to use code_helper.
      	(supportable_narrowing_operation): Likewise.
      	(vect_gimple_build): New function prototype.
      	* tree.h (code_helper::safe_as_tree_code): New function.
      	(code_helper::safe_as_fn_code): New function.
      fe29963d
    • Iain Buclaw's avatar
      d: Warn when declared size of a special enum does not match its intrinsic type. · 3ad9313a
      Iain Buclaw authored
      All special enums have declarations in the D runtime library, but the
      compiler will recognize and treat them specially if declared in any
      module.  When the underlying base type of a special enum is a different
      size to its matched intrinsic, then this can cause undefined behavior at
      runtime.  Detect and warn about when such a mismatch occurs.
      
      gcc/d/ChangeLog:
      
      	* gdc.texi (Warnings): Document -Wextra and -Wmismatched-special-enum.
      	* implement-d.texi (Special Enums): Add reference to warning option
      	-Wmismatched-special-enum.
      	* lang.opt: Add -Wextra and -Wmismatched-special-enum.
      	* types.cc (TypeVisitor::visit (TypeEnum *)): Warn when declared
      	special enum size mismatches its intrinsic type.
      
      gcc/testsuite/ChangeLog:
      
      	* gdc.dg/Wmismatched_enum.d: New test.
      3ad9313a
    • Roger Sayle's avatar
      New wi::bitreverse function. · 108ff03b
      Roger Sayle authored
      This patch provides a wide-int implementation of bitreverse, that
      implements both of Richard Sandiford's suggestions from the review at
      https://gcc.gnu.org/pipermail/gcc-patches/2023-May/618215.html of an
      improved API (as a stand-alone function matching the bswap refactoring),
      and an implementation that works with any bit-width precision.
      
      2023-06-05  Roger Sayle  <roger@nextmovesoftware.com>
      
      gcc/ChangeLog
      	* wide-int.cc (wi::bitreverse_large): New function implementing
      	bit reversal of an integer.
      	* wide-int.h (wi::bitreverse): New (template) function prototype.
      	(bitreverse_large): Prototype helper function/implementation.
      	(wi::bitreverse): New template wrapper around bitreverse_large.
      108ff03b
    • Liao Shihua's avatar
      Testsuite: Fix a fail about xtheadcondmov-indirect-rv64.c · f7f12f0b
      Liao Shihua authored
      I find fail of the xtheadcondmov-indirect-rv64.c test case and provide a way to solve it.
      In this patch, I take Kito's advice that I modify the form of the function bodies.It likes
      *[a-x0-9].
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/riscv/xtheadcondmov-indirect-rv32.c: Generalize to be
      	less sensitive to register allocation choices.
      	* gcc.target/riscv/xtheadcondmov-indirect-rv64.c: Similarly.
      f7f12f0b
    • Uros Bizjak's avatar
      print-rtl: Change return type of two print functions from int to void · 8e1e1fc4
      Uros Bizjak authored
      Also change one internal variable to bool.
      
      gcc/ChangeLog:
      
      	* rtl.h (print_rtl_single): Change return type from int to void.
      	(print_rtl_single_with_indent): Ditto.
      	* print-rtl.h (class rtx_writer): Ditto.  Change m_sawclose to bool.
      	* print-rtl.cc (rtx_writer::rtx_writer): Update for m_sawclose change.
      	(rtx_writer::print_rtx_operand_code_0): Ditto.
      	(rtx_writer::print_rtx_operand_codes_E_and_V): Ditto.
      	(rtx_writer::print_rtx_operand_code_i): Ditto.
      	(rtx_writer::print_rtx_operand_code_u): Ditto.
      	(rtx_writer::print_rtx_operand): Ditto.
      	(rtx_writer::print_rtx): Ditto.
      	(rtx_writer::finish_directive): Ditto.
      	(print_rtl_single): Change return type from int to void
      	and adjust function body accordingly.
      	(rtx_writer::print_rtl_single_with_indent): Ditto.
      8e1e1fc4
    • Uros Bizjak's avatar
      reginfo: Change return type of predicate functions from int to bool · d015c658
      Uros Bizjak authored
      gcc/ChangeLog:
      
      	* rtl.h (reg_classes_intersect_p): Change return type from int to bool.
      	(reg_class_subset_p): Ditto.
      	* reginfo.cc (reg_classes_intersect_p): Ditto.
      	(reg_class_subset_p): Ditto.
      d015c658
    • Costas Argyris's avatar
      libiberty: pex-win32.c: Fix some typos. · 7ee22dc8
      Costas Argyris authored
      
      libiberty/ChangeLog:
      
      	* pex-win32.c: fix typos.
      
      Signed-off-by: default avatarCostas Argyris <costas.argyris@gmail.com>
      Signed-off-by: default avatarJonathan Yong <10walls@gmail.com>
      7ee22dc8
    • Pan Li's avatar
      RISC-V: Support RVV FP16 ZVFH floating-point intrinsic API · 71ea7a30
      Pan Li authored
      
      This patch support the intrinsic API of FP16 ZVFH floating-point. Aka
      SEW=16 for below instructions:
      
      vfadd vfsub vfrsub vfwadd vfwsub
      vfmul vfdiv vfrdiv vfwmul
      vfmacc vfnmacc vfmsac vfnmsac vfmadd
      vfnmadd vfmsub vfnmsub vfwmacc vfwnmacc vfwmsac vfwnmsac
      vfsqrt vfrsqrt7 vfrec7
      vfmin vfmax
      vfsgnj vfsgnjn vfsgnjx
      vmfeq vmfne vmflt vmfle vmfgt vmfge
      vfclass vfmerge
      vfmv
      vfcvt vfwcvt vfncvt
      
      Then users can leverage the instrinsic APIs to perform the FP=16 related
      operations. Please note not all the instrinsic APIs are coverred in the
      test files, only pick some typical ones due to too many. We will perform
      the FP16 related instrinsic API test entirely soon.
      
      Signed-off-by: default avatarPan Li <pan2.li@intel.com>
      
      gcc/ChangeLog:
      
      	* config/riscv/riscv-vector-builtins-types.def
      	(vfloat32mf2_t): New type for DEF_RVV_WEXTF_OPS.
      	(vfloat32m1_t): Ditto.
      	(vfloat32m2_t): Ditto.
      	(vfloat32m4_t): Ditto.
      	(vfloat32m8_t): Ditto.
      	(vint16mf4_t): New type for DEF_RVV_CONVERT_I_OPS.
      	(vint16mf2_t): Ditto.
      	(vint16m1_t): Ditto.
      	(vint16m2_t): Ditto.
      	(vint16m4_t): Ditto.
      	(vint16m8_t): Ditto.
      	(vuint16mf4_t): New type for DEF_RVV_CONVERT_U_OPS.
      	(vuint16mf2_t): Ditto.
      	(vuint16m1_t): Ditto.
      	(vuint16m2_t): Ditto.
      	(vuint16m4_t): Ditto.
      	(vuint16m8_t): Ditto.
      	(vint32mf2_t): New type for DEF_RVV_WCONVERT_I_OPS.
      	(vint32m1_t): Ditto.
      	(vint32m2_t): Ditto.
      	(vint32m4_t): Ditto.
      	(vint32m8_t): Ditto.
      	(vuint32mf2_t): New type for DEF_RVV_WCONVERT_U_OPS.
      	(vuint32m1_t): Ditto.
      	(vuint32m2_t): Ditto.
      	(vuint32m4_t): Ditto.
      	(vuint32m8_t): Ditto.
      	* config/riscv/vector-iterators.md: Add FP=16 support for V,
      	VWCONVERTI, VCONVERT, VNCONVERT, VMUL1 and vlmul1.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/riscv/rvv/base/zvfh-intrinsic.c: New test.
      
      Signed-off-by: default avatarPan Li <pan2.li@intel.com>
      71ea7a30
    • Costas Argyris's avatar
      libiberty: On Windows, pass a >32k cmdline through a response file. · 180ebb8a
      Costas Argyris authored
      
      pex-win32.c (win32_spawn): If the command line for CreateProcess
      exceeds the 32k Windows limit, try to store it in a temporary
      response file and call CreateProcess with @file instead (PR71850).
      
      Signed-off-by: default avatarCostas Argyris <costas.argyris@gmail.com>
      Signed-off-by: default avatarJonathan Yong <10walls@gmail.com>
      
      libiberty/ChangeLog:
      
      	* pex-win32.c (win32_spawn): Check command line length
      	and generate a response file if necessary.
      	(spawn_script): Adjust parameters.
      	(pex_win32_exec_child): Ditto.
      
      Signed-off-by: default avatarJonathan Yong <10walls@gmail.com>
      180ebb8a
    • Andrew Pinski's avatar
      Fix PR 110085: `make clean` in GCC directory on sh target causes a failure · afd87299
      Andrew Pinski authored
      On sh target, there is a MULTILIB_DIRNAMES (or is it MULTILIB_OPTIONS) named m2,
      this conflicts with the langauge m2. So when you do a `make clean`, it will remove
      the m2 directory and then a build will fail. Now since r0-78222-gfa9585134f6f58,
      the multilib directories are no longer created in the gcc directory as libgcc
      was moved to the toplevel. So we can remove the part of clean that removes those
      directories.
      
      Tested on x86_64-linux-gnu and a cross to sh-elf that `make clean` followed by
      `make` works again.
      
      OK?
      
      gcc/ChangeLog:
      
      	PR bootstrap/110085
      	* Makefile.in (clean): Remove the removing of
      	MULTILIB_DIR/MULTILIB_OPTIONS directories.
      afd87299
    • Kewen Lin's avatar
      libgcc: Use initarray section type for .init_stack · 83c3550e
      Kewen Lin authored
      One of my workmates found there is a warning like:
      
        libgcc/config/rs6000/morestack.S:402: Warning: ignoring
          incorrect section type for .init_array.00000
      
      when compiling libgcc/config/rs6000/morestack.S.
      
      Since commit r13-6545 touched that file recently, which was
      suspected to be responsible for this warning, I did some
      investigation and found this is a warning staying for a long
      time.  For section .init_stack*, it's preferred to use
      section type SHT_INIT_ARRAY.  So this patch is use
      "@init_array" to replace "@progbits".
      
      Although the warning is trivial, Segher suggested me to
      post this to fix it, in order to avoid any possible
      misunderstanding/confusion on the warning.
      
      As Alan confirmed, this doesn't require a premise check
      on if the existing binutils supports "@init_array" or not,
      "because if you want split-stack to work, you must link
      with gold, any version of binutils that has gold has an
      assembler that understands @init_array". (Thanks Alan!)
      
      libgcc/ChangeLog:
      
      	* config/i386/morestack.S: Use @init_array rather than
      	@progbits for section type of section .init_array.
      	* config/rs6000/morestack.S: Likewise.
      	* config/s390/morestack.S: Likewise.
      83c3550e
Loading