Skip to content
Snippets Groups Projects
  1. Jan 28, 2023
  2. Jan 27, 2023
  3. Jan 24, 2023
  4. Jan 23, 2023
  5. Jan 20, 2023
  6. Jan 19, 2023
    • Jakub Jelinek's avatar
      openmp: Fix up OpenMP expansion of non-rectangular loops [PR108459] · 46644ec9
      Jakub Jelinek authored
      expand_omp_for_init_counts was using for the case where collapse(2)
      inner loop has init expression dependent on non-constant multiple of
      the outer iterator and the condition upper bound expression doesn't
      depend on the outer iterator fold_unary (NEGATE_EXPR, ...).  This
      will just return NULL if it can't be folded, we need fold_build1
      instead.
      
      2023-01-19  Jakub Jelinek  <jakub@redhat.com>
      
      	PR middle-end/108459
      	* omp-expand.cc (expand_omp_for_init_counts): Use fold_build1 rather
      	than fold_unary for NEGATE_EXPR.
      
      	* testsuite/libgomp.c/pr108459.c: New test.
      46644ec9
  7. Jan 18, 2023
  8. Jan 17, 2023
    • Martin Liska's avatar
      Regenerate Makefile.in files. · 42bf66e4
      Martin Liska authored
      libbacktrace/ChangeLog:
      
      	* Makefile.in: Regenerate.
      
      libgomp/ChangeLog:
      
      	* Makefile.in: Regenerate.
      	* configure: Regenerate.
      
      libphobos/ChangeLog:
      
      	* Makefile.in: Regenerate.
      	* libdruntime/Makefile.in: Regenerate.
      
      libstdc++-v3/ChangeLog:
      
      	* src/libbacktrace/Makefile.in: Regenerate.
      42bf66e4
  9. Jan 16, 2023
  10. Jan 08, 2023
  11. Jan 07, 2023
    • LIU Hao's avatar
      Always define `WIN32_LEAN_AND_MEAN` before <windows.h> · 902c7559
      LIU Hao authored
      Recently, mingw-w64 has got updated <msxml.h> from Wine which is included
      indirectly by <windows.h> if `WIN32_LEAN_AND_MEAN` is not defined. The
      `IXMLDOMDocument` class has a member function named `abort()`, which gets
      affected by our `abort()` macro in "system.h".
      
      `WIN32_LEAN_AND_MEAN` should, nevertheless, always be defined. This
      can exclude 'APIs such as Cryptography, DDE, RPC, Shell, and Windows
      Sockets' [1], and speed up compilation of these files a bit.
      
      [1] https://learn.microsoft.com/en-us/windows/win32/winprog/using-the-windows-headers
      
      gcc/
      
      	PR middle-end/108300
      	* config/xtensa/xtensa-dynconfig.c: Define `WIN32_LEAN_AND_MEAN`
      	before <windows.h>.
      	* diagnostic-color.cc: Likewise.
      	* plugin.cc: Likewise.
      	* prefix.cc: Likewise.
      
      gcc/ada/
      
      	PR middle-end/108300
      	* adaint.c: Define `WIN32_LEAN_AND_MEAN` before `#include
      	<windows.h>`.
      	* cio.c: Likewise.
      	* ctrl_c.c: Likewise.
      	* expect.c: Likewise.
      	* gsocket.h: Likewise.
      	* mingw32.h: Likewise.
      	* mkdir.c: Likewise.
      	* rtfinal.c: Likewise.
      	* rtinit.c: Likewise.
      	* seh_init.c: Likewise.
      	* sysdep.c: Likewise.
      	* terminals.c: Likewise.
      	* tracebak.c: Likewise.
      
      gcc/jit/
      
      	PR middle-end/108300
      	* jit-w32.h: Define `WIN32_LEAN_AND_MEAN` before <windows.h>.
      
      libatomic/
      
      	PR middle-end/108300
      	* config/mingw/lock.c: Define `WIN32_LEAN_AND_MEAN` before
      	<windows.h>.
      
      libffi/
      
      	PR middle-end/108300
      	* src/aarch64/ffi.c: Define `WIN32_LEAN_AND_MEAN` before
      	<windows.h>.
      
      libgcc/
      
      	PR middle-end/108300
      	* config/i386/enable-execute-stack-mingw32.c: Define
      	`WIN32_LEAN_AND_MEAN` before <windows.h>.
      	* libgcc2.c: Likewise.
      	* unwind-generic.h: Likewise.
      
      libgfortran/
      
      	PR middle-end/108300
      	* intrinsics/sleep.c: Define `WIN32_LEAN_AND_MEAN` before
      	<windows.h>.
      
      libgomp/
      
      	PR middle-end/108300
      	* config/mingw32/proc.c: Define `WIN32_LEAN_AND_MEAN` before
      	<windows.h>.
      
      libiberty/
      
      	PR middle-end/108300
      	* make-temp-file.c: Define `WIN32_LEAN_AND_MEAN` before <windows.h>.
      	* pex-win32.c: Likewise.
      
      libssp/
      
      	PR middle-end/108300
      	* ssp.c: Define `WIN32_LEAN_AND_MEAN` before <windows.h>.
      
      libstdc++-v3/
      
      	PR middle-end/108300
      	* src/c++11/system_error.cc: Define `WIN32_LEAN_AND_MEAN` before
      	<windows.h>.
      	* src/c++11/thread.cc: Likewise.
      	* src/c++17/fs_ops.cc: Likewise.
      	* src/filesystem/ops.cc: Likewise.
      
      libvtv/
      
      	PR middle-end/108300
      	* vtv_malloc.cc: Define `WIN32_LEAN_AND_MEAN` before <windows.h>.
      	* vtv_rts.cc: Likewise.
      	* vtv_utils.cc: Likewise.
      902c7559
  12. Jan 06, 2023
  13. Jan 05, 2023
    • Jakub Jelinek's avatar
      openmp: Fix up finish_omp_target_clauses [PR108286] · 29c32186
      Jakub Jelinek authored
      The comment in the loop says that we shouldn't add a map clause if such
      a clause exists already, but the loop was actually using OMP_CLAUSE_DECL
      on any clause.  Target construct can have various clauses which don't
      have OMP_CLAUSE_DECL at all (e.g. nowait, device or if) or clause
      where it means something different (e.g. privatization clauses, allocate,
      depend).
      
      So, only check OMP_CLAUSE_DECL on OMP_CLAUSE_MAP clauses.
      
      2023-01-05  Jakub Jelinek  <jakub@redhat.com>
      
      	PR c++/108286
      	* semantics.cc (finish_omp_target_clauses): Ignore clauses other than
      	OMP_CLAUSE_MAP.
      
      	* testsuite/libgomp.c++/pr108286.C: New test.
      29c32186
  14. Jan 03, 2023
  15. Jan 02, 2023
    • Jakub Jelinek's avatar
      Update copyright dates. · 74d5206f
      Jakub Jelinek authored
      Manual part of copyright year updates.
      
      2023-01-02  Jakub Jelinek  <jakub@redhat.com>
      
      gcc/
      	* gcc.cc (process_command): Update copyright notice dates.
      	* gcov-dump.cc (print_version): Ditto.
      	* gcov.cc (print_version): Ditto.
      	* gcov-tool.cc (print_version): Ditto.
      	* gengtype.cc (create_file): Ditto.
      	* doc/cpp.texi: Bump @copying's copyright year.
      	* doc/cppinternals.texi: Ditto.
      	* doc/gcc.texi: Ditto.
      	* doc/gccint.texi: Ditto.
      	* doc/gcov.texi: Ditto.
      	* doc/install.texi: Ditto.
      	* doc/invoke.texi: Ditto.
      gcc/ada/
      	* gnat_ugn.texi: Bump @copying's copyright year.
      	* gnat_rm.texi: Likewise.
      gcc/d/
      	* gdc.texi: Bump @copyrights-d year.
      gcc/fortran/
      	* gfortranspec.cc (lang_specific_driver): Update copyright notice
      	dates.
      	* gfc-internals.texi: Bump @copying's copyright year.
      	* gfortran.texi: Ditto.
      	* intrinsic.texi: Ditto.
      	* invoke.texi: Ditto.
      gcc/go/
      	* gccgo.texi: Bump @copyrights-go year.
      libgomp/
      	* libgomp.texi: Bump @copying's copyright year.
      libitm/
      	* libitm.texi: Bump @copying's copyright year.
      libquadmath/
      	* libquadmath.texi: Bump @copying's copyright year.
      74d5206f
    • Jakub Jelinek's avatar
      Update Copyright year in ChangeLog files · 68127a8e
      Jakub Jelinek authored
      2022 -> 2023
      68127a8e
  16. Dec 22, 2022
  17. Dec 21, 2022
    • Chung-Lin Tang's avatar
      nvptx: reimplement libgomp barriers [PR99555] · fdc7469c
      Chung-Lin Tang authored
      Instead of trying to have the GPU do CPU-with-OS-like things, this new barriers
      implementation for NVPTX uses simplistic bar.* synchronization instructions.
      Tasks are processed after threads have joined, and only if team->task_count != 0
      
      It is noted that: there might be a little bit of performance forfeited for
      cases where earlier arriving threads could've been used to process tasks ahead
      of other threads, but that has the requirement of implementing complex
      futex-wait/wake like behavior, which is what we're try to avoid with this patch.
      It is deemed that task processing is not what GPU target offloading is usually
      used for.
      
      Implementation highlight notes:
      1. gomp_team_barrier_wake() is now an empty function (threads never "wake" in
         the usual manner)
      2. gomp_team_barrier_cancel() now uses the "exit" PTX instruction.
      3. gomp_barrier_wait_last() now is implemented using "bar.arrive"
      
      4. gomp_team_barrier_wait_end()/gomp_team_barrier_wait_cancel_end():
         The main synchronization is done using a 'bar.red' instruction. This reduces
         across all threads the condition (team->task_count != 0), to enable the task
         processing down below if any thread created a task.
         (this bar.red usage means that this patch is dependent on the prior NVPTX
         bar.red GCC patch)
      
      	PR target/99555
      
      libgomp/ChangeLog:
      
      	* config/nvptx/bar.c (generation_to_barrier): Remove.
      	(futex_wait,futex_wake,do_spin,do_wait): Remove.
      	(GOMP_WAIT_H): Remove.
      	(#include "../linux/bar.c"): Remove.
      	(gomp_barrier_wait_end): New function.
      	(gomp_barrier_wait): Likewise.
      	(gomp_barrier_wait_last): Likewise.
      	(gomp_team_barrier_wait_end): Likewise.
      	(gomp_team_barrier_wait): Likewise.
      	(gomp_team_barrier_wait_final): Likewise.
      	(gomp_team_barrier_wait_cancel_end): Likewise.
      	(gomp_team_barrier_wait_cancel): Likewise.
      	(gomp_team_barrier_cancel): Likewise.
      	* config/nvptx/bar.h (gomp_barrier_t): Remove waiters, lock fields.
      	(gomp_barrier_init): Remove init of waiters, lock fields.
      	(gomp_team_barrier_wake): Remove prototype, add new static inline
      	function.
      fdc7469c
    • Jakub Jelinek's avatar
      openmp: Don't try to destruct DECL_OMP_PRIVATIZED_MEMBER vars [PR108180] · 1119902b
      Jakub Jelinek authored
      DECL_OMP_PRIVATIZED_MEMBER vars are artificial vars with DECL_VALUE_EXPR
      of this->field used just during gimplification and omp lowering/expansion
      to privatize individual fields in methods when needed.
      As the following testcase shows, when not in templates, they were handled
      right, but in templates we actually called cp_finish_decl on them and
      that can result in their destruction, which is obviously undesirable,
      we should only destruct the privatized copies of them created in omp
      lowering.
      
      Fixed thusly.
      
      2022-12-21  Jakub Jelinek  <jakub@redhat.com>
      
      	PR c++/108180
      	* pt.cc (tsubst_expr): Don't call cp_finish_decl on
      	DECL_OMP_PRIVATIZED_MEMBER vars.
      
      	* testsuite/libgomp.c++/pr108180.C: New test.
      1119902b
  18. Dec 17, 2022
  19. Dec 16, 2022
  20. Dec 15, 2022
    • Tobias Burnus's avatar
      libgfortran's ISO_Fortran_binding.c: Use GCC11 version for backward-only code [PR108056] · e205ec03
      Tobias Burnus authored
      Since GCC 12, the conversion between the array descriptors formats - the
      internal (GFC) and the C binding one (CFI) - moved to the compiler itself
      such that the cfi_desc_to_gfc_desc/gfc_desc_to_cfi_desc functions are only
      used with older code (GCC 9 to 11).  The newly added checks caused asserts
      as older code did not pass the proper values (e.g. real(4) as effective
      argument arrived as BT_ASSUME type as the effective type got lost inbetween).
      
      As proposed in the PR, revert to the GCC 11 version - known bugs is better
      than some fixes and new issues. Still, GCC 12 is much better in terms of
      TS29113 support and should really be used.
      
      This patch uses the current libgomp version of the GCC 11 branch, except
      it fixes the GFC version number (which is 0), uses calloc instead of malloc,
      and sets the lower bound to 1 instead of keeping it as is for
      CFI_attribute_other.
      
      libgfortran/ChangeLog:
      
      	PR libfortran/108056
      	* runtime/ISO_Fortran_binding.c (cfi_desc_to_gfc_desc,
      	gfc_desc_to_cfi_desc): Mostly revert to GCC 11 version for
      	those backward-compatiblity-only functions.
      e205ec03
    • GCC Administrator's avatar
      Daily bump. · 26f4aefa
      GCC Administrator authored
      26f4aefa
  21. Dec 14, 2022
    • Julian Brown's avatar
      OpenMP/Fortran: Combined directives with map/firstprivate of same symbol · 9316ad3b
      Julian Brown authored
      This patch fixes a case where a combined directive (e.g. "!$omp target
      parallel ...") contains both a map and a firstprivate clause for the
      same variable.  When the combined directive is split into two nested
      directives, the outer "target" gets the "map" clause, and the inner
      "parallel" gets the "firstprivate" clause, like so:
      
        !$omp target parallel map(x) firstprivate(x)
      
        -->
      
        !$omp target map(x)
          !$omp parallel firstprivate(x)
            ...
      
      When there is no map of the same variable, the firstprivate is distributed
      to both directives, e.g. for 'y' in:
      
        !$omp target parallel map(x) firstprivate(y)
      
        -->
      
        !$omp target map(x) firstprivate(y)
          !$omp parallel firstprivate(y)
            ...
      
      This is not a recent regression, but appear to fix a long-standing ICE.
      (The included testcase is based on one by Tobias.)
      
      2022-12-06  Julian Brown  <julian@codesourcery.com>
      
      gcc/fortran/
      	* trans-openmp.cc (gfc_add_firstprivate_if_unmapped): New function.
      	(gfc_split_omp_clauses): Call above.
      
      libgomp/
      	* testsuite/libgomp.fortran/combined-directive-splitting-1.f90: New
      	test.
      9316ad3b
  22. Dec 11, 2022
  23. Dec 10, 2022
    • Tobias Burnus's avatar
      libgomp: Handle OpenMP's reverse offloads · ea4b23d9
      Tobias Burnus authored
      This commit enabled reverse offload for nvptx such that gomp_target_rev
      actually gets called.  And it fills the latter function to do all of
      the following: finding the host function to the device func ptr and
      copying the arguments to the host, processing the mapping/firstprivate,
      calling the host function, copying back the data and freeing as needed.
      
      The data handling is made easier by assuming that all host variables
      either existed before (and are in the mapping) or that those are
      devices variables not yet available on the host. Thus, the reverse
      mapping can do without refcounts etc. Note that the spec disallows
      inside a target region device-affecting constructs other than target
      plus ancestor device-modifier and it also limits the clauses permitted
      on this construct.
      
      For the function addresses, an additional splay tree is used; for
      the lookup of mapped variables, the existing splay-tree is used.
      Unfortunately, its data structure requires a full walk of the tree;
      Additionally, the just mapped variables are recorded in a separate
      data structure an extra lookup. While the lookup is slow, assuming
      that only few variables get mapped in each reverse offload construct
      and that reverse offload is the exception and not performance critical,
      this seems to be acceptable.
      
      libgomp/ChangeLog:
      
      	* libgomp.h (struct target_mem_desc): Predeclare; move
      	below after 'reverse_splay_tree_node' and add rev_array
      	member.
      	(struct reverse_splay_tree_key_s, reverse_splay_compare): New.
      	(reverse_splay_tree_node, reverse_splay_tree,
      	reverse_splay_tree_key): New typedef.
      	(struct gomp_device_descr): Add mem_map_rev member.
      	* oacc-host.c (host_dispatch): NULL init .mem_map_rev.
      	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_get_num_devices): Claim
      	support for GOMP_REQUIRES_REVERSE_OFFLOAD.
      	* splay-tree.h (splay_tree_callback_stop): New typedef; like
      	splay_tree_callback but returning int not void.
      	(splay_tree_foreach_lazy): Define; like splay_tree_foreach but
      	taking splay_tree_callback_stop as argument.
      	* splay-tree.c (splay_tree_foreach_internal_lazy,
      	splay_tree_foreach_lazy): New; but early exit if callback returns
      	nonzero.
      	* target.c: Instatiate splay_tree_c with splay_tree_prefix 'reverse'.
      	(gomp_map_lookup_rev): New.
      	(gomp_load_image_to_device): Handle reverse-offload function
      	lookup table.
      	(gomp_unload_image_from_device): Free devicep->mem_map_rev.
      	(struct gomp_splay_tree_rev_lookup_data, gomp_splay_tree_rev_lookup,
      	gomp_map_rev_lookup, struct cpy_data, gomp_map_cdata_lookup_int,
      	gomp_map_cdata_lookup): New auxiliary structs and functions for
      	gomp_target_rev.
      	(gomp_target_rev): Implement reverse offloading and its mapping.
      	(gomp_target_init): Init current_device.mem_map_rev.root.
      	* testsuite/libgomp.fortran/reverse-offload-2.f90: New test.
      	* testsuite/libgomp.fortran/reverse-offload-3.f90: New test.
      	* testsuite/libgomp.fortran/reverse-offload-4.f90: New test.
      	* testsuite/libgomp.fortran/reverse-offload-5.f90: New test.
      	* testsuite/libgomp.fortran/reverse-offload-5a.f90: New test without
      	mapping of on-device allocated variables.
      ea4b23d9
    • GCC Administrator's avatar
      Daily bump. · 40ce6485
      GCC Administrator authored
      40ce6485
  24. Dec 09, 2022
    • Tobias Burnus's avatar
      Fortran/OpenMP: align/allocator modifiers to the allocate clause · b2e1c49b
      Tobias Burnus authored
      gcc/fortran/ChangeLog:
      
      	* dump-parse-tree.cc (show_omp_namelist): Improve OMP_LIST_ALLOCATE
      	output.
      	* gfortran.h (struct gfc_omp_namelist): Add 'align' to 'u'.
      	(gfc_free_omp_namelist): Add bool arg.
      	* match.cc (gfc_free_omp_namelist): Likewise; free 'u.align'.
      	* openmp.cc (gfc_free_omp_clauses, gfc_match_omp_clause_reduction,
      	gfc_match_omp_flush): Update call.
      	(gfc_match_omp_clauses): Match 'align/allocate modifers in
      	'allocate' clause.
      	(resolve_omp_clauses): Resolve align.
      	* st.cc (gfc_free_statement): Update call
      	* trans-openmp.cc (gfc_trans_omp_clauses): Handle 'align'.
      
      libgomp/ChangeLog:
      
      	* libgomp.texi (5.1 Impl. Status): Split allocate clause/directive
      	item about 'align'; mark clause as 'Y' and directive as 'N'.
      	* testsuite/libgomp.fortran/allocate-2.f90: New test.
      	* testsuite/libgomp.fortran/allocate-3.f90: New test.
      b2e1c49b
  25. Dec 07, 2022
  26. Dec 06, 2022
    • Marcel Vollweiler's avatar
      OpenMP: omp_get_max_teams, omp_set_num_teams, and omp_{gs}et_teams_thread_limit on offload devices · 81476bc4
      Marcel Vollweiler authored
      This patch adds support for omp_get_max_teams, omp_set_num_teams, and
      omp_{gs}et_teams_thread_limit on offload devices. That includes the usage of
      device-specific ICV values (specified as environment variables or changed on a
      device). In order to reuse device-specific ICV values, a copy back mechanism is
      implemented that copies ICV values back from device to the host.
      
      Additionally, a limitation of the number of teams on gcn offload devices is
      implemented.  The number of teams is limited by twice the number of compute
      units (one team is executed on one compute unit).  This avoids queueing
      unnessecary many teams and a corresponding allocation of large amounts of
      memory.  Without that limitation the memory allocation for a large number of
      user-specified teams can result in an "memory access fault".
      A limitation of the number of teams is already also implemented for nvptx
      devices (see nvptx_adjust_launch_bounds in libgomp/plugin/plugin-nvptx.c).
      
      gcc/ChangeLog:
      
      	* gimplify.cc (optimize_target_teams): Set initial num_teams_upper
      	to "-2" instead of "1" for non-existing num_teams clause in order to
      	disambiguate from the case of an existing num_teams clause with value 1.
      
      libgomp/ChangeLog:
      
      	* config/gcn/icv-device.c (omp_get_teams_thread_limit): Added to
      	allow processing of device-specific values.
      	(omp_set_teams_thread_limit): Likewise.
      	(ialias): Likewise.
      	* config/nvptx/icv-device.c (omp_get_teams_thread_limit): Likewise.
      	(omp_set_teams_thread_limit): Likewise.
      	(ialias): Likewise.
      	* icv-device.c (omp_get_teams_thread_limit): Likewise.
      	(ialias): Likewise.
      	(omp_set_teams_thread_limit): Likewise.
      	* icv.c (omp_set_teams_thread_limit): Removed.
      	(omp_get_teams_thread_limit): Likewise.
      	(ialias): Likewise.
      	* libgomp.texi: Updated documentation for nvptx and gcn corresponding
      	to the limitation of the number of teams.
      	* plugin/plugin-gcn.c (limit_teams): New helper function that limits
      	the number of teams by twice the number of compute units.
      	(parse_target_attributes): Limit the number of teams on gcn offload
      	devices.
      	* target.c (get_gomp_offload_icvs): Added teams_thread_limit_var
      	handling.
      	(gomp_load_image_to_device): Added a size check for the ICVs struct
      	variable.
      	(gomp_copy_back_icvs): New function that is used in GOMP_target_ext to
      	copy back the ICV values from device to host.
      	(GOMP_target_ext): Update the number of teams and threads in the kernel
      	args also considering device-specific values.
      	* testsuite/libgomp.c-c++-common/icv-4.c: Fixed an error in the reading
      	of OMP_TEAMS_THREAD_LIMIT from the environment.
      	* testsuite/libgomp.c-c++-common/icv-5.c: Extended.
      	* testsuite/libgomp.c-c++-common/icv-6.c: Extended.
      	* testsuite/libgomp.c-c++-common/icv-7.c: Extended.
      	* testsuite/libgomp.c-c++-common/icv-9.c: New test.
      	* testsuite/libgomp.fortran/icv-5.f90: New test.
      	* testsuite/libgomp.fortran/icv-6.f90: New test.
      
      gcc/testsuite/ChangeLog:
      
      	* c-c++-common/gomp/target-teams-1.c: Adapt expected values for
      	num_teams from "1" to "-2" in cases without num_teams clause.
      	* g++.dg/gomp/target-teams-1.C: Likewise.
      	* gfortran.dg/gomp/defaultmap-4.f90: Likewise.
      	* gfortran.dg/gomp/defaultmap-5.f90: Likewise.
      	* gfortran.dg/gomp/defaultmap-6.f90: Likewise.
      81476bc4
    • Tobias Burnus's avatar
      libgomp.texi: Fix a OpenMP 5.2 and a TR11 impl-status item · 9f80367e
      Tobias Burnus authored
      libgomp/
      	* libgomp.texi (OpenMP 5.2): Add missing 'the'.
      	(TR11): Add missing '@tab N @tab'.
      9f80367e
  27. Dec 01, 2022
  28. Nov 30, 2022
    • Tobias Burnus's avatar
      libgomp.texi: List GCN's 'gfx803' under OpenMP Context Selectors · e0b95c2e
      Tobias Burnus authored
      libgomp/ChangeLog:
      
      	* libgomp.texi (OpenMP Context Selectors): Add 'gfx803' to gcn's isa.
      e0b95c2e
    • Paul-Antoine Arras's avatar
      amdgcn: Support AMD-specific 'isa' traits in OpenMP context selectors · 1fd50874
      Paul-Antoine Arras authored
      Add support for gfx803 as an alias for fiji.
      Add test cases for all supported 'isa' values.
      
      gcc/ChangeLog:
      
      	* config/gcn/gcn.cc (gcn_omp_device_kind_arch_isa): Add gfx803.
      	* config/gcn/t-omp-device: Add gfx803.
      
      libgomp/ChangeLog:
      
      	* testsuite/libgomp.c/declare-variant-4-fiji.c: New test.
      	* testsuite/libgomp.c/declare-variant-4-gfx803.c: New test.
      	* testsuite/libgomp.c/declare-variant-4-gfx900.c: New test.
      	* testsuite/libgomp.c/declare-variant-4-gfx906.c: New test.
      	* testsuite/libgomp.c/declare-variant-4-gfx908.c: New test.
      	* testsuite/libgomp.c/declare-variant-4-gfx90a.c: New test.
      	* testsuite/libgomp.c/declare-variant-4.h: New header file.
      1fd50874
  29. Nov 29, 2022
  30. Nov 28, 2022
    • Tobias Burnus's avatar
      OpenMP/Fortran: Permit end-clause on directive · 091b6dbc
      Tobias Burnus authored
      gcc/fortran/ChangeLog:
      
      	* openmp.cc (OMP_DO_CLAUSES, OMP_SCOPE_CLAUSES,
      	OMP_SECTIONS_CLAUSES): Add 'nowait'.
      	(OMP_SINGLE_CLAUSES): Add 'nowait' and 'copyprivate'.
      	(gfc_match_omp_distribute_parallel_do,
      	gfc_match_omp_distribute_parallel_do_simd,
      	gfc_match_omp_parallel_do,
      	gfc_match_omp_parallel_do_simd,
      	gfc_match_omp_parallel_sections,
      	gfc_match_omp_teams_distribute_parallel_do,
      	gfc_match_omp_teams_distribute_parallel_do_simd): Disallow 'nowait'.
      	(gfc_match_omp_workshare): Match 'nowait' clause.
      	(gfc_match_omp_end_single): Use clause matcher for 'nowait'.
      	(resolve_omp_clauses): Reject 'nowait' + 'copyprivate'.
      	* parse.cc (decode_omp_directive): Break too long line.
      	(parse_omp_do, parse_omp_structured_block): Diagnose duplicated
      	'nowait' clause.
      
      libgomp/ChangeLog:
      
      	* libgomp.texi (OpenMP 5.2): Mark end-directive as Y.
      
      gcc/testsuite/ChangeLog:
      
      	* gfortran.dg/gomp/copyprivate-1.f90: New test.
      	* gfortran.dg/gomp/copyprivate-2.f90: New test.
      	* gfortran.dg/gomp/nowait-2.f90: Move dg-error tests ...
      	* gfortran.dg/gomp/nowait-4.f90: ... to this new file.
      	* gfortran.dg/gomp/nowait-5.f90: New test.
      	* gfortran.dg/gomp/nowait-6.f90: New test.
      	* gfortran.dg/gomp/nowait-7.f90: New test.
      	* gfortran.dg/gomp/nowait-8.f90: New test.
      091b6dbc
  31. Nov 26, 2022
  32. Nov 25, 2022
    • Sandra Loosemore's avatar
      OpenMP: Generate SIMD clones for functions with "declare target" · 309e2d95
      Sandra Loosemore authored
      This patch causes the IPA simdclone pass to generate clones for
      functions with the "omp declare target" attribute as if they had
      "omp declare simd", provided the function appears to be suitable for
      SIMD execution.  The filter is conservative, rejecting functions
      that write memory or that call other functions not known to be safe.
      A new option -fopenmp-target-simd-clone is added to control this
      transformation; it's enabled for offload processing at -O2 and higher.
      
      gcc/ChangeLog:
      
      	* common.opt (fopenmp-target-simd-clone): New option.
      	(target_simd_clone_device): New enum to go with it.
      	* doc/invoke.texi (-fopenmp-target-simd-clone): Document.
      	* flag-types.h (enum omp_target_simd_clone_device_kind): New.
      	* omp-simd-clone.cc (auto_simd_fail): New function.
      	(auto_simd_check_stmt): New function.
      	(plausible_type_for_simd_clone): New function.
      	(ok_for_auto_simd_clone): New function.
      	(simd_clone_create): Add force_local argument, make the symbol
      	have internal linkage if it is true.
      	(expand_simd_clones): Also check for cloneable functions with
      	"omp declare target".  Pass explicit_p argument to
      	simd_clone.compute_vecsize_and_simdlen target hook.
      	* opts.cc (default_options_table): Add -fopenmp-target-simd-clone.
      	* target.def (TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN):
      	Add bool explicit_p argument.
      	* doc/tm.texi: Regenerated.
      	* config/aarch64/aarch64.cc
      	(aarch64_simd_clone_compute_vecsize_and_simdlen): Update.
      	* config/gcn/gcn.cc
      	(gcn_simd_clone_compute_vecsize_and_simdlen): Update.
      	* config/i386/i386.cc
      	(ix86_simd_clone_compute_vecsize_and_simdlen): Update.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/gomp/target-simd-clone-1.C: New.
      	* g++.dg/gomp/target-simd-clone-2.C: New.
      	* gcc.dg/gomp/target-simd-clone-1.c: New.
      	* gcc.dg/gomp/target-simd-clone-2.c: New.
      	* gcc.dg/gomp/target-simd-clone-3.c: New.
      	* gcc.dg/gomp/target-simd-clone-4.c: New.
      	* gcc.dg/gomp/target-simd-clone-5.c: New.
      	* gcc.dg/gomp/target-simd-clone-6.c: New.
      	* gcc.dg/gomp/target-simd-clone-7.c: New.
      	* gcc.dg/gomp/target-simd-clone-8.c: New.
      	* lib/scanoffloadipa.exp: New.
      
      libgomp/ChangeLog:
      
      	* testsuite/lib/libgomp.exp: Load scanoffloadipa.exp library.
      	* testsuite/libgomp.c/target-simd-clone-1.c: New.
      	* testsuite/libgomp.c/target-simd-clone-2.c: New.
      	* testsuite/libgomp.c/target-simd-clone-3.c: New.
      309e2d95
    • Tobias Burnus's avatar
      libgomp: Add no-target-region rev offload test + fix plugin-nvptx · 9f9d128f
      Tobias Burnus authored
      OpenMP permits that a 'target device(ancestor:1)' is called without being
      enclosed in a target region - using the current device (i.e. the host) in
      that case.  This commit adds a testcase for this.
      
      In case of nvptx, the missing on-device 'GOMP_target_ext' call causes that
      it and also the associated on-device GOMP_REV_OFFLOAD_VAR variable are not
      linked in from nvptx's libgomp.a. Thus, handle the failing cuModuleGetGlobal
      gracefully by disabling reverse offload and assuming that the failure is fine.
      
      libgomp/ChangeLog:
      
      	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_load_image): Use unsigned int
      	for 'i' to match 'fn_entries'; regard absent GOMP_REV_OFFLOAD_VAR
      	as valid and the code having no reverse-offload code.
      	* testsuite/libgomp.c-c++-common/reverse-offload-2.c: New test.
      9f9d128f
Loading