- Dec 21, 2022
-
-
Jakub Jelinek authored
DECL_OMP_PRIVATIZED_MEMBER vars are artificial vars with DECL_VALUE_EXPR of this->field used just during gimplification and omp lowering/expansion to privatize individual fields in methods when needed. As the following testcase shows, when not in templates, they were handled right, but in templates we actually called cp_finish_decl on them and that can result in their destruction, which is obviously undesirable, we should only destruct the privatized copies of them created in omp lowering. Fixed thusly. 2022-12-21 Jakub Jelinek <jakub@redhat.com> PR c++/108180 * pt.cc (tsubst_expr): Don't call cp_finish_decl on DECL_OMP_PRIVATIZED_MEMBER vars. * testsuite/libgomp.c++/pr108180.C: New test.
-
- Dec 17, 2022
-
-
GCC Administrator authored
-
- Dec 16, 2022
-
-
Tobias Burnus authored
Commit r13-4716-ge205ec03f0794aeac3e8a89e947c12624d5a274e accidentally included a testcase of another patch that is pending review: https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608401.html libgomp/ PR libfortran/108056 * testsuite/libgomp.fortran/allocate-4.f90: Remove accidentally added file.
-
GCC Administrator authored
-
- Dec 15, 2022
-
-
Tobias Burnus authored
Since GCC 12, the conversion between the array descriptors formats - the internal (GFC) and the C binding one (CFI) - moved to the compiler itself such that the cfi_desc_to_gfc_desc/gfc_desc_to_cfi_desc functions are only used with older code (GCC 9 to 11). The newly added checks caused asserts as older code did not pass the proper values (e.g. real(4) as effective argument arrived as BT_ASSUME type as the effective type got lost inbetween). As proposed in the PR, revert to the GCC 11 version - known bugs is better than some fixes and new issues. Still, GCC 12 is much better in terms of TS29113 support and should really be used. This patch uses the current libgomp version of the GCC 11 branch, except it fixes the GFC version number (which is 0), uses calloc instead of malloc, and sets the lower bound to 1 instead of keeping it as is for CFI_attribute_other. libgfortran/ChangeLog: PR libfortran/108056 * runtime/ISO_Fortran_binding.c (cfi_desc_to_gfc_desc, gfc_desc_to_cfi_desc): Mostly revert to GCC 11 version for those backward-compatiblity-only functions.
-
GCC Administrator authored
-
- Dec 14, 2022
-
-
Julian Brown authored
This patch fixes a case where a combined directive (e.g. "!$omp target parallel ...") contains both a map and a firstprivate clause for the same variable. When the combined directive is split into two nested directives, the outer "target" gets the "map" clause, and the inner "parallel" gets the "firstprivate" clause, like so: !$omp target parallel map(x) firstprivate(x) --> !$omp target map(x) !$omp parallel firstprivate(x) ... When there is no map of the same variable, the firstprivate is distributed to both directives, e.g. for 'y' in: !$omp target parallel map(x) firstprivate(y) --> !$omp target map(x) firstprivate(y) !$omp parallel firstprivate(y) ... This is not a recent regression, but appear to fix a long-standing ICE. (The included testcase is based on one by Tobias.) 2022-12-06 Julian Brown <julian@codesourcery.com> gcc/fortran/ * trans-openmp.cc (gfc_add_firstprivate_if_unmapped): New function. (gfc_split_omp_clauses): Call above. libgomp/ * testsuite/libgomp.fortran/combined-directive-splitting-1.f90: New test.
-
- Dec 11, 2022
-
-
GCC Administrator authored
-
- Dec 10, 2022
-
-
Tobias Burnus authored
This commit enabled reverse offload for nvptx such that gomp_target_rev actually gets called. And it fills the latter function to do all of the following: finding the host function to the device func ptr and copying the arguments to the host, processing the mapping/firstprivate, calling the host function, copying back the data and freeing as needed. The data handling is made easier by assuming that all host variables either existed before (and are in the mapping) or that those are devices variables not yet available on the host. Thus, the reverse mapping can do without refcounts etc. Note that the spec disallows inside a target region device-affecting constructs other than target plus ancestor device-modifier and it also limits the clauses permitted on this construct. For the function addresses, an additional splay tree is used; for the lookup of mapped variables, the existing splay-tree is used. Unfortunately, its data structure requires a full walk of the tree; Additionally, the just mapped variables are recorded in a separate data structure an extra lookup. While the lookup is slow, assuming that only few variables get mapped in each reverse offload construct and that reverse offload is the exception and not performance critical, this seems to be acceptable. libgomp/ChangeLog: * libgomp.h (struct target_mem_desc): Predeclare; move below after 'reverse_splay_tree_node' and add rev_array member. (struct reverse_splay_tree_key_s, reverse_splay_compare): New. (reverse_splay_tree_node, reverse_splay_tree, reverse_splay_tree_key): New typedef. (struct gomp_device_descr): Add mem_map_rev member. * oacc-host.c (host_dispatch): NULL init .mem_map_rev. * plugin/plugin-nvptx.c (GOMP_OFFLOAD_get_num_devices): Claim support for GOMP_REQUIRES_REVERSE_OFFLOAD. * splay-tree.h (splay_tree_callback_stop): New typedef; like splay_tree_callback but returning int not void. (splay_tree_foreach_lazy): Define; like splay_tree_foreach but taking splay_tree_callback_stop as argument. * splay-tree.c (splay_tree_foreach_internal_lazy, splay_tree_foreach_lazy): New; but early exit if callback returns nonzero. * target.c: Instatiate splay_tree_c with splay_tree_prefix 'reverse'. (gomp_map_lookup_rev): New. (gomp_load_image_to_device): Handle reverse-offload function lookup table. (gomp_unload_image_from_device): Free devicep->mem_map_rev. (struct gomp_splay_tree_rev_lookup_data, gomp_splay_tree_rev_lookup, gomp_map_rev_lookup, struct cpy_data, gomp_map_cdata_lookup_int, gomp_map_cdata_lookup): New auxiliary structs and functions for gomp_target_rev. (gomp_target_rev): Implement reverse offloading and its mapping. (gomp_target_init): Init current_device.mem_map_rev.root. * testsuite/libgomp.fortran/reverse-offload-2.f90: New test. * testsuite/libgomp.fortran/reverse-offload-3.f90: New test. * testsuite/libgomp.fortran/reverse-offload-4.f90: New test. * testsuite/libgomp.fortran/reverse-offload-5.f90: New test. * testsuite/libgomp.fortran/reverse-offload-5a.f90: New test without mapping of on-device allocated variables.
-
GCC Administrator authored
-
- Dec 09, 2022
-
-
Tobias Burnus authored
gcc/fortran/ChangeLog: * dump-parse-tree.cc (show_omp_namelist): Improve OMP_LIST_ALLOCATE output. * gfortran.h (struct gfc_omp_namelist): Add 'align' to 'u'. (gfc_free_omp_namelist): Add bool arg. * match.cc (gfc_free_omp_namelist): Likewise; free 'u.align'. * openmp.cc (gfc_free_omp_clauses, gfc_match_omp_clause_reduction, gfc_match_omp_flush): Update call. (gfc_match_omp_clauses): Match 'align/allocate modifers in 'allocate' clause. (resolve_omp_clauses): Resolve align. * st.cc (gfc_free_statement): Update call * trans-openmp.cc (gfc_trans_omp_clauses): Handle 'align'. libgomp/ChangeLog: * libgomp.texi (5.1 Impl. Status): Split allocate clause/directive item about 'align'; mark clause as 'Y' and directive as 'N'. * testsuite/libgomp.fortran/allocate-2.f90: New test. * testsuite/libgomp.fortran/allocate-3.f90: New test.
-
- Dec 07, 2022
-
-
GCC Administrator authored
-
- Dec 06, 2022
-
-
Marcel Vollweiler authored
This patch adds support for omp_get_max_teams, omp_set_num_teams, and omp_{gs}et_teams_thread_limit on offload devices. That includes the usage of device-specific ICV values (specified as environment variables or changed on a device). In order to reuse device-specific ICV values, a copy back mechanism is implemented that copies ICV values back from device to the host. Additionally, a limitation of the number of teams on gcn offload devices is implemented. The number of teams is limited by twice the number of compute units (one team is executed on one compute unit). This avoids queueing unnessecary many teams and a corresponding allocation of large amounts of memory. Without that limitation the memory allocation for a large number of user-specified teams can result in an "memory access fault". A limitation of the number of teams is already also implemented for nvptx devices (see nvptx_adjust_launch_bounds in libgomp/plugin/plugin-nvptx.c). gcc/ChangeLog: * gimplify.cc (optimize_target_teams): Set initial num_teams_upper to "-2" instead of "1" for non-existing num_teams clause in order to disambiguate from the case of an existing num_teams clause with value 1. libgomp/ChangeLog: * config/gcn/icv-device.c (omp_get_teams_thread_limit): Added to allow processing of device-specific values. (omp_set_teams_thread_limit): Likewise. (ialias): Likewise. * config/nvptx/icv-device.c (omp_get_teams_thread_limit): Likewise. (omp_set_teams_thread_limit): Likewise. (ialias): Likewise. * icv-device.c (omp_get_teams_thread_limit): Likewise. (ialias): Likewise. (omp_set_teams_thread_limit): Likewise. * icv.c (omp_set_teams_thread_limit): Removed. (omp_get_teams_thread_limit): Likewise. (ialias): Likewise. * libgomp.texi: Updated documentation for nvptx and gcn corresponding to the limitation of the number of teams. * plugin/plugin-gcn.c (limit_teams): New helper function that limits the number of teams by twice the number of compute units. (parse_target_attributes): Limit the number of teams on gcn offload devices. * target.c (get_gomp_offload_icvs): Added teams_thread_limit_var handling. (gomp_load_image_to_device): Added a size check for the ICVs struct variable. (gomp_copy_back_icvs): New function that is used in GOMP_target_ext to copy back the ICV values from device to host. (GOMP_target_ext): Update the number of teams and threads in the kernel args also considering device-specific values. * testsuite/libgomp.c-c++-common/icv-4.c: Fixed an error in the reading of OMP_TEAMS_THREAD_LIMIT from the environment. * testsuite/libgomp.c-c++-common/icv-5.c: Extended. * testsuite/libgomp.c-c++-common/icv-6.c: Extended. * testsuite/libgomp.c-c++-common/icv-7.c: Extended. * testsuite/libgomp.c-c++-common/icv-9.c: New test. * testsuite/libgomp.fortran/icv-5.f90: New test. * testsuite/libgomp.fortran/icv-6.f90: New test. gcc/testsuite/ChangeLog: * c-c++-common/gomp/target-teams-1.c: Adapt expected values for num_teams from "1" to "-2" in cases without num_teams clause. * g++.dg/gomp/target-teams-1.C: Likewise. * gfortran.dg/gomp/defaultmap-4.f90: Likewise. * gfortran.dg/gomp/defaultmap-5.f90: Likewise. * gfortran.dg/gomp/defaultmap-6.f90: Likewise.
-
Tobias Burnus authored
libgomp/ * libgomp.texi (OpenMP 5.2): Add missing 'the'. (TR11): Add missing '@tab N @tab'.
-
- Dec 01, 2022
-
-
GCC Administrator authored
-
- Nov 30, 2022
-
-
Tobias Burnus authored
libgomp/ChangeLog: * libgomp.texi (OpenMP Context Selectors): Add 'gfx803' to gcn's isa.
-
Paul-Antoine Arras authored
Add support for gfx803 as an alias for fiji. Add test cases for all supported 'isa' values. gcc/ChangeLog: * config/gcn/gcn.cc (gcn_omp_device_kind_arch_isa): Add gfx803. * config/gcn/t-omp-device: Add gfx803. libgomp/ChangeLog: * testsuite/libgomp.c/declare-variant-4-fiji.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx803.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx900.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx906.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx908.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx90a.c: New test. * testsuite/libgomp.c/declare-variant-4.h: New header file.
-
- Nov 29, 2022
-
-
GCC Administrator authored
-
- Nov 28, 2022
-
-
Tobias Burnus authored
gcc/fortran/ChangeLog: * openmp.cc (OMP_DO_CLAUSES, OMP_SCOPE_CLAUSES, OMP_SECTIONS_CLAUSES): Add 'nowait'. (OMP_SINGLE_CLAUSES): Add 'nowait' and 'copyprivate'. (gfc_match_omp_distribute_parallel_do, gfc_match_omp_distribute_parallel_do_simd, gfc_match_omp_parallel_do, gfc_match_omp_parallel_do_simd, gfc_match_omp_parallel_sections, gfc_match_omp_teams_distribute_parallel_do, gfc_match_omp_teams_distribute_parallel_do_simd): Disallow 'nowait'. (gfc_match_omp_workshare): Match 'nowait' clause. (gfc_match_omp_end_single): Use clause matcher for 'nowait'. (resolve_omp_clauses): Reject 'nowait' + 'copyprivate'. * parse.cc (decode_omp_directive): Break too long line. (parse_omp_do, parse_omp_structured_block): Diagnose duplicated 'nowait' clause. libgomp/ChangeLog: * libgomp.texi (OpenMP 5.2): Mark end-directive as Y. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/copyprivate-1.f90: New test. * gfortran.dg/gomp/copyprivate-2.f90: New test. * gfortran.dg/gomp/nowait-2.f90: Move dg-error tests ... * gfortran.dg/gomp/nowait-4.f90: ... to this new file. * gfortran.dg/gomp/nowait-5.f90: New test. * gfortran.dg/gomp/nowait-6.f90: New test. * gfortran.dg/gomp/nowait-7.f90: New test. * gfortran.dg/gomp/nowait-8.f90: New test.
-
- Nov 26, 2022
-
-
GCC Administrator authored
-
- Nov 25, 2022
-
-
Sandra Loosemore authored
This patch causes the IPA simdclone pass to generate clones for functions with the "omp declare target" attribute as if they had "omp declare simd", provided the function appears to be suitable for SIMD execution. The filter is conservative, rejecting functions that write memory or that call other functions not known to be safe. A new option -fopenmp-target-simd-clone is added to control this transformation; it's enabled for offload processing at -O2 and higher. gcc/ChangeLog: * common.opt (fopenmp-target-simd-clone): New option. (target_simd_clone_device): New enum to go with it. * doc/invoke.texi (-fopenmp-target-simd-clone): Document. * flag-types.h (enum omp_target_simd_clone_device_kind): New. * omp-simd-clone.cc (auto_simd_fail): New function. (auto_simd_check_stmt): New function. (plausible_type_for_simd_clone): New function. (ok_for_auto_simd_clone): New function. (simd_clone_create): Add force_local argument, make the symbol have internal linkage if it is true. (expand_simd_clones): Also check for cloneable functions with "omp declare target". Pass explicit_p argument to simd_clone.compute_vecsize_and_simdlen target hook. * opts.cc (default_options_table): Add -fopenmp-target-simd-clone. * target.def (TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN): Add bool explicit_p argument. * doc/tm.texi: Regenerated. * config/aarch64/aarch64.cc (aarch64_simd_clone_compute_vecsize_and_simdlen): Update. * config/gcn/gcn.cc (gcn_simd_clone_compute_vecsize_and_simdlen): Update. * config/i386/i386.cc (ix86_simd_clone_compute_vecsize_and_simdlen): Update. gcc/testsuite/ChangeLog: * g++.dg/gomp/target-simd-clone-1.C: New. * g++.dg/gomp/target-simd-clone-2.C: New. * gcc.dg/gomp/target-simd-clone-1.c: New. * gcc.dg/gomp/target-simd-clone-2.c: New. * gcc.dg/gomp/target-simd-clone-3.c: New. * gcc.dg/gomp/target-simd-clone-4.c: New. * gcc.dg/gomp/target-simd-clone-5.c: New. * gcc.dg/gomp/target-simd-clone-6.c: New. * gcc.dg/gomp/target-simd-clone-7.c: New. * gcc.dg/gomp/target-simd-clone-8.c: New. * lib/scanoffloadipa.exp: New. libgomp/ChangeLog: * testsuite/lib/libgomp.exp: Load scanoffloadipa.exp library. * testsuite/libgomp.c/target-simd-clone-1.c: New. * testsuite/libgomp.c/target-simd-clone-2.c: New. * testsuite/libgomp.c/target-simd-clone-3.c: New.
-
Tobias Burnus authored
OpenMP permits that a 'target device(ancestor:1)' is called without being enclosed in a target region - using the current device (i.e. the host) in that case. This commit adds a testcase for this. In case of nvptx, the missing on-device 'GOMP_target_ext' call causes that it and also the associated on-device GOMP_REV_OFFLOAD_VAR variable are not linked in from nvptx's libgomp.a. Thus, handle the failing cuModuleGetGlobal gracefully by disabling reverse offload and assuming that the failure is fine. libgomp/ChangeLog: * plugin/plugin-nvptx.c (GOMP_OFFLOAD_load_image): Use unsigned int for 'i' to match 'fn_entries'; regard absent GOMP_REV_OFFLOAD_VAR as valid and the code having no reverse-offload code. * testsuite/libgomp.c-c++-common/reverse-offload-2.c: New test.
-
Tobias Burnus authored
libgomp/ChangeLog: * libgomp.texi (OpenMP Implementation Status): Add three 5.1 items and status for Technical Report (TR) 11.
-
- Nov 22, 2022
-
-
GCC Administrator authored
-
- Nov 21, 2022
-
-
Tobias Burnus authored
output.printf_data.(value union) contains text[128], which has the size of 128 bytes, sufficient for 16 uint64_t variables; hence value_u64[2] could be extended to value_u64[6] - sufficient for all required arguments to gomp_target_rev. Additionally, next_output.printf_data.(msg union) contained msg_u64 which then is no longer needed and also caused 32bit vs 64bit alignment issues. libgomp/ * config/gcn/libgomp-gcn.h (struct output): Remove 'msg_u64' from the union, change value_u64[2] to value_u64[6]. * config/gcn/target.c (GOMP_target_ext): Update accordingly. * plugin/plugin-gcn.c (process_reverse_offload, console_output): Likewise.
-
Martin Liska authored
-
GCC Administrator authored
-
- Nov 19, 2022
-
-
Tobias Burnus authored
libgomp/ChangeLog: * config/gcn/libgomp-gcn.h: New file; contains struct output, declared previously in plugin-gcn.c. * config/gcn/target.c: Include it. (GOMP_ADDITIONAL_ICVS): Declare as extern var. (GOMP_target_ext): Handle reverse offload. * plugin/plugin-gcn.c: Include libgomp-gcn.h. (struct kernargs): Replace struct def by the one from libgomp-gcn.h for output_data. (process_reverse_offload): New. (console_output): Call it.
-
- Nov 17, 2022
-
-
GCC Administrator authored
-
- Nov 16, 2022
-
-
Tobias Burnus authored
Add __builtin_gcn_kernarg_ptr to avoid using hard-coded register values and permit future ABI changes while keeping the API. gcc/ChangeLog: * config/gcn/gcn-builtins.def (KERNARG_PTR): Add. * config/gcn/gcn.cc (gcn_init_builtin_types): Change siptr_type_node, sfptr_type_node and voidptr_type_node from FLAT to ADDR_SPACE_DEFAULT. (gcn_expand_builtin_1): Handle GCN_BUILTIN_KERNARG_PTR. (gcn_oacc_dim_size): Return in ADDR_SPACE_FLAT. libgomp/ChangeLog: * config/gcn/team.c (gomp_gcn_enter_kernel): Use __builtin_gcn_kernarg_ptr instead of asm ("s8"). Co-Authored-By:
Andrew Stubbs <ams@codesourcery.com>
-
- Nov 15, 2022
-
-
GCC Administrator authored
-
- Nov 14, 2022
-
-
Martin Liska authored
This reverts commit c63539ff.
-
Martin Liska authored
This reverts commit 41a45cba.
-
Martin Liska authored
This reverts commit 54ca4eef.
-
Martin Liska authored
This reverts commit 1f5a932e.
-
Martin Liska authored
This reverts commit 1f9c7936.
-
Martin Liska authored
This reverts commit 3ed1b4ce.
-
Martin Liska authored
This reverts commit 8f5aa130.
-
Martin Liska authored
This reverts commit bd044dae.
-
Martin Liska authored
This reverts commit 5e749ee3.
-