- Jul 30, 2023
-
-
GCC Administrator authored
-
- Jul 29, 2023
-
-
Tobias Burnus authored
Fixes for commit r14-2792-g25072a477a56a727b369bf9b20f4d18198ff5894 "OpenMP: Call cuMemcpy2D/cuMemcpy3D for nvptx for omp_target_memcpy_rect", namely: In that commit, the code was changed to handle shared-memory devices; however, as pointed out, omp_target_memcpy_check already set the pointer to NULL in that case. Hence, this commit reverts to the prior version. In cuda.h, it adds cuMemcpyPeer{,Async} for symmetry for cuMemcpy3DPeer (all currently unused) and in three structs, fixes reserved-member names and remove a bogus 'const' in three structs. And it changes a DLSYM to DLSYM_OPT as not all plugins support the new functions, yet. include/ChangeLog: * cuda/cuda.h (CUDA_MEMCPY2D, CUDA_MEMCPY3D, CUDA_MEMCPY3D_PEER): Remove bogus 'const' from 'const void *dst' and fix reserved-name name in those structs. (cuMemcpyPeer, cuMemcpyPeerAsync): Add. libgomp/ChangeLog: * target.c (omp_target_memcpy_rect_worker): Undo dim=1 change for GOMP_OFFLOAD_CAP_SHARED_MEM. (omp_target_memcpy_rect_copy): Likewise for lock condition. (gomp_load_plugin_for_device): Use DLSYM_OPT not DLSYM for memcpy3d/memcpy2d. * plugin/plugin-nvptx.c (GOMP_OFFLOAD_memcpy2d, GOMP_OFFLOAD_memcpy3d): Use memset 0 to nullify reserved and unused src/dst fields for that mem type; remove '{src,dst}LOD = 0'.
-
- Jul 27, 2023
-
-
GCC Administrator authored
-
- Jul 26, 2023
-
-
Tobias Burnus authored
When copying a 2D or 3D rectangular memmory block, the performance is better when using CUDA's cuMemcpy2D/cuMemcpy3D instead of copying the data one by one. That's what this commit does. Additionally, it permits device-to-device copies, if neccessary using a temporary variable on the host. include/ChangeLog: * cuda/cuda.h (CUlimit): Add CUDA_ERROR_NOT_INITIALIZED, CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_INVALID_HANDLE. (CUarray, CUmemorytype, CUDA_MEMCPY2D, CUDA_MEMCPY3D, CUDA_MEMCPY3D_PEER): New typdefs. (cuMemcpy2D, cuMemcpy2DAsync, cuMemcpy2DUnaligned, cuMemcpy3D, cuMemcpy3DAsync, cuMemcpy3DPeer, cuMemcpy3DPeerAsync): New prototypes. libgomp/ChangeLog: * libgomp-plugin.h (GOMP_OFFLOAD_memcpy2d, GOMP_OFFLOAD_memcpy3d): New prototypes. * libgomp.h (struct gomp_device_descr): Add memcpy2d_func and memcpy3d_func. * libgomp.texi (nvtpx): Document when cuMemcpy2D/cuMemcpy3D is used. * oacc-host.c (memcpy2d_func, .memcpy3d_func): Init with NULL. * plugin/cuda-lib.def (cuMemcpy2D, cuMemcpy2DUnaligned, cuMemcpy3D): Invoke via CUDA_ONE_CALL. * plugin/plugin-nvptx.c (GOMP_OFFLOAD_memcpy2d, GOMP_OFFLOAD_memcpy3d): New. * target.c (omp_target_memcpy_rect_worker): (omp_target_memcpy_rect_check, omp_target_memcpy_rect_copy): Permit all device-to-device copyies; invoke new plugins for 2D and 3D copying when available. (gomp_load_plugin_for_device): DLSYM the new plugin functions. * testsuite/libgomp.c/target-12.c: Fix dimension bug. * testsuite/libgomp.fortran/target-12.f90: Likewise. * testsuite/libgomp.fortran/target-memcpy-rect-1.f90: New test.
-
Tobias Burnus authored
libgomp/ChangeLog: * libgomp.texi (OpenMP 5.2 features): Add 'all' for 'defaultmap' as 'N'. (Tasking Routines): Document omp_in_explicit_task. (Implementation-defined ICV Initialization): Use @ref not @code.
-
- Jul 21, 2023
-
-
GCC Administrator authored
-
- Jul 20, 2023
-
-
Tobias Burnus authored
The previous list of OpenMP routines was rather lengthy and the order seemed to be rather random - especially for outputs which did not have @menu as then the sectioning was not visible. The OpenMP specification split in 5.1 the lengthy list by adding sections to the chapter and grouping the routines under them. This patch follow suite and uses the same sections and order. The commit also prepares for adding not-yet-documented routines by listening those in the @menu (@c commented - both for just undocumented and for also unimplemented routines). See also PR 110364. libgomp/ChangeLog: * libgomp.texi (OpenMP Runtime Library Routines): Split long list by adding sections and moving routines there. (OMP_ALLOCATORS): Fix typo.
-
GCC Administrator authored
-
- Jul 19, 2023
-
-
Tobias Burnus authored
Before this commit, gfortran produced with OpenMP for 'do i = 1,10,2' the code for (count.0 = 0; count.0 < 5; count.0 = count.0 + 1) i = count.0 * 2 + 1; While such an inner loop can be collapsed, a non-rectangular could not. With this commit and for all constant loop steps, a simple loop such as 'for (i = 1; i <= 10; i = i + 2)' is created. (Before only for the constant steps of 1 and -1.) The constant step permits to know the direction (increasing/decreasing) that is required for the loop condition. The new code is only valid if one assumes no overflow of the loop variable. However, the Fortran standard can be read that this must be ensured by the user. Namely, the Fortran standard requires (F2023, 10.1.5.2.4): "The execution of any numeric operation whose result is not defined by the arithmetic used by the processor is prohibited." And, for DO loops, F2023's "11.1.7.4.3 The execution cycle" has the following: The number of loop iterations handled by an iteration count, which would permit code like 'do i = huge(i)-5, huge(i),4'. However, in step (3), this count is not only decremented by one but also: "... The DO variable, if any, is incremented by the value of the incrementation parameter m3." And for the example above, 'i' would be 'huge(i)+3' in the last execution cycle, which exceeds the largest model number and should render the example as invalid. PR fortran/107424 gcc/fortran/ChangeLog: * trans-openmp.cc (gfc_nonrect_loop_expr): Accept all constant loop steps. (gfc_trans_omp_do): Likewise; use sign to determine loop direction. libgomp/ChangeLog: * libgomp.texi (Impl. Status 5.0): Add link to new PR110735. * testsuite/libgomp.fortran/non-rectangular-loop-1.f90: Enable commented tests. * testsuite/libgomp.fortran/non-rectangular-loop-1a.f90: Remove test file; tests are in non-rectangular-loop-1.f90. * testsuite/libgomp.fortran/non-rectangular-loop-5.f90: Change testcase to use a non-constant step to retain the 'sorry' test. * testsuite/libgomp.fortran/non-rectangular-loop-6.f90: New test. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/linear-2.f90: Update dump to remove the additional count variable.
-
- Jul 18, 2023
-
-
GCC Administrator authored
-
- Jul 17, 2023
-
-
Tobias Burnus authored
The 'uses_allocators' clause to the 'target' construct accepts predefined allocators and can also be used to define a new allocator for a target region. As predefined allocators in GCC do not require special handling, those can and are ignored after parsing, such that this feature now works. On the other hand, defining a new allocator will fail for now with a 'sorry, unimplemented'. Note that both the OpenMP 5.0/5.1 and 5.2 syntax for uses_allocators is supported by this commit. 2023-07-17 Tobias Burnus <tobias@codesoucery.com> Chung-Lin Tang <cltang@codesourcery.com> gcc/fortran/ChangeLog: * dump-parse-tree.cc (show_omp_namelist, show_omp_clauses): Dump uses_allocators clause. * gfortran.h (gfc_free_omp_namelist): Add memspace_sym to u union and traits_sym to u2 union. (OMP_LIST_USES_ALLOCATORS): New enum value. (gfc_free_omp_namelist): Add 'bool free_mem_traits_space' arg. * match.cc (gfc_free_omp_namelist): Likewise. * openmp.cc (gfc_free_omp_clauses, gfc_match_omp_variable_list, gfc_match_omp_to_link, gfc_match_omp_doacross_sink, gfc_match_omp_clause_reduction, gfc_match_omp_allocate, gfc_match_omp_flush): Update call. (gfc_match_omp_clauses): Likewise. Parse uses_allocators clause. (gfc_match_omp_clause_uses_allocators): New. (enum omp_mask2): Add new OMP_CLAUSE_USES_ALLOCATORS. (OMP_TARGET_CLAUSES): Accept it. (resolve_omp_clauses): Resolve uses_allocators clause * st.cc (gfc_free_statement): Update gfc_free_omp_namelist call. * trans-openmp.cc (gfc_trans_omp_clauses): Handle OMP_LIST_USES_ALLOCATORS; fail with sorry unless predefined allocator. (gfc_split_omp_clauses): Handle uses_allocators. libgomp/ChangeLog: * testsuite/libgomp.fortran/uses_allocators_1.f90: New test. * testsuite/libgomp.fortran/uses_allocators_2.f90: New test. Co-authored-by:
Chung-Lin Tang <cltang@codesourcery.com>
-
- Jul 15, 2023
-
-
GCC Administrator authored
-
- Jul 14, 2023
-
-
Tobias Burnus authored
libgomp/ * libgomp.texi (OMP_ALLOCATOR): Document the default values for the traits. Add crossref to 'Memory allocation'. (Memory allocation): Refer to OMP_ALLOCATOR for the available traits and allocators/mem spaces; document the default value for the pool_size trait.
-
Tobias Burnus authored
Follow up to r14-2462-g450b05ce54d3f0. The case that libnuma was not available at runtime was not properly handled; now it falls back to the normal malloc. libgomp/ * allocator.c (omp_init_allocator): Check whether symbol from dlopened libnuma is available before using libnuma for allocations.
-
GCC Administrator authored
-
- Jul 13, 2023
-
-
David Edelsohn authored
Some test cases in libgomp testsuite pass -flto as an option, but the testcases do not require LTO target support. This patch adds the necessary DejaGNU requirement for LTO support to the testcases.. libgomp/ChangeLog: * testsuite/libgomp.c++/target-map-class-2.C: Require LTO. * testsuite/libgomp.c-c++-common/requires-4.c: Require LTO. * testsuite/libgomp.c-c++-common/requires-4a.c: Require LTO. Signed-off-by:
David Edelsohn <dje.gcc@gmail.com>
-
GCC Administrator authored
-
- Jul 12, 2023
-
-
Tobias Burnus authored
libgomp/ * libgomp.texi (OpenMP 5.0): Replace '... stub' by @ref to 'Memory allocation' section which contains the full status. (TR11): Remove differently worded duplicated entry.
-
Tobias Burnus authored
As with the memkind library, it is only used when found at runtime; it does not need to be present when building GCC. The included testcase does not check whether the memory has been placed on the nearest node as the Linux kernel memory handling too often ignores that hint, using a different node for the allocation. However, when running with 'numactl --preferred=<node> ./executable', it is clearly visible that the feature works by comparing malloc/default vs. nearest placement (using get_mempolicy to obtain the node for a mem addr). libgomp/ChangeLog: * allocator.c: Add ifdef for LIBGOMP_USE_LIBNUMA. (enum gomp_numa_memkind_kind): Renamed from gomp_memkind_kind; add GOMP_MEMKIND_LIBNUMA. (struct gomp_libnuma_data, gomp_init_libnuma, gomp_get_libnuma): New. (omp_init_allocator): Handle partition=nearest with libnuma if avail. (omp_aligned_alloc, omp_free, omp_aligned_calloc, omp_realloc): Add numa_alloc_local (+ memset), numa_free, and numa_realloc calls as needed. * config/linux/allocator.c (LIBGOMP_USE_LIBNUMA): Define * libgomp.texi: Fix a typo; use 'fi' instead of its ligature char. (Memory allocation): Renamed from 'Memory allocation with libmemkind'; updated for libnuma usage. * testsuite/libgomp.c-c++-common/alloc-11.c: New test. * testsuite/libgomp.c-c++-common/alloc-12.c: New test.
-
GCC Administrator authored
-
- Jul 11, 2023
-
-
Tobias Burnus authored
libgomp/ * allocator.c (omp_init_allocator): Use malloc for omp_high_bw_mem_space when the memkind lib is unavailable instead of returning omp_null_allocator. * libgomp.texi (OpenMP 5.0): Fix typo. (Memory allocation with libmemkind): Document implementation in more detail.
-
- Jun 23, 2023
-
-
GCC Administrator authored
-
- Jun 22, 2023
-
-
Tobias Burnus authored
Use @var{} instead of @emph{} - for semantic texinfo formatting; the result is similar: slanted instead of italic in PDF, still italic in HTML, albeit in info is is now uppercase instead of '_' as pre/suffix. The patch also documents the newer _ALL/_DEV/_DEV_<no> env var suffixes and as it refers to the ICV vars and their scope, those were added to the OMP_ env vars for reference. For OMP_NESTING, a note that those were deprecated was added plus a bunch of cross references. For OMP_ALLOCATOR, add note about the lack of per-device env vars support. A new section, consisting mostly of cross references was added to document the implementation-defined ICV initialization, especially as OpenMP demands that implementations document what they do for 'implementation defined'. For nvptx, the implementation-defined used stack size was documented libgomp/ * libgomp.texi: Use @var for ICV vars. (OpenMP Environment Variables): Mention _ALL/_DEV/_DEV_<no> variants, document which ICV is set and which scope the ICV has; extend/cleanup some @ref. (Implementation-defined ICV Initialization): New. (nvptx): Document the implementation-defined used per-warp stack size.
-
- Jun 20, 2023
-
-
GCC Administrator authored
-
- Jun 19, 2023
-
-
Thomas Schwinge authored
ERROR: libgomp.c/target-51.c: unknown dg option: \} for "}" Fix-up for recent commit 01fe115b "libgomp.c/target-51.c: Accept more error-msg variants in dg-output". libgomp/ * testsuite/libgomp.c/target-51.c: Fix DejaGnu directive syntax error.
-
Tobias Burnus authored
Depending on the details, the testcase can fail with different but related messages; all of the following all could be observed for this testcase: libgomp: OMP_TARGET_OFFLOAD is set to MANDATORY, but device cannot be used for offloading libgomp: OMP_TARGET_OFFLOAD is set to MANDATORY, but device not found libgomp: OMP_TARGET_OFFLOAD is set to MANDATORY, but only the host device is available Before, the last two were tested for with 'target offload_device' and '! offload_device', respectively. Now, all three are accepted by matching '.*' already after 'but' and without distinguishing whether the effective target is an offload_device or not. (For completeness, there is a fourth error that follows this pattern: 'OMP_TARGET_OFFLOAD is set to MANDATORY, but device is finalized'.) libgomp/ * testsuite/libgomp.c/target-51.c: Accept more error msg variants as expected dg-output.
-
Tobias Burnus authored
For C/C++ pointers, default implicit mapping firstprivatizes the pointer but if the memory it points to is mapped, the it is updated to point to the device memory (by attaching a zero sized array section of the pointed-to storage). However, if the pointed-to storage wasn't mapped, the pointer was set to NULL on the device side (OpenMP 5.0/5.1 semantic). With this commit, the pointer retains the on-host address in that case (OpenMP 5.2 semantic). The new semantic avoids an explicit map/firstprivate/is_device_ptr in the following sensible cases: Special values (e.g. pointer or 0x1, 0x2 etc.), explicitly device allocated memory (e.g. omp_target_alloc), and with (unified) shared memory. (Note: With (U)SM, mappings still must be tracked, at least when omp_target_associate_ptr does not fail when passing in two destinct pointers.) libgomp/ PR middle-end/110270 * target.c (gomp_map_vars_internal): Copy host value instead of NULL for GOMP_MAP_ZERO_LEN_ARRAY_SECTION if not mapped. * libgomp.texi (OpenMP 5.2 Impl.): Mark as 'Y'. * testsuite/libgomp.c/target-19.c: Update expected value. * testsuite/libgomp.c++/target-18.C: Likewise. * testsuite/libgomp.c++/target-19.C: Likewise. * testsuite/libgomp.c-c++-common/requires-unified-addr-2.c: New test. * testsuite/libgomp.c-c++-common/target-implicit-map-3.c: New test. * testsuite/libgomp.c-c++-common/target-implicit-map-4.c: New test.
-
- Jun 17, 2023
-
-
GCC Administrator authored
-
- Jun 16, 2023
-
-
Tobias Burnus authored
It turned out that gomp_init_targets_once() was not run when directly calling 'omp target' or 'omp target (enter/exit) data' causing an abort with OMP_TARGET_OFFLOAD=mandatory wrongly claiming that no device is available. It was called a tiny bit later but few lines too late for updating the default-device-var. libgomp/ChangeLog: * target.c (resolve_device): Call gomp_get_num_devices early to ensure gomp_init_targets_once was called before using default-device-var. * testsuite/libgomp.c/target-55.c: New test. * testsuite/libgomp.c/target-55a.c: New test.
-
GCC Administrator authored
-
- Jun 15, 2023
-
-
Tobias Burnus authored
Support OpenMP 5.1's syntax for OMP_ALLOCATOR as well, which permits besides predefined allocators also predefined memspaces optionally followed by traits. Additionally, this commit adds the previously lacking documentation for OMP_ALLOCATOR, OMP_AFFINITY_FORMAT and OMP_DISPLAY_AFFINITY. libgomp/ChangeLog: * env.c (gomp_def_allocator_envvar): New var. (parse_allocator): Handle OpenMP 5.1 syntax. (cleanup_env): New. (omp_display_env): Output gomp_def_allocator_envvar for an allocator with traits. * libgomp.texi (OMP_ALLOCATOR, OMP_AFFINITY_FORMAT, OMP_DISPLAY_AFFINITY): New. * testsuite/libgomp.c/allocator-1.c: New test. * testsuite/libgomp.c/allocator-2.c: New test. * testsuite/libgomp.c/allocator-3.c: New test. * testsuite/libgomp.c/allocator-4.c: New test. * testsuite/libgomp.c/allocator-5.c: New test. * testsuite/libgomp.c/allocator-6.c: New test.
-
GCC Administrator authored
-
- Jun 14, 2023
-
-
Thomas Schwinge authored
On 2023-06-14T11:42:22+0200, Tobias Burnus <tobias@codesourcery.com> wrote: > On 14.06.23 10:09, Thomas Schwinge wrote: >> Let me know if I should also adjust the new 'target { ! offload_device }' >> diagnostic "[...] MANDATORY but only the host device is available" to >> include a comma before 'but', for consistency with the other existing >> diagnostics (cited above)? > > I think it makes sense to be consistent. Thus: Yes, please add the commas. Fix-up for recent commit 18c8b56c "OpenMP: Set default-device-var with OMP_TARGET_OFFLOAD=mandatory". libgomp/ * target.c (resolve_device): Align a 'OMP_TARGET_OFFLOAD=mandatory' diagnostic with others. * testsuite/libgomp.c/target-51.c: Adjust.
-
Thomas Schwinge authored
..., so that users don't manually need to specify '-foffload-options=-lgfortran', '-foffload-options=-lm' in addition to '-lgfortran', '-lm' (specified manually, or implicitly by the driver). gcc/ * gcc.cc (driver_handle_option): Forward host '-lgfortran', '-lm' to offloading compilation. * config/gcn/mkoffload.cc (main): Adjust. * config/nvptx/mkoffload.cc (main): Likewise. * doc/invoke.texi (foffload-options): Update example. libgomp/ * testsuite/libgomp.fortran/fortran.exp (lang_link_flags): Don't set. * testsuite/libgomp.oacc-fortran/fortran.exp (lang_link_flags): Likewise. * testsuite/libgomp.c/simd-math-1.c: Remove '-foffload-options=-lm'. * testsuite/libgomp.fortran/fortran-torture_execute_math.f90: Likewise. * testsuite/libgomp.oacc-fortran/fortran-torture_execute_math.f90: Likewise.
-
Thomas Schwinge authored
..., via 'include'ing the existing 'gfortran.fortran-torture/execute/math.f90', which therefore is enhanced for optional OpenACC 'serial', OpenMP 'target' usage. gcc/testsuite/ * gfortran.fortran-torture/execute/math.f90: Enhance for optional OpenACC 'serial', OpenMP 'target' usage. libgomp/ * testsuite/libgomp.fortran/fortran-torture_execute_math.f90: New. * testsuite/libgomp.oacc-fortran/fortran-torture_execute_math.f90: Likewise.
-
Thomas Schwinge authored
..., and therefore, given 'target offload_device': PASS: libgomp.c/target-51.c (test for excess errors) PASS: libgomp.c/target-51.c execution test [-FAIL:-]{+PASS:+} libgomp.c/target-51.c output pattern test Fix-up for recent commit 18c8b56c "OpenMP: Set default-device-var with OMP_TARGET_OFFLOAD=mandatory". libgomp/ * testsuite/libgomp.c/target-51.c: Fix typo.
-
Tobias Burnus authored
OMP_TARGET_OFFLOAD=mandatory handling was before inconsistent. Hence, in OpenMP 5.2 it was clarified/extended by having implications on the default-device-var; additionally, omp_initial_device and omp_invalid_device enum values/PARAMETERs were added; support for it was added in r13-1066-g1158fe43407568 including aborting for omp_invalid_device and non-conforming device numbers. Only the mandatory handling was missing. Namely, while the default-device-var is usually initialized to value 0, with 'mandatory' it must have the value 'omp_invalid_device' if and only if zero non-host devices are available. (The OMP_DEFAULT_DEVICE env var overrides this as it comes semantically after the initialization.) To achieve this, default-device-var is now initialized to MIN_INT. If there is no 'mandatory', it is set to 0 directly after env var parsing. Otherwise, it is updated in gomp_target_init to either 0 or omp_invalid_device. To ensure INT_MIN is never seen by the user, both the omp_get_default_device API routine and omp_display_env (user call and OMP_DISPLAY_ENV env var) call gomp_init_targets_once() in that case. libgomp/ChangeLog: * env.c (gomp_default_icv_values): Init default_device_var to an nonconforming value - INT_MIN. (initialize_env): After env-var parsing, set default_device_var to device 0 unless OMP_TARGET_OFFLOAD=mandatory. (omp_display_env): If default_device_var is INT_MIN, call gomp_init_targets_once. * icv-device.c (omp_get_default_device): Likewise. * libgomp.texi (OMP_DEFAULT_DEVICE): Update init description. (OpenMP 5.2 Impl. Status): Mark OMP_TARGET_OFFLOAD=mandatory as 'Y'. * target.c (resolve_device): Improve error message device-num < 0 with 'mandatory' and no no-host devices available. (gomp_target_init): Set default-device-var if INT_MIN. * testsuite/libgomp.c/target-48.c: New test. * testsuite/libgomp.c/target-49.c: New test. * testsuite/libgomp.c/target-50.c: New test. * testsuite/libgomp.c/target-50a.c: New test. * testsuite/libgomp.c/target-51.c: New test. * testsuite/libgomp.c/target-52.c: New test. * testsuite/libgomp.c/target-53.c: New test. * testsuite/libgomp.c/target-54.c: New test.
-
GCC Administrator authored
-
- Jun 13, 2023
-
-
Tobias Burnus authored
Add a testcase for 'omp requires unified_address' that is currently supported by all devices but was not tested for. libgomp/ PR libgomp/109837 * testsuite/libgomp.c-c++-common/requires-unified-addr-1.c: New test. * testsuite/libgomp.fortran/requires-unified-addr-1.f90: New test.
-
GCC Administrator authored
-