Commits · bede80555065fb1cf553f9484700faf4e96e1da7 · COBOLworx / gcc-cobol

Dec 06, 2023
- Daily bump. · ad80a69d
  GCC Administrator authored 1 year ago
  
  ad80a69d
Dec 05, 2023

libgomp: Handle NULL environ like pointer to NULL pointer [PR111413] · c128ad8e

Jakub Jelinek authored 1 year ago

clearenv function just sets environ to NULL (after sometimes freeing it),
rather than setting it to a pointer to NULL, and our code was assuming
it is always non-NULL.

Fixed thusly, the change seems to be large but actually is just
+  if (environ)
     for (env = environ; *env != 0; env++)
plus reindentation.  I've also noticed the block after this for loop
was badly indented (too much) and fixed that too.

No testcase added, as it needs clearenv + dlopen.

2023-09-19  Jakub Jelinek  <jakub@redhat.com>

	PR libgomp/111413
	* env.c (initialize_env): Don't dereference environ if it is NULL.
	Reindent.

(cherry picked from commit 15345980)

c128ad8e

Aug 25, 2023
- Daily bump. · 765bf89b
  GCC Administrator authored 1 year ago
  
  765bf89b
Aug 24, 2023

omp-expand.cc: Fix wrong code with non-rectangular loop nest [PR111017] · d4648a00

Tobias Burnus authored 1 year ago

Before commit r12-5295-g47de0b56ee455e, all gimple_build_cond in
expand_omp_for_* were inserted with
  gsi_insert_before (gsi_p, cond_stmt, GSI_SAME_STMT);
except the one dealing with the multiplicative factor that was
  gsi_insert_after (gsi, cond_stmt, GSI_CONTINUE_LINKING);

That commit for PR103208 fixed the issue of some missing regimplify of
operands of GIMPLE_CONDs by moving the condition handling to the new function
expand_omp_build_cond. While that function has an 'bool after = false'
argument to switch between the two variants.

However, all callers ommited this argument. This commit reinstates the
prior behavior by passing 'true' for the factor != 0 condition, fixing
the included testcase.

	PR middle-end/111017
gcc/
	* omp-expand.cc (expand_omp_for_init_vars): Pass after=true
	to expand_omp_build_cond for 'factor != 0' condition, resulting
	in pre-r12-5295-g47de0b56ee455e code for the gimple insert.

libgomp/
	* testsuite/libgomp.c-c++-common/non-rect-loop-1.c: New test.

(cherry picked from commit 1dc65003)

d4648a00

Jul 27, 2023
- Update ChangeLog and version files for release · c891d8dc
  Jakub Jelinek authored 1 year ago
  
  View commits for tag releases/gcc-13.2.0 releases/gcc-13.2.0
  
  c891d8dc
Jun 29, 2023
- Daily bump. · e0f07bc9
  GCC Administrator authored 1 year ago
  
  e0f07bc9
Jun 28, 2023

Support parallel testing in libgomp: fallback Perl 'flock' [PR66005] · 09124b7e

Thomas Schwinge authored 1 year ago

Follow-up to commit 6c3b30ef
"Support parallel testing in libgomp, part II [PR66005]"
("..., and enable if 'flock' is available for serializing execution testing"),
where we saw:

> On my Dell Precision 7530 laptop:
>
>     $ uname -srvi
>     Linux 5.15.0-71-generic #78-Ubuntu SMP Tue Apr 18 09:00:29 UTC 2023 x86_64
>     $ grep '^model name' < /proc/cpuinfo | uniq -c
>          12 model name      : Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
>     $ nvidia-smi -L
>     GPU 0: Quadro P1000 (UUID: GPU-e043973b-b52a-d02b-c066-a8fdbf64e8ea)
>
> ... [...]: case (c) standard configuration, no offloading
> configured, [...]

>     $ \time make check-target-libgomp
>
> Case (c), baseline; [...]:
>
>     1180.98user 110.80system 19:36.40elapsed 109%CPU (0avgtext+0avgdata 505148maxresident)k
>     1133.22user 111.08system 19:35.75elapsed 105%CPU (0avgtext+0avgdata 505212maxresident)k
>
> Case (c), parallelized [using 'flock']:
>
> [...]
>     -j12 GCC_TEST_PARALLEL_SLOTS=12
>     2591.04user 192.64system 4:44.98elapsed 976%CPU (0avgtext+0avgdata 505216maxresident)k
>     2581.23user 195.21system 4:47.51elapsed 965%CPU (0avgtext+0avgdata 505212maxresident)k

Quite the same when instead of 'flock' using this fallback Perl 'flock':

    2565.23user 194.35system 4:46.77elapsed 962%CPU (0avgtext+0avgdata 505216maxresident)k
    2549.38user 200.20system 4:46.08elapsed 961%CPU (0avgtext+0avgdata 505216maxresident)k

	PR testsuite/66005
	gcc/
	* doc/install.texi: Document (optional) Perl usage for parallel
	testing of libgomp.
	libgomp/
	* testsuite/lib/libgomp.exp: 'flock' through stdout.
	* testsuite/flock: New.
	* configure.ac (FLOCK): Point to that if no 'flock' available, but
	'perl' is.
	* configure: Regenerate.

(cherry picked from commit 04abe194)

09124b7e

Support parallel testing in libgomp, part II [PR66005] · 3840d5cc

Thomas Schwinge authored 1 year ago

..., and enable if 'flock' is available for serializing execution testing.

Regarding the default of 19 parallel slots, this turned out to be a local
minimum for wall time when testing this on:

    $ uname -srvi
    Linux 4.2.0-42-generic #49~14.04.1-Ubuntu SMP Wed Jun 29 20:22:11 UTC 2016 x86_64
    $ grep '^model name' < /proc/cpuinfo | uniq -c
         32 model name      : Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz

... in two configurations: case (a) standard configuration, no offloading
configured, case (b) offloading for GCN and nvptx configured but no devices
available.  For both cases, default plus '-m32' variant.

    $ \time make check-target-libgomp RUNTESTFLAGS="--target_board=unix\{,-m32\}"

Case (a), baseline:

    6432.23user 332.38system 47:32.28elapsed 237%CPU (0avgtext+0avgdata 505044maxresident)k
    6382.43user 319.21system 47:06.04elapsed 237%CPU (0avgtext+0avgdata 505172maxresident)k

This is what people have been complaining about, rightly so, in
<https://gcc.gnu.org/PR66005> "libgomp make check time is excessive" and
elsewhere.

Case (a), parallelized:

    -j12 GCC_TEST_PARALLEL_SLOTS=10
    3088.49user 267.74system 6:43.82elapsed 831%CPU (0avgtext+0avgdata 505188maxresident)k
    -j15 GCC_TEST_PARALLEL_SLOTS=15
    3308.08user 294.79system 5:56.04elapsed 1011%CPU (0avgtext+0avgdata 505360maxresident)k
    -j17 GCC_TEST_PARALLEL_SLOTS=17
    3539.93user 298.99system 5:27.86elapsed 1170%CPU (0avgtext+0avgdata 505112maxresident)k
    -j18 GCC_TEST_PARALLEL_SLOTS=18
    3697.50user 317.18system 5:14.63elapsed 1275%CPU (0avgtext+0avgdata 505360maxresident)k
    -j19 GCC_TEST_PARALLEL_SLOTS=19
    3765.94user 324.27system 5:13.22elapsed 1305%CPU (0avgtext+0avgdata 505128maxresident)k
    -j20 GCC_TEST_PARALLEL_SLOTS=20
    3684.66user 312.32system 5:15.26elapsed 1267%CPU (0avgtext+0avgdata 505100maxresident)k
    -j23 GCC_TEST_PARALLEL_SLOTS=23
    4040.59user 347.10system 5:29.12elapsed 1333%CPU (0avgtext+0avgdata 505200maxresident)k
    -j26 GCC_TEST_PARALLEL_SLOTS=26
    3973.24user 377.96system 5:24.70elapsed 1340%CPU (0avgtext+0avgdata 505160maxresident)k
    -j32 GCC_TEST_PARALLEL_SLOTS=32
    4004.42user 346.10system 5:16.11elapsed 1376%CPU (0avgtext+0avgdata 505160maxresident)k

Yay!

Case (b), baseline; 2+ h:

    7227.58user 700.54system 2:14:33elapsed 98%CPU (0avgtext+0avgdata 994264maxresident)k

Case (b), parallelized:

    -j12 GCC_TEST_PARALLEL_SLOTS=10
    7377.46user 777.52system 16:06.63elapsed 843%CPU (0avgtext+0avgdata 994344maxresident)k
    -j15 GCC_TEST_PARALLEL_SLOTS=15
    8019.18user 721.42system 12:13.56elapsed 1191%CPU (0avgtext+0avgdata 994228maxresident)k
    -j17 GCC_TEST_PARALLEL_SLOTS=17
    8530.11user 716.95system 10:45.92elapsed 1431%CPU (0avgtext+0avgdata 994176maxresident)k
    -j18 GCC_TEST_PARALLEL_SLOTS=18
    8776.79user 645.89system 10:27.20elapsed 1502%CPU (0avgtext+0avgdata 994248maxresident)k
    -j19 GCC_TEST_PARALLEL_SLOTS=19
    9332.37user 641.76system 10:15.09elapsed 1621%CPU (0avgtext+0avgdata 994260maxresident)k
    -j20 GCC_TEST_PARALLEL_SLOTS=20
    9609.54user 789.88system 10:26.94elapsed 1658%CPU (0avgtext+0avgdata 994284maxresident)k
    -j23 GCC_TEST_PARALLEL_SLOTS=23
    10362.40user 911.14system 10:44.47elapsed 1749%CPU (0avgtext+0avgdata 994208maxresident)k
    -j26 GCC_TEST_PARALLEL_SLOTS=26
    11159.44user 850.99system 11:09.25elapsed 1794%CPU (0avgtext+0avgdata 994256maxresident)k
    -j32 GCC_TEST_PARALLEL_SLOTS=32
    11453.50user 939.52system 11:00.38elapsed 1876%CPU (0avgtext+0avgdata 994240maxresident)k

On my Dell Precision 7530 laptop:

    $ uname -srvi
    Linux 5.15.0-71-generic #78-Ubuntu SMP Tue Apr 18 09:00:29 UTC 2023 x86_64
    $ grep '^model name' < /proc/cpuinfo | uniq -c
         12 model name      : Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
    $ nvidia-smi -L
    GPU 0: Quadro P1000 (UUID: GPU-e043973b-b52a-d02b-c066-a8fdbf64e8ea)

... in two configurations: case (c) standard configuration, no offloading
configured, case (d) offloading for nvptx configured and device available.
For both cases, only default variant, no '-m32'.

    $ \time make check-target-libgomp

Case (c), baseline; roughly half of case (a) (just one variant):

    1180.98user 110.80system 19:36.40elapsed 109%CPU (0avgtext+0avgdata 505148maxresident)k
    1133.22user 111.08system 19:35.75elapsed 105%CPU (0avgtext+0avgdata 505212maxresident)k

Case (c), parallelized:

    -j12 GCC_TEST_PARALLEL_SLOTS=2
    1143.83user 110.76system 10:20.46elapsed 202%CPU (0avgtext+0avgdata 505216maxresident)k
    -j12 GCC_TEST_PARALLEL_SLOTS=6
    1737.08user 143.94system 4:59.48elapsed 628%CPU (0avgtext+0avgdata 505200maxresident)k
    1730.31user 143.02system 4:58.75elapsed 627%CPU (0avgtext+0avgdata 505152maxresident)k
    -j12 GCC_TEST_PARALLEL_SLOTS=8
    2192.63user 169.34system 4:52.96elapsed 806%CPU (0avgtext+0avgdata 505216maxresident)k
    2219.04user 167.67system 4:53.19elapsed 814%CPU (0avgtext+0avgdata 505152maxresident)k
    -j12 GCC_TEST_PARALLEL_SLOTS=10
    2463.93user 184.98system 4:48.39elapsed 918%CPU (0avgtext+0avgdata 505200maxresident)k
    2455.62user 183.68system 4:47.40elapsed 918%CPU (0avgtext+0avgdata 505216maxresident)k
    -j12 GCC_TEST_PARALLEL_SLOTS=12
    2591.04user 192.64system 4:44.98elapsed 976%CPU (0avgtext+0avgdata 505216maxresident)k
    2581.23user 195.21system 4:47.51elapsed 965%CPU (0avgtext+0avgdata 505212maxresident)k
    -j20 GCC_TEST_PARALLEL_SLOTS=20 [oversubscribe]
    2613.18user 199.51system 4:44.06elapsed 990%CPU (0avgtext+0avgdata 505216maxresident)k

Case (d), baseline (compared to case (b): only nvptx offloading compilation,
but also nvptx offloading execution); ~1 h:

    2841.93user 653.68system 1:02:26elapsed 93%CPU (0avgtext+0avgdata 909792maxresident)k
    2842.03user 654.39system 1:02:24elapsed 93%CPU (0avgtext+0avgdata 909880maxresident)k

Case (d), parallelized:

    -j12 GCC_TEST_PARALLEL_SLOTS=2
    2856.39user 606.87system 33:58.64elapsed 169%CPU (0avgtext+0avgdata 909948maxresident)k
    -j12 GCC_TEST_PARALLEL_SLOTS=6
    3444.90user 666.86system 18:37.57elapsed 367%CPU (0avgtext+0avgdata 909856maxresident)k
    3462.13user 667.13system 18:36.87elapsed 369%CPU (0avgtext+0avgdata 909872maxresident)k
    -j12 GCC_TEST_PARALLEL_SLOTS=8
    3929.74user 716.22system 18:02.36elapsed 429%CPU (0avgtext+0avgdata 909832maxresident)k
    -j12 GCC_TEST_PARALLEL_SLOTS=10
    4152.84user 736.16system 17:43.05elapsed 459%CPU (0avgtext+0avgdata 909872maxresident)k
    -j12 GCC_TEST_PARALLEL_SLOTS=12
    4209.60user 749.00system 17:35.20elapsed 469%CPU (0avgtext+0avgdata 909840maxresident)k
    -j20 GCC_TEST_PARALLEL_SLOTS=20 [oversubscribe]
    4255.54user 756.78system 17:29.06elapsed 477%CPU (0avgtext+0avgdata 909868maxresident)k

Worth noting is that with nvptx offloading, there is one execution test case
that times out ('libgomp.fortran/reverse-offload-5.f90').  This effectively
stalls progress for almost 5 min: quickly other executions test cases queue up
on the lock for all parallel slots.  That's working as expected; just noting
this as it accordingly does skew the wall time numbers.

	PR testsuite/66005
	libgomp/
	* configure.ac: Look for 'flock'.
	* testsuite/Makefile.am (gcc_test_parallel_slots): Enable parallel testing.
	* testsuite/config/default.exp: Don't 'load_lib "standard.exp"' here...
	* testsuite/lib/libgomp.exp: ... but here, instead.
	(libgomp_load): Override for parallel testing.
	* testsuite/libgomp-site-extra.exp.in (FLOCK): Set.
	* configure: Regenerate.
	* Makefile.in: Regenerate.
	* testsuite/Makefile.in: Regenerate.

(cherry picked from commit 6c3b30ef)

3840d5cc

Support parallel testing in libgomp, part I [PR66005] · 2aa6135e

Rainer Orth authored 9 years ago


..., while still hard-coding the number of parallel slots to one.

	PR testsuite/66005
	libgomp/
	* testsuite/Makefile.am (PWD_COMMAND): New variable.
	(%/site.exp): New target.
	(check_p_numbers0, check_p_numbers1, check_p_numbers2)
	(check_p_numbers3, check_p_numbers4, check_p_numbers5)
	(check_p_numbers6, check_p_numbers, gcc_test_parallel_slots)
	(check_p_subdirs)
	(check_DEJAGNU_libgomp_targets): New variables.
	($(check_DEJAGNU_libgomp_targets)): New target.
	($(check_DEJAGNU_libgomp_targets)): New dependency.
	(check-DEJAGNU $(check_DEJAGNU_libgomp_targets)): New targets.
	* testsuite/Makefile.in: Regenerate.
	* testsuite/lib/libgomp.exp: For parallel testing,
	'load_file ../libgomp-test-support.exp'.

Co-authored-by: Thomas Schwinge <thomas@codesourcery.com>
(cherry picked from commit e797db5c)

2aa6135e

libgomp C++ testsuite: Use 'lang_include_flags' instead of 'libstdcxx_includes' · 4b9af57e

Thomas Schwinge authored 1 year ago

With nvptx offloading configured, and supported, and CUDA available:

    $ make check-target-libgomp RUNTESTFLAGS="--all c.exp=context-1.c c++.exp=context-1.c"
    [...]
    Running [...]/libgomp.oacc-c/c.exp ...
    PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0  (test for excess errors)
    PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0  execution test
    PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2  (test for excess errors)
    PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2  execution test
    UNSUPPORTED: libgomp.oacc-c/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O2
    Running [...]/libgomp.oacc-c++/c++.exp ...
    PASS: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0  (test for excess errors)
    PASS: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0  execution test
    PASS: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2  (test for excess errors)
    PASS: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2  execution test
    UNSUPPORTED: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O2
    [...]

..., but for 'c++.exp=context-1.c' alone, we currently get all-UNSUPPORTED:

    $ make check-target-libgomp RUNTESTFLAGS_="--all c++.exp=context-1.c"
    [...]
    Running [...]/libgomp.oacc-c++/c++.exp ...
    UNSUPPORTED: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0
    UNSUPPORTED: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2
    UNSUPPORTED: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O2
    [...]

That is, if 'c.exp' executes first, it does successfully evaluate
'dg-require-effective-target openacc_cublas' -- and does cache this result (so
it isn't reevaluated for 'c++.exp').  However, for 'c++.exp' alone (that is,
without the 'c.exp' result cached), we run into:

    spawn -ignore SIGHUP [xgcc] [...] -x c++ openacc_cublas2311907.c [...]
    In file included from /usr/include/cuda_fp16.h:3673,
                     from /usr/include/cublas_api.h:75,
                     from /usr/include/cublas_v2.h:65,
                     from openacc_cublas2311907.c:3:
    /usr/include/cuda_fp16.hpp:67:10: fatal error: utility: No such file or directory

We're missing include paths to C++/libstdc++ build-tree headers.

Fix this by using the mechanism introduced for Fortran in
r212268 (commit f707da16) re
"libgomp.fortran/fortran.exp - add -fintrinsic-modules-path ${blddir}".

	libgomp/
	* testsuite/libgomp.c++/c++.exp: Use 'lang_include_flags' instead
	of 'libstdcxx_includes'.
	* testsuite/libgomp.oacc-c++/c++.exp: Likewise.

(cherry picked from commit 1b93b919)

4b9af57e

May 17, 2023
- Daily bump. · c2bfd2c3
  GCC Administrator authored 1 year ago
  
  c2bfd2c3
May 16, 2023

LTO: Fix writing of toplevel asm with offloading [PR109816] · 7fb7d49b

Tobias Burnus authored 1 year ago

When offloading was enabled, top-level 'asm' were added to the offloading
section, confusing assemblers which did not support the syntax. Additionally,
with offloading and -flto, the top-level assembler code did not end up
in the host files.

As r14-321-g9a41d2cdbcd added top-level 'asm' to one libstdc++ header file,
the issue became more apparent, causing fails with nvptx for some
C++ testcases.

	PR libstdc++/109816

gcc/ChangeLog:

	* lto-cgraph.cc (output_symtab): Guard lto_output_toplevel_asms by
	'!lto_stream_offload_p'.

libgomp/ChangeLog:

	* testsuite/libgomp.c++/target-map-class-1.C: New test.
	* testsuite/libgomp.c++/target-map-class-2.C: New test.

(cherry picked from commit a835f046)

7fb7d49b

May 06, 2023
- Daily bump. · 843854ac
  GCC Administrator authored 1 year ago
  
  843854ac
May 05, 2023

OpenACC: Further attach/detach clause fixes for Fortran [PR109622] · a4cc474b

Julian Brown authored 1 year ago

This patch moves several tests introduced by the following patch:

  https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616939.html
  commit r14-325-gcacf65d74463600815773255e8b82b4043432bd7

into the proper location for OpenACC testing (thanks to Thomas for
spotting my mistake!), and also fixes a few additional problems --
missing diagnostics for non-pointer attaches, and a case where a pointer
was incorrectly dereferenced. Tests are also adjusted for vector-length
warnings on nvidia accelerators.

2023-04-29  Julian Brown  <julian@codesourcery.com>

	PR fortran/109622

gcc/fortran/
	* openmp.cc (resolve_omp_clauses): Add diagnostic for
	non-pointer/non-allocatable attach/detach.
	* trans-openmp.cc (gfc_trans_omp_clauses): Remove dereference for
	pointer-to-scalar derived type component attach/detach.  Fix
	attach/detach handling for descriptors.

gcc/testsuite/
	* gfortran.dg/goacc/pr109622-5.f90: New test.
	* gfortran.dg/goacc/pr109622-6.f90: New test.

libgomp/
	* testsuite/libgomp.fortran/pr109622.f90: Move test...
	* testsuite/libgomp.oacc-fortran/pr109622.f90: ...to here. Ignore
	vector length warning.
	* testsuite/libgomp.fortran/pr109622-2.f90: Move test...
	* testsuite/libgomp.oacc-fortran/pr109622-2.f90: ...to here.  Add
	missing copyin/copyout variable. Ignore vector length warnings.
	* testsuite/libgomp.fortran/pr109622-3.f90: Move test...
	* testsuite/libgomp.oacc-fortran/pr109622-3.f90: ...to here.  Ignore
	vector length warnings.
	* testsuite/libgomp.oacc-fortran/pr109622-4.f90: New test.

(cherry picked from commit 0a26a42b)

a4cc474b

OpenACC: Stand-alone attach/detach clause fixes for Fortran [PR109622] · fa7c4ab3

Julian Brown authored 1 year ago

This patch fixes several cases where multiple attach or detach mapping
nodes were being created for stand-alone attach or detach clauses
in Fortran.  After the introduction of stricter checking later during
compilation, these extra nodes could cause ICEs, as seen in the PR.

The patch also fixes cases that "happened to work" previously where
the user attaches/detaches a pointer to array using a descriptor, and
(I think!) the "_data" field has offset zero, hence the same address as
the descriptor as a whole.

2023-04-27  Julian Brown  <julian@codesourcery.com>

	PR fortran/109622

gcc/fortran/
	* trans-openmp.cc (gfc_trans_omp_clauses): Attach/detach clause fixes.

gcc/testsuite/
	* gfortran.dg/goacc/attach-descriptor.f90: Adjust expected output.

libgomp/
	* testsuite/libgomp.fortran/pr109622.f90: New test.
	* testsuite/libgomp.fortran/pr109622-2.f90: New test.
	* testsuite/libgomp.fortran/pr109622-3.f90: New test.

(cherry picked from commit cacf65d7)

fa7c4ab3

Apr 26, 2023
- Update ChangeLog and version files for release · cc035c5d
  Jakub Jelinek authored 1 year ago
  
  View commits for tag releases/gcc-13.1.0 releases/gcc-13.1.0
  
  cc035c5d
Mar 29, 2023
- Daily bump. · 579cdc1e
  GCC Administrator authored 1 year ago
  
  579cdc1e
Mar 28, 2023

testsuite: Fix weak_undefined handling on Darwin · 8443f42f

Rainer Orth authored 1 year ago

The patch that introduced the weak_undefined effective-target keyword
and corresponding dg-add-options support

commit 378ec7b8
Author: Alexandre Oliva <oliva@adacore.com>
Date:   Thu Mar 23 00:45:05 2023 -0300

    [testsuite] test for weak_undefined support and add options

badly broke the affected tests on macOS like so:

ERROR: gcc.dg/addr_equal-1.c: unknown dg option: 89 for " dg-add-options 5 weak_undefined "
ERROR: gcc.dg/addr_equal-1.c: unknown dg option: 89 for " dg-add-options 5 weak_undefined "

add_options_for_weak_undefined tries to call an non-existant proc "89".
Even after fixing this by escaping the brackets, two tests still failed to
link since they lacked the corresponding calls do dg-add-options
weak_undefined.

Tested on x86_64-apple-darwin20.6.0 and i386-pc-solaris2.11.

2023-03-27  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

	gcc/testsuite:
	* lib/target-supports.exp (add_options_for_weak_undefined): Escape
	brackets.
	* gcc.dg/visibility-22.c: Add weak_undefined options.

	libgomp:
	* testsuite/libgomp.oacc-c-c++-common/routine-nohost-2.c: Add
	weak_undefined options.

8443f42f

Mar 25, 2023
- Daily bump. · 13ec81eb
  GCC Administrator authored 1 year ago
  
  13ec81eb
Mar 24, 2023

libgomp.texi: Fix wording in GCN offload specifics · 243fa488
Tobias Burnus authored 1 year ago
```
libgomp/
	* libgomp.texi (Offload-Target Specifics): Grammar fix.
```
243fa488

Add caveat/safeguard to OpenMP: Handle descriptors in target's firstprivate [PR104949] · e8fec699

Thomas Schwinge authored 1 year ago

Follow-up to commit 49d1a2f9
"OpenMP: Handle descriptors in target's firstprivate [PR104949]".

	PR fortran/104949
	libgomp/
	* target.c (gomp_map_vars_internal) <GOMP_MAP_FIRSTPRIVATE>: Add
	caveat/safeguard.

e8fec699

Mar 11, 2023
- Daily bump. · c8065441
  GCC Administrator authored 2 years ago
  
  c8065441
Mar 10, 2023

Use 'GOMP_MAP_VARS_TARGET' for OpenACC compute constructs [PR90596] · f8332e52

Thomas Schwinge authored 2 years ago

Thereby considerably simplify the device plugins' 'GOMP_OFFLOAD_openacc_exec',
'GOMP_OFFLOAD_openacc_async_exec' functions: in terms of lines of code, but in
particular conceptually: no more device memory allocation, host to device data
copying, device memory deallocation -- 'GOMP_MAP_VARS_TARGET' does all that for
us.

This depends on commit 2b2340e2
"Allow libgomp 'cbuf' buffering with OpenACC 'async' for 'ephemeral' data",
where I said that "a use will emerge later", which is this one here.

	PR libgomp/90596
	libgomp/
	* target.c (gomp_map_vars_internal): Allow for
	'param_kind == GOMP_MAP_VARS_OPENACC | GOMP_MAP_VARS_TARGET'.
	* oacc-parallel.c (GOACC_parallel_keyed): Pass
	'GOMP_MAP_VARS_TARGET' to 'goacc_map_vars'.
	* plugin/plugin-gcn.c (alloc_by_agent, gcn_exec)
	(GOMP_OFFLOAD_openacc_exec, GOMP_OFFLOAD_openacc_async_exec):
	Adjust, simplify.
	(gomp_offload_free): Remove.
	* plugin/plugin-nvptx.c (nvptx_exec, GOMP_OFFLOAD_openacc_exec)
	(GOMP_OFFLOAD_openacc_async_exec): Adjust, simplify.
	(cuda_free_argmem): Remove.
	* testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c:
	Adjust.

f8332e52

Allow libgomp 'cbuf' buffering with OpenACC 'async' for 'ephemeral' data · 2b2340e2

Thomas Schwinge authored 2 years ago

This does *allow*, but under no circumstances is this currently going to be
used: all potentially applicable data is non-'ephemeral', and thus not
considered for 'gomp_coalesce_buf_add' for OpenACC 'async'.  (But a use will
emerge later.)

Follow-up to commit r12-2530-gd88a6951586c7229b25708f4486eaaf4bf4b5bbe
"Don't use libgomp 'cbuf' buffering with OpenACC 'async'", addressing this
TODO comment:

    TODO ... but we could allow CBUF usage for EPHEMERAL data?  (Open question:
    is it more performant to use libgomp CBUF buffering or individual device
    asyncronous copying?)

Ephemeral data is small, and therefore individual device asyncronous copying
does seem dubious -- in particular given that for all those, we'd individually
have to allocate and queue for deallocation a temporary buffer to capture the
ephemeral data.  Instead, just let the 'cbuf' *be* the temporary buffer.

	libgomp/
	* target.c (gomp_copy_host2dev, gomp_map_vars_internal): Allow
	libgomp 'cbuf' buffering with OpenACC 'async' for 'ephemeral'
	data.

2b2340e2

Simplify OpenACC 'no_create' clause implementation · 199867d0

Thomas Schwinge authored 2 years ago

For 'OFFSET_INLINED', 'gomp_map_val' does the right thing, and we may then
simplify the device plugins accordingly.

This is a follow-up to
Subversion r279551 (Git commit a6163563)
"Add OpenACC 2.6's no_create",
Subversion r279622 (Git commit 5bcd470b)
"Use gomp_map_val for OpenACC host-to-device address translation".

	libgomp/
	* target.c (gomp_map_vars_internal): Use 'OFFSET_INLINED' for
	'GOMP_MAP_IF_PRESENT'.
	* plugin/plugin-gcn.c (gcn_exec, GOMP_OFFLOAD_openacc_exec)
	(GOMP_OFFLOAD_openacc_async_exec): Adjust.
	* plugin/plugin-nvptx.c (nvptx_exec, GOMP_OFFLOAD_openacc_exec)
	(GOMP_OFFLOAD_openacc_async_exec): Likewise.
	* testsuite/libgomp.oacc-c-c++-common/no_create-1.c: Add 'async'
	testing.
	* testsuite/libgomp.oacc-c-c++-common/no_create-2.c: Likewise.

199867d0

OpenACC: Remove 'acc_async_test' -> skip shortcut in 'libgomp/oacc-async.c:goacc_wait' · b5037d4a

Thomas Schwinge authored 2 years ago

We're not taking such a shortcut anywhere else, and (with future changes) it
has potential to confuse things if synchronization in a libgomp plugin happens
to have side effects even if an async queue currently is empty.

	libgomp/
	* oacc-async.c (goacc_wait): Remove 'acc_async_test' -> skip
	shortcut.

b5037d4a

Document/verify another aspect of OpenACC 'async' semantics in 'libgomp.oacc-c-c++-common/data-3.c' · 442d51a2

Thomas Schwinge authored 2 years ago

... that I almost broke with later implementation changes.

	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/data-3.c: Document/verify
	another aspect of OpenACC 'async' semantics.

442d51a2

Fix OpenACC/GCN 'acc_ev_enqueue_launch_end' position · 649f1939

Thomas Schwinge authored 2 years ago

For an OpenACC compute construct, we've currently got:

  - [...]
  - acc_ev_enqueue_launch_start
  - launch kernel
  - free memory
  - acc_ev_free
  - acc_ev_enqueue_launch_end

This confused another thing that I'm working on, so I adjusted that to:

  - [...]
  - acc_ev_enqueue_launch_start
  - launch kernel
  - acc_ev_enqueue_launch_end
  - free memory
  - acc_ev_free

Correspondingly, verify 'acc_ev_alloc', 'acc_ev_free' in
'libgomp.oacc-c-c++-common/acc_prof-parallel-1.c'.

	libgomp/
	* plugin/plugin-gcn.c (gcn_exec): Fix 'acc_ev_enqueue_launch_end'
	position.
	* testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c:
	Verify 'acc_ev_alloc', 'acc_ev_free'.

649f1939

Daily bump. · da2b9c6e
GCC Administrator authored 2 years ago

da2b9c6e

Mar 09, 2023

libgomp: Fix default value of GOMP_SPINCOUNT [PR 109062] · 288bc7b5

Hongyu Wang authored 2 years ago

When OMP_WAIT_POLICY is not specified, current implementation will cause
icv flag GOMP_ICV_WAIT_POLICY unset, so global variable wait_policy
will remain its uninitialized value. Initialize it to -1 to make
GOMP_SPINCOUNT behavior consistent with its description.

libgomp/ChangeLog:

	PR libgomp/109062
	* env.c (wait_policy): Initialize to -1.
	(initialize_icvs): Initialize icvs->wait_policy to -1.
	* testsuite/libgomp.c-c++-common/pr109062.c: New test.

288bc7b5

Daily bump. · 6a87fdd3
GCC Administrator authored 2 years ago

6a87fdd3

Mar 08, 2023
- libgomp.texi: Mention GCN_STACK_SIZE in Offload-Target Specifics · 2e3dd14d
  Tobias Burnus authored 2 years ago
  
  libgomp/ChangeLog: * libgomp.texi (Offload-Target Specifics): Mention GCN_STACK_SIZE.
  2e3dd14d
Mar 03, 2023
- Daily bump. · 14db9ed5
  GCC Administrator authored 2 years ago
  
  14db9ed5
Mar 02, 2023

amdgcn: Enable SIMD vectorization of math functions · ce9cd725

Kwok Cheung Yeung authored 2 years ago

Calls to vectorized versions of routines in the math library will now
be inserted when vectorizing code containing supported math functions.

2023-03-02  Kwok Cheung Yeung  <kcy@codesourcery.com>
	    Paul-Antoine Arras  <pa@codesourcery.com>

	gcc/
	* builtins.cc (mathfn_built_in_explicit): New.
	* config/gcn/gcn.cc: Include case-cfn-macros.h.
	(mathfn_built_in_explicit): Add prototype.
	(gcn_vectorize_builtin_vectorized_function): New.
	(gcn_libc_has_function): New.
	(TARGET_LIBC_HAS_FUNCTION): Define.
	(TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION): Define.

	gcc/testsuite/
	* gcc.target/gcn/simd-math-1.c: New testcase.
	* gcc.target/gcn/simd-math-2.c: New testcase.

	libgomp/
	* testsuite/libgomp.c/simd-math-1.c: New testcase.

ce9cd725

Daily bump. · c88a7c63
GCC Administrator authored 2 years ago

c88a7c63

Mar 01, 2023

OpenMP/Fortran: Fix handling of optional is_device_ptr + bind(C) [PR108546] · 96ff97ff

Tobias Burnus authored 2 years ago

For is_device_ptr, optional checks should only be done before calling
libgomp, afterwards they are NULL either because of absent or, by
chance, because it is unallocated or unassociated (for pointers/allocatables).

Additionally, it fixes an issue with explicit mapping for 'type(c_ptr)'.

	PR middle-end/108546

gcc/fortran/ChangeLog:

	* trans-openmp.cc (gfc_trans_omp_clauses): Fix mapping of
	type(C_ptr) variables.

gcc/ChangeLog:

	* omp-low.cc (lower_omp_target): Remove optional handling
	on the receiver side, i.e. inside target (data), for
	use_device_ptr.

libgomp/ChangeLog:

	* testsuite/libgomp.fortran/is_device_ptr-3.f90: New test.
	* testsuite/libgomp.fortran/use_device_ptr-optional-4.f90: New test.

96ff97ff

Feb 23, 2023
- Daily bump. · b6f98991
  GCC Administrator authored 2 years ago
  
  b6f98991
Feb 22, 2023

Add '-Wno-complain-wrong-lang', and use it in... · 320dc51c

Thomas Schwinge authored 2 years ago

Add '-Wno-complain-wrong-lang', and use it in 'gcc/testsuite/lib/target-supports.exp:check_compile' and elsewhere

I noticed that GCC/Rust recently lost all LTO variants in torture testing:

     PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O0  (test for excess errors)
     PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O1  (test for excess errors)
     PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O2  (test for excess errors)
    -PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O2 -flto -fno-use-linker-plugin -flto-partition=none  (test for excess errors)
    -PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  (test for excess errors)
     PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O3 -g  (test for excess errors)
     PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -Os  (test for excess errors)

Etc.

The reason is that when probing for availability of LTO, we run into:

    spawn [...]/build-gcc/gcc/testsuite/rust/../../gccrs -B[...]/build-gcc/gcc/testsuite/rust/../../ -fdiagnostics-plain-output -frust-incomplete-and-experimental-compiler-do-not-use -flto -c -o lto8274.o lto8274.c
    cc1: warning: command-line option '-frust-incomplete-and-experimental-compiler-do-not-use' is valid for Rust but not for C

For GCC/Rust testing, this flag is (as of recently) defaulted in
'gcc/testsuite/lib/rust.exp:rust_init':

    lappend ALWAYS_RUSTFLAGS "additional_flags=-frust-incomplete-and-experimental-compiler-do-not-use"

A few more "command-line option [...] is valid for [...] but not for [...]"
instances were found in the test suite logs, when more than one language is
involved.

With '-Wno-complain-wrong-lang' used in
'gcc/testsuite/lib/target-supports.exp:check_compile', we get back:

     PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O0  (test for excess errors)
     PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O1  (test for excess errors)
     PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O2  (test for excess errors)
    +PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O2 -flto -fno-use-linker-plugin -flto-partition=none  (test for excess errors)
    +PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  (test for excess errors)
     PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O3 -g  (test for excess errors)
     PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -Os  (test for excess errors)

Etc., and in total:

                    === rust Summary for unix ===

    # of expected passes            [-4990-]{+6718+}
    # of expected failures          [-39-]{+51+}

Anything that 'gcc/opts-global.cc:complain_wrong_lang' might do is cut
short by '-Wno-complain-wrong-lang', not just the one 'warning'
diagnostic.  This corresponds to what already exists via
'lang_hooks.complain_wrong_lang_p'.

The 'gcc/opts-common.cc:prune_options' changes follow the same rationale
as PR67640 "driver passes -fdiagnostics-color= always last": we need to
process '-Wno-complain-wrong-lang' early, so that it properly affects
other options appearing before it on the command line.

	gcc/
	* common.opt (-Wcomplain-wrong-lang): New.
	* doc/invoke.texi (-Wno-complain-wrong-lang): Document it.
	* opts-common.cc (prune_options): Handle it.
	* opts-global.cc (complain_wrong_lang): Use it.
	gcc/testsuite/
	* gcc.dg/Wcomplain-wrong-lang-1.c: New.
	* gcc.dg/Wcomplain-wrong-lang-2.c: Likewise.
	* gcc.dg/Wcomplain-wrong-lang-3.c: Likewise.
	* gcc.dg/Wcomplain-wrong-lang-4.c: Likewise.
	* gcc.dg/Wcomplain-wrong-lang-5.c: Likewise.
	* lib/target-supports.exp (check_compile): Use
	'-Wno-complain-wrong-lang'.
	* g++.dg/abi/empty12.C: Likewise.
	* g++.dg/abi/empty13.C: Likewise.
	* g++.dg/abi/empty14.C: Likewise.
	* g++.dg/abi/empty15.C: Likewise.
	* g++.dg/abi/empty16.C: Likewise.
	* g++.dg/abi/empty17.C: Likewise.
	* g++.dg/abi/empty18.C: Likewise.
	* g++.dg/abi/empty19.C: Likewise.
	* g++.dg/abi/empty22.C: Likewise.
	* g++.dg/abi/empty25.C: Likewise.
	* g++.dg/abi/empty26.C: Likewise.
	* gfortran.dg/bind-c-contiguous-1.f90: Likewise.
	* gfortran.dg/bind-c-contiguous-4.f90: Likewise.
	* gfortran.dg/bind-c-contiguous-5.f90: Likewise.
	libgomp/
	* testsuite/libgomp.fortran/alloc-10.f90: Use
	'-Wno-complain-wrong-lang'.
	* testsuite/libgomp.fortran/alloc-11.f90: Likewise.
	* testsuite/libgomp.fortran/alloc-7.f90: Likewise.
	* testsuite/libgomp.fortran/alloc-9.f90: Likewise.
	* testsuite/libgomp.fortran/allocate-1.f90: Likewise.
	* testsuite/libgomp.fortran/depend-4.f90: Likewise.
	* testsuite/libgomp.fortran/depend-5.f90: Likewise.
	* testsuite/libgomp.fortran/depend-6.f90: Likewise.
	* testsuite/libgomp.fortran/depend-7.f90: Likewise.
	* testsuite/libgomp.fortran/depend-inoutset-1.f90: Likewise.
	* testsuite/libgomp.fortran/examples-4/declare_target-1.f90:
	Likewise.
	* testsuite/libgomp.fortran/examples-4/declare_target-2.f90:
	Likewise.
	* testsuite/libgomp.fortran/order-reproducible-1.f90: Likewise.
	* testsuite/libgomp.fortran/order-reproducible-2.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/parallel-dims.f90: Likewise.
	* testsuite/libgomp.fortran/task-detach-6.f90: Remove left-over
	'dg-prune-output'.

320dc51c

Feb 17, 2023
- Daily bump. · 88cc4495
  GCC Administrator authored 2 years ago
  
  88cc4495
Feb 16, 2023

libgomp: Fix up some typos in libgomp.texi · 0b9bd33d

Jakub Jelinek authored 2 years ago

I decided to check for repeated the the in libgomp and noticed
there are several occurrences of a typo theads rather than threads
in libgomp.texi.

2023-02-16  Jakub Jelinek  <jakub@redhat.com>

	* libgomp.texi: Fix typos - theads -> threads.

0b9bd33d