Skip to content
Snippets Groups Projects
  1. Dec 06, 2023
  2. Dec 05, 2023
    • Jakub Jelinek's avatar
      libgomp: Handle NULL environ like pointer to NULL pointer [PR111413] · c128ad8e
      Jakub Jelinek authored
      clearenv function just sets environ to NULL (after sometimes freeing it),
      rather than setting it to a pointer to NULL, and our code was assuming
      it is always non-NULL.
      
      Fixed thusly, the change seems to be large but actually is just
      +  if (environ)
           for (env = environ; *env != 0; env++)
      plus reindentation.  I've also noticed the block after this for loop
      was badly indented (too much) and fixed that too.
      
      No testcase added, as it needs clearenv + dlopen.
      
      2023-09-19  Jakub Jelinek  <jakub@redhat.com>
      
      	PR libgomp/111413
      	* env.c (initialize_env): Don't dereference environ if it is NULL.
      	Reindent.
      
      (cherry picked from commit 15345980)
      c128ad8e
  3. Aug 25, 2023
  4. Aug 24, 2023
    • Tobias Burnus's avatar
      omp-expand.cc: Fix wrong code with non-rectangular loop nest [PR111017] · d4648a00
      Tobias Burnus authored
      Before commit r12-5295-g47de0b56ee455e, all gimple_build_cond in
      expand_omp_for_* were inserted with
        gsi_insert_before (gsi_p, cond_stmt, GSI_SAME_STMT);
      except the one dealing with the multiplicative factor that was
        gsi_insert_after (gsi, cond_stmt, GSI_CONTINUE_LINKING);
      
      That commit for PR103208 fixed the issue of some missing regimplify of
      operands of GIMPLE_CONDs by moving the condition handling to the new function
      expand_omp_build_cond. While that function has an 'bool after = false'
      argument to switch between the two variants.
      
      However, all callers ommited this argument. This commit reinstates the
      prior behavior by passing 'true' for the factor != 0 condition, fixing
      the included testcase.
      
      	PR middle-end/111017
      gcc/
      	* omp-expand.cc (expand_omp_for_init_vars): Pass after=true
      	to expand_omp_build_cond for 'factor != 0' condition, resulting
      	in pre-r12-5295-g47de0b56ee455e code for the gimple insert.
      
      libgomp/
      	* testsuite/libgomp.c-c++-common/non-rect-loop-1.c: New test.
      
      (cherry picked from commit 1dc65003)
      d4648a00
  5. Jul 27, 2023
  6. Jun 29, 2023
  7. Jun 28, 2023
    • Thomas Schwinge's avatar
      Support parallel testing in libgomp: fallback Perl 'flock' [PR66005] · 09124b7e
      Thomas Schwinge authored
      Follow-up to commit 6c3b30ef
      "Support parallel testing in libgomp, part II [PR66005]"
      ("..., and enable if 'flock' is available for serializing execution testing"),
      where we saw:
      
      > On my Dell Precision 7530 laptop:
      >
      >     $ uname -srvi
      >     Linux 5.15.0-71-generic #78-Ubuntu SMP Tue Apr 18 09:00:29 UTC 2023 x86_64
      >     $ grep '^model name' < /proc/cpuinfo | uniq -c
      >          12 model name      : Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
      >     $ nvidia-smi -L
      >     GPU 0: Quadro P1000 (UUID: GPU-e043973b-b52a-d02b-c066-a8fdbf64e8ea)
      >
      > ... [...]: case (c) standard configuration, no offloading
      > configured, [...]
      
      >     $ \time make check-target-libgomp
      >
      > Case (c), baseline; [...]:
      >
      >     1180.98user 110.80system 19:36.40elapsed 109%CPU (0avgtext+0avgdata 505148maxresident)k
      >     1133.22user 111.08system 19:35.75elapsed 105%CPU (0avgtext+0avgdata 505212maxresident)k
      >
      > Case (c), parallelized [using 'flock']:
      >
      > [...]
      >     -j12 GCC_TEST_PARALLEL_SLOTS=12
      >     2591.04user 192.64system 4:44.98elapsed 976%CPU (0avgtext+0avgdata 505216maxresident)k
      >     2581.23user 195.21system 4:47.51elapsed 965%CPU (0avgtext+0avgdata 505212maxresident)k
      
      Quite the same when instead of 'flock' using this fallback Perl 'flock':
      
          2565.23user 194.35system 4:46.77elapsed 962%CPU (0avgtext+0avgdata 505216maxresident)k
          2549.38user 200.20system 4:46.08elapsed 961%CPU (0avgtext+0avgdata 505216maxresident)k
      
      	PR testsuite/66005
      	gcc/
      	* doc/install.texi: Document (optional) Perl usage for parallel
      	testing of libgomp.
      	libgomp/
      	* testsuite/lib/libgomp.exp: 'flock' through stdout.
      	* testsuite/flock: New.
      	* configure.ac (FLOCK): Point to that if no 'flock' available, but
      	'perl' is.
      	* configure: Regenerate.
      
      (cherry picked from commit 04abe194)
      09124b7e
    • Thomas Schwinge's avatar
      Support parallel testing in libgomp, part II [PR66005] · 3840d5cc
      Thomas Schwinge authored
      ..., and enable if 'flock' is available for serializing execution testing.
      
      Regarding the default of 19 parallel slots, this turned out to be a local
      minimum for wall time when testing this on:
      
          $ uname -srvi
          Linux 4.2.0-42-generic #49~14.04.1-Ubuntu SMP Wed Jun 29 20:22:11 UTC 2016 x86_64
          $ grep '^model name' < /proc/cpuinfo | uniq -c
               32 model name      : Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz
      
      ... in two configurations: case (a) standard configuration, no offloading
      configured, case (b) offloading for GCN and nvptx configured but no devices
      available.  For both cases, default plus '-m32' variant.
      
          $ \time make check-target-libgomp RUNTESTFLAGS="--target_board=unix\{,-m32\}"
      
      Case (a), baseline:
      
          6432.23user 332.38system 47:32.28elapsed 237%CPU (0avgtext+0avgdata 505044maxresident)k
          6382.43user 319.21system 47:06.04elapsed 237%CPU (0avgtext+0avgdata 505172maxresident)k
      
      This is what people have been complaining about, rightly so, in
      <https://gcc.gnu.org/PR66005> "libgomp make check time is excessive" and
      elsewhere.
      
      Case (a), parallelized:
      
          -j12 GCC_TEST_PARALLEL_SLOTS=10
          3088.49user 267.74system 6:43.82elapsed 831%CPU (0avgtext+0avgdata 505188maxresident)k
          -j15 GCC_TEST_PARALLEL_SLOTS=15
          3308.08user 294.79system 5:56.04elapsed 1011%CPU (0avgtext+0avgdata 505360maxresident)k
          -j17 GCC_TEST_PARALLEL_SLOTS=17
          3539.93user 298.99system 5:27.86elapsed 1170%CPU (0avgtext+0avgdata 505112maxresident)k
          -j18 GCC_TEST_PARALLEL_SLOTS=18
          3697.50user 317.18system 5:14.63elapsed 1275%CPU (0avgtext+0avgdata 505360maxresident)k
          -j19 GCC_TEST_PARALLEL_SLOTS=19
          3765.94user 324.27system 5:13.22elapsed 1305%CPU (0avgtext+0avgdata 505128maxresident)k
          -j20 GCC_TEST_PARALLEL_SLOTS=20
          3684.66user 312.32system 5:15.26elapsed 1267%CPU (0avgtext+0avgdata 505100maxresident)k
          -j23 GCC_TEST_PARALLEL_SLOTS=23
          4040.59user 347.10system 5:29.12elapsed 1333%CPU (0avgtext+0avgdata 505200maxresident)k
          -j26 GCC_TEST_PARALLEL_SLOTS=26
          3973.24user 377.96system 5:24.70elapsed 1340%CPU (0avgtext+0avgdata 505160maxresident)k
          -j32 GCC_TEST_PARALLEL_SLOTS=32
          4004.42user 346.10system 5:16.11elapsed 1376%CPU (0avgtext+0avgdata 505160maxresident)k
      
      Yay!
      
      Case (b), baseline; 2+ h:
      
          7227.58user 700.54system 2:14:33elapsed 98%CPU (0avgtext+0avgdata 994264maxresident)k
      
      Case (b), parallelized:
      
          -j12 GCC_TEST_PARALLEL_SLOTS=10
          7377.46user 777.52system 16:06.63elapsed 843%CPU (0avgtext+0avgdata 994344maxresident)k
          -j15 GCC_TEST_PARALLEL_SLOTS=15
          8019.18user 721.42system 12:13.56elapsed 1191%CPU (0avgtext+0avgdata 994228maxresident)k
          -j17 GCC_TEST_PARALLEL_SLOTS=17
          8530.11user 716.95system 10:45.92elapsed 1431%CPU (0avgtext+0avgdata 994176maxresident)k
          -j18 GCC_TEST_PARALLEL_SLOTS=18
          8776.79user 645.89system 10:27.20elapsed 1502%CPU (0avgtext+0avgdata 994248maxresident)k
          -j19 GCC_TEST_PARALLEL_SLOTS=19
          9332.37user 641.76system 10:15.09elapsed 1621%CPU (0avgtext+0avgdata 994260maxresident)k
          -j20 GCC_TEST_PARALLEL_SLOTS=20
          9609.54user 789.88system 10:26.94elapsed 1658%CPU (0avgtext+0avgdata 994284maxresident)k
          -j23 GCC_TEST_PARALLEL_SLOTS=23
          10362.40user 911.14system 10:44.47elapsed 1749%CPU (0avgtext+0avgdata 994208maxresident)k
          -j26 GCC_TEST_PARALLEL_SLOTS=26
          11159.44user 850.99system 11:09.25elapsed 1794%CPU (0avgtext+0avgdata 994256maxresident)k
          -j32 GCC_TEST_PARALLEL_SLOTS=32
          11453.50user 939.52system 11:00.38elapsed 1876%CPU (0avgtext+0avgdata 994240maxresident)k
      
      On my Dell Precision 7530 laptop:
      
          $ uname -srvi
          Linux 5.15.0-71-generic #78-Ubuntu SMP Tue Apr 18 09:00:29 UTC 2023 x86_64
          $ grep '^model name' < /proc/cpuinfo | uniq -c
               12 model name      : Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
          $ nvidia-smi -L
          GPU 0: Quadro P1000 (UUID: GPU-e043973b-b52a-d02b-c066-a8fdbf64e8ea)
      
      ... in two configurations: case (c) standard configuration, no offloading
      configured, case (d) offloading for nvptx configured and device available.
      For both cases, only default variant, no '-m32'.
      
          $ \time make check-target-libgomp
      
      Case (c), baseline; roughly half of case (a) (just one variant):
      
          1180.98user 110.80system 19:36.40elapsed 109%CPU (0avgtext+0avgdata 505148maxresident)k
          1133.22user 111.08system 19:35.75elapsed 105%CPU (0avgtext+0avgdata 505212maxresident)k
      
      Case (c), parallelized:
      
          -j12 GCC_TEST_PARALLEL_SLOTS=2
          1143.83user 110.76system 10:20.46elapsed 202%CPU (0avgtext+0avgdata 505216maxresident)k
          -j12 GCC_TEST_PARALLEL_SLOTS=6
          1737.08user 143.94system 4:59.48elapsed 628%CPU (0avgtext+0avgdata 505200maxresident)k
          1730.31user 143.02system 4:58.75elapsed 627%CPU (0avgtext+0avgdata 505152maxresident)k
          -j12 GCC_TEST_PARALLEL_SLOTS=8
          2192.63user 169.34system 4:52.96elapsed 806%CPU (0avgtext+0avgdata 505216maxresident)k
          2219.04user 167.67system 4:53.19elapsed 814%CPU (0avgtext+0avgdata 505152maxresident)k
          -j12 GCC_TEST_PARALLEL_SLOTS=10
          2463.93user 184.98system 4:48.39elapsed 918%CPU (0avgtext+0avgdata 505200maxresident)k
          2455.62user 183.68system 4:47.40elapsed 918%CPU (0avgtext+0avgdata 505216maxresident)k
          -j12 GCC_TEST_PARALLEL_SLOTS=12
          2591.04user 192.64system 4:44.98elapsed 976%CPU (0avgtext+0avgdata 505216maxresident)k
          2581.23user 195.21system 4:47.51elapsed 965%CPU (0avgtext+0avgdata 505212maxresident)k
          -j20 GCC_TEST_PARALLEL_SLOTS=20 [oversubscribe]
          2613.18user 199.51system 4:44.06elapsed 990%CPU (0avgtext+0avgdata 505216maxresident)k
      
      Case (d), baseline (compared to case (b): only nvptx offloading compilation,
      but also nvptx offloading execution); ~1 h:
      
          2841.93user 653.68system 1:02:26elapsed 93%CPU (0avgtext+0avgdata 909792maxresident)k
          2842.03user 654.39system 1:02:24elapsed 93%CPU (0avgtext+0avgdata 909880maxresident)k
      
      Case (d), parallelized:
      
          -j12 GCC_TEST_PARALLEL_SLOTS=2
          2856.39user 606.87system 33:58.64elapsed 169%CPU (0avgtext+0avgdata 909948maxresident)k
          -j12 GCC_TEST_PARALLEL_SLOTS=6
          3444.90user 666.86system 18:37.57elapsed 367%CPU (0avgtext+0avgdata 909856maxresident)k
          3462.13user 667.13system 18:36.87elapsed 369%CPU (0avgtext+0avgdata 909872maxresident)k
          -j12 GCC_TEST_PARALLEL_SLOTS=8
          3929.74user 716.22system 18:02.36elapsed 429%CPU (0avgtext+0avgdata 909832maxresident)k
          -j12 GCC_TEST_PARALLEL_SLOTS=10
          4152.84user 736.16system 17:43.05elapsed 459%CPU (0avgtext+0avgdata 909872maxresident)k
          -j12 GCC_TEST_PARALLEL_SLOTS=12
          4209.60user 749.00system 17:35.20elapsed 469%CPU (0avgtext+0avgdata 909840maxresident)k
          -j20 GCC_TEST_PARALLEL_SLOTS=20 [oversubscribe]
          4255.54user 756.78system 17:29.06elapsed 477%CPU (0avgtext+0avgdata 909868maxresident)k
      
      Worth noting is that with nvptx offloading, there is one execution test case
      that times out ('libgomp.fortran/reverse-offload-5.f90').  This effectively
      stalls progress for almost 5 min: quickly other executions test cases queue up
      on the lock for all parallel slots.  That's working as expected; just noting
      this as it accordingly does skew the wall time numbers.
      
      	PR testsuite/66005
      	libgomp/
      	* configure.ac: Look for 'flock'.
      	* testsuite/Makefile.am (gcc_test_parallel_slots): Enable parallel testing.
      	* testsuite/config/default.exp: Don't 'load_lib "standard.exp"' here...
      	* testsuite/lib/libgomp.exp: ... but here, instead.
      	(libgomp_load): Override for parallel testing.
      	* testsuite/libgomp-site-extra.exp.in (FLOCK): Set.
      	* configure: Regenerate.
      	* Makefile.in: Regenerate.
      	* testsuite/Makefile.in: Regenerate.
      
      (cherry picked from commit 6c3b30ef)
      3840d5cc
    • Rainer Orth's avatar
      Support parallel testing in libgomp, part I [PR66005] · 2aa6135e
      Rainer Orth authored
      
      ..., while still hard-coding the number of parallel slots to one.
      
      	PR testsuite/66005
      	libgomp/
      	* testsuite/Makefile.am (PWD_COMMAND): New variable.
      	(%/site.exp): New target.
      	(check_p_numbers0, check_p_numbers1, check_p_numbers2)
      	(check_p_numbers3, check_p_numbers4, check_p_numbers5)
      	(check_p_numbers6, check_p_numbers, gcc_test_parallel_slots)
      	(check_p_subdirs)
      	(check_DEJAGNU_libgomp_targets): New variables.
      	($(check_DEJAGNU_libgomp_targets)): New target.
      	($(check_DEJAGNU_libgomp_targets)): New dependency.
      	(check-DEJAGNU $(check_DEJAGNU_libgomp_targets)): New targets.
      	* testsuite/Makefile.in: Regenerate.
      	* testsuite/lib/libgomp.exp: For parallel testing,
      	'load_file ../libgomp-test-support.exp'.
      
      Co-authored-by: default avatarThomas Schwinge <thomas@codesourcery.com>
      (cherry picked from commit e797db5c)
      2aa6135e
    • Thomas Schwinge's avatar
      libgomp C++ testsuite: Use 'lang_include_flags' instead of 'libstdcxx_includes' · 4b9af57e
      Thomas Schwinge authored
      With nvptx offloading configured, and supported, and CUDA available:
      
          $ make check-target-libgomp RUNTESTFLAGS="--all c.exp=context-1.c c++.exp=context-1.c"
          [...]
          Running [...]/libgomp.oacc-c/c.exp ...
          PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0  (test for excess errors)
          PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0  execution test
          PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2  (test for excess errors)
          PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2  execution test
          UNSUPPORTED: libgomp.oacc-c/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O2
          Running [...]/libgomp.oacc-c++/c++.exp ...
          PASS: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0  (test for excess errors)
          PASS: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0  execution test
          PASS: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2  (test for excess errors)
          PASS: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2  execution test
          UNSUPPORTED: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O2
          [...]
      
      ..., but for 'c++.exp=context-1.c' alone, we currently get all-UNSUPPORTED:
      
          $ make check-target-libgomp RUNTESTFLAGS_="--all c++.exp=context-1.c"
          [...]
          Running [...]/libgomp.oacc-c++/c++.exp ...
          UNSUPPORTED: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0
          UNSUPPORTED: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2
          UNSUPPORTED: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O2
          [...]
      
      That is, if 'c.exp' executes first, it does successfully evaluate
      'dg-require-effective-target openacc_cublas' -- and does cache this result (so
      it isn't reevaluated for 'c++.exp').  However, for 'c++.exp' alone (that is,
      without the 'c.exp' result cached), we run into:
      
          spawn -ignore SIGHUP [xgcc] [...] -x c++ openacc_cublas2311907.c [...]
          In file included from /usr/include/cuda_fp16.h:3673,
                           from /usr/include/cublas_api.h:75,
                           from /usr/include/cublas_v2.h:65,
                           from openacc_cublas2311907.c:3:
          /usr/include/cuda_fp16.hpp:67:10: fatal error: utility: No such file or directory
      
      We're missing include paths to C++/libstdc++ build-tree headers.
      
      Fix this by using the mechanism introduced for Fortran in
      r212268 (commit f707da16) re
      "libgomp.fortran/fortran.exp - add -fintrinsic-modules-path ${blddir}".
      
      	libgomp/
      	* testsuite/libgomp.c++/c++.exp: Use 'lang_include_flags' instead
      	of 'libstdcxx_includes'.
      	* testsuite/libgomp.oacc-c++/c++.exp: Likewise.
      
      (cherry picked from commit 1b93b919)
      4b9af57e
  8. May 17, 2023
  9. May 16, 2023
    • Tobias Burnus's avatar
      LTO: Fix writing of toplevel asm with offloading [PR109816] · 7fb7d49b
      Tobias Burnus authored
      When offloading was enabled, top-level 'asm' were added to the offloading
      section, confusing assemblers which did not support the syntax. Additionally,
      with offloading and -flto, the top-level assembler code did not end up
      in the host files.
      
      As r14-321-g9a41d2cdbcd added top-level 'asm' to one libstdc++ header file,
      the issue became more apparent, causing fails with nvptx for some
      C++ testcases.
      
      	PR libstdc++/109816
      
      gcc/ChangeLog:
      
      	* lto-cgraph.cc (output_symtab): Guard lto_output_toplevel_asms by
      	'!lto_stream_offload_p'.
      
      libgomp/ChangeLog:
      
      	* testsuite/libgomp.c++/target-map-class-1.C: New test.
      	* testsuite/libgomp.c++/target-map-class-2.C: New test.
      
      (cherry picked from commit a835f046)
      7fb7d49b
  10. May 06, 2023
  11. May 05, 2023
    • Julian Brown's avatar
      OpenACC: Further attach/detach clause fixes for Fortran [PR109622] · a4cc474b
      Julian Brown authored
      This patch moves several tests introduced by the following patch:
      
        https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616939.html
        commit r14-325-gcacf65d74463600815773255e8b82b4043432bd7
      
      into the proper location for OpenACC testing (thanks to Thomas for
      spotting my mistake!), and also fixes a few additional problems --
      missing diagnostics for non-pointer attaches, and a case where a pointer
      was incorrectly dereferenced. Tests are also adjusted for vector-length
      warnings on nvidia accelerators.
      
      2023-04-29  Julian Brown  <julian@codesourcery.com>
      
      	PR fortran/109622
      
      gcc/fortran/
      	* openmp.cc (resolve_omp_clauses): Add diagnostic for
      	non-pointer/non-allocatable attach/detach.
      	* trans-openmp.cc (gfc_trans_omp_clauses): Remove dereference for
      	pointer-to-scalar derived type component attach/detach.  Fix
      	attach/detach handling for descriptors.
      
      gcc/testsuite/
      	* gfortran.dg/goacc/pr109622-5.f90: New test.
      	* gfortran.dg/goacc/pr109622-6.f90: New test.
      
      libgomp/
      	* testsuite/libgomp.fortran/pr109622.f90: Move test...
      	* testsuite/libgomp.oacc-fortran/pr109622.f90: ...to here. Ignore
      	vector length warning.
      	* testsuite/libgomp.fortran/pr109622-2.f90: Move test...
      	* testsuite/libgomp.oacc-fortran/pr109622-2.f90: ...to here.  Add
      	missing copyin/copyout variable. Ignore vector length warnings.
      	* testsuite/libgomp.fortran/pr109622-3.f90: Move test...
      	* testsuite/libgomp.oacc-fortran/pr109622-3.f90: ...to here.  Ignore
      	vector length warnings.
      	* testsuite/libgomp.oacc-fortran/pr109622-4.f90: New test.
      
      (cherry picked from commit 0a26a42b)
      a4cc474b
    • Julian Brown's avatar
      OpenACC: Stand-alone attach/detach clause fixes for Fortran [PR109622] · fa7c4ab3
      Julian Brown authored
      This patch fixes several cases where multiple attach or detach mapping
      nodes were being created for stand-alone attach or detach clauses
      in Fortran.  After the introduction of stricter checking later during
      compilation, these extra nodes could cause ICEs, as seen in the PR.
      
      The patch also fixes cases that "happened to work" previously where
      the user attaches/detaches a pointer to array using a descriptor, and
      (I think!) the "_data" field has offset zero, hence the same address as
      the descriptor as a whole.
      
      2023-04-27  Julian Brown  <julian@codesourcery.com>
      
      	PR fortran/109622
      
      gcc/fortran/
      	* trans-openmp.cc (gfc_trans_omp_clauses): Attach/detach clause fixes.
      
      gcc/testsuite/
      	* gfortran.dg/goacc/attach-descriptor.f90: Adjust expected output.
      
      libgomp/
      	* testsuite/libgomp.fortran/pr109622.f90: New test.
      	* testsuite/libgomp.fortran/pr109622-2.f90: New test.
      	* testsuite/libgomp.fortran/pr109622-3.f90: New test.
      
      (cherry picked from commit cacf65d7)
      fa7c4ab3
  12. Apr 26, 2023
  13. Mar 29, 2023
  14. Mar 28, 2023
    • Rainer Orth's avatar
      testsuite: Fix weak_undefined handling on Darwin · 8443f42f
      Rainer Orth authored
      The patch that introduced the weak_undefined effective-target keyword
      and corresponding dg-add-options support
      
      commit 378ec7b8
      Author: Alexandre Oliva <oliva@adacore.com>
      Date:   Thu Mar 23 00:45:05 2023 -0300
      
          [testsuite] test for weak_undefined support and add options
      
      badly broke the affected tests on macOS like so:
      
      ERROR: gcc.dg/addr_equal-1.c: unknown dg option: 89 for " dg-add-options 5 weak_undefined "
      ERROR: gcc.dg/addr_equal-1.c: unknown dg option: 89 for " dg-add-options 5 weak_undefined "
      
      add_options_for_weak_undefined tries to call an non-existant proc "89".
      Even after fixing this by escaping the brackets, two tests still failed to
      link since they lacked the corresponding calls do dg-add-options
      weak_undefined.
      
      Tested on x86_64-apple-darwin20.6.0 and i386-pc-solaris2.11.
      
      2023-03-27  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>
      
      	gcc/testsuite:
      	* lib/target-supports.exp (add_options_for_weak_undefined): Escape
      	brackets.
      	* gcc.dg/visibility-22.c: Add weak_undefined options.
      
      	libgomp:
      	* testsuite/libgomp.oacc-c-c++-common/routine-nohost-2.c: Add
      	weak_undefined options.
      8443f42f
  15. Mar 25, 2023
  16. Mar 24, 2023
  17. Mar 11, 2023
  18. Mar 10, 2023
    • Thomas Schwinge's avatar
      Use 'GOMP_MAP_VARS_TARGET' for OpenACC compute constructs [PR90596] · f8332e52
      Thomas Schwinge authored
      Thereby considerably simplify the device plugins' 'GOMP_OFFLOAD_openacc_exec',
      'GOMP_OFFLOAD_openacc_async_exec' functions: in terms of lines of code, but in
      particular conceptually: no more device memory allocation, host to device data
      copying, device memory deallocation -- 'GOMP_MAP_VARS_TARGET' does all that for
      us.
      
      This depends on commit 2b2340e2
      "Allow libgomp 'cbuf' buffering with OpenACC 'async' for 'ephemeral' data",
      where I said that "a use will emerge later", which is this one here.
      
      	PR libgomp/90596
      	libgomp/
      	* target.c (gomp_map_vars_internal): Allow for
      	'param_kind == GOMP_MAP_VARS_OPENACC | GOMP_MAP_VARS_TARGET'.
      	* oacc-parallel.c (GOACC_parallel_keyed): Pass
      	'GOMP_MAP_VARS_TARGET' to 'goacc_map_vars'.
      	* plugin/plugin-gcn.c (alloc_by_agent, gcn_exec)
      	(GOMP_OFFLOAD_openacc_exec, GOMP_OFFLOAD_openacc_async_exec):
      	Adjust, simplify.
      	(gomp_offload_free): Remove.
      	* plugin/plugin-nvptx.c (nvptx_exec, GOMP_OFFLOAD_openacc_exec)
      	(GOMP_OFFLOAD_openacc_async_exec): Adjust, simplify.
      	(cuda_free_argmem): Remove.
      	* testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c:
      	Adjust.
      f8332e52
    • Thomas Schwinge's avatar
      Allow libgomp 'cbuf' buffering with OpenACC 'async' for 'ephemeral' data · 2b2340e2
      Thomas Schwinge authored
      This does *allow*, but under no circumstances is this currently going to be
      used: all potentially applicable data is non-'ephemeral', and thus not
      considered for 'gomp_coalesce_buf_add' for OpenACC 'async'.  (But a use will
      emerge later.)
      
      Follow-up to commit r12-2530-gd88a6951586c7229b25708f4486eaaf4bf4b5bbe
      "Don't use libgomp 'cbuf' buffering with OpenACC 'async'", addressing this
      TODO comment:
      
          TODO ... but we could allow CBUF usage for EPHEMERAL data?  (Open question:
          is it more performant to use libgomp CBUF buffering or individual device
          asyncronous copying?)
      
      Ephemeral data is small, and therefore individual device asyncronous copying
      does seem dubious -- in particular given that for all those, we'd individually
      have to allocate and queue for deallocation a temporary buffer to capture the
      ephemeral data.  Instead, just let the 'cbuf' *be* the temporary buffer.
      
      	libgomp/
      	* target.c (gomp_copy_host2dev, gomp_map_vars_internal): Allow
      	libgomp 'cbuf' buffering with OpenACC 'async' for 'ephemeral'
      	data.
      2b2340e2
    • Thomas Schwinge's avatar
      Simplify OpenACC 'no_create' clause implementation · 199867d0
      Thomas Schwinge authored
      For 'OFFSET_INLINED', 'gomp_map_val' does the right thing, and we may then
      simplify the device plugins accordingly.
      
      This is a follow-up to
      Subversion r279551 (Git commit a6163563)
      "Add OpenACC 2.6's no_create",
      Subversion r279622 (Git commit 5bcd470b)
      "Use gomp_map_val for OpenACC host-to-device address translation".
      
      	libgomp/
      	* target.c (gomp_map_vars_internal): Use 'OFFSET_INLINED' for
      	'GOMP_MAP_IF_PRESENT'.
      	* plugin/plugin-gcn.c (gcn_exec, GOMP_OFFLOAD_openacc_exec)
      	(GOMP_OFFLOAD_openacc_async_exec): Adjust.
      	* plugin/plugin-nvptx.c (nvptx_exec, GOMP_OFFLOAD_openacc_exec)
      	(GOMP_OFFLOAD_openacc_async_exec): Likewise.
      	* testsuite/libgomp.oacc-c-c++-common/no_create-1.c: Add 'async'
      	testing.
      	* testsuite/libgomp.oacc-c-c++-common/no_create-2.c: Likewise.
      199867d0
    • Thomas Schwinge's avatar
      OpenACC: Remove 'acc_async_test' -> skip shortcut in 'libgomp/oacc-async.c:goacc_wait' · b5037d4a
      Thomas Schwinge authored
      We're not taking such a shortcut anywhere else, and (with future changes) it
      has potential to confuse things if synchronization in a libgomp plugin happens
      to have side effects even if an async queue currently is empty.
      
      	libgomp/
      	* oacc-async.c (goacc_wait): Remove 'acc_async_test' -> skip
      	shortcut.
      b5037d4a
    • Thomas Schwinge's avatar
      Document/verify another aspect of OpenACC 'async' semantics in 'libgomp.oacc-c-c++-common/data-3.c' · 442d51a2
      Thomas Schwinge authored
      ... that I almost broke with later implementation changes.
      
      	libgomp/
      	* testsuite/libgomp.oacc-c-c++-common/data-3.c: Document/verify
      	another aspect of OpenACC 'async' semantics.
      442d51a2
    • Thomas Schwinge's avatar
      Fix OpenACC/GCN 'acc_ev_enqueue_launch_end' position · 649f1939
      Thomas Schwinge authored
      For an OpenACC compute construct, we've currently got:
      
        - [...]
        - acc_ev_enqueue_launch_start
        - launch kernel
        - free memory
        - acc_ev_free
        - acc_ev_enqueue_launch_end
      
      This confused another thing that I'm working on, so I adjusted that to:
      
        - [...]
        - acc_ev_enqueue_launch_start
        - launch kernel
        - acc_ev_enqueue_launch_end
        - free memory
        - acc_ev_free
      
      Correspondingly, verify 'acc_ev_alloc', 'acc_ev_free' in
      'libgomp.oacc-c-c++-common/acc_prof-parallel-1.c'.
      
      	libgomp/
      	* plugin/plugin-gcn.c (gcn_exec): Fix 'acc_ev_enqueue_launch_end'
      	position.
      	* testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c:
      	Verify 'acc_ev_alloc', 'acc_ev_free'.
      649f1939
    • GCC Administrator's avatar
      Daily bump. · da2b9c6e
      GCC Administrator authored
      da2b9c6e
  19. Mar 09, 2023
    • Hongyu Wang's avatar
      libgomp: Fix default value of GOMP_SPINCOUNT [PR 109062] · 288bc7b5
      Hongyu Wang authored
      When OMP_WAIT_POLICY is not specified, current implementation will cause
      icv flag GOMP_ICV_WAIT_POLICY unset, so global variable wait_policy
      will remain its uninitialized value. Initialize it to -1 to make
      GOMP_SPINCOUNT behavior consistent with its description.
      
      libgomp/ChangeLog:
      
      	PR libgomp/109062
      	* env.c (wait_policy): Initialize to -1.
      	(initialize_icvs): Initialize icvs->wait_policy to -1.
      	* testsuite/libgomp.c-c++-common/pr109062.c: New test.
      288bc7b5
    • GCC Administrator's avatar
      Daily bump. · 6a87fdd3
      GCC Administrator authored
      6a87fdd3
  20. Mar 08, 2023
  21. Mar 03, 2023
  22. Mar 02, 2023
    • Kwok Cheung Yeung's avatar
      amdgcn: Enable SIMD vectorization of math functions · ce9cd725
      Kwok Cheung Yeung authored
      Calls to vectorized versions of routines in the math library will now
      be inserted when vectorizing code containing supported math functions.
      
      2023-03-02  Kwok Cheung Yeung  <kcy@codesourcery.com>
      	    Paul-Antoine Arras  <pa@codesourcery.com>
      
      	gcc/
      	* builtins.cc (mathfn_built_in_explicit): New.
      	* config/gcn/gcn.cc: Include case-cfn-macros.h.
      	(mathfn_built_in_explicit): Add prototype.
      	(gcn_vectorize_builtin_vectorized_function): New.
      	(gcn_libc_has_function): New.
      	(TARGET_LIBC_HAS_FUNCTION): Define.
      	(TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION): Define.
      
      	gcc/testsuite/
      	* gcc.target/gcn/simd-math-1.c: New testcase.
      	* gcc.target/gcn/simd-math-2.c: New testcase.
      
      	libgomp/
      	* testsuite/libgomp.c/simd-math-1.c: New testcase.
      ce9cd725
    • GCC Administrator's avatar
      Daily bump. · c88a7c63
      GCC Administrator authored
      c88a7c63
  23. Mar 01, 2023
    • Tobias Burnus's avatar
      OpenMP/Fortran: Fix handling of optional is_device_ptr + bind(C) [PR108546] · 96ff97ff
      Tobias Burnus authored
      For is_device_ptr, optional checks should only be done before calling
      libgomp, afterwards they are NULL either because of absent or, by
      chance, because it is unallocated or unassociated (for pointers/allocatables).
      
      Additionally, it fixes an issue with explicit mapping for 'type(c_ptr)'.
      
      	PR middle-end/108546
      
      gcc/fortran/ChangeLog:
      
      	* trans-openmp.cc (gfc_trans_omp_clauses): Fix mapping of
      	type(C_ptr) variables.
      
      gcc/ChangeLog:
      
      	* omp-low.cc (lower_omp_target): Remove optional handling
      	on the receiver side, i.e. inside target (data), for
      	use_device_ptr.
      
      libgomp/ChangeLog:
      
      	* testsuite/libgomp.fortran/is_device_ptr-3.f90: New test.
      	* testsuite/libgomp.fortran/use_device_ptr-optional-4.f90: New test.
      96ff97ff
  24. Feb 23, 2023
  25. Feb 22, 2023
    • Thomas Schwinge's avatar
      Add '-Wno-complain-wrong-lang', and use it in... · 320dc51c
      Thomas Schwinge authored
      Add '-Wno-complain-wrong-lang', and use it in 'gcc/testsuite/lib/target-supports.exp:check_compile' and elsewhere
      
      I noticed that GCC/Rust recently lost all LTO variants in torture testing:
      
           PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O0  (test for excess errors)
           PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O1  (test for excess errors)
           PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O2  (test for excess errors)
          -PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O2 -flto -fno-use-linker-plugin -flto-partition=none  (test for excess errors)
          -PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  (test for excess errors)
           PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O3 -g  (test for excess errors)
           PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -Os  (test for excess errors)
      
      Etc.
      
      The reason is that when probing for availability of LTO, we run into:
      
          spawn [...]/build-gcc/gcc/testsuite/rust/../../gccrs -B[...]/build-gcc/gcc/testsuite/rust/../../ -fdiagnostics-plain-output -frust-incomplete-and-experimental-compiler-do-not-use -flto -c -o lto8274.o lto8274.c
          cc1: warning: command-line option '-frust-incomplete-and-experimental-compiler-do-not-use' is valid for Rust but not for C
      
      For GCC/Rust testing, this flag is (as of recently) defaulted in
      'gcc/testsuite/lib/rust.exp:rust_init':
      
          lappend ALWAYS_RUSTFLAGS "additional_flags=-frust-incomplete-and-experimental-compiler-do-not-use"
      
      A few more "command-line option [...] is valid for [...] but not for [...]"
      instances were found in the test suite logs, when more than one language is
      involved.
      
      With '-Wno-complain-wrong-lang' used in
      'gcc/testsuite/lib/target-supports.exp:check_compile', we get back:
      
           PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O0  (test for excess errors)
           PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O1  (test for excess errors)
           PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O2  (test for excess errors)
          +PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O2 -flto -fno-use-linker-plugin -flto-partition=none  (test for excess errors)
          +PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  (test for excess errors)
           PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O3 -g  (test for excess errors)
           PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -Os  (test for excess errors)
      
      Etc., and in total:
      
                          === rust Summary for unix ===
      
          # of expected passes            [-4990-]{+6718+}
          # of expected failures          [-39-]{+51+}
      
      Anything that 'gcc/opts-global.cc:complain_wrong_lang' might do is cut
      short by '-Wno-complain-wrong-lang', not just the one 'warning'
      diagnostic.  This corresponds to what already exists via
      'lang_hooks.complain_wrong_lang_p'.
      
      The 'gcc/opts-common.cc:prune_options' changes follow the same rationale
      as PR67640 "driver passes -fdiagnostics-color= always last": we need to
      process '-Wno-complain-wrong-lang' early, so that it properly affects
      other options appearing before it on the command line.
      
      	gcc/
      	* common.opt (-Wcomplain-wrong-lang): New.
      	* doc/invoke.texi (-Wno-complain-wrong-lang): Document it.
      	* opts-common.cc (prune_options): Handle it.
      	* opts-global.cc (complain_wrong_lang): Use it.
      	gcc/testsuite/
      	* gcc.dg/Wcomplain-wrong-lang-1.c: New.
      	* gcc.dg/Wcomplain-wrong-lang-2.c: Likewise.
      	* gcc.dg/Wcomplain-wrong-lang-3.c: Likewise.
      	* gcc.dg/Wcomplain-wrong-lang-4.c: Likewise.
      	* gcc.dg/Wcomplain-wrong-lang-5.c: Likewise.
      	* lib/target-supports.exp (check_compile): Use
      	'-Wno-complain-wrong-lang'.
      	* g++.dg/abi/empty12.C: Likewise.
      	* g++.dg/abi/empty13.C: Likewise.
      	* g++.dg/abi/empty14.C: Likewise.
      	* g++.dg/abi/empty15.C: Likewise.
      	* g++.dg/abi/empty16.C: Likewise.
      	* g++.dg/abi/empty17.C: Likewise.
      	* g++.dg/abi/empty18.C: Likewise.
      	* g++.dg/abi/empty19.C: Likewise.
      	* g++.dg/abi/empty22.C: Likewise.
      	* g++.dg/abi/empty25.C: Likewise.
      	* g++.dg/abi/empty26.C: Likewise.
      	* gfortran.dg/bind-c-contiguous-1.f90: Likewise.
      	* gfortran.dg/bind-c-contiguous-4.f90: Likewise.
      	* gfortran.dg/bind-c-contiguous-5.f90: Likewise.
      	libgomp/
      	* testsuite/libgomp.fortran/alloc-10.f90: Use
      	'-Wno-complain-wrong-lang'.
      	* testsuite/libgomp.fortran/alloc-11.f90: Likewise.
      	* testsuite/libgomp.fortran/alloc-7.f90: Likewise.
      	* testsuite/libgomp.fortran/alloc-9.f90: Likewise.
      	* testsuite/libgomp.fortran/allocate-1.f90: Likewise.
      	* testsuite/libgomp.fortran/depend-4.f90: Likewise.
      	* testsuite/libgomp.fortran/depend-5.f90: Likewise.
      	* testsuite/libgomp.fortran/depend-6.f90: Likewise.
      	* testsuite/libgomp.fortran/depend-7.f90: Likewise.
      	* testsuite/libgomp.fortran/depend-inoutset-1.f90: Likewise.
      	* testsuite/libgomp.fortran/examples-4/declare_target-1.f90:
      	Likewise.
      	* testsuite/libgomp.fortran/examples-4/declare_target-2.f90:
      	Likewise.
      	* testsuite/libgomp.fortran/order-reproducible-1.f90: Likewise.
      	* testsuite/libgomp.fortran/order-reproducible-2.f90: Likewise.
      	* testsuite/libgomp.oacc-fortran/parallel-dims.f90: Likewise.
      	* testsuite/libgomp.fortran/task-detach-6.f90: Remove left-over
      	'dg-prune-output'.
      320dc51c
  26. Feb 17, 2023
  27. Feb 16, 2023
    • Jakub Jelinek's avatar
      libgomp: Fix up some typos in libgomp.texi · 0b9bd33d
      Jakub Jelinek authored
      I decided to check for repeated the the in libgomp and noticed
      there are several occurrences of a typo theads rather than threads
      in libgomp.texi.
      
      2023-02-16  Jakub Jelinek  <jakub@redhat.com>
      
      	* libgomp.texi: Fix typos - theads -> threads.
      0b9bd33d
Loading