Skip to content
Snippets Groups Projects
  1. Oct 12, 2021
    • Eric Gallager's avatar
      Mark certain subdirectories as missing TAGS targets. · 4ca446a4
      Eric Gallager authored
      The subdirectories in question are libcody,
      libdecnumber, c++tools, libgcc, and libobjc.
      This is progress towards allowing "make tags" to
      work from the top-level directory; a few additional
      changes may also be necessary, though.
      
      ChangeLog:
      
      	* Makefile.def: Mark libcody, libdecnumber,
      	c++tools, libgcc, and libobjc as missing TAGS
      	targets.
      	* Makefile.in: Regenerate.
      4ca446a4
    • Uros Bizjak's avatar
      i386: Improve workaround for PR82524 LRA limitation [PR85730] · b37351e3
      Uros Bizjak authored
      As explained in PR82524, LRA is not able to reload strict_low_part inout
      operand with matched input operand. The patch introduces a workaround,
      where we allow LRA to generate an instruction with non-matched input operand
      which is split post reload to an instruction that inserts non-matched input
      operand to an inout operand and the instruction that uses matched operand.
      
      The generated code improves from:
      
              movsbl  %dil, %edx
              movl    %edi, %eax
              sall    $3, %edx
              movb    %dl, %al
      
      to:
      
              movl    %edi, %eax
              movb    %dil, %al
              salb    $3, %al
      
      which is still not optimal, but the code is one instruction shorter and
      does not use a temporary register.
      
      2021-10-12  Uroš Bizjak  <ubizjak@gmail.com>
      
      gcc/
      	PR target/85730
      	PR target/82524
      	* config/i386/i386.md (*add<mode>_1_slp): Rewrite as
      	define_insn_and_split pattern.  Add alternative 1 and split it
      	post reload to insert operand 1 into the low part of operand 0.
      	(*sub<mode>_1_slp): Ditto.
      	(*and<mode>_1_slp): Ditto.
      	(*<any_or:code><mode>_1_slp): Ditto.
      	(*ashl<mode>3_1_slp): Ditto.
      	(*<any_shiftrt:insn><mode>3_1_slp): Ditto.
      	(*<any_rotate:insn><mode>3_1_slp): Ditto.
      	(*neg<mode>_1_slp): New insn_and_split pattern.
      	(*one_cmpl<mode>_1_slp): Ditto.
      
      gcc/testsuite/
      	PR target/85730
      	PR target/82524
      	* gcc.target/i386/pr85730.c: New test.
      b37351e3
    • David Edelsohn's avatar
      doc: Update MinGW and mingw-64 download links. · 640ae312
      David Edelsohn authored
      gcc/ChangeLog:
      
      	* doc/install.texi: Update MinGW and mingw-64 Binaries
      	download links.
      640ae312
    • Jonathan Wakely's avatar
      libstdc++: Fix test that fails for C++20 · 727137d6
      Jonathan Wakely authored
      Also restore the test for 'a < a' that was removed by r12-2537 because
      it is ill-formed. We still want to test operator< for tuple, we just
      need to not use std::nullptr_t in that tuple type.
      
      libstdc++-v3/ChangeLog:
      
      	* testsuite/20_util/tuple/comparison_operators/overloaded.cc:
      	Restore test for operator<.
      	* testsuite/20_util/tuple/comparison_operators/overloaded2.cc:
      	Adjust expected errors for C++20.
      727137d6
    • Jonathan Wakely's avatar
      libstdc++: Fix move construction of std::tuple with array elements [PR101960] · 74810213
      Jonathan Wakely authored
      The r12-3022 commit only fixed the case where an array is the last
      element of the tuple. This fixes the other cases too. We can just define
      the move constructor as defaulted, which does the right thing. Changing
      the move constructor to be trivial would be an ABI break, but since the
      last base class still has a non-trivial move constructor, defining the
      derived ones as defaulted doesn't change anything.
      
      libstdc++-v3/ChangeLog:
      
      	PR libstdc++/101960
      	* include/std/tuple (_Tuple_impl(_Tuple_impl&&)): Define as
      	defauled.
      	* testsuite/20_util/tuple/cons/101960.cc: Check tuples with
      	array elements before the last element.
      74810213
    • Jonathan Wakely's avatar
      libstdc++: Improve diagnostics for misuses of output iterators · d9dfd7ad
      Jonathan Wakely authored
      This adds deleted overloads so that the errors for invalid uses of
      std::advance and std::distance are easier to understand (see for example
      PR 102181).
      
      libstdc++-v3/ChangeLog:
      
      	* include/bits/stl_iterator_base_funcs.h (__advance): Add
      	deleted overload to improve diagnostics.
      	(__distance): Likewise.
      d9dfd7ad
    • Daniel Le Duc Khoi Nguyen's avatar
      doc: Fix typos in alloc_size documentation · 8226f638
      Daniel Le Duc Khoi Nguyen authored
      gcc/
      	* doc/extend.texi (Common Variable Attributes): Fix typos in
      	alloc_size documentation.
      8226f638
    • Luís Ferreira's avatar
      [PATCH v2] libiberty: d-demangle: remove parenthesis where it is not needed · 98c0ac7e
      Luís Ferreira authored
      libiberty/
      	* d-demangle.c (dlang_parse_qualified): Remove redudant parenthesis
      	around lhs and rhs of assignments.
      98c0ac7e
    • Julian Brown's avatar
      libgomp: Release device lock on cbuf error path · ccfcf08e
      Julian Brown authored
      This patch releases the device lock on a sanity-checking error path in
      transfer combining (cbuf) handling in libgomp:target.c.  This shouldn't
      happen when handling well-formed mapping clauses, but erroneous clauses
      can currently cause a hang if the condition triggers.
      
      2021-12-10  Julian Brown  <julian@codesourcery.com>
      
      libgomp/
      	* target.c (gomp_copy_host2dev): Release device lock on cbuf
      	error path.
      ccfcf08e
    • Richard Biener's avatar
      tree-optimization/102696 - fix SLP discovery for failed BIT_FIELD_REF · d1dcaa31
      Richard Biener authored
      This fixes a forgotten adjustment of matches[] when we fail SLP
      discovery.
      
      2021-10-12  Richard Biener  <rguenther@suse.de>
      
      	PR tree-optimization/102696
      	* tree-vect-slp.c (vect_build_slp_tree_2): Properly mark
      	the tree fatally failed when we reject a BIT_FIELD_REF.
      
      	* g++.dg/vect/pr102696.cc: New testcase.
      d1dcaa31
    • Richard Biener's avatar
      tree-optimization/102572 - fix gathers with invariant mask · 9f12a45e
      Richard Biener authored
      This fixes the vector def gathering for invariant masks which
      failed to pass in the desired vector type resulting in a non-mask
      type to be generate.
      
      2021-10-12  Richard Biener  <rguenther@suse.de>
      
      	PR tree-optimization/102572
      	* tree-vect-stmts.c (vect_build_gather_load_calls): When
      	gathering the vectorized defs for the mask pass in the
      	desired mask vector type so invariants will be handled
      	correctly.
      
      	* g++.dg/vect/pr102572.cc: New testcase.
      9f12a45e
    • Tamar Christina's avatar
      sve: combine inverted masks into NOTs · e36206c9
      Tamar Christina authored
      The following example
      
      void f10(double * restrict z, double * restrict w, double * restrict x,
      	 double * restrict y, int n)
      {
          for (int i = 0; i < n; i++) {
              z[i] = (w[i] > 0) ? x[i] + w[i] : y[i] - w[i];
          }
      }
      
      generates currently:
      
              ld1d    z1.d, p1/z, [x1, x5, lsl 3]
              fcmgt   p2.d, p1/z, z1.d, #0.0
              fcmgt   p0.d, p3/z, z1.d, #0.0
              ld1d    z2.d, p2/z, [x2, x5, lsl 3]
              bic     p0.b, p3/z, p1.b, p0.b
              ld1d    z0.d, p0/z, [x3, x5, lsl 3]
      
      where a BIC is generated between p1 and p0 where a NOT would be better here
      since we won't require the use of p3 and opens the pattern up to being CSEd.
      
      After this patch using a 2 -> 2 split we generate:
      
              ld1d    z1.d, p0/z, [x1, x5, lsl 3]
              fcmgt   p2.d, p0/z, z1.d, #0.0
              not     p1.b, p0/z, p2.b
      
      The additional scratch is needed such that we can CSE the two operations.  If
      both statements wrote to the same register then CSE won't be able to CSE the
      values if there are other statements in between that use the register.
      
      A second pattern is needed to capture the nor case as combine will match the
      longest sequence first.  So without this pattern we end up de-optimizing nor
      and instead emit two nots.  I did not find a better way to do this.
      
      gcc/ChangeLog:
      
      	* config/aarch64/aarch64-sve.md (*fcm<cmp_op><mode>_bic_combine,
      	*fcm<cmp_op><mode>_nor_combine, *fcmuo<mode>_bic_combine,
      	*fcmuo<mode>_nor_combine): New.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/aarch64/sve/pred-not-gen-1.c: New test.
      	* gcc.target/aarch64/sve/pred-not-gen-2.c: New test.
      	* gcc.target/aarch64/sve/pred-not-gen-3.c: New test.
      	* gcc.target/aarch64/sve/pred-not-gen-4.c: New test.
      e36206c9
    • Eric Botcazou's avatar
      Fix PR target/102588 · a1a7d094
      Eric Botcazou authored
      We need a 32-byte wide integer mode (OImode) in order to handle structure
      returns in the 64-bit ABI.
      
      gcc/
      	PR target/102588
      	* config/sparc/sparc-modes.def (OI): New integer mode.
      a1a7d094
    • Tobias Burnus's avatar
      Fortran version of libgomp.c-c++-common/icv-{3,4}.c · f5a538e1
      Tobias Burnus authored
      This adds the Fortran testsuite coverage of
      omp_{get_max,set_num}_threads and omp_{s,g}et_teams_thread_limit
      
      libgomp/
      	* testsuite/libgomp.fortran/icv-3.f90: New.
      	* testsuite/libgomp.fortran/icv-4.f90: New.
      f5a538e1
    • Tobias Burnus's avatar
      Fortran: Various CLASS + assumed-rank fixed [PR102541] · eb92cd57
      Tobias Burnus authored
      Starting point was PR102541, were a previous patch caused an invalid
      e->ref access for class. When testing, it turned out that for
      CLASS to CLASS the code was never executed - additionally, issues
      appeared for optional and a bogus error for -fcheck=all. In particular:
      
      There were a bunch of issues related to optional CLASS, can have the
      'attr.dummy' set in CLASS_DATA (sym) - but sometimes also in 'sym'!?!
      Additionally, gfc_variable_attr could return pointer = 1 for nonpointers
      when the expr is no longer "var" but "var%_data".
      
      	PR fortran/102541
      
      gcc/fortran/ChangeLog:
      
      	* check.c (gfc_check_present): Handle optional CLASS.
      	* interface.c (gfc_compare_actual_formal): Likewise.
      	* trans-array.c (gfc_trans_g77_array): Likewise.
      	* trans-decl.c (gfc_build_dummy_array_decl): Likewise.
      	* trans-types.c (gfc_sym_type): Likewise.
      	* primary.c (gfc_variable_attr): Fixes for dummy and
      	pointer when 'class%_data' is passed.
      	* trans-expr.c (set_dtype_for_unallocated, gfc_conv_procedure_call):
      	For assumed-rank dummy, fix setting rank for dealloc/notassoc actual
      	and setting ubound to -1 for assumed-size actuals.
      
      gcc/testsuite/ChangeLog:
      
      	* gfortran.dg/assumed_rank_24.f90: New test.
      eb92cd57
    • Jakub Jelinek's avatar
      openmp: Avoid calling clear_type_padding_in_mask in the common case where... · 8e1fe3f7
      Jakub Jelinek authored
      openmp: Avoid calling clear_type_padding_in_mask in the common case where there can't be any padding
      
      We can use the clear_padding_type_may_have_padding_p function, which
      is conservative for e.g. RECORD_TYPE/UNION_TYPE, but for the floating and
      complex floating types is accurate.  clear_type_padding_in_mask is
      more expensive because we need to allocate memory, fill it, call the function
      which itself is more expensive and then analyze the memory, so for the
      common case of float/double atomics or even long double on most targets
      we can avoid that.
      
      2021-10-12  Jakub Jelinek  <jakub@redhat.com>
      
      gcc/
      	* gimple-fold.h (clear_padding_type_may_have_padding_p): Declare.
      	* gimple-fold.c (clear_padding_type_may_have_padding_p): No longer
      	static.
      gcc/c-family/
      	* c-omp.c (c_finish_omp_atomic): Use
      	clear_padding_type_may_have_padding_p.
      8e1fe3f7
    • Jakub Jelinek's avatar
      openmp: Add documentation for omp_{get_max, set_num}_threads and omp_{s, g}et_teams_thread_limit · 4096bf82
      Jakub Jelinek authored
      This patch adds documentation for these new OpenMP 5.1 APIs as well as
      two new environment variables - OMP_NUM_TEAMS and OMP_TEAMS_THREAD_LIMIT.
      
      2021-10-12  Jakub Jelinek  <jakub@redhat.com>
      
      	* libgomp.texi (omp_get_max_teams, omp_get_teams_thread_limit,
      	omp_set_num_teams, omp_set_teams_thread_limit, OMP_NUM_TEAMS,
      	OMP_TEAMS_THREAD_LIMIT): Document.
      4096bf82
    • Jakub Jelinek's avatar
      openmp: Fix up warnings on libgomp.info build · de7fa706
      Jakub Jelinek authored
      When building libgomp documentation, I see
      makeinfo --split-size=5000000  -I ../../../libgomp/../gcc/doc/include -I ../../../libgomp -o libgomp.info ../../../libgomp/libgomp.texi
      ../../../libgomp/libgomp.texi:503: warning: node next `omp_get_default_device' in menu `omp_get_device_num' and in sectioning `omp_get_dynamic' differ
      ../../../libgomp/libgomp.texi:528: warning: node prev `omp_get_dynamic' in menu `omp_get_device_num' and in sectioning `omp_get_default_device' differ
      ../../../libgomp/libgomp.texi:560: warning: node next `omp_get_initial_device' in menu `omp_get_level' and in sectioning `omp_get_device_num' differ
      ../../../libgomp/libgomp.texi:587: warning: node next `omp_get_device_num' in menu `omp_get_dynamic' and in sectioning `omp_get_level' differ
      ../../../libgomp/libgomp.texi:587: warning: node prev `omp_get_device_num' in menu `omp_get_default_device' and in sectioning `omp_get_initial_device' differ
      ../../../libgomp/libgomp.texi:615: warning: node prev `omp_get_level' in menu `omp_get_initial_device' and in sectioning `omp_get_device_num' differ
      warnings.  This patch fixes those.
      
      2021-10-12  Jakub Jelinek  <jakub@redhat.com>
      
      	* libgomp.texi (omp_get_device_num): Move @node before omp_get_dynamic
      	to avoid makeinfo warnings.
      de7fa706
    • Jakub Jelinek's avatar
      openmp: Add testsuite coverage for omp_{get_max,set_num}_threads and omp_{s,g}et_teams_thread_limit · 88f5ad52
      Jakub Jelinek authored
      This adds (C/C++ only) testsuite coverage for these new OpenMP 5.1 APIs.
      
      2021-10-12  Jakub Jelinek  <jakub@redhat.com>
      
      	* testsuite/libgomp.c-c++-common/icv-3.c: New test.
      	* testsuite/libgomp.c-c++-common/icv-4.c: New test.
      88f5ad52
    • Jakub Jelinek's avatar
      libgomp: alloc* test fixes [PR102628, PR102668] · 342aedf0
      Jakub Jelinek authored
      As reported, the alloc-9.c test and alloc-{1,2,3}.F* and alloc-11.f90
      tests fail on powerpc64-linux with -m32.
      The reason why it fails just there is that malloc doesn't guarantee there
      128-bit alignment (historically glibc guaranteed 2 * sizeof (void *)
      alignment from malloc).
      
      There are two separate issues.
      One is a thinko on my side.
      In this part of alloc-9.c test (copied to alloc-11.f90), we have
      2 allocators, a with pool size 1024B and alignment 16B and default fallback
      and a2 with pool size 512B and alignment 32B and a as fallback allocator.
      We start at no allocations in both at line 194 and do:
        p = (int *) omp_alloc (sizeof (int), a2);
      // This succeeds in a2 and needs 4+overhead bytes (which includes the 32B alignment)
        p = (int *) omp_realloc (p, 420, a, a2);
      // This allocates 420 bytes+overhead in a, with 16B alignment and deallocates the above
        q = (int *) omp_alloc (sizeof (int), a);
      // This allocates 4+overhead bytes in a, with 16B alignment
        q = (int *) omp_realloc (q, 420, a2, a);
      // This allocates 420+overhead in a2 with 32B alignment
        q = (int *) omp_realloc (q, 768, a2, a2);
      // This attempts to reallocate, but as there are elevated alignment
      // requirements doesn't try to just realloc (even if it wanted to try that
      // a2 is almost full, with 512-420-overhead bytes left in it), so it
      // tries to alloc in a2, but there is no space left in the pool, falls
      // back to a, which already has 420+overhead bytes allocated in it and
      // 1024-420-overhead bytes left and so fails too and fails to default
      // non-pool allocator that allocates it, but doesn't guarantee alignment
      // higher than malloc guarantees.
      // But, the test expected 16B alignment.
      
      So, I've slightly lowered the allocation sizes in that part of the test
      420->320 and 768 -> 568, so that the last test still fails to allocate
      in a2 (568 > 512-320-overhead) but succeeds in a as fallback, which was
      the intent of the test.
      
      Another thing is that alloc-1.F90 seems to be transcription of
      libgomp.c-c++-common/alloc-1.c into Fortran, but alloc-1.c had:
        q = (int *) omp_alloc (768, a2);
        if ((((uintptr_t) q) % 16) != 0)
          abort ();
        q[0] = 7;
        q[767 / sizeof (int)] = 8;
        r = (int *) omp_alloc (512, a2);
        if ((((uintptr_t) r) % __alignof (int)) != 0)
          abort ();
      there but Fortran has:
              cq = omp_alloc (768_c_size_t, a2)
              if (mod (transfer (cq, intptr), 16_c_intptr_t) /= 0) stop 12
              call c_f_pointer (cq, q, [768 / c_sizeof (i)])
              q(1) = 7
              q(768 / c_sizeof (i)) = 8
              cr = omp_alloc (512_c_size_t, a2)
              if (mod (transfer (cr, intptr), 16_c_intptr_t) /= 0) stop 13
      I'm changing the latter to 4_c_intptr_t because other spots in the
      testcase do that, Fortran sadly doesn't have c_alignof, but strictly
      speaking it isn't correct, __alignof (int) could be on some architectures
      smaller than 4.
      So probably alloc-1.F90 etc. should also have
      ! { dg-additional-sources alloc-7.c }
      ! { dg-prune-output "command-line option '-fintrinsic-modules-path=.*' is valid for Fortran but not for C" }
      and use get__alignof_int.
      
      2021-10-12  Jakub Jelinek  <jakub@redhat.com>
      
      	PR libgomp/102628
      	PR libgomp/102668
      	* testsuite/libgomp.c-c++-common/alloc-9.c (main): Decrease
      	allocation sizes from 420 to 320 and from 768 to 568.
      	* testsuite/libgomp.fortran/alloc-11.f90: Likewise.
      	* testsuite/libgomp.fortran/alloc-1.F90: Change expected alignment
      	for cr from 16 to 4.
      342aedf0
    • Jakub Jelinek's avatar
      vectorizer: Fix up -fsimd-cost-model= handling · fab2f61d
      Jakub Jelinek authored
      >	* testsuite/libgomp.c++/scan-10.C: Add option -fvect-cost-model=cheap.
      
      I don't think this is the right thing to do.
      This just means that at some point between 2013 when -fsimd-cost-model has
      been introduced and now -fsimd-cost-model= option at least partially stopped
      working properly.
      As documented, -fsimd-cost-model= overrides the -fvect-cost-model= setting
      for OpenMP simd loops (loop->force_vectorize is true) if specified differently
      from default.
      In tree-vectorizer.h we have:
      static inline bool
      unlimited_cost_model (loop_p loop)
      {
        if (loop != NULL && loop->force_vectorize
            && flag_simd_cost_model != VECT_COST_MODEL_DEFAULT)
          return flag_simd_cost_model == VECT_COST_MODEL_UNLIMITED;
        return (flag_vect_cost_model == VECT_COST_MODEL_UNLIMITED);
      }
      and use it in various places, but we also just use flag_vect_cost_model
      in lots of places (and in one spot use flag_simd_cost_model, not sure if
      we are sure it is a force_vectorize loop or what).
      
      So, IMHO we should change the above inline function to
      loop_cost_model and let it return the cost model and then just
      reimplement unlimited_cost_model as
      return loop_cost_model (loop) == VECT_COST_MODEL_UNLIMITED;
      and then adjust the direct uses of the flag and revert these changes.
      
      2021-10-12  Jakub Jelinek  <jakub@redhat.com>
      
      gcc/
      	* tree-vectorizer.h (loop_cost_model): New function.
      	(unlimited_cost_model): Use it.
      	* tree-vect-loop.c (vect_analyze_loop_costing): Use loop_cost_model
      	call instead of flag_vect_cost_model.
      	* tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Likewise.
      	(vect_prune_runtime_alias_test_list): Likewise.  Also use it instead
      	of flag_simd_cost_model.
      gcc/testsuite/
      	* gcc.dg/gomp/simd-2.c: Remove option -fvect-cost-model=cheap.
      	* gcc.dg/gomp/simd-3.c: Likewise.
      libgomp/
      	* testsuite/libgomp.c/scan-11.c: Remove option -fvect-cost-model=cheap.
      	* testsuite/libgomp.c/scan-12.c: Likewise.
      	* testsuite/libgomp.c/scan-13.c: Likewise.
      	* testsuite/libgomp.c/scan-14.c: Likewise.
      	* testsuite/libgomp.c/scan-15.c: Likewise.
      	* testsuite/libgomp.c/scan-16.c: Likewise.
      	* testsuite/libgomp.c/scan-17.c: Likewise.
      	* testsuite/libgomp.c/scan-18.c: Likewise.
      	* testsuite/libgomp.c/scan-19.c: Likewise.
      	* testsuite/libgomp.c/scan-20.c: Likewise.
      	* testsuite/libgomp.c/scan-21.c: Likewise.
      	* testsuite/libgomp.c/scan-22.c: Likewise.
      	* testsuite/libgomp.c++/scan-9.C: Likewise.
      	* testsuite/libgomp.c++/scan-10.C: Likewise.
      	* testsuite/libgomp.c++/scan-11.C: Likewise.
      	* testsuite/libgomp.c++/scan-12.C: Likewise.
      	* testsuite/libgomp.c++/scan-13.C: Likewise.
      	* testsuite/libgomp.c++/scan-14.C: Likewise.
      	* testsuite/libgomp.c++/scan-15.C: Likewise.
      	* testsuite/libgomp.c++/scan-16.C: Likewise.
      fab2f61d
    • liuhongt's avatar
      Support reduc_{plus,smax,smin,umax,umin}_scal_v4qi. · 73c535a0
      liuhongt authored
      gcc/ChangeLog
      
      	PR target/102483
      	* config/i386/i386-expand.c (emit_reduc_half): Handle
      	V4QImode.
      	* config/i386/mmx.md (reduc_<code>_scal_v4qi): New expander.
      	(reduc_plus_scal_v4qi): Ditto.
      
      gcc/testsuite/ChangeLog
      
      	* gcc.target/i386/pr102483.c: New test.
      	* gcc.target/i386/pr102483-2.c: New test.
      73c535a0
    • liuhongt's avatar
      Adjust testcase for O2 vectorization enabling · d61ce6ab
      liuhongt authored
      This issue was observed in rs6000 specific PR102658 as well.
      
      I've looked into it a bit, it's caused by the "conditional store replacement" which
      is originally disabled without vectorization as below code.
      
        /* If either vectorization or if-conversion is disabled then do
           not sink any stores.  */
        if (param_max_stores_to_sink == 0
            || (!flag_tree_loop_vectorize && !flag_tree_slp_vectorize)
            || !flag_tree_loop_if_convert)
          return false;
      
      The new change makes the innermost loop look like
      
      for (int c1 = 0; c1 <= 1499; c1 += 1) {
        if (c1 <= 500) {
           S_10(c0, c1);
        } else {
            S_9(c0, c1);
        }
        S_11(c0, c1);
      }
      
      and can not be splitted as:
      
      for (int c1 = 0; c1 <= 500; c1 += 1)
        S_10(c0, c1);
      
      for (int c1 = 501; c1 <= 1499; c1 += 1)
        S_9(c0, c1);
      
      So instead of disabling vectorization, could we just disable this cs replacement
      with parameter "--param max-stores-to-sink=0"?
      
      I tested this proposal on ppc64le, it should work as well.
      
      2021-10-11  Kewen Lin  <linkw@linux.ibm.com>
      
      libgomp/ChangeLog:
      
      	* testsuite/libgomp.graphite/force-parallel-8.c: Add --param max-stores-to-sink=0.
      d61ce6ab
    • Paul A. Clarke's avatar
      rs6000: Correct several errant dg-require-effective-target · 82bc9355
      Paul A. Clarke authored
      I misspelled the dg-require-effective-target attribute "vsx_hw" in
      recent commits, causing the effected tests to fail.  Correct the spelling.
      
      2021-10-11  Paul A. Clarke  <pc@us.ibm.com>
      
      gcc/testsuite
      	* gcc.target/powerpc/pr78102.c: Fix dg-require-effective-target.
      	* gcc.target/powerpc/sse4_1-packusdw.c: Likewise.
      	* gcc.target/powerpc/sse4_1-pmaxsb.c: Likewise.
      	* gcc.target/powerpc/sse4_1-pmaxsd.c: Likewise.
      	* gcc.target/powerpc/sse4_1-pmaxud.c: Likewise.
      	* gcc.target/powerpc/sse4_1-pmaxuw.c: Likewise.
      	* gcc.target/powerpc/sse4_1-pminsb.c: Likewise.
      	* gcc.target/powerpc/sse4_1-pminsd.c: Likewise.
      	* gcc.target/powerpc/sse4_1-pminud.c: Likewise.
      	* gcc.target/powerpc/sse4_1-pminuw.c: Likewise.
      	* gcc.target/powerpc/sse4_1-pmovsxbd.c: Likewise.
      	* gcc.target/powerpc/sse4_1-pmovsxbw.c: Likewise.
      	* gcc.target/powerpc/sse4_1-pmovsxwd.c: Likewise.
      	* gcc.target/powerpc/sse4_1-pmovzxbd.c: Likewise.
      	* gcc.target/powerpc/sse4_1-pmovzxbq.c: Likewise.
      	* gcc.target/powerpc/sse4_1-pmovzxbw.c: Likewise.
      	* gcc.target/powerpc/sse4_1-pmovzxdq.c: Likewise.
      	* gcc.target/powerpc/sse4_1-pmovzxwd.c: Likewise.
      	* gcc.target/powerpc/sse4_1-pmovzxwq.c: Likewise.
      	* gcc.target/powerpc/sse4_1-pmulld.c: Likewise.
      	* gcc.target/powerpc/sse4_2-pcmpgtq.c: Likewise.
      	* gcc.target/powerpc/sse4_1-phminposuw.c: Use correct
      	dg-require-effective-target.
      82bc9355
    • Paul A. Clarke's avatar
      rs6000: Support more SSE4 "cmp", "mul", "pack" intrinsics · 29fb1e83
      Paul A. Clarke authored
      Function signatures and decorations match gcc/config/i386/smmintrin.h.
      
      Also, copy tests for:
      - _mm_cmpeq_epi64
      - _mm_mullo_epi32, _mm_mul_epi32
      - _mm_packus_epi32
      - _mm_cmpgt_epi64 (SSE4.2)
      
      from gcc/testsuite/gcc.target/i386.
      
      2021-10-11  Paul A. Clarke  <pc@us.ibm.com>
      
      gcc
      	* config/rs6000/smmintrin.h (_mm_cmpeq_epi64, _mm_cmpgt_epi64,
      	_mm_mullo_epi32, _mm_mul_epi32, _mm_packus_epi32): New.
      	* config/rs6000/nmmintrin.h: Copy from i386, tweak to suit.
      
      gcc/testsuite
      	* gcc.target/powerpc/pr78102.c: Copy from gcc.target/i386,
      	adjust dg directives to suit.
      	* gcc.target/powerpc/sse4_1-packusdw.c: Same.
      	* gcc.target/powerpc/sse4_1-pcmpeqq.c: Same.
      	* gcc.target/powerpc/sse4_1-pmuldq.c: Same.
      	* gcc.target/powerpc/sse4_1-pmulld.c: Same.
      	* gcc.target/powerpc/sse4_2-pcmpgtq.c: Same.
      	* gcc.target/powerpc/sse4_2-check.h: Copy from gcc.target/i386,
      	tweak to suit.
      29fb1e83
    • Paul A. Clarke's avatar
      rs6000: Support SSE4.1 "cvt" intrinsics · 285d75a4
      Paul A. Clarke authored
      Function signatures and decorations match gcc/config/i386/smmintrin.h.
      
      Also, copy tests for:
      - _mm_cvtepi8_epi16, _mm_cvtepi8_epi32, _mm_cvtepi8_epi64
      - _mm_cvtepi16_epi32, _mm_cvtepi16_epi64
      - _mm_cvtepi32_epi64,
      - _mm_cvtepu8_epi16, _mm_cvtepu8_epi32, _mm_cvtepu8_epi64
      - _mm_cvtepu16_epi32, _mm_cvtepu16_epi64
      - _mm_cvtepu32_epi64
      
      from gcc/testsuite/gcc.target/i386.
      
      sse4_1-pmovsxbd.c, sse4_1-pmovsxbq.c, and sse4_1-pmovsxbw.c were
      modified from using "char" types to "signed char" types, because
      the default is unsigned on powerpc.
      
      2021-10-11  Paul A. Clarke  <pc@us.ibm.com>
      
      gcc
      	* config/rs6000/smmintrin.h (_mm_cvtepi8_epi16, _mm_cvtepi8_epi32,
      	_mm_cvtepi8_epi64, _mm_cvtepi16_epi32, _mm_cvtepi16_epi64,
      	_mm_cvtepi32_epi64, _mm_cvtepu8_epi16, _mm_cvtepu8_epi32,
      	_mm_cvtepu8_epi64, _mm_cvtepu16_epi32, _mm_cvtepu16_epi64,
      	_mm_cvtepu32_epi64): New.
      
      gcc/testsuite
      	* gcc.target/powerpc/sse4_1-pmovsxbd.c: Copy from gcc.target/i386,
      	adjust dg directives to suit.
      	* gcc.target/powerpc/sse4_1-pmovsxbq.c: Same.
      	* gcc.target/powerpc/sse4_1-pmovsxbw.c: Same.
      	* gcc.target/powerpc/sse4_1-pmovsxdq.c: Same.
      	* gcc.target/powerpc/sse4_1-pmovsxwd.c: Same.
      	* gcc.target/powerpc/sse4_1-pmovsxwq.c: Same.
      	* gcc.target/powerpc/sse4_1-pmovzxbd.c: Same.
      	* gcc.target/powerpc/sse4_1-pmovzxbq.c: Same.
      	* gcc.target/powerpc/sse4_1-pmovzxbw.c: Same.
      	* gcc.target/powerpc/sse4_1-pmovzxdq.c: Same.
      	* gcc.target/powerpc/sse4_1-pmovzxwd.c: Same.
      	* gcc.target/powerpc/sse4_1-pmovzxwq.c: Same.
      285d75a4
    • Paul A. Clarke's avatar
      rs6000: Simplify some SSE4.1 "test" intrinsics · 1ec08caf
      Paul A. Clarke authored
      Copy some simple redirections from i386 <smmintrin.h>, for:
      - _mm_test_all_zeros
      - _mm_test_all_ones
      - _mm_test_mix_ones_zeros
      
      2021-10-11  Paul A. Clarke  <pc@us.ibm.com>
      
      gcc
      	* config/rs6000/smmintrin.h (_mm_test_all_zeros,
      	_mm_test_all_ones, _mm_test_mix_ones_zeros): Rewrite as macro.
      1ec08caf
    • Paul A. Clarke's avatar
      rs6000: Support SSE4.1 "min" and "max" intrinsics · 2be6f6d4
      Paul A. Clarke authored
      Function signatures and decorations match gcc/config/i386/smmintrin.h.
      
      Also, copy tests for _mm_min_epi8, _mm_min_epu16, _mm_min_epi32,
      _mm_min_epu32, _mm_max_epi8, _mm_max_epu16, _mm_max_epi32, _mm_max_epu32
      from gcc/testsuite/gcc.target/i386.
      
      sse4_1-pmaxsb.c and sse4_1-pminsb.c were modified from using
      "char" types to "signed char" types, because the default is unsigned on
      powerpc.
      
      2021-10-11  Paul A. Clarke  <pc@us.ibm.com>
      
      gcc
      	* config/rs6000/smmintrin.h (_mm_min_epi8, _mm_min_epu16,
      	_mm_min_epi32, _mm_min_epu32, _mm_max_epi8, _mm_max_epu16,
      	_mm_max_epi32, _mm_max_epu32): New.
      
      gcc/testsuite
      	* gcc.target/powerpc/sse4_1-pmaxsb.c: Copy from gcc.target/i386.
      	* gcc.target/powerpc/sse4_1-pmaxsd.c: Same.
      	* gcc.target/powerpc/sse4_1-pmaxud.c: Same.
      	* gcc.target/powerpc/sse4_1-pmaxuw.c: Same.
      	* gcc.target/powerpc/sse4_1-pminsb.c: Same.
      	* gcc.target/powerpc/sse4_1-pminsd.c: Same.
      	* gcc.target/powerpc/sse4_1-pminud.c: Same.
      	* gcc.target/powerpc/sse4_1-pminuw.c: Same.
      2be6f6d4
    • GCC Administrator's avatar
      Daily bump. · 732d7638
      GCC Administrator authored
      732d7638
  2. Oct 11, 2021
    • Eric Gallager's avatar
      Add obj-c++.srcman target to gcc/objcp/Makefile. · 30cce6f6
      Eric Gallager authored
      
      Closes #56604
      
      Signed-off-by: default avatarEric Gallager <egallager@gcc.gnu.org>
      
      gcc/objcp/ChangeLog:
      	PR objc++/56604
      	* Make-lang.in: Add obj-c++.srcman: line.
      30cce6f6
    • Jan Hubicka's avatar
      Revert accidental change in ipa-modref-tree.h · 150493d1
      Jan Hubicka authored
      	* ipa-modref-tree.h (struct modref_access_node): Revert
      	accidental change.
      	(struct modref_ref_node): Likewise.
      150493d1
    • Jonathan Wakely's avatar
      libstdc++: Add wrapper for internal uses of std::terminate · 250ddf4c
      Jonathan Wakely authored
      This adds an inline wrapper for std::terminate that doesn't add the
      declaration of std::terminate to namespace std. This allows the
      library to terminate without including all of <exception>.
      
      libstdc++-v3/ChangeLog:
      
      	* include/bits/atomic_timed_wait.h: Remove unused header.
      	* include/bits/c++config (std:__terminate): Define.
      	* include/bits/semaphore_base.h: Remove <exception> and use
      	__terminate instead of terminate.
      	* include/bits/std_thread.h: Likewise.
      	* libsupc++/eh_terminate.cc (std::terminate): Use qualified-id
      	to call __cxxabiv1::__terminate.
      250ddf4c
    • Jonathan Wakely's avatar
      libstdc++: Simplify std::basic_regex::assign · 247bac50
      Jonathan Wakely authored
      We know that if __is_contiguous_iterator is true then we have a pointer
      or a __normal_iterator that wraps a pointer, so we don't need to use
      std::__to_address.
      
      libstdc++-v3/ChangeLog:
      
      	* include/bits/regex.h (basic_regex::assign(Iter, Iter)): Avoid
      	std::__to_address by using poitner directly or using base()
      	member of __normal_iterator.
      247bac50
    • Jonathan Wakely's avatar
      libstdc++: Fix std::numeric_limits::lowest() test for strict modes · 45ba5426
      Jonathan Wakely authored
      This test uses std::is_integral to decide whether we are testing an
      integral or floating-point type. But that fails for __int128 because
      is_integral<__int128> is false in strict modes. By using
      numeric_limits::is_integer instead we get the right answer for all types
      that have a numeric_limits specialization.
      
      We can also simplify the test by removing the unnecessary tag
      dispatching.
      
      libstdc++-v3/ChangeLog:
      
      	* testsuite/18_support/numeric_limits/lowest.cc: Use
      	numeric_limits<T>::is_integer instead of is_integral<T>::value.
      45ba5426
    • Jonathan Wakely's avatar
      libstdc++: Add valid range assertions to std::basic_regex [PR89927] · 6b6788f8
      Jonathan Wakely authored
      This adds some debug assertions to basic_regex. They don't actually
      diagnose the error in the PR yet, but I have another patch to make them
      more effective.
      
      Also change the __glibcxx_assert(false) consistency checks to include a
      string literal that tells the user a bit more about why the process
      aborted. We could consider adding a __glibcxx_bug or
      __glibcxx_internal_error macro for this purpose, but ideally we'll never
      hit such bugs anyway so it shouldn't be needed.
      
      libstdc++-v3/ChangeLog:
      
      	PR libstdc++/89927
      	* include/bits/regex.h (basic_regex(const _Ch_type*, size_t)):
      	Add __glibcxx_requires_string_len assertion.
      	(basic_regex::assign(InputIterator, InputIterator)): Add
      	__glibcxx_requires_valid_range assertion.
      	* include/bits/regex_scanner.tcc (_Scanner::_M_advance())
      	(_Scanner::_M_scan_normal()): Use string literal in assertions.
      6b6788f8
    • Jonathan Wakely's avatar
      libstdc++: Fix std::match_results::end() for failed matches [PR102667] · 84088dc4
      Jonathan Wakely authored
      The end() function needs to consider whether the underlying vector is
      empty, not whether the match_results object is empty. That's because the
      underlying vector will always contain at least three elements for a
      match_results object that is "ready". It contains three extra elements
      which are stored in the vector but are not considered part of sequence,
      and so should not be part of the [begin(),end()) range.
      
      libstdc++-v3/ChangeLog:
      
      	PR libstdc++/102667
      	* include/bits/regex.h (match_result::empty()): Optimize by
      	calling the base function directly.
      	(match_results::end()): Check _Base_type::empty() not empty().
      	* testsuite/28_regex/match_results/102667.C: New test.
      84088dc4
    • Jan Hubicka's avatar
      Commonize ipa-pta constraint generation for calls · 008e7397
      Jan Hubicka authored
      Commonize the three paths to produce constraints for function call
      and makes it more flexible, so we can implement new features more easily.  Main
      idea is to not special case pure and const since we can now describe all of
      pure/const via their EAF flags (implicit_const_eaf_flags and
      implicit_pure_eaf_flags) and info on existence of global memory loads/stores in
      function which is readily available in the modref tree.
      
      While rewriting the function, I dropped some of optimizations in the way we
      generate constraints. Some of them we may want to add back, but I think the
      constraint solver should be fast to get rid of them quickly, so it looks like
      bit of premature optimization.
      
      We now always produce one additional PTA variable (callescape) for things that
      escape into function call and thus can be stored to parameters or global memory
      (if modified). This is no longer the same as global escape in case function is
      not reading global memory. It is also not same as call use, since we now
      understand the fact that interposable functions may use parameter in a way that
      is not releavnt for PTA (so we can not optimize out stores initializing the
      memory, but we can be safe about fact that pointers stored does not escape).
      
      Compared to previous code we now handle correctly EAF_NOT_RETURNED in all cases
      (previously we did so only when all parameters had the flag) and also handle
      NOCLOBBER in more cases (since we make difference between global escape and
      call escape). Because I commonized code handling args and static chains, we
      could now easily extend modref to also track flags for static chain and return
      slot which I plan to do next.
      
      Otherwise I put some effort into producing constraints that produce similar
      solutions as before (so it is harder to debug differences). For example if
      global memory is written one can simply move callescape to escape rather then
      making everything escape by its own constraints, but it affects ipa-pta
      testcases.
      
      gcc/ChangeLog:
      
      	* ipa-modref-tree.h (modref_tree::global_access_p): New member
      	function.
      	* ipa-modref.c:
      	(implicint_const_eaf_flags,implicit_pure_eaf_flags,
      	ignore_stores_eaf_flags): Move to ipa-modref.h
      	(remove_useless_eaf_flags): Remove early exit on NOCLOBBER.
      	(modref_summary::global_memory_read_p): New member function.
      	(modref_summary::global_memory_written_p): New member function.
      	* ipa-modref.h (modref_summary::global_memory_read_p,
      	modref_summary::global_memory_written_p): Declare.
      	(implicint_const_eaf_flags,implicit_pure_eaf_flags,
      	ignore_stores_eaf_flags): move here.
      	* tree-ssa-structalias.c: Include ipa-modref-tree.h, ipa-modref.h
      	and attr-fnspec.h.
      	(handle_rhs_call): Rewrite.
      	(handle_call_arg): New function.
      	(determine_global_memory_access): New function.
      	(handle_const_call): Remove
      	(handle_pure_call): Remove
      	(find_func_aliases_for_call): Update use of handle_rhs_call.
      	(compute_points_to_sets): Handle global memory acccesses
      	selectively
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.dg/torture/ssa-pta-fn-1.c: Fix template; add noipa.
      	* gcc.dg/tree-ssa/pta-callused.c: Fix template.
      008e7397
    • Patrick Palka's avatar
      c++: Add testcase for already-fixed PR [PR102643] · 0de8c2f8
      Patrick Palka authored
      Fixed with r12-1744.
      
      	PR c++/102643
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/cpp2a/class-deduction-alias11.C: New test.
      0de8c2f8
    • Diane Meirowitz's avatar
      doc: improve -fsanitize=undefined description · 1c0a83ef
      Diane Meirowitz authored
      gcc/ChangeLog:
      	* doc/invoke.texi: Add link to UndefinedBehaviorSanitizer
      	documentation, mention UBSAN_OPTIONS, similar to what is done
      	for AddressSanitizer.
      1c0a83ef
    • Jonathan Wakely's avatar
      f8582398
Loading