Skip to content
Snippets Groups Projects
  1. Sep 14, 2022
    • Richard Biener's avatar
      tree-optimization/106938 - cleanup abnormal edges after inlining · cd14c97c
      Richard Biener authored
      After inlining and IPA transforms we run fixup_cfg to fixup CFG
      effects in other functions.  But that fails to clean abnormal
      edges from non-pure/const calls which might no longer be necessary
      when ->calls_setjmp is false.  The following ensures this happens
      and refactors things so we call EH/abnormal cleanup only on the
      last stmt in a block.
      
      	PR tree-optimization/106938
      	* tree-cfg.cc (execute_fixup_cfg): Purge dead abnormal
      	edges for all last stmts in a block.  Do EH cleanup
      	only on the last stmt in a block.
      
      	* gcc.dg/pr106938.c: New testcase.
      cd14c97c
    • Aldy Hernandez's avatar
      [PR106936] Remove assert from get_value_range. · 12a8d5e2
      Aldy Hernandez authored
      This assert was put here to make sure that the legacy
      get_value_range() wasn't being called on stuff that legacy couldn't
      handle (floats, etc), because the result would ultimately be copied
      into a value_range_equiv.
      
      In this case, simplify_casted_cond() is calling it on an offset_type
      which is neither an integer nor a pointer.  However, range_of_expr
      happily punted on it, and then the fallthru code set the range to
      VARYING.  As value_range_equiv can store VARYING types of anything
      (including types it can't handle), this is fine.
      
      The easiest thing to do is remove the assert.  If someone from the non
      legacy world tries to get a non integer/pointer range here, it's going
      to blow up anyhow because the temporary in get_value_range is
      int_range_max.
      
      	PR tree-optimization/106936
      
      gcc/ChangeLog:
      
      	* value-query.cc (range_query::get_value_range): Remove assert.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/tree-ssa/pr106936.C: New test.
      12a8d5e2
    • Jan-Benedict Glaw's avatar
      Drop unused variable · 1457be6d
      Jan-Benedict Glaw authored
      With the "STABS: remove -gstabs and -gxcoff functionality" patch, a left-over
      `start` variable remained unused:
      
      /usr/lib/gcc-snapshot/bin/g++  -fno-PIE -c   -g -O2   -DIN_GCC  -DCROSS_DIRECTORY_STRUCTURE   -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Werror -fno-common  -DHAVE_CONFIG_H -I. -I. -I../../gcc/gcc -I../../gcc/gcc/. -I../../gcc/gcc/../include -I../../gcc/gcc/../libcpp/include -I../../gcc/gcc/../libcody  -I../../gcc/gcc/../libdecnumber -I../../gcc/gcc/../libdecnumber/dpd -I../libdecnumber -I../../gcc/gcc/../libbacktrace   -o mips.o -MT mips.o -MMD -MP -MF ./.deps/mips.TPo ../../gcc/gcc/config/mips/mips.cc
      ../../gcc/gcc/config/mips/mips.cc: In function 'void mips_option_override()':
      ../../gcc/gcc/config/mips/mips.cc:20021:10: error: unused variable 'start' [-Werror=unused-variable]
      20021 |   int i, start, regno, mode;
            |          ^~~~~
      
      2022-09-14  Jan-Benedict Glaw  <jbglaw@lug-owl.de>
      
      gcc/
      	* config/mips/mips.cc (mips_option_override): Drop unused variable.
      1457be6d
    • Julian Brown's avatar
      OpenMP 5.0: Clause ordering for OpenMP 5.0 (topological sorting by base pointer) · b57abd07
      Julian Brown authored
      This patch reimplements the omp_target_reorder_clauses function in
      anticipation of supporting "deeper" struct mappings (that is, with
      several structure dereference operators, or similar).
      
      The idea is that in place of the (possibly quadratic) algorithm in
      omp_target_reorder_clauses that greedily moves clauses containing
      addresses that are subexpressions of other addresses before those other
      addresses, we employ a topological sort algorithm to calculate a proper
      order for map clauses. This should run in linear time, and hopefully
      handles degenerate cases where multiple "levels" of indirect accesses
      are present on a given directive.
      
      The new method also takes care to keep clause groups together, addressing
      the concerns raised in:
      
        https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570501.html
      
      To figure out if some given clause depends on a base pointer in another
      clause, we strip off the outer layers of the address expression, and check
      (via a tree_operand_hash hash table we have built) if the result is a
      "base pointer" as defined in OpenMP 5.0 (1.2.6 Data Terminology). There
      are some subtleties involved, however:
      
       - We must treat MEM_REF with zero offset the same as INDIRECT_REF.
         This should probably be fixed in the front ends instead so we always
         use a canonical form (probably INDIRECT_REF). The following patch
         shows one instance of the problem, but there may be others:
      
         https://gcc.gnu.org/pipermail/gcc-patches/2021-May/571382.html
      
       - Mapping a whole struct implies mapping each of that struct's
         elements, which may be base pointers. Because those base pointers
         aren't necessarily explicitly referenced in the directive in question,
         we treat the whole-struct mapping as a dependency instead.
      
      2022-09-13  Julian Brown  <julian@codesourcery.com>
      
      gcc/
      	* gimplify.cc (is_or_contains_p, omp_target_reorder_clauses): Delete
      	functions.
      	(omp_tsort_mark): Add enum.
      	(omp_mapping_group): Add struct.
      	(debug_mapping_group, omp_get_base_pointer, omp_get_attachment,
      	omp_group_last, omp_gather_mapping_groups, omp_group_base,
      	omp_index_mapping_groups, omp_containing_struct,
      	omp_tsort_mapping_groups_1, omp_tsort_mapping_groups,
      	omp_segregate_mapping_groups, omp_reorder_mapping_groups): New
      	functions.
      	(gimplify_scan_omp_clauses): Call above functions instead of
      	omp_target_reorder_clauses, unless we've seen an error.
      	* omp-low.cc (scan_sharing_clauses): Avoid strict test if we haven't
      	sorted mapping groups.
      
      gcc/testsuite/
      	* g++.dg/gomp/target-lambda-1.C: Adjust expected output.
      	* g++.dg/gomp/target-this-3.C: Likewise.
      	* g++.dg/gomp/target-this-4.C: Likewise.
      b57abd07
    • Robin Dapp's avatar
      testsuite/s390: Add -mzarch to ifcvt test cases. · 2aa5f880
      Robin Dapp authored
      Add missing -mzarch to ifcvt test cases.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/s390/ifcvt-one-insn-bool.c: Add -mzarch.
      	* gcc.target/s390/ifcvt-one-insn-char.c: Dito.
      	* gcc.target/s390/ifcvt-two-insns-bool.c: Dito.
      	* gcc.target/s390/ifcvt-two-insns-int.c: Dito.
      	* gcc.target/s390/ifcvt-two-insns-long.c: Add -mzarch and change
      	long into long long.
      2aa5f880
    • Robin Dapp's avatar
      testsuite/s390: Fix vperm-rev testcases. · 48970cba
      Robin Dapp authored
      Add -save-temps and tabs for matching.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/s390/vector/vperm-rev-z14.c: Add -save-temps.
      	* gcc.target/s390/vector/vperm-rev-z15.c: Likewise.
      48970cba
    • Jakub Jelinek's avatar
      Disallow pointer operands for |, ^ and partly & [PR106878] · 645ef01a
      Jakub Jelinek authored
      My change to match.pd (that added the two simplifications this patch
      touches) results in more |/^/& assignments with pointer arguments,
      but since r12-1608 we reject pointer operands for BIT_NOT_EXPR.
      
      Disallowing them for BIT_NOT_EXPR and allowing for BIT_{IOR,XOR,AND}_EXPR
      leads to a match.pd maintainance nightmare (see one of the patches in the
      PR), so either we want to allow pointer operand on BIT_NOT_EXPR (but then
      we run into issues e.g. with the ranger which expects it can emulate
      BIT_NOT_EXPR ~X as - 1 - X which doesn't work for pointers which don't
      support MINUS_EXPR), or the following patch disallows pointer arguments
      for all of BIT_{IOR,XOR,AND}_EXPR with the exception of BIT_AND_EXPR
      with INTEGER_CST last operand (for simpler pointer realignment).
      I had to tweak one reassoc optimization and the two match.pd
      simplifications.
      
      2022-09-14  Jakub Jelinek  <jakub@redhat.com>
      
      	PR tree-optimization/106878
      	* tree-cfg.cc (verify_gimple_assign_binary): Disallow pointer,
      	reference or OFFSET_TYPE BIT_IOR_EXPR, BIT_XOR_EXPR or, unless
      	the second argument is INTEGER_CST, BIT_AND_EXPR.
      	* match.pd ((type) X op CST -> (type) (X op ((type-x) CST)),
      	(type) (((type2) X) op Y) -> (X op (type) Y)): Punt for
      	POINTER_TYPE_P or OFFSET_TYPE.
      	* tree-ssa-reassoc.cc (optimize_range_tests_cmp_bitwise): For
      	pointers cast them to pointer sized integers first.
      
      	* gcc.c-torture/compile/pr106878.c: New test.
      645ef01a
    • Richard Biener's avatar
      tree-optimization/106934 - avoid BIT_FIELD_REF of bitfields · 05f5c42c
      Richard Biener authored
      The following avoids creating BIT_FIELD_REF of bitfields in
      update-address-taken.  The patch doesn't implement punning to
      a full precision integer type but leaves a comment according to
      that.
      
      	PR tree-optimization/106934
      	* tree-ssa.cc (non_rewritable_mem_ref_base): Avoid BIT_FIELD_REFs
      	of bitfields.
      	(maybe_rewrite_mem_ref_base): Likewise.
      
      	* gfortran.dg/pr106934.f90: New testcase.
      05f5c42c
    • liuhongt's avatar
      Check another epilog variable peeling case in vectorizable_nonlinear_induction. · 93b09bf3
      liuhongt authored
      in vectorizable_nonlinear_induction, r13-2503-gc13223b790bbc5 prevent variable peeling by
      only checking LOOP_VINFO_MASK_SKIP_NITERS (loop_vinfo). But when
      "!vect_use_loop_mask_for_alignment_p (loop_vinfo) &&
      LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) < 0", vectorizer will
      still do variable peeling for epilog, and it hits gcc_assert in
      vect_peel_nonlinear_iv_init.
      
      gcc/ChangeLog:
      
      	PR tree-optimization/106905
      	* tree-vect-loop.cc (vectorizable_nonlinear_induction): Return
      	false when !vect_use_loop_mask_for_alignment_p (loop_vinfo) &&
      	LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) < 0.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/i386/pr106905.c: New test.
      	* gcc.target/ia64/pr106905.c: New test.
      93b09bf3
    • Torbjörn SVENSSON's avatar
      testsuite: gluefile file need to be prefixed · 9d503515
      Torbjörn SVENSSON authored
      
      When the status wrapper is used, the gluefile need to be prefixed with
      -Wl, in order for the test cases to have the dump files with the
      expected names.
      
      2022-09-14  Torbjörn SVENSSON  <torbjorn.svensson@foss.st.com>
      
      gcc/testsuite/
      	PR target/95720
      	* lib/g++.exp: Moved gluefile block to after flags have been
      	  prefixed for the target_compile call.
      	* lib/gcc.exp: Likewise.
      	* lib/wrapper.exp: Reset adjusted state flag.
      
      Co-Authored-By: default avatarYvan ROUX <yvan.roux@foss.st.com>
      Signed-off-by: default avatarTorbjörn SVENSSON <torbjorn.svensson@foss.st.com>
      9d503515
    • GCC Administrator's avatar
      Daily bump. · 1995a022
      GCC Administrator authored
      1995a022
  2. Sep 13, 2022
    • Roger Sayle's avatar
      PR target/106877: Robustify reg-stack to malformed asm. · ff85f0af
      Roger Sayle authored
      This patch resolves PR target/106877 an ICE-on-invalid inline-asm
      regression.  An innocent upstream change means that the test case
      from PR inline-asm/84683 now hits a different assert in reg-stack.cc's
      move_for_stack_reg.  Fixed by duplicating Jakub's solution to PR 84683
      https://gcc.gnu.org/pipermail/gcc-patches/2018-March/495193.html at
      this second (similar) gcc_assert.
      
      2022-09-13  Roger Sayle  <roger@nextmovesoftware.com>
      
      gcc/ChangeLog
      	PR target/106877
      	* reg-stack.cc (move_for_stack_reg): Check for any_malformed_asm
      	in gcc_assert.
      
      gcc/testsuite/ChangeLog
      	PR target/106877
      	* g++.dg/ext/pr106877.C: New test case.
      ff85f0af
    • Jakub Jelinek's avatar
      libgomp: Appease some static analyzers [PR106906] · e11babbf
      Jakub Jelinek authored
      While icv_addr[1] = false; assignments where icv_addr has void *
      element type is correct and matches how it is used (in those cases
      the void * pointer is then cast to bool and used that way), there is no
      reason not to add explicit (void *) casts there which are there already
      for (void *) true.  And, there is in fact even no point in actually
      doing those stores at all because we set that pointer to NULL a few
      lines earlier.  So, this patch adds the explicit casts and then
      comments those out to show intent.
      
      2022-09-13  Jakub Jelinek  <jakub@redhat.com>
      
      	PR libgomp/106906
      	* env.c (get_icv_member_addr): Cast false to void * before assigning
      	it to icv_addr[1], and comment the whole assignment out.
      e11babbf
    • Patrick Palka's avatar
      libstdc++: Implement ranges::slide_view from P2442R1 · 7d7e2149
      Patrick Palka authored
      This also implements the LWG 3711 and 3712 changes to slide_view.
      
      libstdc++-v3/ChangeLog:
      
      	* include/std/ranges (__detail::__slide_caches_nothing): Define.
      	(__detail::__slide_caches_last): Define.
      	(__detail::__slide_caches_first): Define.
      	(slide_view): Define.
      	(enable_borrowed_range<slide_view>): Define.
      	(slide_view::_Iterator): Define.
      	(slide_view::_Sentinel): Define.
      	(views::__detail::__can_slide_view): Define.
      	(views::_Slide, views::slide): Define.
      	* testsuite/std/ranges/adaptors/slide/1.cc: New test.
      7d7e2149
    • Patrick Palka's avatar
      libstdc++: Implement ranges::chunk_view from P2442R1 · 5d84a441
      Patrick Palka authored
      This also implements the LWG 3707, 3710 and 3712 changes to chunk_view.
      
      libstdc++-v3/ChangeLog:
      
      	* include/std/ranges (__detail::__div_ceil): Define.
      	(chunk_view): Define.
      	(chunk_view::_OuterIter): Define.
      	(chunk_view::_OuterIter::value_type): Define.
      	(chunk_view::_InnerIter): Define.
      	(chunk_view<_Vp>): Define partial specialization for forward
      	ranges.
      	(enable_borrowed_range<chunk_view>): Define.
      	(chunk_view<_Vp>::_Iterator): Define.
      	(views::__detail::__can_chunk_view): Define.
      	(views::_Chunk, views::chunk): Define.
      	* testsuite/std/ranges/adaptors/chunk/1.cc: New test.
      5d84a441
    • Patrick Palka's avatar
      libstdc++: Implement LWG 3569 changes to join_view::_Iterator · 7aa80c82
      Patrick Palka authored
      libstdc++-v3/ChangeLog:
      
      	* include/std/ranges (join_view::_Iterator::_M_satisfy):
      	Adjust resetting _M_inner as per LWG 3569.
      	(join_view::_Iterator::_M_inner): Wrap in std::optional
      	as per LWG 3569.
      	(join_view::_Iterator::_Iterator): Relax constraints as
      	per LWG 3569.
      	(join_view::_Iterator::operator*): Adjust as per LWG 3569.
      	(join_view::_Iterator::operator->): Likewise.
      	(join_view::_Iterator::operator++): Likewise.
      	(join_view::_Iterator::operator--): Likewise.
      	(join_view::_Iterator::iter_move): Likewise.
      	(join_view::_Iterator::iter_swap): Likewise.
      	* testsuite/std/ranges/adaptors/join.cc (test14): New test.
      7aa80c82
    • Patrick Palka's avatar
      libstdc++: Avoid -Wparentheses warning with debug iterators · edf6fe78
      Patrick Palka authored
      I noticed compiling e.g. std/ranges/adaptors/join.cc with
      -D_GLIBCXX_DEBUG -Wsystem-headers -Wall gives the warning:
      
        gcc/libstdc++-v3/include/debug/safe_iterator.h:477:9: warning: suggest parentheses around ‘&&’ within ‘||’ [-Wparentheses]
      
      libstdc++-v3/ChangeLog:
      
      	* include/debug/safe_iterator.h (_GLIBCXX_DEBUG_VERIFY_OPERANDS):
      	Add parentheses to avoid -Wparentheses warning.
      edf6fe78
    • Patrick Palka's avatar
      c++: remove single-parameter version of mark_used · 5e1031ff
      Patrick Palka authored
      gcc/cp/ChangeLog:
      
      	* cp-tree.h (mark_used): Remove single-parameter overload.  Add
      	default argument to the two-parameter overload.
      	* decl2.cc (mark_used): Likewise.
      5e1031ff
    • Patrick Palka's avatar
      c++: two-parameter version of cxx_constant_value · fea6ae0e
      Patrick Palka authored
      Since some callers need the complain parameter but not the object
      parameter, let's introduce and use an overload of cxx_constant_value
      that omits the latter.
      
      gcc/cp/ChangeLog:
      
      	* cp-tree.h (cxx_constant_value): Define two-parameter version
      	that omits the object parameter.
      	* decl.cc (build_explicit_specifier): Omit NULL_TREE object
      	argument to cxx_constant_value.
      	* except.cc (build_noexcept_spec): Likewise.
      	* pt.cc (expand_integer_pack): Likewise.
      	(fold_targs_r): Likewise.
      	* semantics.cc (finish_if_stmt_cond): Likewise.
      fea6ae0e
    • Patrick Palka's avatar
      c++: some missing-SFINAE fixes · 441a4880
      Patrick Palka authored
      It looks like we aren't respecting SFINAE for:
      
        * an invalid/non-constant conditional explicit-specifier
        * a non-constant conditional noexcept-specifier
        * a non-constant argument to __integer_pack
      
      This patch fixes these in the usual way, by passing complain and
      propagating error_mark_node appropriately.
      
      gcc/cp/ChangeLog:
      
      	* decl.cc (build_explicit_specifier): Pass complain to
      	cxx_constant_value.
      	* except.cc (build_noexcept_spec): Likewise.
      	* pt.cc (expand_integer_pack): Likewise.
      	(tsubst_function_decl): Propagate error_mark_node returned
      	from build_explicit_specifier.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/cpp1z/noexcept-type26.C: New test.
      	* g++.dg/cpp2a/explicit19.C: New test.
      	* g++.dg/ext/integer-pack6.C: New test.
      441a4880
    • Max Filippov's avatar
      48e40d0b
    • Kewen Lin's avatar
      rs6000: Fix the check of bif argument number [PR104482] · 38db4834
      Kewen Lin authored
      As PR104482 shown, it's one regression about the handlings when
      the argument number is more than the one of built-in function
      prototype.  The new bif support only catches the case that the
      argument number is less than the one of function prototype, but
      it misses the case that the argument number is more than the one
      of function prototype.  Because it uses "n != expected_args",
      n is updated in
      
         for (n = 0; !VOID_TYPE_P (TREE_VALUE (fnargs)) && n < nargs;
              fnargs = TREE_CHAIN (fnargs), n++)
      
      , it's restricted to be less than or equal to expected_args with
      the guard !VOID_TYPE_P (TREE_VALUE (fnargs)), so it's wrong.
      
      The fix is to use nargs instead, also move the checking hunk's
      location ahead to avoid useless further scanning when the counts
      mismatch.
      
      	PR target/104482
      
      gcc/ChangeLog:
      
      	* config/rs6000/rs6000-c.cc (altivec_resolve_overloaded_builtin): Fix
      	the equality check for argument number, and move this hunk ahead.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/powerpc/pr104482.c: New test.
      38db4834
    • Kewen.Lin's avatar
      rs6000: Handle unresolved overloaded builtin [PR105485] · 94504c9a
      Kewen.Lin authored
      PR105485 exposes that new builtin function framework doesn't handle
      unresolved overloaded builtin function well.  With new builtin
      function support, we don't have builtin info for any overloaded
      rs6000_gen_builtins enum, since they are expected to be resolved to
      one specific instance.  So when function rs6000_gimple_fold_builtin
      faces one unresolved overloaded builtin, the access for builtin info
      becomes out of bound and gets ICE then.
      
      We should not try to fold one unresolved overloaded builtin there
      and as the previous support we should emit one error message during
      expansion phase like "unresolved overload for builtin ...".
      
      	PR target/105485
      
      gcc/ChangeLog:
      
      	* config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_builtin): Add
      	the handling for unresolved overloaded builtin function.
      	(rs6000_expand_builtin): Likewise.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.target/powerpc/pr105485.C: New test.
      94504c9a
    • Kewen Lin's avatar
      rs6000: Suggest unroll factor for loop vectorization · 0ee1548d
      Kewen Lin authored
      Commit r12-6679-g7ca1582ca60dc8 made vectorizer accept one
      unroll factor to be applied to vectorization factor when
      vectorizing the main loop, it would be suggested by target
      when doing costing.
      
      This patch introduces function determine_suggested_unroll_factor
      for rs6000 port, to make it be able to suggest the unroll factor
      for a given loop being vectorized.  Referring to aarch64 port
      and basing on the analysis on SPEC2017 performance evaluation
      results, it mainly considers these aspects:
        1) unroll option and pragma which can disable unrolling for the
           given loop;
        2) simple hardware resource model with issued non memory access
           vector insn per cycle;
        3) aggressive heuristics when iteration count is unknown:
           - reduction case to break cross iteration dependency;
           - emulated gather load;
        4) estimated iteration count when iteration count is unknown;
      
      With this patch, SPEC2017 performance evaluation results on
      Power8/9/10 are listed below (speedup pct.):
      
        * Power10
          - O2: all are neutral (excluding some noises);
          - Ofast: 510.parest_r +6.67%, the others are neutral
                   (use ... for the followings);
          - Ofast + unroll: 510.parest_r +5.91%, ...
          - Ofast + LTO + PGO: 510.parest_r +3.00%, ...
          - Ofast + cheap vect cost: 510.parest_r +6.23%, ...
          - Ofast + very-cheap vect cost: all are neutral;
      
        * Power9
          - Ofast: 510.parest_r +8.73%, 538.imagick_r +11.18%
                   (likely noise), 500.perlbench_r +1.84%, ...
      
        * Power8
          - Ofast: 510.parest_r +5.43%, ...;
      
      This patch also introduces one documented parameter
      rs6000-vect-unroll-limit= similar to what aarch64 proposes,
      by evaluating on P8/P9/P10, the default value 4 is slightly
      better than the other choices like 2 and 8.
      
      It also parameterizes two other values as undocumented
      parameters for future tweaking.  One parameter is
      rs6000-vect-unroll-issue, it's to simply model hardware
      resource for non memory access vector instructions to avoid
      excessive unrolling, initially I tried to use the value in
      the hook rs6000_issue_rate, but the evaluation showed it's
      bad, so I evaluated different values 2/4/6/8 on P8/P9/P10 at
      Ofast, the results showed the default value 4 is good enough
      on these different architectures.  For a record, choice 8
      could make 510.parest_r's gain become smaller or gone on
      P8/P9/P10; choice 6 could make 503.bwaves_r degrade by more
      than 1% on P8/P10; and choice 2 could make 538.imagick_r
      degrade by 3.8%.  The other parameter is
      rs6000-vect-unroll-reduc-threshold.  It's mainly inspired by
      510.parest_r and tweaked as it, evaluating with different
      values 0/1/2/3 for the threshold, it showed value 1 is the
      best choice.  For a record, choice 0 could make 525.x264_r
      degrade by 2% and 527.cam4_r degrade by 2.95% on P10,
      548.exchange2_r degrade by 1.41% and 527.cam4_r degrade by
      2.54% on P8; choice 2 and bigger values could make
      510.parest_r's gain become smaller.
      
      gcc/ChangeLog:
      
      	* config/rs6000/rs6000.cc (class rs6000_cost_data): Add new members
      	m_nstores, m_reduc_factor, m_gather_load and member function
      	determine_suggested_unroll_factor.
      	(rs6000_cost_data::update_target_cost_per_stmt): Update for m_nstores,
      	m_reduc_factor and m_gather_load.
      	(rs6000_cost_data::determine_suggested_unroll_factor): New function.
      	(rs6000_cost_data::finish_cost): Use determine_suggested_unroll_factor.
      	* config/rs6000/rs6000.opt (rs6000-vect-unroll-limit): New parameter.
      	(rs6000-vect-unroll-issue): Likewise.
      	(rs6000-vect-unroll-reduc-threshold): Likewise.
      	* doc/invoke.texi (rs6000-vect-unroll-limit): Document new parameter.
      0ee1548d
    • Richard Biener's avatar
      middle-end/106909 - CTRL altering flag after folding · 2c867232
      Richard Biener authored
      The following makes sure to clear the CTRL altering flag when
      folding emits a __builitin_unreachable in place of a virtual call
      which now might become a trap.
      
      	PR middle-end/106909
      	* gimple-fold.cc (gimple_fold_call): Clear the ctrl-altering
      	flag of a unreachable call.
      2c867232
    • Richard Biener's avatar
      tree-optimization/106913 - ICE with -da and -Wuninitialized · ad08894e
      Richard Biener authored
      The following avoids setting and not clearing an auto_bb_flag
      on EXIT_BLOCK which we don't verify for such stale flags but
      dump_bb_info still asserts on them.
      
      	PR tree-optimization/106913
      	* tree-ssa-uninit.cc (warn_uninitialized_vars): Do not set
      	ft_reachable on EXIT_BLOCK.
      ad08894e
    • Richard Sandiford's avatar
      aarch64: Vector move fixes for +nosimd · 721c0fb3
      Richard Sandiford authored
      This patch fixes various issues around the handling of vectors
      and (particularly) vector structures with +nosimd.  Previously,
      passing and returning structures would trigger an ICE, since:
      
      * we didn't allow the structure modes to be stored in FPRs
      
      * we didn't provide +nosimd move patterns
      
      * splitting the moves into word-sized pieces (the default
        strategy without move patterns) doesn't work because the
        registers are doubleword sized.
      
      The patch is a bit of a hodge-podge since a lot of the handling of
      moves, register costs, and register legitimacy is so interconnected.
      It didn't seem feasible to split things further.
      
      Some notes:
      
      * The patch recognises vector and tuple modes based on TARGET_FLOAT
        rather than TARGET_SIMD, and instead adds TARGET_SIMD to places
        that really do need the vector ISA.  This is necessary for the
        modes to be handled correctly in register arguments and returns.
      
      * The 64-bit (DREG) STP peephole required TARGET_SIMD but the
        LDP peephole didn't.  I think the LDP one is right, since
        DREG moves could involve GPRs as well as FPRs.
      
      * The patch keeps the existing choices of instructions for
        TARGET_SIMD, just in case they happen to be better than FMOV
        on some uarches.
      
      * Before the patch, +nosimd Q<->Q moves of 128-bit scalars went via
        a GPR, thanks to a secondary reload pattern.  This approach might
        not be ideal, but there's no reason that 128-bit vectors should
        behave differently from 128-bit scalars.  The patch therefore
        extends the current scalar approach to vectors.
      
      * Multi-vector LD1 and ST1 require TARGET_SIMD, so the TARGET_FLOAT
        structure moves need to use LDP/STP and LDR/STR combinations
        instead.  That's also what we do for big-endian even with
        TARGET_SIMD, so most of the code was already there.  The patterns
        for structures of 64-bit vectors are identical, but the patterns
        for structures of 128-bit vectors need to cope with the lack of
        128-bit Q<->Q moves.
      
        It isn't feasible to move multi-vector tuples via GPRs, so the
        patch moves them via memory instead.  This contaminates the port
        with its first secondary memory reload.
      
      gcc/
      
      	* config/aarch64/aarch64.cc (aarch64_classify_vector_mode): Use
      	TARGET_FLOAT instead of TARGET_SIMD.
      	(aarch64_vectorize_related_mode): Restrict ADVSIMD handling to
      	TARGET_SIMD.
      	(aarch64_hard_regno_mode_ok): Don't allow tuples of 2 64-bit vectors
      	in GPRs.
      	(aarch64_classify_address): Treat little-endian structure moves
      	like big-endian for TARGET_FLOAT && !TARGET_SIMD.
      	(aarch64_secondary_memory_needed): New function.
      	(aarch64_secondary_reload): Handle 128-bit Advanced SIMD vectors
      	in the same way as TF, TI and TD.
      	(aarch64_rtx_mult_cost): Restrict ADVSIMD handling to TARGET_SIMD.
      	(aarch64_rtx_costs): Likewise.
      	(aarch64_register_move_cost): Treat a pair of 64-bit vectors
      	separately from a single 128-bit vector.  Handle the cost implied
      	by aarch64_secondary_memory_needed.
      	(aarch64_simd_valid_immediate): Restrict ADVSIMD handling to
      	TARGET_SIMD.
      	(aarch64_expand_vec_perm_const_1): Likewise.
      	(TARGET_SECONDARY_MEMORY_NEEDED): New macro.
      	* config/aarch64/iterators.md (VTX): New iterator.
      	* config/aarch64/aarch64.md (arches): Add fp_q as a synonym of simd.
      	(arch_enabled): Adjust accordingly.
      	(@aarch64_reload_mov<TX:mode>): Extend to...
      	(@aarch64_reload_mov<VTX:mode>): ...this.
      	* config/aarch64/aarch64-simd.md (mov<mode>): Require TARGET_FLOAT
      	rather than TARGET_SIMD.
      	(movmisalign<mode>): Likewise.
      	(load_pair<DREG:mode><DREG2:mode>): Likewise.
      	(vec_store_pair<DREG:mode><DREG2:mode>): Likewise.
      	(load_pair<VQ:mode><VQ2:mode>): Likewise.
      	(vec_store_pair<VQ:mode><VQ2:mode>): Likewise.
      	(@aarch64_split_simd_mov<mode>): Likewise.
      	(aarch64_get_low<mode>): Likewise.
      	(aarch64_get_high<mode>): Likewise.
      	(aarch64_get_half<mode>): Likewise.  Canonicalize to a move for
      	lowpart extracts.
      	(*aarch64_simd_mov<VDMOV:mode>): Require TARGET_FLOAT rather than
      	TARGET_SIMD.  Use different w<-w and r<-w instructions for
      	!TARGET_SIMD.  Disable immediate moves for !TARGET_SIMD but
      	add an alternative specifically for w<-Z.
      	(*aarch64_simd_mov<VQMOV:mode>): Require TARGET_FLOAT rather than
      	TARGET_SIMD.  Likewise for the associated define_splits.  Disable
      	FPR moves and immediate moves for !TARGET_SIMD but add an alternative
      	specifically for w<-Z.
      	(aarch64_simd_mov_from_<mode>high): Require TARGET_FLOAT rather than
      	TARGET_SIMD.  Restrict the existing alternatives to TARGET_SIMD
      	but add a new r<-w one for !TARGET_SIMD.
      	(*aarch64_get_high<mode>): New pattern.
      	(load_pair_lanes<mode>): Require TARGET_FLOAT rather than TARGET_SIMD.
      	(store_pair_lanes<mode>): Likewise.
      	(*aarch64_combine_internal<mode>): Likewise.  Restrict existing
      	w<-w, w<-r and w<-m alternatives to TARGET_SIMD but add a new w<-r
      	alternative for !TARGET_SIMD.
      	(*aarch64_combine_internal_be<mode>): Likewise.
      	(aarch64_combinez<mode>): Require TARGET_FLOAT rather than TARGET_SIMD.
      	Remove bogus arch attribute.
      	(*aarch64_combinez_be<mode>): Likewise.
      	(@aarch64_vec_concat<mode>): Require TARGET_FLOAT rather than
      	TARGET_SIMD.
      	(aarch64_combine<mode>): Likewise.
      	(aarch64_rev_reglist<mode>): Likewise.
      	(mov<mode>): Likewise.
      	(*aarch64_be_mov<VSTRUCT_2D:mode>): Extend to TARGET_FLOAT &&
      	!TARGET_SIMD, regardless of endianness.  Extend associated
      	define_splits in the same way, both for this pattern and the
      	ones below.
      	(*aarch64_be_mov<VSTRUCT_2Qmode>): Likewise.  Restrict w<-w
      	alternative to TARGET_SIMD.
      	(*aarch64_be_movoi): Likewise.
      	(*aarch64_be_movci): Likewise.
      	(*aarch64_be_movxi): Likewise.
      	(*aarch64_be_mov<VSTRUCT_4QD:mode>): Extend to TARGET_FLOAT
      	&& !TARGET_SIMD, regardless of endianness.  Restrict w<-w alternative
      	to TARGET_SIMD for tuples of 128-bit vectors.
      	(*aarch64_be_mov<VSTRUCT_4QD:mode>): Likewise.
      	* config/aarch64/aarch64-ldpstp.md: Remove TARGET_SIMD condition
      	from DREG STP peephole.  Change TARGET_SIMD to TARGET_FLOAT in
      	the VQ and VP_2E LDP and STP peepholes.
      
      gcc/testsuite/
      	* gcc.target/aarch64/ldp_stp_20.c: New test.
      	* gcc.target/aarch64/ldp_stp_21.c: Likewise.
      	* gcc.target/aarch64/ldp_stp_22.c: Likewise.
      	* gcc.target/aarch64/ldp_stp_23.c: Likewise.
      	* gcc.target/aarch64/ldp_stp_24.c: Likewise.
      	* gcc.target/aarch64/movv16qi_1.c (gpr_to_gpr): New function.
      	* gcc.target/aarch64/movv8qi_1.c (gpr_to_gpr): Likewise.
      	* gcc.target/aarch64/movv16qi_2.c: New test.
      	* gcc.target/aarch64/movv16qi_3.c: Likewise.
      	* gcc.target/aarch64/movv2di_1.c: Likewise.
      	* gcc.target/aarch64/movv2x16qi_1.c: Likewise.
      	* gcc.target/aarch64/movv2x8qi_1.c: Likewise.
      	* gcc.target/aarch64/movv3x16qi_1.c: Likewise.
      	* gcc.target/aarch64/movv3x8qi_1.c: Likewise.
      	* gcc.target/aarch64/movv4x16qi_1.c: Likewise.
      	* gcc.target/aarch64/movv4x8qi_1.c: Likewise.
      	* gcc.target/aarch64/movv8qi_2.c: Likewise.
      	* gcc.target/aarch64/movv8qi_3.c: Likewise.
      	* gcc.target/aarch64/vect_unary_2.c: Likewise.
      721c0fb3
    • Richard Sandiford's avatar
      aarch64: Disassociate ls64 from simd · 91061fd5
      Richard Sandiford authored
      The ls64-related move expanders and splits required TARGET_SIMD.
      That isn't necessary, since the 64-byte values are stored entirely
      in GPRs.  (The associated define_insn was already correct.)
      
      I wondered about moving the patterns to aarch64.md, but it wasn't
      clear-cut.
      
      gcc/
      	* config/aarch64/aarch64-simd.md (movv8di): Remove TARGET_SIMD
      	condition.  Likewise for the related define_split.  Tweak formatting.
      
      gcc/testsuite/
      	* gcc.target/aarch64/acle/ls64_asm_2.c: New test.
      91061fd5
    • Tobias Burnus's avatar
      libgomp.texi: move item from gcn to nvptx · eec36f27
      Tobias Burnus authored
      I misplaced one remark into 'gcn' instead of 'nvptx' in
      commit r13-2625-g6b43f556f392a7165582aca36a19fe7389d995b2
      
      libgomp/ChangeLog:
      
      	* libgomp.texi (gcn): Move misplaced -march=sm_30 remark to ...
      	(nvptx): ... here.
      eec36f27
    • GCC Administrator's avatar
      Daily bump. · b5f09bd7
      GCC Administrator authored
      b5f09bd7
  3. Sep 12, 2022
    • Patrick Palka's avatar
      c++: remove '_sfinae' suffix from functions · c17fa0f2
      Patrick Palka authored
      The functions
      
        abstract_virtuals_error
        cxx_constant_value
        get_target_expr
        instantiate_non_dependent_expr
        require_complete_type
      
      are each just a non-SFINAE-enabled wrapper for the corresponding
      SFINAE-enabled version that's suffixed by '_sfinae'.  But this suffix is
      at best redundant since a 'complain' parameter already broadly conveys
      that a function is SFINAE-enabled, and having two such versions of a
      function is less concise than just using a default argument for 'complain'
      (and arguably no less mistake prone).
      
      So this patch squashes the two versions of each of the above functions
      by adding a default 'complain' argument to the SFINAE-enabled version
      whose '_sfinae' suffix we then remove.
      
      gcc/cp/ChangeLog:
      
      	* call.cc (build_conditional_expr): Adjust calls to
      	'_sfinae'-suffixed functions.
      	(build_temp): Likewise.
      	(convert_like_internal): Likewise.
      	(convert_arg_to_ellipsis): Likewise.
      	(build_over_call): Likewise.
      	(build_cxx_call): Likewise.
      	(build_new_method_call): Likewise.
      	* constexpr.cc (cxx_eval_outermost_constant_expr): Likewise.
      	(cxx_constant_value_sfinae): Rename to ...
      	(cxx_constant_value): ... this.  Document its default arguments.
      	(fold_non_dependent_expr): Adjust function comment.
      	* cp-tree.h (instantiate_non_dependent_expr_sfinae): Rename to ...
      	(instantiate_non_dependent_expr): ... this.  Give its 'complain'
      	parameter a default argument.
      	(get_target_expr_sfinae, get_target_expr): Likewise.
      	(require_complete_type_sfinae, require_complete_type): Likewise.
      	(abstract_virtuals_error_sfinae, abstract_virtuals_error):
      	Likewise.
      	(cxx_constant_value_sfinae, cxx_constant_value): Likewise.
      	* cvt.cc (build_up_reference): Adjust calls to '_sfinae'-suffixed
      	functions.
      	(ocp_convert): Likewise.
      	* decl.cc (build_explicit_specifier): Likewise.
      	* except.cc (build_noexcept_spec): Likewise.
      	* init.cc (build_new_1): Likewise.
      	* pt.cc (expand_integer_pack): Likewise.
      	(instantiate_non_dependent_expr_internal): Adjust function
      	comment.
      	(instantiate_non_dependent_expr): Rename to ...
      	(instantiate_non_dependent_expr_sfinae): ... this.  Document its
      	default argument.
      	(tsubst_init): Adjust calls to '_sfinae'-suffixed functions.
      	(fold_targs_r): Likewise.
      	* semantics.cc (finish_compound_literal): Likewise.
      	(finish_decltype_type): Likewise.
      	(cp_build_bit_cast): Likewise.
      	* tree.cc (build_cplus_new): Likewise.
      	(get_target_expr): Rename to ...
      	(get_target_expr_sfinae): ... this.  Document its default
      	argument.
      	* typeck.cc (require_complete_type): Rename to ...
      	(require_complete_type_sfinae): ... this.  Document its default
      	argument.
      	(cp_build_array_ref): Adjust calls to '_sfinae'-suffixed
      	functions.
      	(convert_arguments): Likewise.
      	(cp_build_binary_op): Likewise.
      	(build_static_cast_1): Likewise.
      	(cp_build_modify_expr): Likewise.
      	(convert_for_initialization): Likewise.
      	* typeck2.cc (abstract_virtuals_error): Rename to ...
      	(abstract_virtuals_error_sfinae): ... this. Document its default
      	argument.
      	(build_functional_cast_1): Adjust calls to '_sfinae'-suffixed
      	functions.
      c17fa0f2
    • Patrick Palka's avatar
      c++: template-id arguments are evaluated [PR101906] · c3ba0eaa
      Patrick Palka authored
      Here we're neglecting to clear cp_unevaluated_operand when substituting
      into the arguments of the alias template-id 'skip<(T(), 0), T>' with T=A,
      which means cp_unevaluated_operand remains set during mark_used for
      A::A() and so we don't synthesize it.  Later constant evaluation for
      the substituted template argument '(A(), 0)' (from coerce_template_parms)
      fails with "'constexpr A::A()' used before its definition" since it was
      never synthesized.
      
      This doesn't happen with a class template because tsubst_aggr_type
      clears cp_unevaluated_operand during substitution thereof.  But since
      template arguments are generally manifestly constant-evaluated, which in
      turn are evaluated even in an unevaluated operand, we should be clearing
      cp_unevaluated_operand more broadly whenever substituting into any set
      of template arguments.  To that end this patch makes us clear it during
      tsubst_template_args.
      
      	PR c++/101906
      
      gcc/cp/ChangeLog:
      
      	* pt.cc (tsubst_template_args): Set cp_evaluated here.
      	(tsubst_aggr_type): Not here.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/template/evaluated1.C: New test.
      	* g++.dg/template/evaluated1a.C: New test.
      	* g++.dg/template/evaluated1b.C: New test.
      	* g++.dg/template/evaluated1c.C: New test.
      c3ba0eaa
    • Jason Merrill's avatar
      c++: auto member function and auto variable [PR106893] · 03381bec
      Jason Merrill authored
      As with PR105623, we need to call mark_single_function sooner to
      resolve the type of a BASELINK.
      
      	PR c++/106893
      	PR c++/90451
      
      gcc/cp/ChangeLog:
      
      	* decl.cc (cp_finish_decl): Call mark_single_function.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/cpp1y/auto-fn65.C: New test.
      03381bec
    • Jason Merrill's avatar
      c++: cast to array of unknown bound [PR93259] · 6bcca5f6
      Jason Merrill authored
      We already know to treat a variable of array-of-unknown-bound type as
      dependent, we should do the same for arr{}.
      
      	PR c++/93259
      
      gcc/cp/ChangeLog:
      
      	* pt.cc (type_dependent_expression_p): Treat a compound
      	literal of array-of-unknown-bound type like a variable.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/cpp0x/initlist-array17.C: New test.
      6bcca5f6
    • Takayuki 'January June' Suwa's avatar
      xtensa: Implement new target hook: TARGET_CONSTANT_OK_FOR_CPROP_P · 936efcac
      Takayuki 'January June' Suwa authored
      This patch implements new target hook TARGET_CONSTANT_OK_FOR_CPROP_P in
      order to exclude CONST_INTs that cannot fit into a MOVI machine instruction
      from cprop.
      
      gcc/ChangeLog:
      
      	* config/xtensa/xtensa.cc (TARGET_CONSTANT_OK_FOR_CPROP_P):
      	New macro definition.
      	(xtensa_constant_ok_for_cprop_p):
      	Implement the hook as mentioned above.
      936efcac
    • Patrick Palka's avatar
      libstdc++: Add already-accepted <ranges> testcase [PR106320] · db19cfda
      Patrick Palka authored
      Although PR106320 affected only the 10 and 11 branches, and the testcase
      from there is already correctly accepted on trunk and the 12 branch, we
      still should add the testcase to trunk/12 too for inter-branch consistency.
      
      	PR libstdc++/106320
      
      libstdc++-v3/ChangeLog:
      
      	* testsuite/std/ranges/adaptors/join.cc (test13): New test.
      db19cfda
    • Jason Merrill's avatar
      c++: lambda capture of array with deduced bounds [PR106567] · 7c989a8e
      Jason Merrill authored
      We can't use the type of an array variable directly if we haven't deduced
      its length yet.
      
      	PR c++/106567
      
      gcc/cp/ChangeLog:
      
      	* lambda.cc (type_deducible_expression_p): Check
      	array_of_unknown_bound_p.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/cpp0x/lambda/lambda-array4.C: New test.
      7c989a8e
    • Jonathan Wakely's avatar
      c++: Refer to internal linkage for -Wsubobject-linkage [PR86491] · 8ef5fa4c
      Jonathan Wakely authored
      Since C++11 relaxed the requirement for template arguments to have
      external linkage, it's possible to get -Wsubobject-linkage warnings
      without using any anonymous namespaces. This confuses users when they
      get diagnostics that refer to an anonymous namespace that doesn't exist
      in their code.
      
      This changes the diagnostic to say "has internal linkage" for C++11 and
      later, if the type isn't actually a member of the anonymous namespace.
      Making that distinction involved renaming the current decl_anon_ns_mem_p to
      something that better expresses its semantics.
      
      For C++98 template arguments declared with 'static' are ill-formed
      anyway, so the only way this warning can arise is via anonymous
      namespaces. That means the existing wording is accurate for C++98 and so
      we can keep it.
      
      	PR c++/86491
      
      gcc/cp/ChangeLog:
      
      	* decl2.cc (constrain_class_visibility): Adjust wording of
      	-Wsubobject-linkage for cases where anonymous
      	namespaces aren't used.
      	* tree.cc (decl_anon_ns_mem_p): Now only true for actual anonymous
      	namespace members, rename old semantics to...
      	(decl_internal_context_p): ...this.
      	* cp-tree.h, name-lookup.cc, pt.cc: Adjust.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/warn/anonymous-namespace-3.C: Use separate dg-warning
      	directives for C++98 and everything else.
      	* g++.dg/warn/Wsubobject-linkage-5.C: New test.
      8ef5fa4c
    • Joseph Myers's avatar
      stdatomic.h: Do not define ATOMIC_VAR_INIT for C2x · 2e7bc76d
      Joseph Myers authored
      The <stdatomic.h> macro ATOMIC_VAR_INIT, previously declared obsolete,
      is removed completely in C2x; disable it for C2x in GCC's
      implementation.  (Although ATOMIC_* are reserved names for this
      header, disabling the macro for C2x still seems appropriate.)
      
      Bootstrapped with no regressions for x86_64-pc-linux-gnu.
      
      gcc/
      	* ginclude/stdatomic.h [defined __STDC_VERSION__ &&
      	__STDC_VERSION__ > 201710L] (ATOMIC_VAR_INIT): Do not define.
      
      gcc/testsuite/
      	* gcc.dg/atomic/c2x-stdatomic-var-init-1.c: New test.
      2e7bc76d
    • Tobias Burnus's avatar
      nvptx/mkoffload.cc: Warn instead of error when reverse offload is not possible · 6b43f556
      Tobias Burnus authored
      Reverse offload requests at least -misa=sm_35; with this patch, a warning
      instead of an error is shown, still permitting reverse offload for all
      other configured device types. This is achieved by not calling
      GOMP_offload_register_ver (and stopping generating pointless 'static const char'
      variables, once known.)
      
      The tool_name as progname changes adds "nvptx " and "gcn " to the
      "mkoffload: warning/error:" diagnostic.
      
      gcc/ChangeLog:
      
      	* config/nvptx/mkoffload.cc (process): Replace a fatal_error by
      	a warning + not enabling offloading if -misa=sm_30 prevents
      	reverse offload.
      	(main): Use tool_name as progname for diagnostic.
      	* config/gcn/mkoffload.cc (main): Likewise.
      
      libgomp/ChangeLog:
      
      	* libgomp.texi (Offload-Target Specifics: nvptx): Document
      	that reverse offload requires >= -march=sm_35.
      	* testsuite/libgomp.c-c++-common/requires-4.c: Build for nvptx
      	with -misa=sm_35.
      	* testsuite/libgomp.c-c++-common/requires-5.c: Likewise.
      	* testsuite/libgomp.c-c++-common/requires-6.c: Likewise.
      	* testsuite/libgomp.c-c++-common/reverse-offload-1.c: Likewise.
      	* testsuite/libgomp.fortran/reverse-offload-1.f90: Likewise.
      	* testsuite/libgomp.c/reverse-offload-sm30.c: New test.
      6b43f556
Loading