Skip to content
Snippets Groups Projects
  1. Aug 09, 2022
    • Roger Sayle's avatar
      Use PTEST to perform AND in TImode STV of (A & B) != 0 on x86_64. · a56c1641
      Roger Sayle authored
      This x86_64 backend patch allows TImode STV to take advantage of the
      fact that the PTEST instruction performs an AND operation.  Previously
      PTEST was (mostly) used for comparison against zero, by using the same
      operands.  The benefits are demonstrated by the new test case:
      
      __int128 a,b;
      int foo()
      {
        return (a & b) != 0;
      }
      
      Currently with -O2 -msse4 we generate:
      
              movdqa  a(%rip), %xmm0
              pand    b(%rip), %xmm0
              xorl    %eax, %eax
              ptest   %xmm0, %xmm0
              setne   %al
              ret
      
      with this patch we now generate:
      
              movdqa  a(%rip), %xmm0
              xorl    %eax, %eax
              ptest   b(%rip), %xmm0
              setne   %al
              ret
      
      Technically, the magic happens using new define_insn_and_split patterns.
      Using two patterns allows this transformation to performed independently
      of whether TImode STV is run before or after combine.  The one tricky
      case is that immediate constant operands of the AND behave slightly
      differently between TImode and V1TImode: All V1TImode immediate operands
      becomes loads, but for TImode only values that are not hilo_operands
      need to be loaded.  Hence the new *testti_doubleword accepts any
      general_operand, but internally during split calls force_reg whenever
      the second operand is not x86_64_hilo_general_operand.  This required
      (benefits from) some tweaks to TImode STV to support CONST_WIDE_INT in
      more places, using CONST_SCALAR_INT_P instead of just CONST_INT_P.
      
      2022-08-09  Roger Sayle  <roger@nextmovesoftware.com>
      
      gcc/ChangeLog
      	* config/i386/i386-features.cc (scalar_chain::convert_compare):
      	Create new pseudos only when/if needed.  Add support for TEST,
      	i.e. (COMPARE (AND x y) (const_int 0)), using UNSPEC_PTEST.
      	When broadcasting V2DImode and V4SImode use new pseudo register.
      	(timode_scalar_chain::convert_op): Do nothing if operand is
      	already V1TImode.  Avoid generating useless SUBREG conversions,
      	i.e. (SUBREG:V1TImode (REG:V1TImode) 0).  Handle CONST_WIDE_INT
      	in addition to CONST_INT by using CONST_SCALAR_INT_P.
      	(convertible_comparison_p): Use CONST_SCALAR_INT_P to match both
      	CONST_WIDE_INT and CONST_INT.  Recognize new *testti_doubleword
      	pattern as an STV candidate.
      	(timode_scalar_to_vector_candidate_p): Allow CONST_SCALAR_INT_P
      	operands in binary logic operations.
      
      	* config/i386/i386.cc (ix86_rtx_costs) <case UNSPEC>: Add costs
      	for UNSPEC_PTEST; a PTEST that performs an AND has the same cost
      	as regular PTEST, i.e. cost->sse_op.
      
      	* config/i386/i386.md (*testti_doubleword): New pre-reload
      	define_insn_and_split that recognizes comparison of TI mode AND
      	against zero.
      	* config/i386/sse.md (*ptest<mode>_and): New pre-reload
      	define_insn_and_split that recognizes UNSPEC_PTEST of identical
      	AND operands.
      
      gcc/testsuite/ChangeLog
      	* gcc.target/i386/sse4_1-stv-8.c: New test case.
      a56c1641
    • Roger Sayle's avatar
      middle-end: Optimize ((X >> C1) & C2) != C3 for more cases. · 6fc14f19
      Roger Sayle authored
      Following my middle-end patch for PR tree-optimization/94026, I'd promised
      Jeff Law that I'd clean up the dead-code in fold-const.cc now that these
      optimizations are handled in match.pd.  Alas, I discovered things aren't
      quite that simple, as the transformations I'd added avoided cases where
      C2 overlapped with the new bits introduced by the shift, but the original
      code handled any value of C2 provided that it had a single-bit set (under
      the condition that C3 was always zero).
      
      This patch upgrades the transformations supported by match.pd to cover
      any values of C2 and C3, provided that C1 is a valid bit shift constant,
      for all three shift types (logical right, arithmetic right and left).
      This then makes the code in fold-const.cc fully redundant, and adds
      support for some new (corner) cases not previously handled.  If the
      constant C1 is valid for the type's precision, the shift is now always
      eliminated (with C2 and C3 possibly updated to test the sign bit).
      
      Interestingly, the fold-const.cc code that I'm now deleting was originally
      added by me back in 2006 to resolve PR middle-end/21137.  I've confirmed
      that those testcase(s) remain resolved with this patch (and I'll close
      21137 in Bugzilla).  This patch also implements most (but not all) of the
      examples mentioned in PR tree-optimization/98954, for which I have some
      follow-up patches.
      
      2022-08-09  Roger Sayle  <roger@nextmovesoftware.com>
      	    Richard Biener  <rguenther@suse.de>
      
      gcc/ChangeLog
      	PR middle-end/21137
      	PR tree-optimization/98954
      	* fold-const.cc (fold_binary_loc): Remove optimizations to
      	optimize ((X >> C1) & C2) ==/!= 0.
      	* match.pd (cmp (bit_and (lshift @0 @1) @2) @3): Remove wi::ctz
      	check, and handle all values of INTEGER_CSTs @2 and @3.
      	(cmp (bit_and (rshift @0 @1) @2) @3): Likewise, remove wi::clz
      	checks, and handle all values of INTEGER_CSTs @2 and @3.
      
      gcc/testsuite/ChangeLog
      	PR middle-end/21137
      	PR tree-optimization/98954
      	* gcc.dg/fold-eqandshift-4.c: New test case.
      6fc14f19
    • Vibhav Pant's avatar
      libgccjit.h: Uncomment macro definition for testing gcc_jit_context_new_bitcast support · 9385cd9c
      Vibhav Pant authored
      
      The macro definition for LIBGCCJIT_HAVE_gcc_jit_context_new_bitcast
      was earlier located in the documentation comment for
      gcc_jit_context_new_bitcast, making it unavailable to code that
      consumed libgccjit.h. This commit moves the definition out of the
      comment, making it effective.
      
      gcc/jit/ChangeLog:
      	* libgccjit.h (LIBGCCJIT_HAVE_gcc_jit_context_new_bitcast): Move
      	definition out of comment.
      
      Signed-off-by: default avatarDavid Malcolm <dmalcolm@redhat.com>
      9385cd9c
    • David Malcolm's avatar
      docs: add notes on which functions -fanalyzer has hardcoded knowledge of · 16877cc2
      David Malcolm authored
      
      gcc/ChangeLog:
      	* doc/invoke.texi (Static Analyzer Options): Add notes on which
      	functions the analyzer has hardcoded knowledge of.
      
      Signed-off-by: default avatarDavid Malcolm <dmalcolm@redhat.com>
      16877cc2
    • Iain Buclaw's avatar
      d: Fix undefined reference to pragma(inline) symbol (PR106563) · 04284176
      Iain Buclaw authored
      Functions that are declared `pragma(inline)' should be treated as if
      they are defined in every translation unit they are referenced from,
      regardless of visibility protection.  Ensure they always get
      DECL_ONE_ONLY linkage, and start emitting them into other modules that
      import them.
      
      	PR d/106563
      
      gcc/d/ChangeLog:
      
      	* decl.cc (DeclVisitor::visit (FuncDeclaration *)): Set semanticRun
      	before generating its symbol.
      	(function_defined_in_root_p): New function.
      	(function_needs_inline_definition_p): New function.
      	(maybe_build_decl_tree): New function.
      	(get_symbol_decl): Call maybe_build_decl_tree before returning symbol.
      	(start_function): Use function_defined_in_root_p instead of inline
      	test for locally defined symbols.
      	(set_linkage_for_decl): Check for inline functions before private or
      	protected symbols.
      
      gcc/testsuite/ChangeLog:
      
      	* gdc.dg/torture/torture.exp (srcdir): New proc.
      	* gdc.dg/torture/imports/pr106563math.d: New test.
      	* gdc.dg/torture/imports/pr106563regex.d: New test.
      	* gdc.dg/torture/imports/pr106563uni.d: New test.
      	* gdc.dg/torture/pr106563.d: New test.
      04284176
    • Andrew Stubbs's avatar
      amdgcn: Vector procedure call ABI · 4e191462
      Andrew Stubbs authored
      Adjust the (unofficial) procedure calling ABI such that vector arguments are
      passed in vector registers, not on the stack.  Scalar arguments continue to
      be passed in scalar registers, making a total of 12 argument registers.
      
      The return value is also moved to a vector register (even for scalars; it
      would be possible to retain the scalar location, using untyped_call, but
      there's no obvious advantage in doing so).
      
      After this change the ABI is as follows:
      
      s0-s13  : Reserved for kernel launch parameters.
      s14-s15 : Frame pointer.
      s16-s17 : Stack pointer.
      s18-s19 : Link register.
      s20-s21 : Exec Save.
      s22-s23 : CC Save.
      s24-s25 : Scalar arguments.          NO LONGER RETURN VALUE.
      s26-s29 : Additional scalar arguments (makes 6 total).
      s30-s31 : Static Chain.
      v0      : Prologue/epilogue scratch.
      v1      : Constant 0, 1, 2, 3, 4, ... 63.
      v2-v7   : Prologue/epilogue scratch.
      v8-v9   : Return value & vector arguments.              NEW.
      v10-v13 : Additional vector arguments (makes 6 total).  NEW.
      
      gcc/ChangeLog:
      
      	* config/gcn/gcn.cc (gcn_function_value): Allow vector return values.
      	(num_arg_regs): Allow vector arguments.
      	(gcn_function_arg): Likewise.
      	(gcn_function_arg_advance): Likewise.
      	(gcn_arg_partial_bytes): Likewise.
      	(gcn_return_in_memory): Likewise.
      	(gcn_expand_epilogue): Get return value from v8.
      	* config/gcn/gcn.h (RETURN_VALUE_REG): Set to v8.
      	(FIRST_PARM_REG): USE FIRST_SGPR_REG for clarity.
      	(FIRST_VPARM_REG): New.
      	(FUNCTION_ARG_REGNO_P): Allow vector parameters.
      	(struct gcn_args): Add vnum field.
      	(LIBCALL_VALUE): All vector return values.
      	* config/gcn/gcn.md (gcn_call_value): Add vector constraints.
      	(gcn_call_value_indirect): Likewise.
      4e191462
    • Richard Biener's avatar
      autopar TLC · 9aa08cd4
      Richard Biener authored
      The following removes all excessive update_ssa calls from OMP
      expansion, thereby rewriting the atomic load and store cases to
      GIMPLE code generation.  I don't think autopar ever exercises the
      atomics code though.
      
      There's not much test coverage overall so I've built SPEC 2k17
      with -floop-parallelize-all -ftree-parallelize-loops=2 with and
      without LTO (and otherwise -Ofast plus -march=haswell) without
      fallout.
      
      If there's any fallout it's not OK to update SSA form for
      each and every OMP stmt lowered.
      
      	* omp-expand.cc (expand_omp_atomic_load): Emit GIMPLE
      	directly.  Avoid update_ssa when in SSA form.
      	(expand_omp_atomic_store): Likewise.
      	(expand_omp_atomic_fetch_op): Avoid update_ssa when in SSA
      	form.
      	(expand_omp_atomic_pipeline): Likewise.
      	(expand_omp_atomic_mutex): Likewise.
      	* tree-parloops.cc (gen_parallel_loop): Use
      	TODO_update_ssa_no_phi after loop_version.
      9aa08cd4
    • Richard Biener's avatar
      Remove --param max-fsm-thread-length · c64ef5cd
      Richard Biener authored
      This removes max-fsm-thread-length which is obsoleted by
      max-jump-thread-paths.
      
      	* doc/invoke.texi (max-fsm-thread-length): Remove.
      	* params.opt (max-fsm-thread-length): Likewise.
      	* tree-ssa-threadbackward.cc
      	(back_threader_profitability::profitable_path_p): Do not
      	check max-fsm-thread-length.
      c64ef5cd
    • Richard Biener's avatar
      tree-optimization/106514 - add --param max-jump-thread-paths · 409978d5
      Richard Biener authored
      The following adds a limit for the exponential greedy search of
      the backwards jump threader.  The idea is to limit the search
      space in a way that the paths considered are the same if the search
      were in BFS order rather than DFS.  In particular it stops considering
      incoming edges into a block if the product of the in-degrees of
      blocks on the path exceeds the specified limit.
      
      When considering the low stmt copying limit of 7 (or 1 in the size
      optimize case) this means the degenerate case with maximum search
      space is a sequence of conditions with no actual code
      
        B1
         |\
         | empty
         |/
        B2
         |\
         ...
        Bn
         |\
      
      GIMPLE_CONDs are costed 2, an equivalent GIMPLE_SWITCH already 4, so
      we reach 7 already with 3 middle conditions (B1 and Bn do not count).
      The search space would be 2^4 == 16 to reach this.  The FSM threads
      historically allowed for a thread length of 10 but is really looking
      for a single multiway branch threaded across the backedge.  I've
      chosen the default of the new parameter to 64 which effectively
      limits the outdegree of the switch statement (the cases reaching the
      backedge) to that number (divided by 2 until I add some special
      pruning for FSM threads due to the loop header indegree).  The
      testcase ssa-dom-thread-7.c requires 56 at the moment (as said,
      some special FSM thread pruning of considered edges would bring
      it down to half of that), but we now get one more threading
      and quite some more in later threadfull.  This testcase seems to
      be difficult to check for expected transforms.
      
      The new testcases add the degenerate case we currently thread
      (without deciding whether that's a good idea ...) plus one with
      an approripate limit that should prevent the threading.
      
      This obsoletes the mentioned --param max-fsm-thread-length but
      I am not removing it as part of this patch.  When the search
      space is limited the thread stmt size limit effectively provides
      max-fsm-thread-length.
      
      The param with its default does not help PR106514 enough to unleash
      path searching with the higher FSM stmt count limit.
      
      	PR tree-optimization/106514
      	* params.opt (max-jump-thread-paths): New.
      	* doc/invoke.texi (max-jump-thread-paths): Document.
      	* tree-ssa-threadbackward.cc (back_threader::find_paths_to_names):
      	Honor max-jump-thread-paths, take overall_path argument.
      	(back_threader::find_paths): Pass 1 as initial overall_path.
      
      	* gcc.dg/tree-ssa/ssa-thread-16.c: New testcase.
      	* gcc.dg/tree-ssa/ssa-thread-17.c: Likewise.
      	* gcc.dg/tree-ssa/ssa-dom-thread-7.c: Adjust.
      409978d5
    • Tobias Burnus's avatar
      OpenMP: Fix folding with simd's linear clause [PR106492] · 8a16b9f9
      Tobias Burnus authored
      gcc/ChangeLog:
      
      	PR middle-end/106492
      	* omp-low.cc (lower_rec_input_clauses): Add missing folding
      	to data type of linear-clause list item.
      
      gcc/testsuite/ChangeLog:
      
      	PR middle-end/106492
      	* g++.dg/gomp/pr106492.C: New test.
      8a16b9f9
    • GCC Administrator's avatar
      Daily bump. · 5f17badb
      GCC Administrator authored
      5f17badb
  2. Aug 08, 2022
    • Andrew MacLeod's avatar
      Evaluate condition arguments with the correct type. · ef623bb5
      Andrew MacLeod authored
      Processing of a cond_expr requires that a range of the correct type for the
      operands of the cond_expr is passed in.
      
      	PR tree-optimization/106556
      	gcc/
      	* gimple-range-gori.cc (gori_compute::condexpr_adjust): Use the
      	  type of the cond_expr operands being evaluted.
      
      	gcc/testsuite/
      	* gfortran.dg/pr106556.f90: New.
      ef623bb5
    • Tom Honermann's avatar
      preprocessor/106426: Treat u8 character literals as unsigned in char8_t modes. · 053876cd
      Tom Honermann authored
      This patch corrects handling of UTF-8 character literals in preprocessing
      directives so that they are treated as unsigned types in char8_t enabled
      C++ modes (C++17 with -fchar8_t or C++20 without -fno-char8_t). Previously,
      UTF-8 character literals were always treated as having the same type as
      ordinary character literals (signed or unsigned dependent on target or use
      of the -fsigned-char or -funsigned char options).
      
      	PR preprocessor/106426
      
      gcc/c-family/ChangeLog:
      	* c-opts.cc (c_common_post_options): Assign cpp_opts->unsigned_utf8char
      	subject to -fchar8_t, -fsigned-char, and/or -funsigned-char.
      
      gcc/testsuite/ChangeLog:
      	* g++.dg/ext/char8_t-char-literal-1.C: Check signedness of u8 literals.
      	* g++.dg/ext/char8_t-char-literal-2.C: Check signedness of u8 literals.
      
      libcpp/ChangeLog:
      	* charset.cc (narrow_str_to_charconst): Set signedness of CPP_UTF8CHAR
      	literals based on unsigned_utf8char.
      	* include/cpplib.h (cpp_options): Add unsigned_utf8char.
      	* init.cc (cpp_create_reader): Initialize unsigned_utf8char.
      053876cd
    • Tom Honermann's avatar
      C: Implement C2X N2653 char8_t and UTF-8 string literal changes · 703837b2
      Tom Honermann authored
      This patch implements the core language and compiler dependent library
      changes adopted for C2X via WG14 N2653.  The changes include:
      - Change of type for UTF-8 string literals from array of const char to
        array of const char8_t (unsigned char).
      - A new atomic_char8_t typedef.
      - A new ATOMIC_CHAR8_T_LOCK_FREE macro defined in terms of the existing
        __GCC_ATOMIC_CHAR8_T_LOCK_FREE predefined macro.
      
      gcc/ChangeLog:
      
      	* ginclude/stdatomic.h (atomic_char8_t,
      	ATOMIC_CHAR8_T_LOCK_FREE): New typedef and macro.
      
      gcc/c/ChangeLog:
      
      	* c-parser.cc (c_parser_string_literal): Use char8_t as the type
      	of CPP_UTF8STRING when char8_t support is enabled.
      	* c-typeck.cc (digest_init): Allow initialization of an array
      	of character type by a string literal with type array of
      	char8_t.
      
      gcc/c-family/ChangeLog:
      
      	* c-lex.cc (lex_string, lex_charconst): Use char8_t as the type
      	of CPP_UTF8CHAR and CPP_UTF8STRING when char8_t support is
      	enabled.
      	* c-opts.cc (c_common_post_options): Set flag_char8_t if
      	targeting C2x.
      
      gcc/testsuite/ChangeLog:
      	* gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c: New test.
      	* gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c: New test.
      	* gcc.dg/c11-utf8str-type.c: New test.
      	* gcc.dg/c17-utf8str-type.c: New test.
      	* gcc.dg/c2x-utf8str-type.c: New test.
      	* gcc.dg/c2x-utf8str.c: New test.
      	* gcc.dg/gnu2x-utf8str-type.c: New test.
      	* gcc.dg/gnu2x-utf8str.c: New test.
      703837b2
    • Iain Buclaw's avatar
      d: Fix ICE in in add_stack_var, at cfgexpand.cc:476 · 4b0253b0
      Iain Buclaw authored
      The type that triggers the ICE never got completed by the semantic
      analysis pass.  Checking for size forces it to be done, or issue a
      compile-time error.
      
      	PR d/106555
      
      gcc/d/ChangeLog:
      
      	* d-target.cc (Target::isReturnOnStack): Check for return type size.
      
      gcc/testsuite/ChangeLog:
      
      	* gdc.dg/imports/pr106555.d: New test.
      	* gdc.dg/pr106555.d: New test.
      4b0253b0
    • François Dumont's avatar
      libstdc++: [_GLIBCXX_DEBUG] Do not consider detached iterators as value-initialized · 01b1afdc
      François Dumont authored
      An attach iterator has its _M_version set to something != 0, the container version. This
      value shall be preserved when detaching it so that the iterator does not look like a
      value-initialized one.
      
      libstdc++-v3/ChangeLog:
      
      	* include/debug/formatter.h (__singular_value_init): New _Iterator_state enum entry.
      	(_Parameter<>(const _Safe_iterator<>&, const char*, _Is_iterator)): Check if iterator
      	parameter is value-initialized.
      	(_Parameter<>(const _Safe_local_iterator<>&, const char*, _Is_iterator)): Likewise.
      	* include/debug/safe_iterator.h (_Safe_iterator<>::_M_value_initialized()): New. Adapt
      	checks.
      	* include/debug/safe_local_iterator.h (_Safe_local_iterator<>::_M_value_initialized()): New.
      	Adapt checks.
      	* src/c++11/debug.cc (_Safe_iterator_base::_M_reset): Do not reset _M_version.
      	(print_field(PrintContext&, const _Parameter&, const char*)): Adapt state_names.
      	* testsuite/23_containers/deque/debug/iterator1_neg.cc: New test.
      	* testsuite/23_containers/deque/debug/iterator2_neg.cc: New test.
      	* testsuite/23_containers/forward_list/debug/iterator1_neg.cc: New test.
      	* testsuite/23_containers/forward_list/debug/iterator2_neg.cc: New test.
      	* testsuite/23_containers/forward_list/debug/iterator3_neg.cc: New test.
      01b1afdc
    • Andrew Pinski's avatar
      Fix middle-end/103645: empty struct store not removed when using compound literal · 21c7aab0
      Andrew Pinski authored
      For compound literals empty struct stores are not removed as they go down a
      different path of the gimplifier; trying to optimize the init constructor.
      This fixes the problem by not adding the gimple assignment at the end
      of gimplify_init_constructor if it was an empty type.
      
      Note this updates gcc.dg/pr87052.c where we had:
      const char d[0] = { };
      And was expecting a store to d but after this, there is no store
      as the decl's type is zero in size.
      
      OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
      
      gcc/ChangeLog:
      
      	PR middle-end/103645
      	* gimplify.cc (gimplify_init_constructor): Don't build/add
      	gimple assignment of an empty type.
      
      gcc/testsuite/ChangeLog:
      	* gcc.dg/pr87052.c: Update d var to expect nothing.
      21c7aab0
    • Tamar Christina's avatar
      AArch32: Fix 128-bit sequential consistency atomic operations. · 5471f55f
      Tamar Christina authored
      Similar to AArch64 the Arm implementation of 128-bit atomics is broken.
      
      For 128-bit atomics we rely on pthread barriers to correct guard the address
      in the pointer to get correct memory ordering.  However for 128-bit atomics the
      address under the lock is different from the original pointer.
      
      This means that one of the values under the atomic operation is not protected
      properly and so we fail during when the user has requested sequential
      consistency as there's no barrier to enforce this requirement.
      
      As such users have resorted to adding an
      
      #ifdef GCC
      <emit barrier>
      #endif
      
      around the use of these atomics.
      
      This corrects the issue by issuing a barrier only when __ATOMIC_SEQ_CST was
      requested.  I have hand verified that the barriers are inserted
      for atomic seq cst.
      
      libatomic/ChangeLog:
      
      	PR target/102218
      	* config/arm/host-config.h (pre_seq_barrier, post_seq_barrier,
      	pre_post_seq_barrier): Require barrier on __ATOMIC_SEQ_CST.
      5471f55f
    • Tamar Christina's avatar
      AArch64: Fix 128-bit sequential consistency atomic operations. · e6a8ae90
      Tamar Christina authored
      The AArch64 implementation of 128-bit atomics is broken.
      
      For 128-bit atomics we rely on pthread barriers to correct guard the address
      in the pointer to get correct memory ordering.  However for 128-bit atomics the
      address under the lock is different from the original pointer.
      
      This means that one of the values under the atomic operation is not protected
      properly and so we fail during when the user has requested sequential
      consistency as there's no barrier to enforce this requirement.
      
      As such users have resorted to adding an
      
      #ifdef GCC
      <emit barrier>
      #endif
      
      around the use of these atomics.
      
      This corrects the issue by issuing a barrier only when __ATOMIC_SEQ_CST was
      requested.  To remedy this performance hit I think we should revisit using a
      similar approach to out-line-atomics for the 128-bit atomics.
      
      Note that I believe I need the empty file due to the include_next chain but
      I am not entirely sure.  I have hand verified that the barriers are inserted
      for atomic seq cst.
      
      libatomic/ChangeLog:
      
      	PR target/102218
      	* config/aarch64/aarch64-config.h: New file.
      	* config/aarch64/host-config.h: New file.
      e6a8ae90
    • Richard Biener's avatar
      lto/106540 - fix LTO tree input wrt dwarf2out_register_external_die · 2a1448f2
      Richard Biener authored
      I've revisited the earlier two workarounds for dwarf2out_register_external_die
      getting duplicate entries.  It turns out that r11-525-g03d90a20a1afcb
      added dref_queue pruning to lto_input_tree but decl reading uses that
      to stream in DECL_INITIAL even when in the middle of SCC streaming.
      When that SCC then gets thrown away we can end up with debug nodes
      registered which isn't supposed to happen.  The following adjusts
      the DECL_INITIAL streaming to go the in-SCC way, using lto_input_tree_1,
      since no SCCs are expected at this point, just refs.
      
      	PR lto/106540
      	PR lto/106334
      	* dwarf2out.cc (dwarf2out_register_external_die): Restore
      	original assert.
      	* lto-streamer-in.cc (lto_read_tree_1): Use lto_input_tree_1
      	to input DECL_INITIAL, avoiding to commit drefs.
      2a1448f2
    • Andrew Pinski's avatar
      Move testcase gcc.dg/tree-ssa/pr93776.c to gcc.c-torture/compile/pr93776.c · 2633c8d8
      Andrew Pinski authored
      Since this testcase is not exactly SSA specific and it would
      be a good idea to compile this at more than just at -O1, moving
      it to gcc.c-torture/compile would do that.
      
      Committed as obvious after a test on x86_64-linux-gnu.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.dg/tree-ssa/pr93776.c: Moved to...
      	* gcc.c-torture/compile/pr93776.c: ...here.
      2633c8d8
    • GCC Administrator's avatar
      Daily bump. · 37e8e63d
      GCC Administrator authored
      37e8e63d
  3. Aug 07, 2022
    • Roger Sayle's avatar
      [Committed] Add -mno-stv to new gcc.target/i386/cmpti2.c test case. · ef54eb74
      Roger Sayle authored
      Adding -march=cascadelake to the command line options of the new cmpti2.c
      testcase triggers TImode STV and produces vector code that doesn't match
      the scalar implementation that this test was intended to check.  Adding
      -mno-stv to the options fixes this.  Committed as obvious.
      
      2022-08-07  Roger Sayle  <roger@nextmovesoftware.com>
      
      gcc/testsuite/ChangeLog
      	* gcc.target/i386/cmpti2.c: Add -mno-stv to dg-options.
      ef54eb74
    • Jakub Jelinek's avatar
      c++: Add support for __real__/__imag__ modifications in constant expressions [PR88174] · 19077677
      Jakub Jelinek authored
      We claim we support P0415R1 (constexpr complex), but e.g.
       #include <complex>
      
      constexpr bool
      foo ()
      {
        std::complex<double> a (1.0, 2.0);
        a += 3.0;
        a.real (6.0);
        return a.real () == 6.0 && a.imag () == 2.0;
      }
      
      static_assert (foo ());
      
      fails with
      test.C:12:20: error: non-constant condition for static assertion
         12 | static_assert (foo ());
            |                ~~~~^~
      test.C:12:20:   in ‘constexpr’ expansion of ‘foo()’
      test.C:8:10:   in ‘constexpr’ expansion of ‘a.std::complex<double>::real(6.0e+0)’
      test.C:12:20: error: modification of ‘__real__ a.std::complex<double>::_M_value’ is not a constant expression
      
      The problem is we don't handle REALPART_EXPR and IMAGPART_EXPR
      in cxx_eval_store_expression.
      The following patch attempts to support it (with a requirement
      that those are the outermost expressions, ARRAY_REF/COMPONENT_REF
      etc. are just not possible on the result of these, BIT_FIELD_REF
      would be theoretically possible if trying to extract some bits
      from one part of a complex int, but I don't see how it could appear
      in the FE trees.
      
      For these references, the code handles value being COMPLEX_CST,
      COMPLEX_EXPR or CONSTRUCTOR_NO_CLEARING empty CONSTRUCTOR (what we use
      to represent uninitialized values for C++20 and later) and the
      code starts by rewriting it to COMPLEX_EXPR, so that we can freely
      adjust the individual parts and later on possibly optimize it back
      to COMPLEX_CST if both halves are constant.
      
      2022-08-07  Jakub Jelinek  <jakub@redhat.com>
      
      	PR c++/88174
      	* constexpr.cc (cxx_eval_store_expression): Handle REALPART_EXPR
      	and IMAGPART_EXPR.  Change ctors from releasing_vec to
      	auto_vec<tree *>, adjust all uses.  For !preeval, update ctors
      	vector.
      
      	* g++.dg/cpp1y/constexpr-complex1.C: New test.
      19077677
    • Roger Sayle's avatar
      Allow any immediate constant in *cmp<dwi>_doubleword splitter on x86_64. · a46bca36
      Roger Sayle authored
      This patch tweaks i386.md's *cmp<dwi>_doubleword splitter's predicate to
      allow general_operand, not just x86_64_hilo_general_operand, to improve
      code generation.  As a general rule, i386.md's _doubleword splitters should
      be post-reload splitters that require integer immediate operands to be
      x86_64_hilo_int_operand, so that each part is a valid word mode immediate
      constant.  As an exception to this rule, doubleword patterns that must be
      split before reload, because they require additional scratch registers,
      can use take advantage of this ability to create new pseudos, to accept
      any immediate constant, and call force_reg on the high and/or low parts
      if they are not suitable immediate operands in word mode.
      
      The benefit is shown in the new cmpti3.c test case below.
      
      __int128 x;
      int foo()
      {
          __int128 t = 0x1234567890abcdefLL;
          return x == t;
      }
      
      where GCC with -O2 currently generates:
      
              movabsq $1311768467294899695, %rax
              xorl    %edx, %edx
              xorq    x(%rip), %rax
              xorq    x+8(%rip), %rdx
              orq     %rdx, %rax
              sete    %al
              movzbl  %al, %eax
              ret
      
      but with this patch now generates:
      
              movabsq $1311768467294899695, %rax
              xorq    x(%rip), %rax
              orq     x+8(%rip), %rax
              sete    %al
              movzbl  %al, %eax
              ret
      
      2022-08-07  Roger Sayle  <roger@nextmovesoftware.com>
      
      gcc/ChangeLog
      	* config/i386/i386.md (*cmp<dwi>_doubleword): Change predicate
      	for x86_64_hilo_general_operand to general operand.  Call
      	force_reg on parts that are not x86_64_immediate_operand.
      
      gcc/testsuite/ChangeLog
      	* gcc.target/i386/cmpti1.c: New test case.
      	* gcc.target/i386/cmpti2.c: Likewise.
      	* gcc.target/i386/cmpti3.c: Likewise.
      a46bca36
    • GCC Administrator's avatar
      Daily bump. · 019a41a7
      GCC Administrator authored
      019a41a7
  4. Aug 06, 2022
  5. Aug 05, 2022
    • David Malcolm's avatar
      New warning: -Wanalyzer-jump-through-null [PR105947] · e1a91681
      David Malcolm authored
      
      This patch adds a new warning to -fanalyzer for jumps through NULL
      function pointers.
      
      gcc/analyzer/ChangeLog:
      	PR analyzer/105947
      	* analyzer.opt (Wanalyzer-jump-through-null): New option.
      	* engine.cc (class jump_through_null): New.
      	(exploded_graph::process_node): Complain about jumps through NULL
      	function pointers.
      
      gcc/ChangeLog:
      	PR analyzer/105947
      	* doc/invoke.texi: Add -Wanalyzer-jump-through-null.
      
      gcc/testsuite/ChangeLog:
      	PR analyzer/105947
      	* gcc.dg/analyzer/function-ptr-5.c: New test.
      
      Signed-off-by: default avatarDavid Malcolm <dmalcolm@redhat.com>
      e1a91681
    • Roger Sayle's avatar
      middle-end: Allow backend to expand/split double word compare to 0/-1. · cc01a27d
      Roger Sayle authored
      This patch to the middle-end's RTL expansion reorders the code in
      emit_store_flag_1 so that the backend has more control over how best
      to expand/split double word equality/inequality comparisons against
      zero or minus one.  With the current implementation, the middle-end
      always decides to lower this idiom during RTL expansion using SUBREGs
      and word mode instructions, without ever consulting the backend's
      machine description.  Hence on x86_64, a TImode comparison against zero
      is always expanded as:
      
      (parallel [
        (set (reg:DI 91)
             (ior:DI (subreg:DI (reg:TI 88) 0)
                     (subreg:DI (reg:TI 88) 8)))
        (clobber (reg:CC 17 flags))])
      (set (reg:CCZ 17 flags)
           (compare:CCZ (reg:DI 91)
                        (const_int 0 [0])))
      
      This patch, which makes no changes to the code itself, simply reorders
      the clauses in emit_store_flag_1 so that the middle-end first attempts
      expansion using the target's doubleword mode cstore optab/expander,
      and only if this fails, falls back to lowering to word mode operations.
      On x86_64, this allows the expander to produce:
      
      (set (reg:CCZ 17 flags)
           (compare:CCZ (reg:TI 88)
                        (const_int 0 [0])))
      
      which is a candidate for scalar-to-vector transformations (and
      combine simplifications etc.).  On targets that don't define a cstore
      pattern for doubleword integer modes, there should be no change in
      behaviour.  For those that do, the current behaviour can be restored
      (if desired) by restricting the expander/insn to not apply when the
      comparison is EQ or NE, and operand[2] is either const0_rtx or
      constm1_rtx.
      
      This change just keeps RTL expansion more consistent (in philosophy).
      For other doubleword comparisons, such as with operators LT and GT,
      or with constants other than zero or -1, the wishes of the backend
      are respected, and only if the optab expansion fails are the default
      fall-back implementations using narrower integer mode operations
      (and conditional jumps) used.
      
      2022-08-05  Roger Sayle  <roger@nextmovesoftware.com>
      
      gcc/ChangeLog
      	* expmed.cc (emit_store_flag_1): Move code to expand double word
      	equality and inequality against zero or -1, using word operations,
      	to after trying to use the backend's cstore<mode>4 optab/expander.
      cc01a27d
    • Jonathan Wakely's avatar
      libstdc++: Add feature test macro for <experimental/scope> · 58a644cf
      Jonathan Wakely authored
      libstdc++-v3/ChangeLog:
      
      	* include/experimental/scope (__cpp_lib_experimental_scope):
      	Define.
      	* testsuite/experimental/scopeguard/uniqueres.cc: Check macro.
      58a644cf
    • Jonathan Wakely's avatar
      libstdc++: Implement <experimental/scope> from LFTSv3 · 29fc5075
      Jonathan Wakely authored
      libstdc++-v3/ChangeLog:
      
      	* include/Makefile.am: Add new header.
      	* include/Makefile.in: Regenerate.
      	* include/experimental/scope: New file.
      	* testsuite/experimental/scopeguard/uniqueres.cc: New test.
      	* testsuite/experimental/scopeguard/exit.cc: New test.
      29fc5075
    • Tamar Christina's avatar
      middle-end: Guard value_replacement and store_elim from seeing diamonds. · 1878ab36
      Tamar Christina authored
      This excludes value_replacement and store_elim from diamonds as they don't
      handle the form properly.
      
      gcc/ChangeLog:
      
      	PR middle-end/106534
      	* tree-ssa-phiopt.cc (tree_ssa_phiopt_worker): Guard the
      	value_replacement and store_elim from diamonds.
      1878ab36
    • Richard Biener's avatar
      backthreader dump fix · 6ca94826
      Richard Biener authored
      This fixes odd SUCCEEDED dumps from the backthreader registry that
      can happen even though register_jump_thread cancelled the thread
      as invalid.
      
      	* tree-ssa-threadbackward.cc (back_threader::maybe_register_path):
      	Check whether the registry register_path rejected the path.
      	(back_threader_registry::register_path): Return whether
      	register_jump_thread succeeded.
      6ca94826
    • Aldy Hernandez's avatar
      Inline unsupported_range constructor. · 47964e76
      Aldy Hernandez authored
      An unsupported_range temporary is instantiated in every Value_Range
      for completeness sake and should be mostly a NOP.  However, it's
      showing up in the callgrind stats, because it's not inline.  This
      fixes the oversight.
      
      	PR tree-optimization/106514
      
      gcc/ChangeLog:
      
      	* value-range.cc (unsupported_range::unsupported_range): Move...
      	* value-range.h (unsupported_range::unsupported_range): ...here.
      	(unsupported_range::set_undefined): New.
      47964e76
    • Richard Biener's avatar
      tree-optimization/106533 - loop distribution of inner loop of nest · 36bc2a8f
      Richard Biener authored
      Loop distribution currently gives up if the outer loop of a loop
      nest it analyzes contains a stmt with side-effects instead of
      continuing to analyze the innermost loop.  The following fixes that
      by continuing anyway.
      
      	PR tree-optimization/106533
      	* tree-loop-distribution.cc (loop_distribution::execute): Continue
      	analyzing the inner loops when find_seed_stmts_for_distribution
      	fails.
      
      	* gcc.dg/tree-ssa/ldist-39.c: New testcase.
      36bc2a8f
    • Haochen Gui's avatar
      rs6000: Correct return value of check_p9modulo_hw_available. · 4574dad4
      Haochen Gui authored
      Set the return value to 0 when modulo is supported, and to 1 when not supported.
      
      gcc/testsuite/
      	* lib/target-supports.exp (check_p9modulo_hw_available): Correct return
      	value.
      4574dad4
    • Andrew Pinski's avatar
      [RSIC-V] Fix 32bit riscv with zbs extension enabled · ffe4f55a
      Andrew Pinski authored
      The problem here was a disconnect between splittable_const_int_operand
      predicate and the function riscv_build_integer_1 for 32bits with zbs enabled.
      The splittable_const_int_operand predicate had a check for TARGET_64BIT which
      was not needed so this patch removed it.
      
      Committed as obvious after a build for risc32-elf configured with --with-arch=rv32imac_zba_zbb_zbc_zbs.
      
      Thanks,
      Andrew Pinski
      
      gcc/ChangeLog:
      
      	* config/riscv/predicates.md (splittable_const_int_operand):
      	Remove the check for TARGET_64BIT for single bit const values.
      ffe4f55a
    • GCC Administrator's avatar
      Daily bump. · 4ad52740
      GCC Administrator authored
      4ad52740
  6. Aug 04, 2022
    • Eugene Rozenfeld's avatar
      Add myself as AutoFDO maintainer · cd093ee4
      Eugene Rozenfeld authored
      ChangeLog:
      
      	* MAINTAINERS: Add myself as AutoFDO maintainer.
      cd093ee4
    • Jonathan Wakely's avatar
      libstdc++: Make std::string_view(Range&&) constructor explicit · 2678386d
      Jonathan Wakely authored
      The P2499R0 paper was recently approved for C++23.
      
      libstdc++-v3/ChangeLog:
      
      	* include/std/string_view (basic_string_view(Range&&)): Add
      	explicit as per P2499R0.
      	* testsuite/21_strings/basic_string_view/cons/char/range_c++20.cc:
      	Adjust implicit conversions. Check implicit conversions fail.
      	* testsuite/21_strings/basic_string_view/cons/wchar_t/range_c++20.cc:
      	Likewise.
      2678386d
Loading