Skip to content
Snippets Groups Projects
  1. Dec 12, 2022
    • Harald Anlauf's avatar
      Fortran: improve checking of assumed-size array spec [PR102180] · cf5327b8
      Harald Anlauf authored
      gcc/fortran/ChangeLog:
      
      	PR fortran/102180
      	* array.cc (match_array_element_spec): Add check for bad
      	assumed-implied-spec.
      	(gfc_match_array_spec): Reorder logic so that the first bad array
      	element spec may trigger an error.
      
      gcc/testsuite/ChangeLog:
      
      	PR fortran/102180
      	* gfortran.dg/pr102180.f90: New test.
      cf5327b8
    • Iain Buclaw's avatar
      d: Fix undefined reference to nested lambda in template (PR108055) · 9fe7d3de
      Iain Buclaw authored
      Sometimes, nested lambdas of templated functions get no code generation
      due to them being marked as instantianted outside of all modules being
      compiled in the current compilation unit.  This despite enclosing
      template instances being marked as instantiated inside the current
      compilation unit.  To fix, all enclosing templates are now checked in
      `function_defined_in_root_p'.
      
      Because of this change, `function_needs_inline_definition_p' has also
      been fixed up to only check whether the regular function definition
      itself is to be emitted in the current compilation unit.
      
      	PR d/108055
      
      gcc/d/ChangeLog:
      
      	* decl.cc (function_defined_in_root_p): Check all enclosing template
      	instances for definition in a root module.
      	(function_needs_inline_definition_p): Replace call to
      	function_defined_in_root_p with test for outer module `isRoot'.
      
      gcc/testsuite/ChangeLog:
      
      	* gdc.dg/torture/imports/pr108055conv.d: New.
      	* gdc.dg/torture/imports/pr108055spec.d: New.
      	* gdc.dg/torture/imports/pr108055write.d: New.
      	* gdc.dg/torture/pr108055.d: New test.
      9fe7d3de
    • Wilco Dijkstra's avatar
      AArch64: Enable TARGET_CONST_ANCHOR · 2d7c73ee
      Wilco Dijkstra authored
      Enable TARGET_CONST_ANCHOR to allow complex constants to be created via
      immediate add/sub.  Use a 24-bit range as that enables a 3 or 4-instruction
      immediate to be replaced by 2 add/sub instructions.  Fix the costing of
      add/sub to support 24-bit and 12-bit shifted immediates.
      The generated code for the testcase is now the same or better than LLVM.
      It also results in a small codesize reduction on SPEC.
      
      gcc/
      	* config/aarch64/aarch64.cc (aarch64_rtx_costs): Add correct costs
      	for 24-bit and 12-bit shifted immediate add/sub.
      	(TARGET_CONST_ANCHOR): Define.
      	* config/aarch64/predicates.md (aarch64_pluslong_immediate):
      	Fix range check.
      
      gcc/testsuite/
      	* gcc.target/aarch64/movk_3.c: New test.
      2d7c73ee
    • Tamar Christina's avatar
      middle-end: simplify complex if expressions where comparisons are inverse of one another. · 4d9db4bd
      Tamar Christina authored
      This optimizes the following sequence
      
        ((a < b) & c) | ((a >= b) & d)
      
      into
      
        (a < b ? c : d) & 1
      
      for scalar and on vector we can omit the & 1.
      
      Also recognizes
      
        (-(a < b) & c) | (-(a >= b) & d)
      
      into
      
        a < b ? c : d
      
      This changes the code generation from
      
      zoo2:
      	cmp     w0, w1
      	cset    w0, lt
      	cset    w1, ge
      	and     w0, w0, w2
      	and     w1, w1, w3
      	orr     w0, w0, w1
      	ret
      
      into
      
      	cmp	w0, w1
      	csel	w0, w2, w3, lt
      	and	w0, w0, 1
      	ret
      
      and significantly reduces the number of selects we have to do in the vector
      code.
      
      gcc/ChangeLog:
      
      	* match.pd: Add new rule.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/aarch64/if-compare_1.c: New test.
      	* gcc.target/aarch64/if-compare_2.c: New test.
      4d9db4bd
    • Tamar Christina's avatar
      AArch64: Fix vector re-interpretation between partial SIMD modes · 594264e9
      Tamar Christina authored
      While writing a patch series I started getting incorrect codegen out from
      VEC_PERM on partial struct types.
      
      It turns out that this was happening because the TARGET_CAN_CHANGE_MODE_CLASS
      implementation has a slight bug in it.  The hook only checked for SIMD to
      Partial but never Partial to SIMD.   This resulted in incorrect subregs to be
      generated from the fallback code in VEC_PERM_EXPR expansions.
      
      I have unfortunately not been able to trigger it using a standalone testcase as
      the mid-end optimizes away the permute every time I try to describe a permute
      that would result in the bug.
      
      The patch now rejects any conversion of partial SIMD struct types, unless they
      are both partial structures of the same number of registers or one is a SIMD
      type who's size is less than 8 bytes.
      
      gcc/ChangeLog:
      
      	* config/aarch64/aarch64.cc (aarch64_can_change_mode_class): Restrict
      	conversions between partial struct types properly.
      594264e9
    • Tamar Christina's avatar
      AArch64: Support new tbranch optab. · 17ae956c
      Tamar Christina authored
      This implements the new tbranch optab for AArch64.
      
      we cannot emit one big RTL for the final instruction immediately.
      The reason that all comparisons in the AArch64 backend expand to separate CC
      compares, and separate testing of the operands is for ifcvt.
      
      The separate CC compare is needed so ifcvt can produce csel, cset etc from the
      compares.  Unlike say combine, ifcvt can not do recog on a parallel with a
      clobber.  Should we emit the instruction directly then ifcvt will not be able
      to say, make a csel, because we have no patterns which handle zero_extract and
      compare. (unlike combine ifcvt cannot transform the extract into an AND).
      
      While you could provide various patterns for this (and I did try) you end up
      with broken patterns because you can't add the clobber to the CC register.  If
      you do, ifcvt recog fails.
      
      i.e.
      
      int
      f1 (int x)
      {
        if (x & 1)
          return 1;
        return x;
      }
      
      We lose csel here.
      
      Secondly the reason the compare with an explicit CC mode is needed is so that
      ifcvt can transform the operation into a version that doesn't require the flags
      to be set.  But it only does so if it know the explicit usage of the CC reg.
      
      For instance
      
      int
      foo (int a, int b)
      {
        return ((a & (1 << 25)) ? 5 : 4);
      }
      
      Doesn't require a comparison, the optimal form is:
      
      foo(int, int):
              ubfx    x0, x0, 25, 1
              add     w0, w0, 4
              ret
      
      and no compare is actually needed.  If you represent the instruction using an
      ANDS instead of a zero_extract then you get close, but you end up with an ands
      followed by an add, which is a slower operation.
      
      gcc/ChangeLog:
      
      	* config/aarch64/aarch64.md (*tb<optab><mode>1): Rename to...
      	(*tb<optab><ALLI:mode><GPI:mode>1): ... this.
      	(tbranch_<code><mode>4): New.
      	* config/aarch64/iterators.md(ZEROM, zerom): New.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/aarch64/tbz_1.c: New test.
      17ae956c
    • Tamar Christina's avatar
      middle-end: Add new tbranch optab to add support for bit-test-and-branch operations · dc582d2e
      Tamar Christina authored
      This adds a new test-and-branch optab that can be used to do a conditional test
      of a bit and branch.   This is similar to the cbranch optab but instead can
      test any arbitrary bit inside the register.
      
      This patch recognizes boolean comparisons and single bit mask tests.
      
      gcc/ChangeLog:
      
      	* dojump.cc (do_jump): Pass along value.
      	(do_jump_by_parts_greater_rtx): Likewise.
      	(do_jump_by_parts_zero_rtx): Likewise.
      	(do_jump_by_parts_equality_rtx): Likewise.
      	(do_compare_rtx_and_jump): Likewise.
      	(do_compare_and_jump): Likewise.
      	* dojump.h (do_compare_rtx_and_jump): New.
      	* optabs.cc (emit_cmp_and_jump_insn_1): Refactor to take optab to check.
      	(validate_test_and_branch): New.
      	(emit_cmp_and_jump_insns): Optiobally take a value, and when value is
      	supplied then check if it's suitable for tbranch.
      	* optabs.def (tbranch_eq$a4, tbranch_ne$a4): New.
      	* doc/md.texi (tbranch_@var{op}@var{mode}4): Document it.
      	* optabs.h (emit_cmp_and_jump_insns): New.
      	* tree.h (tree_zero_one_valued_p): New.
      dc582d2e
    • Tamar Christina's avatar
      aarch64: Make existing V2HF be usable. · 2cba118e
      Tamar Christina authored
      The backend has an existing V2HFmode that is used by pairwise operations.
      This mode was however never made fully functional.  Amongst other things it was
      never declared as a vector type which made it unusable from the mid-end.
      
      It's also lacking an implementation for load/stores so reload ICEs if this mode
      is every used.  This finishes the implementation by providing the above.
      
      Note that I have created a new iterator VHSDF_P instead of extending VHSDF
      because the previous iterator is used in far more things than just load/stores.
      
      It's also used for instance in intrinsics and extending this would force me to
      provide support for mangling the type while we never expose it through
      intrinsics.
      
      gcc/ChangeLog:
      
      	* config/aarch64/aarch64-simd.md (*aarch64_simd_movv2hf): New.
      	(mov<mode>, movmisalign<mode>, aarch64_dup_lane<mode>,
      	aarch64_store_lane0<mode>, aarch64_simd_vec_set<mode>,
      	@aarch64_simd_vec_copy_lane<mode>, vec_set<mode>,
      	reduc_<optab>_scal_<mode>, reduc_<fmaxmin>_scal_<mode>,
      	aarch64_reduc_<optab>_internal<mode>, aarch64_get_lane<mode>,
      	vec_init<mode><Vel>, vec_extract<mode><Vel>): Support V2HF.
      	(aarch64_simd_dupv2hf): New.
      	* config/aarch64/aarch64.cc (aarch64_classify_vector_mode):
      	Add E_V2HFmode.
      	* config/aarch64/iterators.md (VHSDF_P): New.
      	(V2F, VMOVE, nunits, Vtype, Vmtype, Vetype, stype, VEL,
      	Vel, q, vp): Add V2HF.
      	* config/arm/types.md (neon_fp_reduc_add_h): New.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/aarch64/sve/slp_1.c: Update testcase.
      2cba118e
    • Jonathan Wakely's avatar
      libstdc++: Add a test checking for chrono::duration overflows · dc94eaab
      Jonathan Wakely authored
      This test fails if chrono::days::rep or chrono::years::rep is a 32-bit
      type, because a large days or years value silently overflows a 32-bit
      integer when converted to seconds. It would be conforming to implement
      chrono::days as chrono::duration<int32_t, ratio<86400>>, but would make
      this overflow case more likely. Similarly for chrono::years,
      chrono::months and chrono::weeks. This test is here to remind us not to
      make that change lightly.
      
      libstdc++-v3/ChangeLog:
      
      	* testsuite/20_util/duration/arithmetic/overflow_c++20.cc: New
      	test.
      dc94eaab
    • Jonathan Wakely's avatar
      libstdc++: Fix constraint on std::basic_format_string [PR108024] · 6c0f9584
      Jonathan Wakely authored
      Also remove some redundant std::move calls for return statements.
      
      libstdc++-v3/ChangeLog:
      
      	PR libstdc++/108024
      	* include/std/format (basic_format_string): Fix constraint.
      	* testsuite/std/format/format_string.cc: New test.
      6c0f9584
    • Jonathan Wakely's avatar
      libstdc++: Change names that clash with Win32 or Clang · cb363fd9
      Jonathan Wakely authored
      Clang now defines an __is_unsigned built-in, and Windows defines an
      _Out_ macro. Replace uses of those as identifiers.
      
      There might also be a problem with __is_signed, which we use in several
      places.
      
      libstdc++-v3/ChangeLog:
      
      	* include/std/chrono (hh_mm_ss): Rename __is_unsigned member to
      	_S_is_unsigned.
      	* include/std/format (basic_format_context): Rename _Out_
      	template parameter to _Out2.
      	* testsuite/17_intro/names.cc: Add Windows SAL annotation
      	macros.
      cb363fd9
    • Jonathan Wakely's avatar
      libstdc++: Define atomic lock-free type aliases for C++20 [PR98034] · 320ac807
      Jonathan Wakely authored
      libstdc++-v3/ChangeLog:
      
      	PR libstdc++/98034
      	* include/std/atomic (__cpp_lib_atomic_lock_free_type_aliases):
      	Define macro.
      	(atomic_signed_lock_free, atomic_unsigned_lock_free): Define
      	aliases.
      	* include/std/version (__cpp_lib_atomic_lock_free_type_aliases):
      	Define macro.
      	* testsuite/29_atomics/atomic/lock_free_aliases.cc: New test.
      320ac807
    • Jonathan Wakely's avatar
      libstdc++: Make operator<< for stacktraces less templated (LWG 3515) · 2327d933
      Jonathan Wakely authored
      This change was approved for C++23 last month.
      
      libstdc++-v3/ChangeLog:
      
      	* include/std/stacktrace (operator<<): Only output to narrow
      	ostreams (LWG 3515).
      	* testsuite/19_diagnostics/stacktrace/synopsis.cc:
      2327d933
    • Martin Liska's avatar
      mklog: do not parse binary file for PR entry · 14d0f82c
      Martin Liska authored
      contrib/ChangeLog:
      
      	* mklog.py: Do not search PR entry in a file that is binary.
      14d0f82c
    • Kyrylo Tkachov's avatar
      aarch64: Add __ARM_FEATURE_PAUTH and __ARM_FEATURE_BTI ACLE defines · 688f4eb2
      Kyrylo Tkachov authored
      Recent ACLE additions specified the __ARM_FEATURE_PAUTH and __ARM_FEATURE_BTI macros [1] that the compiler
      should define when the pointer authentication and BTI instructions are available (and don't act as NOPs).
      We've received requests to enable them in GCC for aarch64, similar to clang [2].
      It's a fairly simple patch and should be non-intrusive at this stage.
      Pointer authentication has its own "pauth" feature flag, whereas BTI depends on an architecture level
      of Armv8.5-a or later.
      
      Bootstrapped and tested on aarch64-none-linux-gnu.
      
      [1] https://github.com/ARM-software/acle/blob/main/main/acle.md#pointer-authentication
      [2] https://reviews.llvm.org/rG7d40baa82b1f272f68de63f3c4f68d970bdcd6ed
      
      gcc/ChangeLog:
      
      	* config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins): Define
      	__ARM_FEATURE_PAUTH and __ARM_FEATURE_BTI when appropriate.
      	* config/aarch64/aarch64.h (TARGET_BTI): Define.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/aarch64/acle/bti_def.c: New test.
      	* gcc.target/aarch64/acle/pauth_def.c: New test.
      688f4eb2
    • Richard Biener's avatar
      Revert parts of ADDR_EXPR/CONSTRUCTOR treatment change in match.pd · 49bf49bb
      Richard Biener authored
      This reverts the part that substitutes from the definition of an
      SSA name to the capture, thus ADDR_EXPR@0 eventually yielding
      &y_1->a[i_2] instead of _3.  That's because I didn't think of
      how to deal with substituting @0 in the result pattern.  So
      the following re-instantiates the SSA def CONSTRUCTOR handling
      and in the ADDR_EXPR helpers used by match.pd handles SSA names
      defined to ADDR_EXPRs transparently.
      
      	* genmatch.cc (dt_simplify::gen): Revert last change.
      	* match.pd: Revert simplification of CONSTUCTOR leaf handling.
      	(&x cmp SSA_NAME): Handle ADDR_EXPR in SSA defs.
      	* fold-const.cc (split_address_to_core_and_offset): Handle
      	ADDR_EXPRs in SSA defs.
      	(address_compare): Likewise.
      49bf49bb
    • Richard Biener's avatar
      tree-optimization/89317 - another pattern for &p->x != p + 4 · 2dc5d6b1
      Richard Biener authored
      As seen in the original testcase for PR89317 we are missing
      comparison simplification patterns for &p->x != p + 4.  Fixed
      by making an existing one apply.  To make the pattern apply
      during CCP we need to simplify ccp_fold to not use GENERIC
      folding of conditions but also use GIMPLE folding.
      
      	PR tree-optimization/89317
      	* tree-ssa-ccp.cc (ccp_fold): Handle GIMPLE_COND via
      	gimple_fold_stmt_to_constant_1.
      	* match.pd (&a != &a + c): Apply to pointer_plus with non-ADDR_EXPR
      	base as well.
      
      	* gcc.dg/tree-ssa/pr89317.c: Amend.
      2dc5d6b1
    • GCC Administrator's avatar
      Daily bump. · 324e9953
      GCC Administrator authored
      324e9953
  2. Dec 11, 2022
    • Steve Kargl's avatar
      Fortran: fix ICE on bad use of statement function [PR107995] · 8f72249f
      Steve Kargl authored
      gcc/fortran/ChangeLog:
      
      	PR fortran/107995
      	* interface.cc (gfc_check_dummy_characteristics): Reject statement
      	function dummy arguments.
      
      gcc/testsuite/ChangeLog:
      
      	PR fortran/107995
      	* gfortran.dg/pr107995.f90: New test.
      8f72249f
    • Iain Buclaw's avatar
      d: Fix internal compiler error: in visit, at d/imports.cc:72 (PR108050) · d9d8c967
      Iain Buclaw authored
      The visitor for lowering IMPORTED_DECLs did not have an override for
      dealing with importing OverloadSet symbols.  This has now been
      implemented in the code generator.
      
      	PR d/108050
      
      gcc/d/ChangeLog:
      
      	* decl.cc (DeclVisitor::visit (Import *)): Handle build_import_decl
      	returning a TREE_LIST.
      	* imports.cc (ImportVisitor::visit (OverloadSet *)): New override.
      
      gcc/testsuite/ChangeLog:
      
      	* gdc.dg/imports/pr108050/mod1.d: New.
      	* gdc.dg/imports/pr108050/mod2.d: New.
      	* gdc.dg/imports/pr108050/package.d: New.
      	* gdc.dg/pr108050.d: New test.
      d9d8c967
    • Martin Liska's avatar
      unidiff: use newline='\n' argument · b0451799
      Martin Liska authored
      In order to support CR on a line, we need to open files
      with newline='\n' as our line endings supposed to be of UNIX style.
      
      contrib/ChangeLog:
      
      	* check_GNU_style.py: Use newline=\n.
      	* check_GNU_style_lib.py: Simplify.
      	* gcc-changelog/git_commit.py: Fix issues seen
      	Rust patchset.
      	* gcc-changelog/git_email.py: Use newline argument.
      	* gcc-changelog/test_email.py: New test.
      	* gcc-changelog/test_patches.txt: New test.
      	* mklog.py: Use newline argument.
      b0451799
    • Iain Buclaw's avatar
      d: Merge upstream dmd, druntime c8ae4adb2e, phobos 792c8b7c1. · 6d799f0a
      Iain Buclaw authored
      D front-end changes:
      
      	- Import dmd v2.101.0.
      	- Deprecate the ability to call `__traits(getAttributes)' on
      	  overload sets.
      	- Deprecate non-empty `for' statement increment clause with no
      	  effect.
      	- Array literals assigned to `scope' array variables can now be
      	  allocated on the stack.
      
      D runtime changes:
      
      	- Import druntime v2.101.0.
      
      Phobos changes:
      
      	- Import phobos v2.101.0.
      
      gcc/d/ChangeLog:
      
      	* dmd/MERGE: Merge upstream dmd c8ae4adb2e.
      	* typeinfo.cc (check_typeinfo_type): Update for new front-end
      	interface.
      	(TypeInfoVisitor::visit (TypeInfoStructDeclaration *)): Remove warning
      	that toHash() must be declared 'nothrow @safe`.
      
      libphobos/ChangeLog:
      
      	* libdruntime/MERGE: Merge upstream druntime c8ae4adb2e.
      	* src/MERGE: Merge upstream phobos 792c8b7c1.
      6d799f0a
    • Iain Buclaw's avatar
      d: Expand bsr intrinsic as `clz(arg) ^ (argsize - 1)' · cc7f509d
      Iain Buclaw authored
      As well as removing unnecessary casts, this results in less temporaries
      being generated during the initial gimple lowering pass.  Otherwise the
      code generated is identical to the former intrinsic expansion.
      
      gcc/d/ChangeLog:
      
      	* intrinsics.cc (expand_intrinsic_bsf): Fix comment.
      	(expand_intrinsic_bsr): Use BIT_XOR_EXPR instead of MINUS_EXPR.
      cc7f509d
    • Richard Biener's avatar
      tree-optimization/89317 - missed folding of (p + 4) - &p->d · d13b86f9
      Richard Biener authored
      The PR notices we fail to simplify
      
        a_4 = &x_3(D)->data;
        b_5 = x_3(D) + 16;
        _1 = b_5 - a_4;
      
      together with the enabler handling ADDR_EXPR leafs in separate
      stmts in match.pd the suggested patterns work.
      
      	PR tree-optimization/89317
      	* match.pd ((p + b) - &p->c -> b - offsetof(c)): New patterns.
      
      	* gcc.dg/tree-ssa/pr89317.c: New testcase.
      d13b86f9
    • Richard Biener's avatar
      Treat ADDR_EXPR and CONSTRUCTOR as GIMPLE/GENERIC magically · 26295a06
      Richard Biener authored
      The following allows to match ADDR_EXPR for both the invariant
      &a.b case as well as the &p->d case in a separate definition
      transparently.  This also allows to remove the hack we employ
      for CONSTRUCTOR which we handle for example with
      
       (match vec_same_elem_p
        CONSTRUCTOR@0
        (if (TREE_CODE (@0) == SSA_NAME
             && uniform_vector_p (gimple_assign_rhs1 (SSA_NAME_DEF_STMT (@0))))))
      
      Note CONSTUCTORs always appear as separate definition in GIMPLE,
      but I continue to play safe and ADDR_EXPRs are now matched in
      both places where previously ADDR_EXPR@0 would have missed
      the &p->x case.
      
      This is a prerequesite for the PR89317 fix.
      
      	* genmatch.cc (dt_node::gen_kids): Handle ADDR_EXPR in both
      	the GENERIC and GIMPLE op position.
      	(dt_simplify::gen): Capture both GENERIC and GIMPLE op
      	position for ADDR_EXPR and CONSTRUCTOR.
      	* match.pd: Simplify CONSTRUCTOR leaf handling.
      
      	* gcc.dg/tree-ssa/forwprop-3.c: Adjust.
      	* g++.dg/tree-ssa/pr31146-2.C: Likewise.
      26295a06
    • Richard Biener's avatar
      tree-optimization/106904 - bogus -Wstringopt-overflow with vectors · f8d136e5
      Richard Biener authored
      The following avoids CSE of &ps->wp to &ps->wp.hwnd confusing
      -Wstringopt-overflow by making sure to produce addresses to the
      biggest container from vectorization.  For this I introduce
      strip_zero_offset_components which turns &ps->wp.hwnd into
      &(*ps) and use that to base the vector data references on.
      That will also work for addresses with variable components,
      alternatively emitting pointer arithmetic via calling
      get_inner_reference and gimplifying that would be possible
      but likely more intrusive.
      
      This is by no means a complete fix for all of those issues
      (avoiding ADDR_EXPRs in favor of pointer arithmetic might be).
      Other passes will have similar issues.
      
      In theory that might now cause false negatives.
      
      	PR tree-optimization/106904
      	* tree.h (strip_zero_offset_components): Declare.
      	* tree.cc (strip_zero_offset_components): Define.
      	* tree-vect-data-refs.cc (vect_create_addr_base_for_vector_ref):
      	Strip zero offset components before building the address.
      
      	* gcc.dg/Wstringop-overflow-pr106904.c: New testcase.
      f8d136e5
    • Tobias Burnus's avatar
      fortran/openmp.cc: Remove 's' that slipped in during %<..%> replacement · 045592f6
      Tobias Burnus authored
      Seemingly, 's' (in VI that's the 's'ubstitute command) appeared verbatim in
      a gfc_error message when to doing the '...' to %<...%> replacements in commit
      r13-4590-g84f6f8a2a97f88be01e223c9c9dbab801a4f501f
      
      gcc/fortran/
      	* openmp.cc (gfc_match_omp_context_selector_specification):
      	Remove spurious 's' in an error message.
      045592f6
    • GCC Administrator's avatar
      Daily bump. · c6b12b80
      GCC Administrator authored
      c6b12b80
  3. Dec 10, 2022
    • Harald Anlauf's avatar
      Fortran: reject bad SIZE argument while simplifying ISHFTC [PR106911] · ae443853
      Harald Anlauf authored
      gcc/fortran/ChangeLog:
      
      	PR fortran/106911
      	* simplify.cc (gfc_simplify_ishftc): If the SIZE argument is known
      	to be outside the allowed range, terminate simplification.
      
      gcc/testsuite/ChangeLog:
      
      	PR fortran/106911
      	* gfortran.dg/pr106911.f90: New test.
      ae443853
    • Jakub Jelinek's avatar
      ivopts: Fix IP_END handling for asm goto [PR107997] · 7676235f
      Jakub Jelinek authored
      The following testcase ICEs, because the latch bb ends with
      asm goto which has both fallthrough to the header and one or more labels
      in the header too.  In that case there is just a single edge out of the
      latch block, but still the asm goto is stmt_ends_bb_p statement, yet
      ivopts decides to emit an IV bump at the IP_END position and inserts
      it into the same bb as the asm goto after it, which then fails verification
      (control flow in the middle of bb).
      
      The following patch fixes it by splitting the latch -> header edge in that
      case and inserting into the newly created bb, where split_edge ->
      redirect_edge_and_branch is able to deal with this case correctly.
      
      2022-12-10  Jakub Jelinek  <jakub@redhat.com>
      
      	PR tree-optimization/107997
      	* tree-ssa-loop-ivopts.cc: Include cfganal.h.
      	(create_new_iv) <case IP_END>: If ip_end_pos bb is non-empty and ends
      	with a stmt which ends bb, instead of adding iv update after it split
      	the latch edge and insert iterator into the new latch bb.
      
      	* gcc.c-torture/compile/pr107997.c: New test.
      7676235f
    • Tobias Burnus's avatar
      libgomp: Handle OpenMP's reverse offloads · ea4b23d9
      Tobias Burnus authored
      This commit enabled reverse offload for nvptx such that gomp_target_rev
      actually gets called.  And it fills the latter function to do all of
      the following: finding the host function to the device func ptr and
      copying the arguments to the host, processing the mapping/firstprivate,
      calling the host function, copying back the data and freeing as needed.
      
      The data handling is made easier by assuming that all host variables
      either existed before (and are in the mapping) or that those are
      devices variables not yet available on the host. Thus, the reverse
      mapping can do without refcounts etc. Note that the spec disallows
      inside a target region device-affecting constructs other than target
      plus ancestor device-modifier and it also limits the clauses permitted
      on this construct.
      
      For the function addresses, an additional splay tree is used; for
      the lookup of mapped variables, the existing splay-tree is used.
      Unfortunately, its data structure requires a full walk of the tree;
      Additionally, the just mapped variables are recorded in a separate
      data structure an extra lookup. While the lookup is slow, assuming
      that only few variables get mapped in each reverse offload construct
      and that reverse offload is the exception and not performance critical,
      this seems to be acceptable.
      
      libgomp/ChangeLog:
      
      	* libgomp.h (struct target_mem_desc): Predeclare; move
      	below after 'reverse_splay_tree_node' and add rev_array
      	member.
      	(struct reverse_splay_tree_key_s, reverse_splay_compare): New.
      	(reverse_splay_tree_node, reverse_splay_tree,
      	reverse_splay_tree_key): New typedef.
      	(struct gomp_device_descr): Add mem_map_rev member.
      	* oacc-host.c (host_dispatch): NULL init .mem_map_rev.
      	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_get_num_devices): Claim
      	support for GOMP_REQUIRES_REVERSE_OFFLOAD.
      	* splay-tree.h (splay_tree_callback_stop): New typedef; like
      	splay_tree_callback but returning int not void.
      	(splay_tree_foreach_lazy): Define; like splay_tree_foreach but
      	taking splay_tree_callback_stop as argument.
      	* splay-tree.c (splay_tree_foreach_internal_lazy,
      	splay_tree_foreach_lazy): New; but early exit if callback returns
      	nonzero.
      	* target.c: Instatiate splay_tree_c with splay_tree_prefix 'reverse'.
      	(gomp_map_lookup_rev): New.
      	(gomp_load_image_to_device): Handle reverse-offload function
      	lookup table.
      	(gomp_unload_image_from_device): Free devicep->mem_map_rev.
      	(struct gomp_splay_tree_rev_lookup_data, gomp_splay_tree_rev_lookup,
      	gomp_map_rev_lookup, struct cpy_data, gomp_map_cdata_lookup_int,
      	gomp_map_cdata_lookup): New auxiliary structs and functions for
      	gomp_target_rev.
      	(gomp_target_rev): Implement reverse offloading and its mapping.
      	(gomp_target_init): Init current_device.mem_map_rev.root.
      	* testsuite/libgomp.fortran/reverse-offload-2.f90: New test.
      	* testsuite/libgomp.fortran/reverse-offload-3.f90: New test.
      	* testsuite/libgomp.fortran/reverse-offload-4.f90: New test.
      	* testsuite/libgomp.fortran/reverse-offload-5.f90: New test.
      	* testsuite/libgomp.fortran/reverse-offload-5a.f90: New test without
      	mapping of on-device allocated variables.
      ea4b23d9
    • Gaius Mulley's avatar
      Add initial ChangeLogs for modula2. · 68ee8a64
      Gaius Mulley authored
      
      Add initial ChangeLog file in libgm2 and gcc/m2.
      
      ChangeLog:
      
      	* libgm2: (New directory).
      	* libgm2/ChangeLog: (New file).
      
      gcc/ChangeLog:
      
      	* m2: (New directory).
      	* m2/ChangeLog: (New file).
      
      Signed-off-by: default avatarGaius Mulley <gaiusmod2@gmail.com>
      68ee8a64
    • Thomas Schwinge's avatar
      Add stub 'gcc/rust/ChangeLog' · 24ff0b3e
      Thomas Schwinge authored
      24ff0b3e
    • Tobias Burnus's avatar
      Fortran: Replace simple '.' quotes by %<.%> · 84f6f8a2
      Tobias Burnus authored
      When using %qs instead of '%s' or %<=%> instead of '=' looks nicer
      by having nicer quotes and bold text, if the terminal supports it;
      otherwise, plain quotes are used.
      
      gcc/fortran/ChangeLog:
      
      	* match.cc (gfc_match_member_sep): Use %<...%> in gfc_error.
      	* openmp.cc (gfc_match_oacc_routine, gfc_match_omp_context_selector,
      	gfc_match_omp_context_selector_specification,
      	gfc_match_omp_declare_variant, resolve_omp_clauses): Likewise;
      	use %qs instead of '%s'.
      	* primary.cc (match_real_constant, gfc_match_varspec): Likewise.
      	* resolve.cc (gfc_resolve_formal_arglist, resolve_operator,
      	resolve_ordinary_assign): Likewise.
      84f6f8a2
    • Thomas Schwinge's avatar
      Prepare 'contrib/gcc-changelog/git_commit.py' for GCC/Rust · 325529e2
      Thomas Schwinge authored
      	contrib/
      	* gcc-changelog/git_commit.py (default_changelog_locations): Add
      	'gcc/rust'.
      	(bug_components): Add 'rust'.
      325529e2
    • Gaius Mulley's avatar
      Add ChangeLog directories for modula2 into git_commit.py. · 7e4aa710
      Gaius Mulley authored
      
      Prepare to add changelogs for the Modula2 front end by changing
      the contrib git_commit.py script.
      
      contrib/ChangeLog:
      
      	* gcc-changelog/git_commit.py (default_changelog_locations):
      	New entry for gcc/m2.  New entry for libgm2.
      
      Signed-off-by: default avatarGaius Mulley <gaiusmod2@gmail.com>
      7e4aa710
    • Ian Lance Taylor's avatar
      libbacktrace: rewrite and simplify main zstd loop · 1bdba731
      Ian Lance Taylor authored
      	* elf.c (ZSTD_TABLE_*): Use elf_zstd_fse_baseline_entry.
      	(ZSTD_ENCODE_BASELINE_BITS): Define.
      	(ZSTD_DECODE_BASELINE, ZSTD_DECODE_BASEBITS): Define.
      	(elf_zstd_literal_length_base): New static const array.
      	(elf_zstd_match_length_base): Likewise.
      	(struct elf_zstd_fse_baseline_entry): Define.
      	(elf_zstd_make_literal_baseline_fse): New static function.
      	(elf_zstd_make_offset_baseline_fse): Likewise.
      	(elf_zstd_make_match_baseline_fse): Likewise.
      	(print_table, main): Use elf_zstd_fse_baseline_entry.
      	(elf_zstd_lit_table, elf_zstd_match_table): Likewise.
      	(elf_zstd_offset_table): Likewise.
      	(struct elf_zstd_seq_decode): Likewise.  Remove use_rle and rle
      	fields.
      	(elf_zstd_unpack_seq_decode): Use elf_zstd_fse_baseline_entry,
      	taking a conversion function.  Convert RLE to FSE.
      	(elf_zstd_literal_length_baseline): Remove.
      	(elf_zstd_literal_length_bits): Remove.
      	(elf_zstd_match_length_baseline): Remove.
      	(elf_zstd_match_length_bits): Remove.
      	(elf_zstd_decompress): Use elf_zstd_fse_baseline_entry.  Rewrite
      	and simplify main loop.
      1bdba731
    • GCC Administrator's avatar
      Daily bump. · 40ce6485
      GCC Administrator authored
      40ce6485
  4. Dec 09, 2022
    • Paul Thomas's avatar
      Fortran: ICE on recursive derived types with allocatable components [PR107872] · 01254aa2
      Paul Thomas authored
      gcc/fortran/ChangeLog:
      
      	PR fortran/107872
      	* resolve.cc (derived_inaccessible): Skip over allocatable components
      	to prevent an infinite loop.
      
      gcc/testsuite/ChangeLog:
      
      	PR fortran/107872
      	* gfortran.dg/pr107872.f90: New test.
      01254aa2
    • Tobias Burnus's avatar
      Fortran/OpenMP: align/allocator modifiers to the allocate clause · b2e1c49b
      Tobias Burnus authored
      gcc/fortran/ChangeLog:
      
      	* dump-parse-tree.cc (show_omp_namelist): Improve OMP_LIST_ALLOCATE
      	output.
      	* gfortran.h (struct gfc_omp_namelist): Add 'align' to 'u'.
      	(gfc_free_omp_namelist): Add bool arg.
      	* match.cc (gfc_free_omp_namelist): Likewise; free 'u.align'.
      	* openmp.cc (gfc_free_omp_clauses, gfc_match_omp_clause_reduction,
      	gfc_match_omp_flush): Update call.
      	(gfc_match_omp_clauses): Match 'align/allocate modifers in
      	'allocate' clause.
      	(resolve_omp_clauses): Resolve align.
      	* st.cc (gfc_free_statement): Update call
      	* trans-openmp.cc (gfc_trans_omp_clauses): Handle 'align'.
      
      libgomp/ChangeLog:
      
      	* libgomp.texi (5.1 Impl. Status): Split allocate clause/directive
      	item about 'align'; mark clause as 'Y' and directive as 'N'.
      	* testsuite/libgomp.fortran/allocate-2.f90: New test.
      	* testsuite/libgomp.fortran/allocate-3.f90: New test.
      b2e1c49b
Loading