Skip to content
Snippets Groups Projects
  1. Nov 16, 2021
    • Andrew Pinski's avatar
      tree-optimization: [PR103245] Improve detection of abs pattern using multiplication · 3200de91
      Andrew Pinski authored
      So while working on PR 103228 (and a few others), I noticed the testcase for PR 94785
      was failing. The problem is that the nop_convert moved from being inside the IOR to be
      outside of it. I also noticed the patch for PR 103228 was not needed to reproduce the
      issue either.
      This patch combines the two patterns together for the abs match when using multiplication
      and adds a few places where nop_convert are optional.
      
      OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
      
      	PR tree-optimization/103245
      
      gcc/ChangeLog:
      
      	* match.pd: Combine the abs pattern matching using multiplication.
      	Adding optional nop_convert too.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.dg/tree-ssa/pr103245-1.c: New test.
      3200de91
    • H.J. Lu's avatar
      Add a missing return when transforming atomic bit test and operations · 074ee8d9
      H.J. Lu authored
      When failing to transform equivalent, but slighly different cases of
      atomic bit test and operations to their canonical forms, return
      immediately.
      
      gcc/
      
      	PR middle-end/103268
      	* tree-ssa-ccp.c (optimize_atomic_bit_test_and): Add a missing
      	return.
      
      gcc/testsuite/
      
      	PR middle-end/103268
      	* gcc.dg/pr103268-1.c: New test.
      	* gcc.dg/pr103268-2.c: Likewise.
      074ee8d9
    • Jim Wilson's avatar
      Update my email address. · a031aaa2
      Jim Wilson authored
      	* MAINTAINERS: Update my address.
      a031aaa2
    • GCC Administrator's avatar
      Daily bump. · e2b57363
      GCC Administrator authored
      e2b57363
  2. Nov 15, 2021
    • Jason Merrill's avatar
      c++: Add -fimplicit-constexpr · 87c2080b
      Jason Merrill authored
      With each successive C++ standard the restrictions on the use of the
      constexpr keyword for functions get weaker and weaker; it recently occurred
      to me that it is heading toward the same fate as the C register keyword,
      which was once useful for optimization but became obsolete.  Similarly, it
      seems to me that we should be able to just treat inlines as constexpr
      functions and not make people add the extra keyword everywhere.
      
      There were a lot of testcase changes needed; many disabling errors about
      non-constexpr functions that are now constexpr, and many disabling implicit
      constexpr so that the tests can check the same thing as before, whether
      that's mangling or whatever.
      
      gcc/c-family/ChangeLog:
      
      	* c.opt: Add -fimplicit-constexpr.
      	* c-cppbuiltin.c: Define __cpp_implicit_constexpr.
      	* c-opts.c (c_common_post_options): Disable below C++14.
      
      gcc/cp/ChangeLog:
      
      	* cp-tree.h (struct lang_decl_fn): Add implicit_constexpr.
      	(decl_implicit_constexpr_p): New.
      	* class.c (type_maybe_constexpr_destructor): Use
      	TYPE_HAS_TRIVIAL_DESTRUCTOR and maybe_constexpr_fn.
      	(finalize_literal_type_property): Simplify.
      	* constexpr.c (is_valid_constexpr_fn): Check for dtor.
      	(maybe_save_constexpr_fundef): Try to set DECL_DECLARED_CONSTEXPR_P
      	on inlines.
      	(cxx_eval_call_expression): Use maybe_constexpr_fn.
      	(maybe_constexpr_fn): Handle flag_implicit_constexpr.
      	(var_in_maybe_constexpr_fn): Use maybe_constexpr_fn.
      	(potential_constant_expression_1): Likewise.
      	(decl_implicit_constexpr_p): New.
      	* decl.c (validate_constexpr_redeclaration): Allow change with
      	-fimplicit-constexpr.
      	(grok_special_member_properties): Use maybe_constexpr_fn.
      	* error.c (dump_function_decl): Don't print 'constexpr'
      	if it's implicit.
      	* Make-lang.in (check-c++-all): Update.
      
      libstdc++-v3/ChangeLog:
      
      	* testsuite/20_util/to_address/1_neg.cc: Adjust error.
      	* testsuite/26_numerics/random/concept.cc: Adjust asserts.
      
      gcc/testsuite/ChangeLog:
      
      	* lib/g++-dg.exp: Handle "impcx".
      	* lib/target-supports.exp
      	(check_effective_target_implicit_constexpr): New.
      	* g++.dg/abi/abi-tag16.C:
      	* g++.dg/abi/abi-tag18a.C:
      	* g++.dg/abi/guard4.C:
      	* g++.dg/abi/lambda-defarg1.C:
      	* g++.dg/abi/mangle26.C:
      	* g++.dg/cpp0x/constexpr-diag3.C:
      	* g++.dg/cpp0x/constexpr-ex1.C:
      	* g++.dg/cpp0x/constexpr-ice5.C:
      	* g++.dg/cpp0x/constexpr-incomplete2.C:
      	* g++.dg/cpp0x/constexpr-memfn1.C:
      	* g++.dg/cpp0x/constexpr-neg3.C:
      	* g++.dg/cpp0x/constexpr-specialization.C:
      	* g++.dg/cpp0x/inh-ctor19.C:
      	* g++.dg/cpp0x/inh-ctor30.C:
      	* g++.dg/cpp0x/lambda/lambda-mangle3.C:
      	* g++.dg/cpp0x/lambda/lambda-mangle5.C:
      	* g++.dg/cpp1y/auto-fn12.C:
      	* g++.dg/cpp1y/constexpr-loop5.C:
      	* g++.dg/cpp1z/constexpr-lambda7.C:
      	* g++.dg/cpp2a/constexpr-dtor3.C:
      	* g++.dg/cpp2a/constexpr-new13.C:
      	* g++.dg/cpp2a/constinit11.C:
      	* g++.dg/cpp2a/constinit12.C:
      	* g++.dg/cpp2a/constinit14.C:
      	* g++.dg/cpp2a/constinit15.C:
      	* g++.dg/cpp2a/spaceship-constexpr1.C:
      	* g++.dg/cpp2a/spaceship-eq3.C:
      	* g++.dg/cpp2a/udlit-class-nttp-neg2.C:
      	* g++.dg/debug/dwarf2/auto1.C:
      	* g++.dg/debug/dwarf2/cdtor-1.C:
      	* g++.dg/debug/dwarf2/lambda1.C:
      	* g++.dg/debug/dwarf2/pr54508.C:
      	* g++.dg/debug/dwarf2/pubnames-2.C:
      	* g++.dg/debug/dwarf2/pubnames-3.C:
      	* g++.dg/ext/is_literal_type3.C:
      	* g++.dg/ext/visibility/template7.C:
      	* g++.dg/gcov/gcov-12.C:
      	* g++.dg/gcov/gcov-2.C:
      	* g++.dg/ipa/devirt-35.C:
      	* g++.dg/ipa/devirt-36.C:
      	* g++.dg/ipa/devirt-37.C:
      	* g++.dg/ipa/devirt-44.C:
      	* g++.dg/ipa/imm-devirt-1.C:
      	* g++.dg/lookup/builtin5.C:
      	* g++.dg/lto/inline-crossmodule-1_0.C:
      	* g++.dg/modules/enum-1_a.C:
      	* g++.dg/modules/fn-inline-1_c.C:
      	* g++.dg/modules/pmf-1_b.C:
      	* g++.dg/modules/used-1_c.C:
      	* g++.dg/tls/thread_local11.C:
      	* g++.dg/tls/thread_local11a.C:
      	* g++.dg/tm/pr46653.C:
      	* g++.dg/ubsan/pr70035.C:
      	* g++.old-deja/g++.other/delete6.C:
      	* g++.dg/modules/pmf-1_a.H:
      	Adjust for implicit constexpr.
      87c2080b
    • Jason Merrill's avatar
      c++: split_nonconstant_init and flexarrays · 29e4163a
      Jason Merrill authored
      split_nonconstant_init was doing the wrong thing for both the initialization
      and cleanup here; we know the size from the initializer, and we can pass it
      along.  This doesn't make the testcase work, since the y destructor is still
      broken, but it removes the wrong error for the aggregate initialization.
      
      gcc/cp/ChangeLog:
      
      	* typeck2.c (split_nonconstant_init_1): Handle flexarrays better.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/ext/flexary37.C: Remove expected error.
      29e4163a
    • Siddhesh Poyarekar's avatar
      gimple-fold: Use ranges to simplify strncat and snprintf · 323026c7
      Siddhesh Poyarekar authored
      
      Use ranges for lengths and object sizes in strncat and snprintf to
      determine if they can be transformed into simpler operations.
      
      gcc/ChangeLog:
      
      	* gimple-fold.c (gimple_fold_builtin_strncat): Use ranges to
      	determine if it is safe to transform to strcat.
      	(gimple_fold_builtin_snprintf): Likewise.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.dg/fold-stringops-2.c: Define size_t.
      	(safe1): Adjust.
      	(safe4): New test.
      	* gcc.dg/fold-stringops-3.c: New test.
      
      Signed-off-by: default avatarSiddhesh Poyarekar <siddhesh@gotplt.org>
      323026c7
    • Siddhesh Poyarekar's avatar
      gimple-fold: Use ranges to simplify _chk calls · cea4dab8
      Siddhesh Poyarekar authored
      
      Instead of comparing LEN and SIZE only if they are constants, use their
      ranges to decide if LEN will always be lower than or same as SIZE.
      
      This change ends up putting the stringop-overflow warning line number
      against the strcpy implementation, so adjust the warning check to be
      line number agnostic.
      
      gcc/ChangeLog:
      
      	* gimple-fold.c (known_lower): New function.
      	(gimple_fold_builtin_strncat_chk,
      	gimple_fold_builtin_memory_chk, gimple_fold_builtin_stxcpy_chk,
      	gimple_fold_builtin_stxncpy_chk,
      	gimple_fold_builtin_snprintf_chk,
      	gimple_fold_builtin_sprintf_chk): Use it.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.dg/Wobjsize-1.c: Make warning change line agnostic.
      	* gcc.dg/fold-stringops-2.c: New test.
      
      Signed-off-by: default avatarSiddhesh Poyarekar <siddhesh@gotplt.org>
      cea4dab8
    • Siddhesh Poyarekar's avatar
      gimple-fold: Transform stp*cpy_chk to str*cpy directly · d1753b4b
      Siddhesh Poyarekar authored
      
      Avoid going through another folding cycle and use the ignore flag to
      directly transform BUILT_IN_STPCPY_CHK to BUILT_IN_STRCPY when set,
      likewise for BUILT_IN_STPNCPY_CHK to BUILT_IN_STPNCPY.
      
      Dump the transformation in dump_file so that we can verify in tests that
      the direct transformation actually happened.
      
      gcc/ChangeLog:
      
      	* gimple-fold.c (dump_transformation): New function.
      	(gimple_fold_builtin_stxcpy_chk,
      	gimple_fold_builtin_stxncpy_chk): Use it.  Simplify to
      	BUILT_IN_STRNCPY if return value is not used.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.dg/fold-stringops-1.c: New test.
      
      Signed-off-by: default avatarSiddhesh Poyarekar <siddhesh@gotplt.org>
      d1753b4b
    • H.J. Lu's avatar
      Check optab before transforming atomic bit test and operations · 4c19122b
      H.J. Lu authored
      Check optab before transforming equivalent, but slighly different cases
      of atomic bit test and operations to their canonical forms.
      
      gcc/
      
      	PR middle-end/103184
      	* tree-ssa-ccp.c (optimize_atomic_bit_test_and): Check optab
      	before transforming equivalent, but slighly different cases to
      	their canonical forms.
      
      gcc/testsuite/
      
      	PR middle-end/103184
      	* gcc.dg/pr103184-1.c: New test.
      	* gcc.dg/pr103184-2.c: Likewise.
      4c19122b
    • Iain Sandoe's avatar
      IPA: Provide a mechanism to register static DTORs via cxa_atexit. · fabe8cc4
      Iain Sandoe authored
      
      For at least one target (Darwin) the platform convention is to
      register static destructors (i.e. __attribute__((destructor)))
      with __cxa_atexit rather than placing them into a list that is
      run by some other mechanism.
      
      This patch provides a target hook that allows a target to opt
      into this and handling for the process in ipa_cdtor_merge ().
      
      When the mode is enabled (dtors_from_cxa_atexit is set) we:
      
       * Generate new CTORs to register static destructors with
         __cxa_atexit and add them to the existing list of CTORs;
         we then process the revised CTORs list.
      
       * We sort the DTORs into priority and then TU order, this
         means that they are registered in that order with
         __cxa_atexit () and therefore will be run in the reverse
         order.
      
       * Likewise, CTORs are sorted into priority and then TU order,
         which means that they will run in that order.
      
      This matches the behavior of using init/fini (or
      mod_init_func/mod_term_func) sections.
      
      This also fixes a bug where Fortran needs a DTOR to be run to
      close IO.
      
      Signed-off-by: default avatarIain Sandoe <iain@sandoe.co.uk>
      
      	PR fortran/102992
      
      gcc/ChangeLog:
      
      	* config/darwin.h (TARGET_DTORS_FROM_CXA_ATEXIT): New.
      	* doc/tm.texi: Regenerated.
      	* doc/tm.texi.in: Add TARGET_DTORS_FROM_CXA_ATEXIT hook.
      	* ipa.c (cgraph_build_static_cdtor_1): Return the built
      	function decl.
      	(build_cxa_atexit_decl): New.
      	(build_dso_handle_decl): New.
      	(build_cxa_dtor_registrations): New.
      	(compare_cdtor_tu_order): New.
      	(build_cxa_atexit_fns): New.
      	(ipa_cdtor_merge): If dtors_from_cxa_atexit is set,
      	process the DTORs/CTORs accordingly.
      	(pass_ipa_cdtor_merge::gate): Also run if
      	dtors_from_cxa_atexit is set.
      	* target.def (dtors_from_cxa_atexit): New hook.
      fabe8cc4
    • Iain Sandoe's avatar
      configure, Darwin: Check ld64 support for -platform-version. · d3cc82dc
      Iain Sandoe authored
      
      Newer versions of ld64 allow specifiying the OS target (e.g.
      macos or ios) the version and the SDK version all in a single
      command.  This checks the availability of the command for the
      current toolchain.
      
      Signed-off-by: default avatarIain Sandoe <iain@sandoe.co.uk>
      
      gcc/ChangeLog:
      
      	* config.in: Regenerate.
      	* configure: Regenerate.
      	* configure.ac: Test ld64 for -platform-version support.
      d3cc82dc
    • Iain Sandoe's avatar
      testsuite, Darwin: In tsvc.h, use malloc for Darwin <= 9. · bd5159bd
      Iain Sandoe authored
      
      Earlier Darwin versions fdo not have posix_memalign() but the
      malloc implementation is guaranteed to produce memory suitably
      aligned for the largest vector type.
      
      Signed-off-by: default avatarIain Sandoe <iain@sandoe.co.uk>
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.dg/vect/tsvc/tsvc.h: Use malloc for Darwin 9 and
      	earlier.
      bd5159bd
    • Iain Sandoe's avatar
      Ada, Darwin : Use DSYMUTIL_FOR_TARGET in libgnat/gnarl builds. · b7f01478
      Iain Sandoe authored
      
      Most of the time we get away with using the dsymutil that is
      installed with the latest Xcode, however for some cross-compilation
      cases that does not work.
      
      We now have the ability to specify the correct dsymutil to use for
      the toolchain (--with-dsymutil=) and we should use that specified
      tool for debug link.  Fixes cross-compilers from x86-64 to powerpc.
      
      Signed-off-by: default avatarIain Sandoe <iain@sandoe.co.uk>
      
      gcc/ada/ChangeLog:
      
      	* gcc-interface/Makefile.in: Use DSYMUTIL_FOR_TARGET in
      	libgnat/libgnarl recipies.
      b7f01478
    • François Dumont's avatar
      libstdc++: Unordered containers merge re-use hash code · d10b863f
      François Dumont authored
      When merging 2 unordered containers with same hasher we can re-use the hash code from
      the cache if any.
      
      Also in the context of the merge operation on multi-container use previous insert iterator as a hint
      for the next insert.
      
      libstdc++-v3/ChangeLog:
      
      	* include/bits/hashtable_policy.h:
      	(_Hash_code_base<>::_M_hash_code(const _Hash&, const _Hash_node_value<_Value, true>&)): New.
      	(_Hash_code_base<>::_M_hash_code<_H2>(const _H2&, const _Hash_node_value<>&)): New.
      	* include/bits/hashtable.h (_Hashtable<>::_M_merge_unique): Use latter.
      	(_Hashtable<>::_M_merge_multi): Likewise.
      	* testsuite/23_containers/unordered_multiset/modifiers/merge.cc (test05): New test.
      	* testsuite/23_containers/unordered_set/modifiers/merge.cc (test04): New test.
      d10b863f
    • Thomas Schwinge's avatar
      Use 'location_hash' for 'gcc/diagnostic-spec.h:nowarn_map' · f861ed8b
      Thomas Schwinge authored
      Instead of hard-coded '0'/'UINT_MAX', we now use the 'RESERVED_LOCATION_P'
      values 'UNKNOWN_LOCATION'/'BUILTINS_LOCATION' as spare values for
      'Empty'/'Deleted', and generally simplify the code.
      
      	gcc/
      	* diagnostic-spec.h (typedef xint_hash_t)
      	(typedef xint_hash_map_t): Replace with...
      	(typedef nowarn_map_t): ... this.
      	(nowarn_map): Adjust.
      	* diagnostic-spec.c (nowarn_map, suppress_warning_at): Likewise.
      f861ed8b
    • Thomas Schwinge's avatar
      Use 'location_hash' for 'seen_locations' in 'gcc/profile.c:branch_prob' · bcebd057
      Thomas Schwinge authored
      Follow-up to commit 102fcf94
      "Fix GCOV CFG related issues": considering the current
      'int_hash <location_t, 0, 2>', per 'libcpp/include/line-map.h':
      
            Actual     | Value                         | Meaning
            -----------+-------------------------------+-------------------------------
            0x00000000 | UNKNOWN_LOCATION (gcc/input.h)| Unknown/invalid location.
            -----------+-------------------------------+-------------------------------
            0x00000001 | BUILTINS_LOCATION             | The location for declarations
                       |   (gcc/input.h)               | in "<built-in>"
            -----------+-------------------------------+-------------------------------
            0x00000002 | RESERVED_LOCATION_COUNT       | The first location to be
                       | (also                         | handed out, and the
                       |  ordmap[0]->start_location)   | first line in ordmap 0
      
      ... this currently uses value '0' ('UNKNOWN_LOCATION') as spare values for
      'Empty', and value '2' ('RESERVED_LOCATION_COUNT') as spare values for
      'Deleted', which is questionable?
      
      What actually does get put into 'seen_locations' is (mostly...)
      restricted/gated by '!RESERVED_LOCATION_P' (which is true unless
      'UNKNOWN_LOCATION' or 'BUILTINS_LOCATION'), thus we may simply use
      'location_hash'.
      
      	gcc/
      	* profile.c (branch_prob): Use 'location_hash' for
      	'seen_locations'.
      bcebd057
    • Aldy Hernandez's avatar
      Drop tree overflow in irange setter. · 6c29c9d6
      Aldy Hernandez authored
      Drop meaningless overflow that may creep into the IL.
      
      gcc/ChangeLog:
      
      	PR tree-optimization/103207
      	* value-range.cc (irange::set): Drop overflow.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.dg/pr103207.c: New test.
      6c29c9d6
    • Tobias Burnus's avatar
      Fortran: openmp: Add support for thread_limit clause on target · 82ec4cb3
      Tobias Burnus authored
      gcc/fortran/ChangeLog:
      
      	* openmp.c (OMP_TARGET_CLAUSES): Add thread_limit.
      	* trans-openmp.c (gfc_split_omp_clauses): Add thread_limit also to
      	teams.
      
      libgomp/ChangeLog:
      
      	* testsuite/libgomp.fortran/thread-limit-1.f90: New test.
      82ec4cb3
    • Jakub Jelinek's avatar
      testsuite: Add testcase for already fixed PR [PR100469] · b2e1ac54
      Jakub Jelinek authored
      This bug introduced in r11-7448-gff92ede8d269375f800e1b347a48f4698874b4a3
      has been fixed already by r12-1354-g2d2ed777b23ab6503027039e0adbfe1162f52b2f
      aka PR100852 fix.
      
      2021-11-15  Jakub Jelinek  <jakub@redhat.com>
      
      	PR debug/100469
      	* g++.dg/opt/pr100469.C: New test.
      b2e1ac54
    • H.J. Lu's avatar
      x86: Add gcc.target/i386/pr103205-2.c · 65010897
      H.J. Lu authored
      	PR target/103205
      	* gcc.target/i386/pr103205-2.c: New test.
      65010897
    • H.J. Lu's avatar
      libffi: Update LOCAL_PATCHES · 7d768a9d
      H.J. Lu authored
      Add
      
      commit a91f844e
      Author: Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>
      Date:   Mon Nov 15 10:24:27 2021 +0100
      
          libffi: Use #define instead of .macro in  src/x86/win64.S [PR102874]
      
      to LOCAL_PATCHES.
      
      	* LOCAL_PATCHES: Add commit a91f844e.
      7d768a9d
    • Jakub Jelinek's avatar
      openmp: Add support for thread_limit clause on target · aea72386
      Jakub Jelinek authored
      OpenMP 5.1 says that thread_limit clause can also appear on target,
      and similarly to teams should affect the thread-limit-var ICV.
      On combined target teams, the clause goes to both.
      
      We actually passed thread_limit internally on target already before,
      but only used it for gcn/ptx offloading to hint how many threads should be
      created and for ptx didn't set thread_limit_var in that case.
      Similarly for host fallback.
      Also, I found that we weren't copying the args array that contains encoded
      thread_limit and num_teams clause for target (etc.) for async target.
      
      2021-11-15  Jakub Jelinek  <jakub@redhat.com>
      
      gcc/
      	* gimplify.c (optimize_target_teams): Only add OMP_CLAUSE_THREAD_LIMIT
      	to OMP_TARGET_CLAUSES if it isn't there already.
      gcc/c-family/
      	* c-omp.c (c_omp_split_clauses) <case OMP_CLAUSE_THREAD_LIMIT>:
      	Duplicate to both OMP_TARGET and OMP_TEAMS.
      gcc/c/
      	* c-parser.c (OMP_TARGET_CLAUSE_MASK): Add
      	PRAGMA_OMP_CLAUSE_THREAD_LIMIT.
      gcc/cp/
      	* parser.c (OMP_TARGET_CLAUSE_MASK): Add
      	PRAGMA_OMP_CLAUSE_THREAD_LIMIT.
      libgomp/
      	* task.c (gomp_create_target_task): Copy args array as well.
      	* target.c (gomp_target_fallback): Add args argument.
      	Set gomp_icv (true)->thread_limit_var if thread_limit is present.
      	(GOMP_target): Adjust gomp_target_fallback caller.
      	(GOMP_target_ext): Likewise.
      	(gomp_target_task_fn): Likewise.
      	* config/nvptx/team.c (gomp_nvptx_main): Set
      	gomp_global_icv.thread_limit_var.
      	* testsuite/libgomp.c-c++-common/thread-limit-1.c: New test.
      aea72386
    • Aldy Hernandez's avatar
      Fix PHI ordering problems in the path solver. · fcdf49a0
      Aldy Hernandez authored
      After auditing the PHI range calculations, I'm not convinced we've
      caught all the corner cases.  They haven't shown up in the wild (yet),
      but better safe than sorry.
      
      We shouldn't write anything to the cache or trigger additional
      lookups while calculating a PHI, as this may cause ordering problems.
      We should resolve the PHI with either the cache as it stands, or by
      asking for ranges on entry to the path.  I've documented this.
      
      There was one dubious case where we called fold_range in
      ssa_range_in_phi, which mostly by luck wasn't triggering lookups,
      because fold_range solves a PHI by calling range_on_edge, which is set
      to pick up global ranges by default in path_range_query.  This is
      fragile, so I've rewritten the call to explicitly use cached or global
      ranges.
      
      Also, the cache should be avoided in ssa_range_in_phi when the arg is
      defined in the PHI's block, as not doing so could create an ordering
      problem.  We have a similar check when calculating relations in PHIs.
      
      Tested on x86-64 & ppc64le Linux.
      
      gcc/ChangeLog:
      
      	* gimple-range-path.cc (path_range_query::internal_range_of_expr):
      	Remove useless code.
      	(path_range_query::ssa_defined_in_bb): New.
      	(path_range_query::ssa_range_in_phi): Avoid fold_range call that
      	could trigger additional lookups.
      	Do not use the cache for ARGs defined in this block.
      	(path_range_query::compute_ranges_in_block): Use ssa_defined_in_bb.
      	(path_range_query::maybe_register_phi_relation): Same.
      	(path_range_query::range_of_stmt): Adjust comment.
      	* gimple-range-path.h (ssa_defined_in_bb): New.
      fcdf49a0
    • Aldy Hernandez's avatar
      path solver: Default to global range if nothing found. · 540d92ae
      Aldy Hernandez authored
      This has been a long time coming, but we weren't able to make the
      change because of some unrelated regressions.
      
      Tested on x86-64 & ppc64le Linux.
      
      gcc/ChangeLog:
      
      	* gimple-range-path.cc (path_range_query::internal_range_of_expr):
      	Default to global range if nothing found.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/tree-ssa/pr31146-2.C: Add -fno-thread-jumps.
      540d92ae
    • Richard Biener's avatar
      tree-optimization/103237 - avoid vectorizing unhandled double reductions · 220bd618
      Richard Biener authored
      Double reductions which have multiple LC PHIs in the inner loop
      are not handled correctly during transformation since those PHIs
      are not properly classified as reduction.  The following disables
      vectorizing them.
      
      2021-11-15  Richard Biener  <rguenther@suse.de>
      
      	PR tree-optimization/103237
      	* tree-vect-loop.c (vect_is_simple_reduction): Fail for
      	double reductions with multiple inner loop LC PHI nodes.
      
      	* gcc.dg/torture/pr103237.c: New testcase.
      220bd618
    • Hongyu Wang's avatar
      PR target/103069: Relax cmpxchg loop for x86 target · 4d281ff7
      Hongyu Wang authored
      From the CPU's point of view, getting a cache line for writing is more
      expensive than reading.  See Appendix A.2 Spinlock in:
      
      https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/
      xeon-lock-scaling-analysis-paper.pdf
      
      The full compare and swap will grab the cache line exclusive and causes
      excessive cache line bouncing.
      
      The atomic_fetch_{or,xor,and,nand} builtins generates cmpxchg loop under
      -march=x86-64 like:
      
      	movl	v(%rip), %eax
      .L2:
      	movl	%eax, %ecx
      	movl	%eax, %edx
      	orl	$1, %ecx
      	lock cmpxchgl	%ecx, v(%rip)
      	jne	.L2
      	movl	%edx, %eax
      	andl	$1, %eax
      	ret
      
      To relax above loop, GCC should first emit a normal load, check and jump to
      .L2 if cmpxchgl may fail. Before jump to .L2, PAUSE should be inserted to
      yield the CPU to another hyperthread and to save power, so the code is
      like
      
      .L84:
              movl    (%rdi), %ecx
              movl    %eax, %edx
              orl     %esi, %edx
              cmpl    %eax, %ecx
              jne     .L82
              lock cmpxchgl   %edx, (%rdi)
              jne     .L84
      .L82:
              rep nop
              jmp     .L84
      
      This patch adds corresponding atomic_fetch_op expanders to insert load/
      compare and pause for all the atomic logic fetch builtins. Add flag
      -mrelax-cmpxchg-loop to control whether to generate relaxed loop.
      
      gcc/ChangeLog:
      
      	PR target/103069
      	* config/i386/i386-expand.c (ix86_expand_atomic_fetch_op_loop):
      	New expand function.
      	* config/i386/i386-options.c (ix86_target_string): Add
      	-mrelax-cmpxchg-loop flag.
      	(ix86_valid_target_attribute_inner_p): Likewise.
      	* config/i386/i386-protos.h (ix86_expand_atomic_fetch_op_loop):
      	New expand function prototype.
      	* config/i386/i386.opt: Add -mrelax-cmpxchg-loop.
      	* config/i386/sync.md (atomic_fetch_<logic><mode>): New expander
      	for SI,HI,QI modes.
      	(atomic_<logic>_fetch<mode>): Likewise.
      	(atomic_fetch_nand<mode>): Likewise.
      	(atomic_nand_fetch<mode>): Likewise.
      	(atomic_fetch_<logic><mode>): New expander for DI,TI modes.
      	(atomic_<logic>_fetch<mode>): Likewise.
      	(atomic_fetch_nand<mode>): Likewise.
      	(atomic_nand_fetch<mode>): Likewise.
      	* doc/invoke.texi: Document -mrelax-cmpxchg-loop.
      
      gcc/testsuite/ChangeLog:
      
      	PR target/103069
      	* gcc.target/i386/pr103069-1.c: New test.
      	* gcc.target/i386/pr103069-2.c: Ditto.
      4d281ff7
    • Richard Biener's avatar
      tree-optimization/103219 - avoid ICE in unroll-and-jam · d1ca8aea
      Richard Biener authored
      For no particularly good reason unroll-and-jam uses single_dom_exit
      to determine the exit for the region it wants to run VN on.  That
      happens to ICE because of the dominance restriction.  Use single_exit
      instead.
      
      2021-11-15  Richard Biener  <rguenther@suse.de>
      
      	PR tree-optimization/103219
      	* gimple-loop-jam.c (tree_loop_unroll_and_jam): Use single_exit
      	to determine the exit for the VN region.
      
      	* gcc.dg/torture/pr103219.c: New testcase.
      d1ca8aea
    • Prathamesh Kulkarni's avatar
      [tree-vectorizer.c] Merge pass_vectorize::execute with vectorize_loops and... · 2551cd4f
      Prathamesh Kulkarni authored
      [tree-vectorizer.c] Merge pass_vectorize::execute with vectorize_loops and replace occurences of cfun with function param.
      
      gcc/ChangeLog:
      	* tree-ssa-loop.c (pass_vectorize): Move to tree-vectorizer.c.
      	(pass_data_vectorize): Likewise.
      	(make_pass_vectorize): Likewise.
      	* tree-vectorizer.c (vectorize_loops): Merge with
      	pass_vectorize::execute and replace cfun occurences with fun param.
      	(adjust_simduid_builtins): Add fun param, replace cfun occurences with
      	fun, and adjust callers approrpiately.
      	(note_simd_array_uses): Likewise.
      	(vect_loop_dist_alias_call): Likewise.
      	(set_uid_loop_bbs): Likewise.
      	(vect_transform_loops): Likewise.
      	(try_vectorize_loop_1): Likewise.
      	(try_vectorize_loop): Likewise.
      2551cd4f
    • Rainer Orth's avatar
      libffi: Use #define instead of .macro in src/x86/win64.S [PR102874] · a91f844e
      Rainer Orth authored
      The libffi 3.4.2 import badly broke Solaris/x86 bootstrap with the native
      assembler:
      
      Assembler:
              "/vol/gcc/src/hg/master/local/libffi/src/x86/win64.S", line 88 :
      Illegal mnemonic
              Near line: ".macro epilogue"
              "/vol/gcc/src/hg/master/local/libffi/src/x86/win64.S", line 88 : Syntax
      error
              Near line: ".macro epilogue"
              "/vol/gcc/src/hg/master/local/libffi/src/x86/win64.S", line 95 :
      Illegal mnemonic
              Near line: ".endm"
              "/vol/gcc/src/hg/master/local/libffi/src/x86/win64.S", line 95 : Syntax
      error
              Near line: ".endm"
              "/vol/gcc/src/hg/master/local/libffi/src/x86/win64.S", line 100 :
      Illegal mnemonic
              Near line: " epilogue"
              "/vol/gcc/src/hg/master/local/libffi/src/x86/win64.S", line 100 :
      Syntax error
              Near line: "epilogue"
      
      Solaris as doesn't support .macro/.endm.
      
      Fixed by using #define instead of the unportable .macro.
      
      Tested on i386-pc-solaris2.11 and x86_64-pc-linux-gnu.
      
      The bug has been reported upstream
      (https://github.com/libffi/libffi/issues/665); a corresponding pull
      request is also pending (https://github.com/libffi/libffi/pull/669).
      
      
      2021-10-21  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>
      
      	libffi:
      	PR libffi/102874
      	* src/x86/win64.S (epilogue): Use #define instead of .macro.
      a91f844e
    • Rainer Orth's avatar
      testsuite: i386: Require dfp in gcc.target/i386/pr101346.c · a68933da
      Rainer Orth authored
      gcc.target/i386/pr101346.c currently FAILs on Solaris/x86:
      
      FAIL: gcc.target/i386/pr101346.c (test for excess errors)
      
      Excess errors:
      /vol/gcc/src/hg/master/local/gcc/testsuite/gcc.target/i386/pr101346.c:6:1:
      error: decimal floating-point not supported for this target
      /vol/gcc/src/hg/master/local/gcc/testsuite/gcc.target/i386/pr101346.c:7:6:
      error: decimal floating-point not supported for this target
      /vol/gcc/src/hg/master/local/gcc/testsuite/gcc.target/i386/pr101346.c:9:12:
      warning: implicit declaration of function '__builtin_fabsd128'; did you
      mean '__builtin_fabsf128'? [-Wimplicit-function-declaration]
      
      Fixed by requiring dfp support.  Tested on i386-pc-solaris2.11 and
      x86_64-pc-linux-gnu.
      
      
      2021-10-20  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>
      
      	gcc/testsuite:
      	* gcc.target/i386/pr101346.c: Require dfp support.
      a68933da
    • Jakub Jelinek's avatar
      i386: Fix up x86 atomic_bit_test* expanders for !TARGET_HIMODE_MATH [PR103205] · 625eef42
      Jakub Jelinek authored
      With !TARGET_HIMODE_MATH, the OPTAB_DIRECT expand_simple_binop fail and so
      we ICE.  We don't really care if they are done promoted in SImode instead.
      
      2021-11-15  Jakub Jelinek  <jakub@redhat.com>
      
      	PR target/103205
      	* config/i386/sync.md (atomic_bit_test_and_set<mode>,
      	atomic_bit_test_and_complement<mode>,
      	atomic_bit_test_and_reset<mode>): Use OPTAB_WIDEN instead of
      	OPTAB_DIRECT.
      
      	* gcc.target/i386/pr103205.c: New test.
      625eef42
    • Jakub Jelinek's avatar
      libgomp, nvptx: Honor OpenMP 5.1 num_teams lower bound · 9fa72756
      Jakub Jelinek authored
      Here is a PTX implementation of what I was talking about, that for
      num_teams_upper 0 or whenever num_teams_lower <= num_blocks, the current
      implementation is fine but if the user explicitly asks for more
      teams than we can provide in hardware, we need to stop assuming that
      omp_get_team_num () is equal to the hw team id, but instead need to use some
      team specific memory (it is .shared for PTX), or if none is
      provided, array indexed by the hw team id and run some teams serially within
      the same hw thread.
      
      2021-11-15  Jakub Jelinek  <jakub@redhat.com>
      
      	* config/nvptx/team.c (__gomp_team_num): Define as
      	__attribute__((shared)) var.
      	(gomp_nvptx_main): Initialize __gomp_team_num to 0.
      	* config/nvptx/target.c (__gomp_team_num): Declare as
      	extern __attribute__((shared)) var.
      	(GOMP_teams4): Use __gomp_team_num as the team number instead of
      	%ctaid.x.  If first, initialize it to %ctaid.x.  If num_teams_lower
      	is bigger than num_blocks, use num_teams_lower teams and arrange for
      	bumping of __gomp_team_num if !first and returning false once we run
      	out of teams.
      	* config/nvptx/teams.c (__gomp_team_num): Declare as
      	extern __attribute__((shared)) var.
      	(omp_get_team_num): Return __gomp_team_num value instead of %ctaid.x.
      9fa72756
    • Jakub Jelinek's avatar
      libgomp: Add a testcase for omp_get_num_teams inside of target inside of host teams · d2944597
      Jakub Jelinek authored
      This is https://github.com/OpenMP/spec/issues/3183
      There is an agreement that we should return 1 team inside of target,
      even if that target is inside of host teams.  We were doing that
      when offloading and not during host fallback, r12-5151 should fix that
      even for host fallback.
      
      2021-11-15  Jakub Jelinek  <jakub@redhat.com>
      
      	* testsuite/libgomp.c/teams-5.c: New test.
      d2944597
    • Jason Merrill's avatar
      c++: location of lambda object and conversion call · 2317082c
      Jason Merrill authored
      Two things that had poor location info: we weren't giving the TARGET_EXPR
      for a lambda object any location, and the call to a conversion function was
      getting whatever input_location happened to be.
      
      gcc/cp/ChangeLog:
      
      	* call.c (perform_implicit_conversion_flags): Use the location of
      	the argument.
      	* lambda.c (build_lambda_object): Set location on the TARGET_EXPR.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/cpp0x/lambda/lambda-switch.C: Adjust expected location.
      2317082c
    • Jason Merrill's avatar
      c++: check constexpr constructor body · 37326651
      Jason Merrill authored
      The implicit constexpr patch revealed that our checks for constexpr
      constructors that could possibly produce a constant value (which
      otherwise are IFNDR) was failing to look at most of the function body.
      Fixing that required some library tweaks.
      
      gcc/cp/ChangeLog:
      
      	* constexpr.c (maybe_save_constexpr_fundef): Also check whether the
      	body of a constructor is potentially constant.
      
      libstdc++-v3/ChangeLog:
      
      	* src/c++17/memory_resource.cc: Add missing constexpr.
      	* include/experimental/internet: Only mark copy constructor
      	as constexpr with __cpp_constexpr_dynamic_alloc.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/cpp1y/constexpr-89285-2.C: Expect error.
      	* g++.dg/cpp1y/constexpr-89285.C: Adjust error.
      37326651
    • Jason Merrill's avatar
      c++: is_this_parameter and coroutines proxies · daa9c6b0
      Jason Merrill authored
      Compiling coroutines/pr95736.C with the implicit constexpr patch broke
      because is_this_parameter didn't recognize the coroutines proxy for 'this'.
      
      gcc/cp/ChangeLog:
      
      	* semantics.c (is_this_parameter): Check DECL_HAS_VALUE_EXPR_P
      	instead of is_capture_proxy.
      daa9c6b0
    • Jason Merrill's avatar
      c++: c++20 constexpr default ctor and array init · bd95d75f
      Jason Merrill authored
      The implicit constexpr patch revealed that marking the constructor in the
      PR70690 testcase as constexpr made the bug reappear, because build_vec_init
      assumed that a constexpr default constructor initialized the whole object,
      so it was equivalent to value-initialization.  But this is no longer true in
      C++20.
      
      	PR c++/70690
      
      gcc/cp/ChangeLog:
      
      	* init.c (build_vec_init): Check default_init_uninitialized_part in
      	C++20.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/init/array41a.C: New test.
      bd95d75f
    • Jason Merrill's avatar
      c++: don't do constexpr folding in unevaluated context · 4df7f8c7
      Jason Merrill authored
      The implicit constexpr patch revealed that we were doing constant evaluation
      of arbitrary expressions in unevaluated contexts, leading to failure when we
      tried to evaluate e.g. a call to declval.  This is wrong more generally;
      only manifestly-constant-evaluated expressions should be evaluated within
      an unevaluated operand.
      
      Making this change revealed a case we were failing to mark as manifestly
      constant-evaluated.
      
      gcc/cp/ChangeLog:
      
      	* constexpr.c (maybe_constant_value): Don't evaluate
      	in an unevaluated operand unless manifestly const-evaluated.
      	(fold_non_dependent_expr_template): Likewise.
      	* decl.c (compute_array_index_type_loc): This context is
      	manifestly constant-evaluated.
      4df7f8c7
    • Jason Merrill's avatar
      c++: constexpr virtual and vbase thunk · 267318a2
      Jason Merrill authored
      C++20 allows virtual functions to be constexpr.  I don't think that calling
      through a pointer to a vbase subobject is supposed to work in a constant
      expression, since an object with virtual bases can't be constant, but the
      call shouldn't ICE.
      
      gcc/cp/ChangeLog:
      
      	* constexpr.c (cxx_eval_thunk_call): Error instead of ICE
      	on vbase thunk to constexpr function.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/cpp2a/constexpr-virtual20.C: New test.
      267318a2
Loading