- Mar 06, 2025
-
-
Jonathan Wakely authored
We need to include <bits/stl_pair.h> in C++23 and later, so that __pair_like_convertible_from can use __pair_like, and so that __is_tuple_like_v is declared before we define a partial specialization. libstdc++-v3/ChangeLog: * include/bits/ranges_util.h: Include <bits/stl_pair.h>. Reviewed-by:
Patrick Palka <ppalka@redhat.com>
-
Jonathan Wakely authored
This destructor declaration serves no purpose, as pointed out by LWG 3903 which was approved at Varna, June 2023. libstdc++-v3/ChangeLog: * include/std/span (span::~span): Remove, as per LWG 3903. Reviewed-by:
Patrick Palka <ppalka@redhat.com>
-
Jonathan Wakely authored
This test fails due to duplicate explicit instantiations on targets where size_t and unsigned int are the same type. It also fails with -D_GLIBCXX_USE_CXX11_ABI=0 due to using std::string in constexpr functions, and with --disable-libstdcxx-pch due to not including <algorithm> for ranges::fold_left. libstdc++-v3/ChangeLog: PR libstdc++/119144 * testsuite/26_numerics/complex/tuple_like.cc: Include <algorithm>, replace std::string with std::string_view, instantiate tests for long instead of size_t.
-
Richard Sandiford authored
This reverts commit e836d803.
-
Richard Biener authored
The following makes sure to also walk CONSTRUCTOR element indexes which can be FIELD_DECLs, referencing otherwise unused types we need to clean. walk_tree only walks CONSTRUCTOR element data. PR lto/114501 * ipa-free-lang-data.cc (find_decls_types_r): Explicitly handle CONSTRUCTORs as walk_tree handling of those is incomplete. * g++.dg/pr114501_0.C: New testcase.
-
Jonathan Wakely authored
libstdc++-v3/ * src/c++20/tzdb.cc [__GTHREADS && !__GTHREADS_CXX0X]: Use '__gnu_cxx::__mutex'. Co-authored-by:
Thomas Schwinge <tschwinge@baylibre.com>
-
Thomas Schwinge authored
libstdc++: Avoid '-Wunused-parameter' for 'out' in member function 'std::codecvt_base::result std::__format::{anonymous}::__encoding::conv(std::string_view, std::string&) const' In a newlib configuration: ../../../../../source-gcc/libstdc++-v3/src/c++20/format.cc: In member function ‘std::codecvt_base::result std::__format::{anonymous}::__encoding::conv(std::string_view, std::string&) const’: ../../../../../source-gcc/libstdc++-v3/src/c++20/format.cc:100:35: error: unused parameter ‘out’ [-Werror=unused-parameter] 100 | conv(string_view input, string& out) const | ~~~~~~~~^~~ libstdc++-v3/ * src/c++20/format.cc (conv): Tag 'out' as '[[maybe_unused]]'.
-
Thomas Schwinge authored
libstdc++: Avoid '-Wunused-parameter' for 'is_directory' in member function 'bool std::filesystem::__cxx11::_Dir::do_unlink(bool, std::error_code&) const' In a newlib configuration: ../../../../../source-gcc/libstdc++-v3/src/c++17/fs_dir.cc: In member function ‘bool std::filesystem::__cxx11::_Dir::do_unlink(bool, std::error_code&) const’: ../../../../../source-gcc/libstdc++-v3/src/c++17/fs_dir.cc:147:18: error: unused parameter ‘is_directory’ [-Werror=unused-parameter] 147 | do_unlink(bool is_directory, error_code& ec) const noexcept | ~~~~~^~~~~~~~~~~~ libstdc++-v3/ * src/c++17/fs_dir.cc (do_unlink): Tag 'is_directory' as '[[maybe_unused]]'.
-
Thomas Schwinge authored
libstdc++: Avoid '-Wunused-parameter' for 'nofollow' in static member function 'static std::filesystem::__gnu_posix::DIR* std::filesystem::_Dir_base::openat(const _At_path&, bool)' In a newlib configuration: In file included from ../../../../../source-gcc/libstdc++-v3/src/c++17/fs_dir.cc:37, from ../../../../../source-gcc/libstdc++-v3/src/c++17/cow-fs_dir.cc:26: ../../../../../source-gcc/libstdc++-v3/src/c++17/../filesystem/dir-common.h: In static member function ‘static std::filesystem::__gnu_posix::DIR* std::filesystem::_Dir_base::openat(const _At_path&, bool)’: ../../../../../source-gcc/libstdc++-v3/src/c++17/../filesystem/dir-common.h:210:36: error: unused parameter ‘nofollow’ [-Werror=unused-parameter] 210 | openat(const _At_path& atp, bool nofollow) | ~~~~~^~~~~~~~ libstdc++-v3/ * src/filesystem/dir-common.h (openat): Tag 'nofollow' as '[[maybe_unused]]'.
-
Thomas Schwinge authored
libstdc++: Avoid '-Wunused-parameter' for '__what' in function 'void std::__throw_format_error(const char*)' In a '-fno-exceptions' configuration: In file included from ../../../../../source-gcc/libstdc++-v3/src/c++20/format.cc:29: [...]/build-gcc/[...]/libstdc++-v3/include/format: In function ‘void std::__throw_format_error(const char*)’: [...]/build-gcc/[...]/libstdc++-v3/include/format:200:36: error: unused parameter ‘__what’ [-Werror=unused-parameter] 200 | __throw_format_error(const char* __what) | ~~~~~~~~~~~~^~~~~~ libstdc++-v3/ * include/bits/c++config [!__cpp_exceptions] (_GLIBCXX_THROW_OR_ABORT): Reference '_EXC'. Co-authored-by:
Jonathan Wakely <jwakely@redhat.com>
-
Jonathan Wakely authored
The old COW std::string is not usable in constant expressions, so these new tests fail with -D_GLIBCXX_USE_CXX11_ABI=0. The parts of the tests using std::string can be conditionally skipped. libstdc++-v3/ChangeLog: * testsuite/20_util/specialized_algorithms/uninitialized_copy/constexpr.cc: Do not test COW std::string in constexpr contexts. * testsuite/20_util/specialized_algorithms/uninitialized_default_construct/constexpr.cc: Likewise. * testsuite/20_util/specialized_algorithms/uninitialized_fill/constexpr.cc: Likewise. * testsuite/20_util/specialized_algorithms/uninitialized_move/constexpr.cc: Likewise. * testsuite/20_util/specialized_algorithms/uninitialized_value_construct/constexpr.cc: Likewise. Reviewed-by:
Giuseppe D'Angelo <giuseppe.dangelo@kdab.com>
-
Alex Coplan authored
The PR claims that pair-fusion has invalid uses of gcc_assert (such that the pass will misbehave with --disable-checking). As noted in the comments, in the case of the calls to restrict_movement, the only way we can possibly depend on the side effects is if we call it with a non-singleton move range. However, the intent is that we always have a singleton move range here, and thus we do not rely on the side effects. This patch therefore adds asserts to check for a singleton move range before calling restrict_movement, thus clarifying the intent and hopefully dispelling any concerns that having the calls wrapped in asserts is problematic here. gcc/ChangeLog: PR rtl-optimization/114492 * pair-fusion.cc (pair_fusion_bb_info::fuse_pair): Check for singleton move range before calling restrict_movement. (pair_fusion::try_promote_writeback): Likewise.
-
Giuseppe D'Angelo authored
This commit implements P2819R2 for C++26, making std::complex destructurable and tuple-like (see [complex.tuple]). std::get needs to get forward declared in stl_pair.h (following the existing precedent for the implementation of P2165R4, cf. r14-8710-g65b4cba9d6a9ff), and implemented in <complex>. Also, std::get(complex<T>) needs to return *references* to the real and imaginary parts of a std::complex object, honoring the value category and constness of the argument. In principle a straightforward task, it gets a bit convoluted by the fact that: 1) std::complex does not have existing getters that one can use for this (real() and imag() return values, not references); 2) there are specializations for language/extended floating-point types, which requires some duplication -- need to amend the primary and all the specializations; 3) these specializations use a `__complex__ T`, but the primary template uses two non-static data members, making generic code harder to write. The implementation choice used here is to add the overloads of std::get for complex as declared in [complex.tuple]. In turn they dispatch to a newly added getter that extracts references to the real/imaginary parts of a complex<T>. This getter is private API, and the implementation depends on whether it's the primary (bind the data member) or a specialization (use the GCC language extensions for __complex__). To avoid duplication and minimize template instantiations, the getter uses C++23's deducing this (this avoids const overloads). The value category is dealt with by the std::get overloads. Add a test that covers the aspects of the tuple protocol, as well as the tuple-like interface. While at it, add a test for the existing tuple-like feature-testing macro. PR libstdc++/113310 libstdc++-v3/ChangeLog: * include/bits/stl_pair.h (get): Forward-declare std::get for std::complex. * include/bits/version.def (tuple_like): Bump the value of the feature-testing macro in C++26. * include/bits/version.h: Regenerate. * include/std/complex: Implement the tuple protocol for std::complex. (tuple_size): Specialize for std::complex. (tuple_element): Ditto. (__is_tuple_like_v): Ditto. (complex): Add a private getter to obtain references to the real and the imaginary part, on the primary class template and on its specializations. (get): Add overloads of std::get for std::complex. * testsuite/20_util/tuple/tuple_like_ftm.cc: New test. * testsuite/26_numerics/complex/tuple_like.cc: New test.
-
Richard Sandiford authored
Following on from the discussion in: https://gcc.gnu.org/pipermail/gcc-patches/2025-February/675256.html this patch removes TARGET_IRA_CALLEE_SAVED_REGISTER_COST_SCALE and replaces it with two hooks: one that controls the cost of using an extra callee-saved register and one that controls the cost of allocating a frame for the first spill. (The patch does not attempt to address the shrink-wrapping part of the thread above.) On AArch64, this is enough to fix PR117477, as verified by the new tests. The patch does not change the SPEC2017 scores significantly. (I saw a slight improvement in fotonik3d and roms, but I'm not convinced that the improvements are real.) The patch makes IRA use caller saves for gcc.target/aarch64/pr103350-1.c, which is a scan-dump correctness test that relies on not using caller saves. The decision to use caller saves looks appropriate, and saves an instruction, so I've just added -fno-caller-saves to the test options. The x86 parts were written by Honza. gcc/ PR rtl-optimization/117477 * config/aarch64/aarch64.cc (aarch64_count_saves): New function. (aarch64_count_above_hard_fp_saves, aarch64_callee_save_cost) (aarch64_frame_allocation_cost): Likewise. (TARGET_CALLEE_SAVE_COST): Define. (TARGET_FRAME_ALLOCATION_COST): Likewise. * config/i386/i386.cc (ix86_ira_callee_saved_register_cost_scale): Replace with... (ix86_callee_save_cost): ...this new hook. (TARGET_IRA_CALLEE_SAVED_REGISTER_COST_SCALE): Delete. (TARGET_CALLEE_SAVE_COST): Define. * target.h (spill_cost_type, frame_cost_type): New enums. * target.def (callee_save_cost, frame_allocation_cost): New hooks. (ira_callee_saved_register_cost_scale): Delete. * doc/tm.texi.in (TARGET_IRA_CALLEE_SAVED_REGISTER_COST_SCALE): Delete. (TARGET_CALLEE_SAVE_COST, TARGET_FRAME_ALLOCATION_COST): New hooks. * doc/tm.texi: Regenerate. * hard-reg-set.h (hard_reg_set_popcount): New function. * ira-color.cc (allocated_memory_p): New variable. (allocated_callee_save_regs): Likewise. (record_allocation): New function. (assign_hard_reg): Use targetm.frame_allocation_cost to model the cost of the first spill or first caller save. Use targetm.callee_save_cost to model the cost of using new callee-saved registers. Apply the exit rather than entry frequency to the cost of restoring a register or deallocating the frame. Update the new variables above. (improve_allocation): Use record_allocation. (color): Initialize allocated_callee_save_regs. (ira_color): Initialize allocated_memory_p. * targhooks.h (default_callee_save_cost): Declare. (default_frame_allocation_cost): Likewise. * targhooks.cc (default_callee_save_cost): New function. (default_frame_allocation_cost): Likewise. gcc/testsuite/ PR rtl-optimization/117477 * gcc.target/aarch64/callee_save_1.c: New test. * gcc.target/aarch64/callee_save_2.c: Likewise. * gcc.target/aarch64/callee_save_3.c: Likewise. * gcc.target/aarch64/pr103350-1.c: Add -fno-caller-saves. Co-authored-by:
Jan Hubicka <hubicka@ucw.cz>
-
Michal Jires authored
Incremental LTO disabled cleanup of output_files since they have to persist in ltrans cache. This unintetionally also kept temporary early debug "*.debug.temp.o" files. Bootstrapped/regtested on x86_64-linux. Ok for trunk? lto-plugin/ChangeLog: * lto-plugin.c (cleanup_handler): Keep only files in ltrans cache.
-
Richard Biener authored
The following testcase runs into a re-gimplification issue during inlining when processing MEM[(struct e *)this_2(D)].a = {}; where re-gimplification does not handle assignments in the same way than the gimplifier but instead relies on rhs_predicate_for and gimplifying the RHS standalone. This fails to handle special-casing of CTORs. The is_gimple_mem_rhs_or_call predicate already handles clobbers but not empty CTORs so we end up in the fallback code trying to force the CTOR into a separate stmt using a temporary - but as we have a non-copyable type here that ICEs. The following generalizes empty CTORs in is_gimple_mem_rhs_or_call since those need no additional re-gimplification. PR middle-end/119119 * gimplify.cc (is_gimple_mem_rhs_or_call): All empty CTORs are OK when not a register type. * g++.dg/torture/pr11911.C: New testcase.
-
Simon Martin authored
We have been miscompiling the following valid code since GCC8, and r8-3497-g281e6c1d8f1b4c === cut here === struct span { span (const int (&__first)[1]) : _M_ptr (__first) {} int operator[] (long __i) { return _M_ptr[__i]; } const int *_M_ptr; }; void foo () { constexpr int a_vec[]{1}; auto vec{[&a_vec]() -> span { return a_vec; }()}; } === cut here === The problem is that perform_implicit_conversion_flags (via mark_rvalue_use) replaces "a_vec" in the return statement by a CONSTRUCTOR representing a_vec's constant value, and then takes its address when invoking span's constructor. So we end up with an instance that points to garbage instead of a_vec's storage. As per Jason's suggestion, this patch simply removes the calls to mark_*_use from perform_implicit_conversion_flags, which fixes the PR. PR c++/117504 gcc/cp/ChangeLog: * call.cc (perform_implicit_conversion_flags): Don't call mark_{l,r}value_use. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/constexpr-117504.C: New test. * g++.dg/cpp2a/constexpr-117504a.C: New test.
-
Pan Li authored
The changes to vsetvl pass since 14 result in the asm check failure, update the asm check to meet the newest behavior. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_zbb.c: Tweak the asm check for vsetvl. Signed-off-by:
Pan Li <pan2.li@intel.com>
-
Jeff Law authored
Inspired by Liao Shihua, this adjusts two tests in the RISC-V testsuite to get more coverage. Drop the -O1 argument and replace it with -fext-dce. That way the test gets run across the full set of flags. We just need to make sure to skip -O0. gcc/testsuite/ChangeLog: * gcc.target/riscv/core_list_init.c: Use -fext-dce rather than -O1. Skip for -O0. * gcc.target/riscv/pr111384.c: Ditto.
-
GCC Administrator authored
-
- Mar 05, 2025
-
-
Gaius Mulley authored
This patch allow a packedset to be rotated by the system module intrinsic procedure function. It ensures that both operands to the tree rotate are of the same type. In turn the result will be the same type and the assignment into the designator (of the same set type) will succeed. gcc/m2/ChangeLog: PR modula2/118998 * gm2-gcc/m2expr.cc (m2expr_BuildLRotate): Convert nBits to the return type. (m2expr_BuildRRotate): Ditto. (m2expr_BuildLogicalRotate): Convert op3 to an integer type. Replace op3 aith rotateCount. Negate rotateCount if it is negative and call rotate right. * gm2-gcc/m2pp.cc (m2pp_bit_and_expr): New function. (m2pp_binary_function): Ditto. (m2pp_simple_expression): BIT_AND_EXPR new case clause. LROTATE_EXPR ditto. RROTATE_EXPR ditto. gcc/testsuite/ChangeLog: PR modula2/118998 * gm2/iso/pass/testrotate.mod: New test. * gm2/pim/fail/tinyconst.mod: New test. * gm2/sets/run/pass/simplepacked.mod: New test. Signed-off-by:
Gaius Mulley <gaiusmod2@gmail.com>
-
Jonathan Wakely authored
Implement LWG 3912, approved in Varna, June 2023. libstdc++-v3/ChangeLog: * include/std/ranges (enumerate_view::_Iterator::operator-): Add noexcept, as per LWG 3912. * testsuite/std/ranges/adaptors/enumerate/1.cc: Check iterator difference is noexcept.
-
yxj-github-437 authored
I notice std::timespec and std::timespec_get are used in preprocessor condition _GLIBCXX_HAVE_TIMESPEC_GET. So in module std, it should be the same. libstdc++-v3: * src/c++23/std-clib.cc.in (timespec): Move within preprocessor group guarded by _GLIBCXX_HAVE_TIMESPEC_GET.
-
Jonathan Wakely authored
The new test functions I added in r15-7765-g3866ca796d5281 are causing those tests to FAIL on Solaris and arm-thumb due to the linker complaining about undefined functions. The new test functions are not called, so it shouldn't matter that they call undefined member functions, but it does. Move those functions to separate { dg-do compile } files so the linker isn't used and won't complain. libstdc++-v3/ChangeLog: PR libstdc++/119110 * testsuite/25_algorithms/move/constrained.cc: Move test06 function to ... * testsuite/25_algorithms/move/105609.cc: New test. * testsuite/25_algorithms/move_backward/constrained.cc: Move test04 function to ... * testsuite/25_algorithms/move_backward/105609.cc: New test.
-
Mark Wielaard authored
fortran added a new -Wexternal-argument-mismatch option, but the lang.opt.urls file wasn't regenerated. Fixes: 21ca9153 ("C prototypes for external arguments; add warning for mismatch.") gcc/fortran/ChangeLog: * lang.opt.urls: Regenerated.
-
Patrick Palka authored
libstdc++-v3/ChangeLog: * include/bits/version.def (ranges_cache_latest): Define. * include/bits/version.h: Regenerate. * include/std/ranges (__detail::__non_propagating_cache::_M_reset): Export from base class _Optional_base. (cache_latest_view): Define for C++26. (cache_latest_view::_Iterator): Likewise. (cache_latest_view::_Sentinel): Likewise. (views::__detail::__can_cache_latest): Likewise. (views::_CacheLatest, views::cache_latest): Likewise. * testsuite/std/ranges/adaptors/cache_latest/1.cc: New test. Reviewed-by:
Jonathan Wakely <jwakely@redhat.com>
-
Marek Polacek authored
This PR complains that we issue a -Wnonnull even in a decltype. This fix disables even -Wformat and -Wrestrict. I think that's fine. PR c++/115580 gcc/c-family/ChangeLog: * c-common.cc (check_function_arguments): Return early if c_inhibit_evaluation_warnings. gcc/testsuite/ChangeLog: * g++.dg/warn/Wnonnull16.C: New test. Reviewed-by:
Jason Merrill <jason@redhat.com>
-
Giuseppe D'Angelo authored
This is a C++ >= 26 codepath for supporting constexpr stable_sort, so we know that we have if consteval available; it just needs protection with the feature-testing macro. Also merge the return in the same statement. Amends r15-7708-gff43f9853d3b10. libstdc++-v3/ChangeLog: * include/bits/stl_algo.h (__stable_sort): Use if consteval instead of is_constant_evaluated. Reviewed-by:
Jonathan Wakely <jwakely@redhat.com>
-
Jason Merrill authored
Because coroutines insert a call to the resumer between the initialization of the return value and the actual return to the caller, we need to duplicate the work of gimplify_return_expr for the !aggregate_value_p case. PR c++/117364 PR c++/118874 gcc/cp/ChangeLog: * coroutines.cc (cp_coroutine_transform::build_ramp_function): For !aggregate_value_p return force return value into a local temp. gcc/testsuite/ChangeLog: * g++.dg/coroutines/torture/pr118874.C: New test. Co-authored-by:
Jakub Jelinek <jakub@redhat.com>
-
Hannes Braun authored
vld1q_s8_x3, vld1q_s16_x3, vld1q_s8_x4 and vld1q_s16_x4 were expecting pointers to unsigned integers. These parameters should be pointers to signed integers. gcc/ChangeLog: PR target/118942 * config/arm/arm_neon.h (vld1q_s8_x3): Use int8_t instead of uint16_t. (vld1q_s16_x3): Use int16_t instead of uint16_t. (vld1q_s8_x4): Likewise. (vld1q_s16_x4): Likewise. gcc/testsuite/ChangeLog: PR target/118942 * gcc.target/arm/simd/vld1q_base_xN_1.c: Add -Wpointer-sign. Signed-off-by:
Hannes Braun <hannes@hannesbraun.net>
-
Patrick Palka authored
- Use __builtin_unreachable to suppress a false-positive "control reaches end of non-void function" warning in the recursive lambda (which the existing tests failed to notice since test01 wasn't being called at runtime) - Relax the constraints on views::concat in the single-argument case as per PR115215 - Add an input_range requirement to that same case as per LWG 4082 - In the const-converting constructor of concat_view's iterator, don't require the first iterator to be default constructible PR libstdc++/115215 PR libstdc++/115218 libstdc++-v3/ChangeLog: * include/std/ranges (concat_view::iterator::_S_invoke_with_runtime_index): Use __builtin_unreachable in recursive lambda to certify it always exits via 'return'. (concat_view::iterator::iterator): In the const-converting constructor, direct initialize _M_it. (views::_Concat::operator()): Adjust constraints in the single-argument case as per LWG 4082. * testsuite/std/ranges/concat/1.cc (test01): Call it at runtime too. (test04): New test. Reviewed-by:
Tomasz Kamiński <tkaminsk@redhat.com> Reviewed-by:
Jonathan Wakely <jwakely@redhat.com>
-
Da Xie authored
Add check for constrained auto type specifier in function declaration or function type declaration with trailing return type. Issue error if such usage is detected. Test file renamed, and added a new test for type declaration. Successfully bootstrapped and regretested on x86_64-pc-linux-gnu: Added 6 passed and 4 unsupported tests. PR c++/100589 gcc/cp/ChangeLog: * decl.cc (grokdeclarator): Issue an error for a declarator with constrained auto type specifier and trailing return types. Include function names if available. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/concepts-pr100589.C: New test. Signed-off-by:
Da Xie <xxie_xd@163.com> Reviewed-by:
Patrick Palka <ppalka@redhat.com> Reviewed-by:
Jason Merrill <jason@redhat.com>
-
Kyrylo Tkachov authored
The PARALLEL created in aarch64_evpc_dup is used to hold the lane number. It is not appropriate for it to have a vector mode. Other such uses use VOIDmode. Do this here as well. This avoids the risk of generic code treating the PARALLEL as trapping when it has floating-point mode. Bootstrapped and tested on aarch64-none-linux-gnu. Signed-off-by:
Kyrylo Tkachov <ktkachov@nvidia.com> PR rtl-optimization/119046 * config/aarch64/aarch64.cc (aarch64_evpc_dup): Use VOIDmode for PARALLEL.
-
Kyrylo Tkachov authored
In this testcase late-combine was failing to merge: dup v31.4s, v31.s[3] fmla v30.4s, v31.4s, v29.4s into the lane-wise fmla form. This is because late-combine checks may_trap_p under the hood on the dup insn. This ended up returning true for the insn: (set (reg:V4SF 152 [ _32 ]) (vec_duplicate:V4SF (vec_select:SF (reg:V4SF 111 [ rhs_panel.8_31 ]) (parallel:V4SF [ (const_int 3 [0x3])])))) Although mem_trap_p correctly reasoned that vec_duplicate and vec_select of floating-point modes can't trap, it assumed that the V4SF parallel can trap. The correct behaviour is to recurse into vector inside the PARALLEL and check the sub-expression. This patch adjusts may_trap_p_1 to do just that. With this check the above insn is not deemed to be trapping and is propagated into the FMLA giving: fmla vD.4s, vA.4s, vB.s[3] Bootstrapped and tested on aarch64-none-linux-gnu. Apparently this also fixes a regression in gcc.target/aarch64/vmul_element_cost.c that I observed. Signed-off-by:
Kyrylo Tkachov <ktkachov@nvidia.com> gcc/ PR rtl-optimization/119046 * rtlanal.cc (may_trap_p_1): Don't mark FP-mode PARALLELs as trapping. gcc/testsuite/ PR rtl-optimization/119046 * gcc.target/aarch64/pr119046.c: New test.
-
Jakub Jelinek authored
The following testcase is miscompiled during evrp. Before vrp, we have (from ccp): # RANGE [irange] long long unsigned int [0, +INF] MASK 0xffffffffffffc000 VALUE 0x2d _3 = _2 + 18446744073708503085; ... # RANGE [irange] long long unsigned int [0, +INF] MASK 0xffffffffffffc000 VALUE 0x59 _6 = (long long unsigned int) _5; # RANGE [irange] int [-INF, +INF] MASK 0xffffc000 VALUE 0x34 _7 = k_11 + -1048524; switch (_7) <default: <L5> [33.33%], case 8: <L7> [33.33%], case 24: <L6> [33.33%], case 32: <L6> [33.33%]> ... # RANGE [irange] long long unsigned int [0, +INF] MASK 0xffffffffffffc07d VALUE 0x0 # i_20 = PHI <_3(4), 0(3), _6(2)> and evrp is now trying to figure out range for i_20 in range_of_phi. All the ranges and MASK/VALUE pairs above are correct for the testcase, k_11 and _2 based on it is a result of multiplication by a constant with low 14 bits cleared and then some numbers are added to it. There is an obvious missed optimization for which I've filed PR119039, simplify_switch_using_ranges could see that all the labels but default are unreachable because the controlling expression has MASK 0xffffc000 VALUE 0x34 and none of 8, 24 and 32 satisfy that. Anyway, during range_of_phi for i_20, we process the PHI arguments in order. For the _3(4) case, we figure out that it is reachable through the case 24: case 32: labels only of the switch and that 0x34 - 0x2d is 7, so derive [irange] long long unsigned int [17, 17][25, 25] MASK 0xffffffffffffc000 VALUE 0x2d (the MASK/VALUE just got inherited from the _3 earlier range). Now (not suprisingly because those labels aren't actually reachable), that range is inconsistent, 0x2d is 45, so there is conflict between the values and the irange_bitmask. value-range.{h,cc} code differentiates between actually stored irange_bitmask, which is that MASK 0xffffffffffffc000 VALUE 0x2d, and semantic bitmask, which is what get_bitmask returns. That is // The mask inherent in the range is calculated on-demand. For // example, [0,255] does not have known bits set by default. This // saves us considerable time, because setting it at creation incurs // a large penalty for irange::set. At the time of writing there // was a 5% slowdown in VRP if we kept the mask precisely up to date // at all times. Instead, we default to -1 and set it when // explicitly requested. However, this function will always return // the correct mask. // // This also means that the mask may have a finer granularity than // the range and thus contradict it. Think of the mask as an // enhancement to the range. For example: // // [3, 1000] MASK 0xfffffffe VALUE 0x0 // // 3 is in the range endpoints, but is excluded per the known 0 bits // in the mask. // // See also the note in irange_bitmask::intersect. irange_bitmask bm = get_bitmask_from_range (type (), lower_bound (), upper_bound ()); if (!m_bitmask.unknown_p ()) bm.intersect (m_bitmask); Now, get_bitmask_from_range here is MASK 0x1f VALUE 0x0 and it intersects that with that MASK 0xffffffffffffc000 VALUE 0x2d. Which triggers the ugly special case in irange_bitmask::intersect: // If we have two known bits that are incompatible, the resulting // bit is undefined. It is unclear whether we should set the entire // range to UNDEFINED, or just a subset of it. For now, set the // entire bitmask to unknown (VARYING). if (wi::bit_and (~(m_mask | src.m_mask), m_value ^ src.m_value) != 0) { unsigned prec = m_mask.get_precision (); m_mask = wi::minus_one (prec); m_value = wi::zero (prec); } so the semantic bitmask is actually MASK 0xffffffffffffffff VALUE 0x0. Next, range_of_phi attempts to union it with the 0(3) PHI argument, and during irange::union_ first adds the [0,0] to the subranges, so [irange] long long unsigned int [0, 0][17, 17][25, 25] MASK 0xffffffffffffc000 VALUE 0x2d and then goes on to irange::union_bitmask which does if (m_bitmask == r.m_bitmask) return false; irange_bitmask bm = get_bitmask (); irange_bitmask save = bm; bm.union_ (r.get_bitmask ()); if (save == bm) return false; m_bitmask = bm; if (save == get_bitmask ()) return false; m_bitmask MASK 0xffffffffffffc000 VALUE 0x2d isn't the same as r.m_bitmask MASK 0x0 VALUE 0x0, so we compute the semantic bitmask (but note, not from the original range before union, but the modified one, dunno if that isn't a problem as well), which is still the VARYING/unknown_p one, union_ that with MASK 0x0 VALUE 0x0 and get still MASK 0xffffffffffffffff VALUE 0x0, so don't update anything, the semantic bitmask didn't change, so we are fine (not!, see later). Except then we try to union with the third PHI argument. And, because the edge to that comes only from case 8: label and there is a known difference between the two, the argument is actually already from earlier replaced by 45(2) constant. So, irange::union_ adds the [45, 45] range to the list of subranges, but voila, 45 is 0x2d and satisfies the stored MASK 0xffffffffffffc000 VALUE 0x2d and so the semantic bitmask changed to from MASK 0xffffffffffffffff VALUE 0x0 to MASK 0xffffffffffffc000 VALUE 0x2d by that addition. Eventually, we just optimize this to [irange] long long unsigned int [45, 45] because that is the only range which satisfies the bitmask. And that is wrong, at runtime i_20 has value 0. The following patch attempts to detect this case where get_bitmask turns some non-VARYING m_bitmask into VARYING one because of a conflict and in that case makes sure m_bitmask is actually updated rather than unmodified, so that later union_ doesn't cause problems. I also wonder whether e.g. get_bitmask couldn't have special case for this and if bm.intersect (m_bitmask); yields unknown_p from something not originally unknown_p, perhaps chooses to just use get_bitmask_from_range value and ignore the stored m_bitmask. Though, dunno how union_bitmask in that case would figure out it needs to update m_bitmask. 2025-03-05 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/118953 * value-range.cc (irange::union_bitmask): Update m_bitmask if get_bitmask () is unknown_p and m_bitmask is not even when the semantic bitmask didn't change and returning false. * gcc.dg/torture/pr118953.c: New test.
-
Tomasz Kamiński authored
Fix regression introduced by r14-8710-g65b4cba9d6a9ff PR libstdc++/119121 libstdc++-v3/ChangeLog: * include/bits/ranges_util.h (__detail::__pair_like_convertible_from): Use `_Tp` in `is_reference_v` check * testsuite/std/ranges/subrange/tuple_like.cc: New tests for pair-like conversion Reviewed-by:
Jonathan Wakely <jwakely@redhat.com>
-
Richard Biener authored
For strict-alignment targets we can end up with BLKmode single-element array types when the element type is unaligned. This confuses type checking since the canonical type would have an aligned element type and a non-BLKmode mode. The following simply ignores the mode we assign to array types for this purpose, like we already do for record and union types. PR middle-end/97323 * tree.cc (gimple_canonical_types_compatible_p): Ignore TYPE_MODE also for ARRAY_TYPE. (verify_type): Likewise. * gcc.dg/pr97323.c: New testcase.
-
Tomasz Kamiński authored
ChangeLog: * MAINTAINERS: Add myself.
-
Giuseppe D'Angelo authored
This commit adds support for C++26's constexpr specialized memory algorithms, introduced by P2283R2, P3508R0, P3369R0. The uninitialized_default, value, copy, move and fill algorithms are affected, in all of their variants (iterator-based, range-based and _n versions.) The changes are mostly mechanical -- add `constexpr` to a number of signatures when compiling in C++26 and above modes. The internal helper guard class for range algorithms instead can be marked unconditionally. uninitialized_default_construct is implemented in terms of the _Construct_novalue helper, which requires support for C++26's constexpr placement new from the compiler (P2747R2, which GCC implements). We can simply mark it as constexpr in C++26 language modes, even if the compiler does not support P2747R2 (e.g. Clang 17/18), because C++23's P2448R2 makes it OK to mark functions as constexpr even if they never qualify, and other compilers implement this. The only "real" change to the implementation of the algorithms is that during constant evaluation I need to dispatch to a constexpr-friendly version of them. For each algorithm family I've added only one test to cover it and its variants; the idea is to avoid too much repetition and simplify future maintenance. libstdc++-v3/ChangeLog: * include/bits/ranges_uninitialized.h: Mark the specialized memory algorithms as constexpr in C++26. Also mark the members of the _DestroyGuard helper class. * include/bits/stl_uninitialized.h: Ditto. * include/bits/stl_construct.h: (_Construct_novalue) Mark it as constexpr in C++26. * include/bits/version.def (raw_memory_algorithms): Bump the feature-testing macro for C++26. * include/bits/version.h: Regenerate. * testsuite/20_util/headers/memory/synopsis.cc: Add constexpr to the uninitialized_* algorithms (when in C++26) in the test. * testsuite/20_util/specialized_algorithms/feature_test_macro.cc: New test. * testsuite/20_util/specialized_algorithms/uninitialized_copy/constexpr.cc: New test. * testsuite/20_util/specialized_algorithms/uninitialized_default_construct/constexpr.cc: New test. * testsuite/20_util/specialized_algorithms/uninitialized_fill/constexpr.cc: New test. * testsuite/20_util/specialized_algorithms/uninitialized_move/constexpr.cc: New test. * testsuite/20_util/specialized_algorithms/uninitialized_value_construct/constexpr.cc: New test. Reviewed-by:
Patrick Palka <ppalka@redhat.com>
-
Andre Vehreschild authored
PR fortran/104684 gcc/fortran/ChangeLog: * trans-array.cc (gfc_conv_expr_descriptor): Look at the lang-specific akind and do a view convert when only the akind attribute differs between pointer and allocatable array. gcc/testsuite/ChangeLog: * gfortran.dg/coarray/ptr_comp_6.f08: New test.
-