- Mar 10, 2023
-
-
Ju-Zhe Zhong authored
Hi, current maybe_gen_insn can only expand 9 nops. For RVV intrinsics, I need to extend it as 10, otherwise I should use GEN_FCN. This patch is quite obvious change, Ok for trunk ? Thanks. gcc/ChangeLog: * config/riscv/riscv-vector-builtins.cc (function_expander::use_ternop_insn): Use maybe_gen_insn instead. (function_expander::use_widen_ternop_insn): Ditto. * optabs.cc (maybe_gen_insn): Extend nops handling.
-
Ju-Zhe Zhong authored
gcc/ChangeLog: * config/riscv/riscv-vector-builtins-bases.cc: Split indexed load patterns according to RVV ISA. * config/riscv/vector-iterators.md: New iterators. * config/riscv/vector.md (@pred_indexed_<order>load<VNX1_QHSD:mode><VNX1_QHSDI:mode>): Remove. (@pred_indexed_<order>load<mode>_same_eew): New pattern. (@pred_indexed_<order>load<mode>_x2_greater_eew): Ditto. (@pred_indexed_<order>load<mode>_x4_greater_eew): Ditto. (@pred_indexed_<order>load<mode>_x8_greater_eew): Ditto. (@pred_indexed_<order>load<mode>_x2_smaller_eew): Ditto. (@pred_indexed_<order>load<mode>_x4_smaller_eew): Ditto. (@pred_indexed_<order>load<mode>_x8_smaller_eew): Ditto. (@pred_indexed_<order>load<VNX2_QHSD:mode><VNX2_QHSDI:mode>): Remove. (@pred_indexed_<order>load<VNX4_QHSD:mode><VNX4_QHSDI:mode>): Ditto. (@pred_indexed_<order>load<VNX8_QHSD:mode><VNX8_QHSDI:mode>): Ditto. (@pred_indexed_<order>load<VNX16_QHS:mode><VNX16_QHSI:mode>): Ditto. (@pred_indexed_<order>load<VNX32_QH:mode><VNX32_QHI:mode>): Ditto. (@pred_indexed_<order>load<VNX64_Q:mode><VNX64_Q:mode>): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/merge_constraint-1.c: New test.
-
Michael Collison authored
* tree-vect-loop-manip.cc (vect_do_peeling): Use result of constant_lower_bound instead of vf for the lower bound of the epilog loop trip count.
-
Jason Merrill authored
The code for handling signed + typedef was breaking on __int128_t, because it isn't a proper typedef: it doesn't have DECL_ORIGINAL_TYPE. PR c++/108099 gcc/cp/ChangeLog: * decl.cc (grokdeclarator): Handle non-typedef typedef_decl. gcc/testsuite/ChangeLog: * g++.dg/ext/int128-7.C: New test.
-
Jason Merrill authored
PR c++/108542 gcc/cp/ChangeLog: * class.cc (instantiate_type): Strip location wrapper. gcc/testsuite/ChangeLog: * g++.dg/contracts/contracts-err1.C: New test.
-
GCC Administrator authored
-
- Mar 09, 2023
-
-
Jason Merrill authored
The optimization to reuse the same allocator temporary for all string constructor calls was breaking on this testcase, because the temps were already in the argument to build_vec_init, and replacing them with references to one slot got confused with calls at multiple levels (for the initializer_list backing array, and then again for the array member of the std::array). Fixed by reusing the whole TARGET_EXPR instead of pulling out the slot; gimplification ensures that it's only initialized once. I also moved the check for initializing a std:: class down into the tree walk, and handle multiple temps within a single array element initialization. PR c++/108773 gcc/cp/ChangeLog: * init.cc (find_allocator_temps_r): New. (combine_allocator_temps): Replace find_allocator_temp. (build_vec_init): Adjust. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/initlist-array18.C: New test. * g++.dg/cpp0x/initlist-array19.C: New test.
-
David Malcolm authored
There are various -Wanalyzer-null-dereference false +ves in bugzilla that I've been attempting to fix. Unfortunately I haven't made much progress, but it seems worth at least capturing the reduced reproducers as test cases, to make it easier to spot changes in behavior. gcc/testsuite/ChangeLog: PR analyzer/102671 PR analyzer/105755 PR analyzer/108251 PR analyzer/108400 * gcc.dg/analyzer/null-deref-pr102671-1.c: New test, reduced from Emacs. * gcc.dg/analyzer/null-deref-pr102671-2.c: Likewise. * gcc.dg/analyzer/null-deref-pr105755.c: Likewise. * gcc.dg/analyzer/null-deref-pr108251-smp_fetch_ssl_fc_has_early-O2.c: New test, reduced from haproxy's src/ssl_sample.c. * gcc.dg/analyzer/null-deref-pr108251-smp_fetch_ssl_fc_has_early.c: Likewise. * gcc.dg/analyzer/null-deref-pr108400-SoftEtherVPN-WebUi.c: New test, reduced from SoftEtherVPN's src/Cedar/WebUI.c. Signed-off-by:
David Malcolm <dmalcolm@redhat.com>
-
Tamar Christina authored
When doing an emergency dump the cfg output dumps are corrupted because the ending "}" is missing. Normally when the pass manager finishes it would call finish_graph_dump_file to produce this. This is called here because each pass can dump multiple digraphs. However during an emergency dump we only dump the current function and so after that is done we never go back to the pass manager. As such, we need to manually call finish_graph_dump_file in order to properly finish off graph generation. With this -ftree-dump-*-graph works properly during a crash dump. gcc/ChangeLog: * passes.cc (emergency_dump_function): Finish graph generation.
-
Tamar Christina authored
We were analyzing code quality after recent changes and have noticed that the tbz support somehow managed to increase the number of branches overall rather than decreased them. While investigating this we figured out that the problem is that when an existing & <contants> exists in gimple and the instruction is generated because of the range information gotten from the ANDed constant that we end up with the situation that you get a NOP AND in the RTL expansion. This is not a problem as CSE will take care of it normally. The issue is when this original AND was done in a location where PRE or FRE "lift" the AND to a different basic block. This triggers a problem when the resulting value is not single use. Instead of having an AND and tbz, we end up generating an AND + TST + BR if the mode is HI or QI. This CSE across BB was a problem before but this change made it worse. Our branch patterns rely on combine being able to fold AND or zero_extends into the instructions. To work around this (since a proper fix is outside of the scope of stage-4) we are limiting the new tbranch optab to only HI and QI mode values. This isn't a problem because these two modes are modes for which we don't have CBZ support, so they are the problematic cases to begin with. Additionally booleans are QI. The second thing we're doing is limiting the only legal bitpos to pos 0. i.e. only the bottom bit. This such that we prevent the double ANDs as much as possible. Now most other cases, i.e. where we had an explicit & in the source code are still handled correctly by the anonymous (*tb<optab><ALLI:mode><GPI:mode>1) pattern that was added along with tbranch support. This means we don't expand the superflous AND here, and while it doesn't fix the problem that in the cross BB case we loss tbz, it also doesn't make things worse. With these tweaks we've now reduced the number of insn uniformly was originally expected. gcc/ChangeLog: * config/aarch64/aarch64.md (tbranch_<code><mode>3): Restrict to SHORT and bottom bit only. gcc/testsuite/ChangeLog: * gcc.target/aarch64/tbz_2.c: New test. * gcc.target/aarch64/tbz_3.c: New test.
-
Patrick Palka authored
The LWG 3820 testcase revealed a bug in _M_advance, which this patch also fixes. libstdc++-v3/ChangeLog: * include/std/ranges (cartesian_product_view::_Iterator::_Iterator): Remove constraint on default constructor as per LWG 3849. (cartesian_product_view::_Iterator::_M_prev): Adjust position of _Nm > 0 test as per LWG 3820. (cartesian_product_view::_Iterator::_M_advance): Perform bounds checking only on sized cartesian products. * testsuite/std/ranges/cartesian_product/1.cc (test08): New test.
-
Patrick Palka authored
PR libstdc++/109024 libstdc++-v3/ChangeLog: * include/std/ranges (chunk_by_view::_M_pred): Remove DMI as per LWG 3796. (repeat_view::_M_pred): Likewise. * testsuite/std/ranges/adaptors/chunk_by/1.cc (test03): New test. * testsuite/std/ranges/repeat/1.cc (test05): New test.
-
Patrick Palka authored
PR libstdc++/108362 libstdc++-v3/ChangeLog: * include/std/ranges (__detail::__can_single_view): New concept. (_Single::operator()): Constrain it. Move [[nodiscard]] to the end of the function declarator. (__detail::__can_iota_view): New concept. (_Iota::operator()): Constrain it. Move [[nodiscard]] to the end of the function declarator. (__detail::__can_istream_view): New concept. (_Istream::operator()): Constrain it. Move [[nodiscard]] to the end of the function declarator. * testsuite/std/ranges/iota/iota_view.cc (test07): New test. * testsuite/std/ranges/istream_view.cc (test08): New test. * testsuite/std/ranges/single_view.cc (test07): New test.
-
Andrew Pinski authored
The problem here is after r13-4748-g2a27ae32fabf85, in some cases we were calling inform without a corresponding warning. This changes the logic such that we only cause that to happen if there was a warning happened before hand. Changes since * v1: Fix formating and dump message as suggested by Jakub. OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. gcc/ChangeLog: PR tree-optimization/108980 * gimple-array-bounds.cc (array_bounds_checker::check_array_ref): Reorgnize the call to warning for not strict flexible arrays to be before the check of warned.
-
Patrick Palka authored
ranges::begin() isn't guaranteed to be equality-preserving for non-forward ranges, so in cartesian_product_view::end we need to avoid needlessly calling begin() on the first range (which could be non-forward) in the case where __empty_tail is false as per its specification. Since we're already using a variadic lambda to compute __empty_tail, we might as well use that same lambda to build up the tuple of iterators instead of building it separately via e.g. std::apply or __tuple_transform. PR libstdc++/107572 libstdc++-v3/ChangeLog: * include/std/ranges (cartesian_product_view::end): When building the tuple of iterators, avoid calling ranges::begin on the first range if __empty_tail is false. * testsuite/std/ranges/cartesian_product/1.cc (test07): New test.
-
Jonathan Wakely authored
libstdc++-v3/ChangeLog: PR libstdc++/108882 * config/os/gnu-linux/ldbl-ieee128-extra.ver: Fix incorrect patterns.
-
Jason Merrill authored
The standard was unclear what happens with the transformation of a deduction guide if the initial template argument deduction fails for a reason other than not deducing all the arguments; my implementation assumed that the right thing was to give up on the deduction guide. But in consideration of CWG2664 this week I realized that we get a better result by just continuing with an empty set of deductions, so the alias deduction guide is the same as the original deduction guide plus the deducible constraint. DR 2664 PR c++/102529 gcc/cp/ChangeLog: * pt.cc (alias_ctad_tweaks): Continue after deduction failure. gcc/testsuite/ChangeLog: * g++.dg/DRs/dr2664.C: New test. * g++.dg/cpp2a/class-deduction-alias15.C: New test.
-
Jason Merrill authored
In my initial implementation of alias CTAD, I described a couple of differences from the specification that I thought would not have a practical effect; this testcase demonstrates that I was wrong. One difference is resolved by the CPTK_IS_DEDUCIBLE commit; the other (adding too many of the alias template parameters to the new deduction guide) is fixed by this patch. PR c++/105841 gcc/cp/ChangeLog: * pt.cc (corresponding_template_parameter_list): Split out... (corresponding_template_parameter): ...from here. (find_template_parameters): Factor out... (find_template_parameter_info::find_in): ...this function. (find_template_parameter_info::find_in_recursive): New. (find_template_parameter_info::found): New. (alias_ctad_tweaks): Only add parms used in the deduced args. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/class-deduction-alias14.C: New test. Co-authored-by:
Michael Spertus <mike@spertus.com>
-
Jason Merrill authored
I want to have more discussion about the interface before claiming the __is_deducible name, so for GCC 13 make it internal-only. gcc/ChangeLog: * doc/extend.texi: Comment out __is_deducible docs. gcc/cp/ChangeLog: * cp-trait.def (IS_DEDUCIBLE): Add space to name. gcc/testsuite/ChangeLog: * g++.dg/ext/is_deducible1.C: Guard with __has_builtin (__is_deducible).
-
Jason Merrill authored
C++20 class template argument deduction for an alias template involves adding a constraint that the template arguments for the alias template can be deduced from the return type of the deduction guide for the underlying class template. In the standard, this is modeled as defining a class template with a partial specialization, but it's much more efficient to implement with a trait that directly tries to perform the deduction. The first argument to the trait is a template rather than a type, so various places needed to be adjusted to accommodate that. PR c++/105841 gcc/ChangeLog: * doc/extend.texi (Type Traits):: Document __is_deducible. gcc/cp/ChangeLog: * cp-trait.def (IS_DEDUCIBLE): New. * cxx-pretty-print.cc (pp_cxx_trait): Handle non-type. * parser.cc (cp_parser_trait): Likewise. * tree.cc (cp_tree_equal): Likewise. * pt.cc (tsubst_copy_and_build): Likewise. (type_targs_deducible_from): New. (alias_ctad_tweaks): Use it. * semantics.cc (trait_expr_value): Handle CPTK_IS_DEDUCIBLE. (finish_trait_expr): Likewise. * constraint.cc (diagnose_trait_expr): Likewise. * cp-tree.h (type_targs_deducible_from): Declare. gcc/testsuite/ChangeLog: * g++.dg/ext/is_deducible1.C: New test.
-
Costas Argyris authored
Compile a resource object that contains the utf8 manifest. Then link that object into the driver and compiler proper. For compiler proper the link has to be forced because the resource object file gets into a static library (libbackend.a) and gets eventually dropped because it has no symbols of its own and nothing is referencing it inside the library. Therefore, an artificial symbol is planted to force the link. gcc/ChangeLog: PR driver/108865 * config.host: add object for x86_64-*-mingw*. * config/i386/sym-mingw32.cc: dummy file to attach symbol. * config/i386/utf8-mingw32.rc: windres resource file. * config/i386/winnt-utf8.manifest: XML manifest to enable UTF-8. * config/i386/x-mingw32: reference to x-mingw32-utf8. * config/i386/x-mingw32-utf8: Makefile fragment to embed UTF-8 manifest. Signed-off-by:
Jonathan Yong <10walls@gmail.com>
-
Vladimir N. Makarov authored
LRA is too conservative in calculation of conflicts with clobbered regs by using the biggest access mode. This results in failure of possible reg coalescing and worse code. This patch solves the problem. PR rtl-optimization/108999 gcc/ChangeLog: * lra-constraints.cc (process_alt_operands): Use operand modes for clobbered regs instead of the biggest access mode. gcc/testsuite/ChangeLog: * gcc.target/aarch64/pr108999.c: New.
-
Richard Biener authored
The following plugs one place in extract_muldiv where it should avoid folding when sanitizing overflow. PR middle-end/108995 * fold-const.cc (extract_muldiv_1): Avoid folding (CST * b) / CST2 when sanitizing overflow and we rely on overflow being undefined. * gcc.dg/ubsan/pr108995.c: New testcase.
-
Jakub Jelinek authored
The following testcase is reduced from miscompilation of scipy package. If we have say lhs = [1., 1.] - [1., 1.] and want to compute the range of lhs from it, we correctly determine it is [0., 0.] (if computations are exact, we generally don't try to round them further in frange_arithmetic). In the testcase it is about a reverse operation, [1., 1.] = op1 + [1., 1.] and we want to compute range of op1 from that. Right now we just perform the inverse operation (there are some corner cases about NaN and infinities handling) and so arrive to range [0., 0.] as well, and because it is a singleton, optimize return eps; to return 0. That is incorrect though, for the reverse ops we need to take into account also rounding, the right exact range is [-0x1.0p-54, 0x1.0p-53] in this case when rounding to nearest, i.e. all numbers which added to 1. with round to nearest still produce 1. The problem isn't solely on singleton ranges, and isn't solely on results around zero. We basically need to consider also values where the result is up to 0.5ulp away from the lhs range boundaries in each direction. The following patch fixes it by extending the lhs range for the reverse operations by 1ulp in each direction. The PR contains a pseudo-random test generator I've used to generate 300000 tests of + and - and then used the same test with * and / instead of + and - together with a hack to print the discovered ranges by the patch in a form that another test could then verify the range is conservatively correct and how far it is from a minimal range. I believe the results are good enough for now, though plan to look incrementally into trying to do something better on the -XXX_MAX or XXX_MAX boundaries (where I think frange_nextafter will use -inf or +inf) and also try to increase the range just by 0.5ulp rather than 1ulp if !flag_rounding_math. But dunno if either of those will be doable and will pass the testing, so I think it is worth committing this fix first. 2023-03-09 Jakub Jelinek <jakub@redhat.com> Richard Biener <rguenther@suse.de> PR tree-optimization/109008 * range-op-float.cc (float_widen_lhs_range): New function. (foperator_plus::op1_range, foperator_minus::op1_range, foperator_minus::op2_range, foperator_mult::op1_range, foperator_div::op1_range, foperator_div::op2_range): Use it. * gcc.c-torture/execute/ieee/pr109008.c: New test.
-
Hongyu Wang authored
When OMP_WAIT_POLICY is not specified, current implementation will cause icv flag GOMP_ICV_WAIT_POLICY unset, so global variable wait_policy will remain its uninitialized value. Initialize it to -1 to make GOMP_SPINCOUNT behavior consistent with its description. libgomp/ChangeLog: PR libgomp/109062 * env.c (wait_policy): Initialize to -1. (initialize_icvs): Initialize icvs->wait_policy to -1. * testsuite/libgomp.c-c++-common/pr109062.c: New test.
-
GCC Administrator authored
-
- Mar 08, 2023
-
-
Tobias Burnus authored
libgomp/ChangeLog: * libgomp.texi (Offload-Target Specifics): Mention GCN_STACK_SIZE.
-
Kewen Lin authored
As PR108727 shows, when cleanup code called by the stack unwinder calls function _Unwind_Resume, it goes via plt stub like: function 00000000.plt_call._Unwind_Resume: => 0x0000000010003580 <+0>: std r2,40(r1) 0x0000000010003584 <+4>: ld r12,-31760(r2) 0x0000000010003588 <+8>: mtctr r12 0x000000001000358c <+12>: ld r2,-31752(r2) 0x0000000010003590 <+16>: cmpldi r2,0 0x0000000010003594 <+20>: bnectr+ 0x0000000010003598 <+24>: b 0x100031a4 <_Unwind_Resume@plt> It wants to save TOC base (r2) to r1 + 40, but we only bump the stack segment by 32 bytes as follows: stdu %r29,-32(%r3) It means the access is out of the stack segment allocated by __generic_morestack, once the touch area isn't writable like this failure shows, it would cause segment fault. So fix the bump size with one reasonable value PARAMS. PR libgcc/108727 libgcc/ChangeLog: * config/rs6000/morestack.S (__morestack): Use PARAMS for new stack bump size.
-
Kewen Lin authored
According to Haochen's finding in [1], currently ppc-fortran.exp doesn't support Fortran specific warning or error messages well. By looking into it, it's due to that gfortran uses some different warning/error prefixes as follows: set gcc_warning_prefix "\[Ww\]arning:" set gcc_error_prefix "(Fatal )?\[Ee\]rror:" comparing to: set gcc_warning_prefix "warning:" set gcc_error_prefix "(fatal )?error:" So this is to override these two prefixes and make it support dg-{warning,error} checks. [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-March/613302.html gcc/testsuite/ChangeLog: * gcc.target/powerpc/ppc-fortran/ppc-fortran.exp: Override gcc_{warning,error}_prefix with Fortran specific one used in gfortran_init.
-
Kewen Lin authored
Test cases scalar-test-data-class-1[45].c adopts type __int128 which requires to check int128 effective target, otherwise the testing on them will fail at -m32. This patch is to add int128 effective target requirement. gcc/testsuite/ChangeLog: * gcc.target/powerpc/bfp/scalar-test-data-class-14.c: Adjust with int128 effective target requirement. * gcc.target/powerpc/bfp/scalar-test-data-class-15.c: Likewise.
-
Kewen Lin authored
Two test cases scalar-test-data-class-12.c and vec-test-data-class-9.c fail on Power9 BE testing at -m32, they adopts a built-in function scalar_insert_exp which requires powerpc64 support. This patch is to make them to check has_arch_ppc64 effective target requirement. PR testsuite/108729 gcc/testsuite/ChangeLog: * gcc.target/powerpc/bfp/scalar-test-data-class-12.c: Adjust with has_arch_ppc64 effective target. * gcc.target/powerpc/bfp/vec-test-data-class-9.c: Likewise.
-
Kewen Lin authored
The built-in function scalar_test_neg_qp is under stanza ieee128-hw, that is TARGET_FLOAT128_HW. Since we don't have float128 hardware support on 32-bit as follows: if (TARGET_FLOAT128_HW && !TARGET_64BIT) { if ((rs6000_isa_flags_explicit & OPTION_MASK_FLOAT128_HW) != 0) error ("%qs requires %qs", "%<-mfloat128-hardware%>", "-m64"); rs6000_isa_flags &= ~OPTION_MASK_FLOAT128_HW; } So adjust the case with lp64 effective target accordingly. PR testsuite/108730 gcc/testsuite/ChangeLog: * gcc.target/powerpc/bfp/scalar-test-neg-8.c: Adjust with lp64 effective target requirement.
-
Kewen Lin authored
Compiled with cpu type Power9 or later, GCC generates xxspltib rather than vspltis*, so adjust the test case scanning content accordingly. PR testsuite/108813 gcc/testsuite/ChangeLog: * gcc.target/powerpc/pr101384-2.c: Adjust with xxspltib.
-
Kewen Lin authored
On BE, the extracted index for the leftmost element is 0 rather than 1, adjust the test case accordingly. PR testsuite/108810 gcc/testsuite/ChangeLog: * gcc.target/powerpc/fold-vec-extract-double.p9.c (testd_cst): Adjust the extracted index for BE.
-
Jeff Law authored
The mips msa-ds.c test is trying to ensure that MSA branches can have their delay slots filled. The regexp it used looked for the function name, a nop, then the function name again. If found that sequence, then the test failed. The problem is with Vlad's recent IRA work there's simply less code in the test (good) and as a result one of the *other* branches in the test had an unfilled delay slot -- the delay slot for the MSA branch was still being filled. This patch tightens up the regexp. In particular it looks for the MSA branch and a nop on the next line (avoiding the over-eager .* construct). That indicates that the MSA branch did not have its delay slot filled. When that sequence is found, then the test fails. This fixes the recent regressions for mips64 and mips64el in the tester. Installing on the trunk, gcc/testsuite: * gcc.target/mips/msa-ds.c: Fix over eager pattern matching.
-
Hans-Peter Nilsson authored
The recently added tests missed checking for "fopenmp" (see other tests where "-fopenmp" is passed), which makes them fail on non-openmp systems. * gcc.dg/analyzer/omp-parallel-for-get-min.c, gcc.dg/analyzer/omp-parallel-for-1.c: Require effective target fopenmp.
-
GCC Administrator authored
-
- Mar 07, 2023
-
-
Jonathan Grant authored
gcc/ChangeLog PR sanitizer/81649 * doc/invoke.texi (Instrumentation Options): Clarify LeakSanitizer behavior.
-
Benson Muite authored
gcc/ChangeLog * doc/install.texi (Prerequisites): Add link to gmplib.org.
-
Jason Merrill authored
A missed piece of the patch for static operator(): in tsubst_function_decl, we don't want to replace the first parameter with a new closure pointer if operator() is static. PR c++/108526 PR c++/106651 gcc/cp/ChangeLog: * pt.cc (tsubst_function_decl): Don't replace the closure parameter if DECL_STATIC_FUNCTION_P. gcc/testsuite/ChangeLog: * g++.dg/cpp23/static-operator-call5.C: Pass -g.
-