- Aug 24, 2023
-
-
Tobias Burnus authored
Before commit r12-5295-g47de0b56ee455e, all gimple_build_cond in expand_omp_for_* were inserted with gsi_insert_before (gsi_p, cond_stmt, GSI_SAME_STMT); except the one dealing with the multiplicative factor that was gsi_insert_after (gsi, cond_stmt, GSI_CONTINUE_LINKING); That commit for PR103208 fixed the issue of some missing regimplify of operands of GIMPLE_CONDs by moving the condition handling to the new function expand_omp_build_cond. While that function has an 'bool after = false' argument to switch between the two variants. However, all callers ommited this argument. This commit reinstates the prior behavior by passing 'true' for the factor != 0 condition, fixing the included testcase. PR middle-end/111017 gcc/ * omp-expand.cc (expand_omp_for_init_vars): Pass after=true to expand_omp_build_cond for 'factor != 0' condition, resulting in pre-r12-5295-g47de0b56ee455e code for the gimple insert. libgomp/ * testsuite/libgomp.c-c++-common/non-rect-loop-1.c: New test. (cherry picked from commit 1dc65003)
-
Richard Biener authored
We now got test coverage for non-SSA name bits so the following amends the SSA_NAME_OCCURS_IN_ABNORMAL_PHI checks. PR tree-optimization/111070 * tree-ssa-ifcombine.cc (ifcombine_ifandif): Check we have an SSA name before checking SSA_NAME_OCCURS_IN_ABNORMAL_PHI. * gcc.dg/pr111070.c: New testcase. (cherry picked from commit 966b0a96)
-
Richard Biener authored
The following guards the bit test merging code in if-combine against the appearance of SSA names used in abnormal PHIs. PR tree-optimization/111039 * tree-ssa-ifcombine.cc (ifcombine_ifandif): Check for SSA_NAME_OCCURS_IN_ABNORMAL_PHI. * gcc.dg/pr111039.c: New testcase. (cherry picked from commit 482551a7)
-
Richard Biener authored
The following fixes a bad choice in representing things to the alias oracle by LIM which while correct in pieces is inconsistent with itself. When canonicalizing a ref to a bare deref instead of leaving the base object and the extracted offset the same and just substituting an alternate ref the following replaces the base and the offset as well, avoiding the confusion that otherwise will arise in aliasing_matching_component_refs_p. PR tree-optimization/111019 * tree-ssa-loop-im.cc (gather_mem_refs_stmt): When canonicalizing also scrap base and offset in case the ref is indirect. * g++.dg/torture/pr111019.C: New testcase. (cherry picked from commit 745ec213)
-
Richard Biener authored
Sometimes IVOPTs chooses a weird induction variable which downstream leads to issues. Most of the times we can fend those off during costing by rejecting the candidate but it looks like the address description costing synthesizes is different from what we end up generating so the following fixes things up at code generation time. Specifically we avoid the create_mem_ref_raw fallback which uses a literal zero address base with the actual base in index2. For the case in question we have the address type = unsigned long offset = 0 elements = { [0] = &e * -3, [1] = (sizetype) a.9_30 * 232, [2] = ivtmp.28_44 * 4 } from which we code generate the problematical _3 = MEM[(long int *)0B + ivtmp.36_9 + ivtmp.28_44 * 4]; which references the object at address zero. The patch below recognizes the fallback after the fact and transforms the TARGET_MEM_REF memory reference into a LEA for which this form isn't problematic: _24 = &MEM[(long int *)0B + ivtmp.36_34 + ivtmp.28_44 * 4]; _3 = *_24; hereby avoiding the correctness issue. We'd later conclude the program terminates at the null pointer dereference and make the function pure, miscompling the main function of the testcase. PR tree-optimization/110702 * tree-ssa-loop-ivopts.cc (rewrite_use_address): When we created a NULL pointer based access rewrite that to a LEA. * gcc.dg/torture/pr110702.c: New testcase. (cherry picked from commit 13dfb01e)
-
Andrew Pinski authored
The patterns that were added in r13-4620-g4d9db4bdd458, missed that (a > b) and (a <= b) are not inverse of each other for floating point comparisons (if NaNs are supported). Even though there was a check for intergal types, it was only for the result of the cond rather for the type of what is being compared. The fix is to check to see if cmp and icmp are inverse of each other by using the invert_tree_comparison function. OK for trunk and GCC 13 branch? Bootstrapped and tested on x86_64-linux-gnu with no regressions. I added the testcase to execute/ieee as it requires support for NAN. PR tree-optimization/111109 gcc/ChangeLog: * match.pd (ior(cond,cond), ior(vec_cond,vec_cond)): Add check to make sure cmp and icmp are inverse. gcc/testsuite/ChangeLog: * gcc.c-torture/execute/ieee/fp-cmp-cond-1.c: New test. (cherry picked from commit 4aa14ec7)
-
Richard Biener authored
The following applies some maintainance with respect to type qualifiers and kinds added by later DWARF standards to prune_unused_types_walk. The particular case in the bug is not handling (thus marking required) all restrict qualified type DIEs. I've found more DW_TAG_*_type that are unhandled, looked up the DWARF docs and added them as well based on common sense. PR debug/111080 * dwarf2out.cc (prune_unused_types_walk): Handle DW_TAG_restrict_type, DW_TAG_shared_type, DW_TAG_atomic_type, DW_TAG_immutable_type, DW_TAG_coarray_type, DW_TAG_unspecified_type and DW_TAG_dynamic_type as to only output them when referenced. * gcc.dg/debug/dwarf2/pr111080.c: New testcase. (cherry picked from commit bd2c4d6d)
-
liuhongt authored
Both "graniterapid-d" and "graniterapids" are attached with PROCESSOR_GRANITERAPID in processor_alias_table but mapped to different __cpu_subtype in get_intel_cpu. And get_builtin_code_for_version will try to match the first PROCESSOR_GRANITERAPIDS in processor_alias_table which maps to "granitepraids" here. 861 else if (new_target->arch_specified && new_target->arch > 0) 1862 for (i = 0; i < pta_size; i++) 1863 if (processor_alias_table[i].processor == new_target->arch) 1864 { 1865 const pta *arch_info = &processor_alias_table[i]; 1866 switch (arch_info->priority) 1867 { 1868 default: 1869 arg_str = arch_info->name; This mismatch makes dispatch_function_versions check the preidcate of__builtin_cpu_is ("graniterapids") for "graniterapids-d" and causes the issue. The patch explicitly PROCESSOR_GRANITERAPIDS_D to make a distinction. For "alderlake","raptorlake", "meteorlake" they share same isa, cost, tuning, and mapped to the same __cpu_type/__cpu_subtype in get_intel_cpu, so no need to add PROCESSOR_RAPTORLAKE and others. gcc/ChangeLog: * common/config/i386/i386-common.cc (processor_names): Add new member graniterapids-s. * config/i386/i386-options.cc (processor_alias_table): Update PROCESSOR_GRANITERAPIDS_D. (m_GRANITERAPID_D): New macro. (m_CORE_AVX512): Add m_GRANITERAPIDS_D. (processor_cost_table): Add icelake_cost for PROCESSOR_GRANITERAPIDS_D. * config/i386/i386.h (enum processor_type): Add new member PROCESSOR_GRANITERAPIDS_D. * config/i386/i386-c.cc (ix86_target_macros_internal): Handle PROCESSOR_GRANITERAPIDS_D. (cherry picked from commit afe15e97)
-
GCC Administrator authored
-
- Aug 23, 2023
-
-
Uros Bizjak authored
Disable (=&r,m,m) alternative for 32-bit targets. The combination of two memory operands (possibly with complex addressing mode), early clobbered output, frame pointer and PIC registers uses too much registers on a register constrained 32-bit target. PR target/111010 gcc/ChangeLog: * config/i386/i386.md (*concat<any_or_plus:mode><dwi>3_3): Disable (=&r,m,m) alternative for 32-bit targets. (*concat<any_or_plus:mode><dwi>3_4): Ditto.
-
GCC Administrator authored
-
- Aug 22, 2023
-
-
Juzhe-Zhong authored
This patch will be backport to GCC 13 and commit to trunk. gcc/ChangeLog: * config/riscv/t-riscv: Add riscv-vsetvl.def
-
Jakub Jelinek authored
As mentioned in the PR, these types are supported in C++ since GCC 13, so we shouldn't confuse users. 2023-08-22 Jakub Jelinek <jakub@redhat.com> PR c++/106652 * doc/extend.texi (_Float<n>): Drop obsolete sentence that the types aren't supported in C++. (cherry picked from commit 145da6a8)
-
Jonathan Wakely authored
std::format was treating {:f} and {:F} identically on the basis that for the fixed 1.234567 format there are no alphabetical characters that need to be in uppercase. But that's wrong for infinities and NaNs, which should be formatted as "INF" and "NAN" for {:F}. libstdc++-v3/ChangeLog: * include/std/format (__format::_Pres_type): Add _Pres_F. (__formatter_fp::parse): Use _Pres_F for 'F'. (__formatter_fp::format): Set __upper for _Pres_F. * testsuite/std/format/functions/format.cc: Check formatting of infinity and NaN for each presentation type. (cherry picked from commit d07bce47)
-
Jonathan Wakely authored
When the library is built with --disable-libstdcxx-dual-abi the only type of std::string supported is the COW string, and the two global std::string objects in tzdb.cc have to allocate memory. I added them thinking they would fit in the SSO string buffer, but that's not the case when the library only uses COW strings. Replace them with string_view objects to avoid any allocations. libstdc++-v3/ChangeLog: * src/c++20/tzdb.cc (tzdata_file, leaps_file): Change type to std::string_view. (cherry picked from commit d82a85b6)
-
Jonathan Wakely authored
libstdc++-v3/ChangeLog: * include/experimental/internet (address_v4::to_string): Remove unused parameter name. (cherry picked from commit 8ee74c5a)
-
Jonathan Wakely authored
This constructor should only ever be used with a literal 0 as the argument, so we can make it consteval. This has the nice advantage that it is expanded immediately in the front end, and so GDB will never step into the __cmp_cat::__unseq::__unseq(__unseq*) constructor that is uninteresting and probably confusing to users. libstdc++-v3/ChangeLog: * libsupc++/compare (__cmp_cat::__unseq): Make ctor consteval. * testsuite/18_support/comparisons/categories/zero_neg.cc: Prune excess errors caused by invalid consteval calls. (cherry picked from commit 84cff28f)
-
xuli authored
his patch fixes this issue happens on GCC-13. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111074 This patch should be backported to GCC-13. GCC-14 has rewritten propagate_avl function, so there is no issue. PR target/111074 gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (extract_single_source): Fix bug.
-
GCC Administrator authored
-
- Aug 21, 2023
-
-
liuhongt authored
Alderlake-N is E-core only, add it as an alias of Alderlake. gcc/ChangeLog: * common/config/i386/cpuinfo.h (get_intel_cpu): Detect Alderlake-N. * common/config/i386/i386-common.cc (alias_table): Support -march=gracemont as an alias of -march=alderlake. (cherry picked from commit f847e019)
-
GCC Administrator authored
-
- Aug 20, 2023
-
-
GCC Administrator authored
-
- Aug 19, 2023
-
-
Guo Jie authored
gcc/ChangeLog: * config/loongarch/t-loongarch: Add loongarch-driver.h into TM_H. Add loongarch-def.h and loongarch-tune.h into OPTIONS_H_EXTRA. Co-authored-by:
Lulu Cheng <chenglulu@loongson.cn> (cherry picked from commit 3e315736)
-
GCC Administrator authored
-
- Aug 18, 2023
-
-
GCC Administrator authored
-
- Aug 17, 2023
-
-
GCC Administrator authored
-
- Aug 16, 2023
-
-
Patrick Palka authored
Here we're incorrectly rejecting the first type-requirement at parse time with concepts-requires35.C:14:56: error: ‘typename A<T>::B’ is not a template [-fpermissive] We also incorrectly reject the second type-requirement at satisfaction time with concepts-requires35.C:17:34: error: ‘typename A<int>::B’ names ‘template<class U> struct A<int>::B’, which is not a type and similarly for the third type-requirement. This seems to happen only within a type-requirement; if we instead use e.g. an alias template then it works as expected. The difference ultimately seems to be that during parsing of a using-decl, we pass check_dependency_p=true to cp_parser_nested_name_specifier_opt whereas for a type-requirement we pass check_dependency_p=false. Passing =false causes cp_parser_template_id for the dependently-scoped template-id B<bool> to create a TYPE_DECL of TYPENAME_TYPE (with TYPENAME_IS_CLASS_P unexpectedly set in the last two cases) whereas passing =true causes it to return a TEMPLATE_ID_EXPR. We then call make_typename_type on this TYPE_DECL which does the wrong thing. Since there seems to be no justification for using check_dependency_p=false here, the simplest fix seems to be to pass check_dependency_p=true instead, matching the behavior of cp_parser_elaborated_type_specifier. PR c++/110927 gcc/cp/ChangeLog: * parser.cc (cp_parser_type_requirement): Pass check_dependency_p=true instead of =false. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/concepts-requires35.C: New test. (cherry picked from commit 63bd36be)
-
liuhongt authored
Support -m[no-]gather -m[no-]scatter to enable/disable vectorization for all gather/scatter instructions Rename original use_gather to use_gather_8parts, Support -mtune-ctrl={,^}use_gather to set/clear tune features use_gather_{2parts, 4parts, 8parts}. Support the new option -mgather as alias of -mtune-ctrl=, use_gather, ^use_gather. Similar for use_scatter. gcc/ChangeLog: * config/i386/i386-builtins.cc (ix86_vectorize_builtin_gather): Adjust for use_gather_8parts. * config/i386/i386-options.cc (parse_mtune_ctrl_str): Set/Clear tune features use_{gather,scatter}_{2parts, 4parts, 8parts} for -mtune-crtl={,^}{use_gather,use_scatter}. * config/i386/i386.cc (ix86_vectorize_builtin_scatter): Adjust for use_scatter_8parts * config/i386/i386.h (TARGET_USE_GATHER): Rename to .. (TARGET_USE_GATHER_8PARTS): .. this. (TARGET_USE_SCATTER): Rename to .. (TARGET_USE_SCATTER_8PARTS): .. this. * config/i386/x86-tune.def (X86_TUNE_USE_GATHER): Rename to (X86_TUNE_USE_GATHER_8PARTS): .. this. (X86_TUNE_USE_SCATTER): Rename to (X86_TUNE_USE_SCATTER_8PARTS): .. this. * config/i386/i386.opt: Add new options mgather, mscatter. (cherry picked from commit b2a927fb)
-
liuhongt authored
For more details of GDS (Gather Data Sampling), refer to https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/advisory-guidance/gather-data-sampling.html After microcode update, there's performance regression. To avoid that, the patch disables gather generation in autovectorization but uses gather scalar emulation instead. gcc/ChangeLog: * config/i386/i386-options.cc (m_GDS): New macro. * config/i386/x86-tune.def (X86_TUNE_USE_GATHER_2PARTS): Don't enable for m_GDS. (X86_TUNE_USE_GATHER_4PARTS): Ditto. (X86_TUNE_USE_GATHER): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx2-gather-2.c: Adjust options to keep gather vectorization. * gcc.target/i386/avx2-gather-6.c: Ditto. * gcc.target/i386/avx512f-pr88464-1.c: Ditto. * gcc.target/i386/avx512f-pr88464-5.c: Ditto. * gcc.target/i386/avx512vl-pr88464-1.c: Ditto. * gcc.target/i386/avx512vl-pr88464-11.c: Ditto. * gcc.target/i386/avx512vl-pr88464-3.c: Ditto. * gcc.target/i386/avx512vl-pr88464-9.c: Ditto. * gcc.target/i386/pr88531-1b.c: Ditto. * gcc.target/i386/pr88531-1c.c: Ditto. (cherry picked from commit 3064d1f5)
-
GCC Administrator authored
-
- Aug 15, 2023
-
-
Jonathan Wakely authored
This backported patch was from Paul Dreik.
-
GCC Administrator authored
-
- Aug 14, 2023
-
-
Jonathan Wakely authored
If abs(__v) is smaller than one, the result will be of the form 0.xxxxx. It is only if the magnitude is large that more digits are needed before the decimal dot. This uses frexp instead of log10 which should be less expensive and have sufficient precision for the desired purpose. It removes the problematic cases where log10 will be negative or not fit in an int. Signed-off-by:
Paul Dreik <gccpatches@pauldreik.se> libstdc++-v3/ChangeLog: PR libstdc++/110860 * include/std/format (__formatter_fp::format): Use frexp instead of log10. (cherry picked from commit 2d2b05f0)
-
Jonathan Wakely authored
When writing to a contiguous iterator, std::format_to_n(out, n, ...) always returns out + n, even if it wrote fewer than n characters to the iterator. The problem is in the _M_finish() member function of the _Iter_sink specialization for contiguous iterators. _M_finish() calls _M_overflow() to update its count of characters written, so it can return the count of characters that would be written if there was room. But _M_overflow() assumes it's only called when the buffer is full, and so switches to the internal buffer. _M_finish() then thinks that if the internal buffer is in use, we already wrote at least n characters and so returns out+n as the output position. We can fix the problem by adding a check in _M_overflow() so that we don't update the count and switch to the internal buffer unless we've run out of room, i.e. _M_unused().size() is zero. The caller then needs to be prepared for _M_count not being the final total, and so add _M_used.size() to it. However, there's not actually any need for _M_finish() to call _M_overflow() to get the count. We now need to use _M_count and _M_used.size() to get the total anyway so _M_overflow() doesn't help with that. And we don't need to use _M_overflow() to flush unwritten characters to the output, because the specialization for contiguous iterators always writes directly to the output without buffering (except when we've exceeded the maximum number of characters, in which case we want to discard the buffered characters anyway). So _M_finish() can be simplified and can avoid calling _M_overflow(). This change also fixes some member functions of other sink classes to only call _M_overflow() when there are characters in the buffer, which is needed to meet _M_overflow's precondition that _M_used().size()!=0. libstdc++-v3/ChangeLog: PR libstdc++/110990 * include/std/format (_Seq_sink::get): Only call _M_overflow if its precondition is met. (_Iter_sink::_M_finish): Likewise. (_Iter_sink<C, ContigIter>::_M_overflow): Only switch to the internal buffer after running out of space. (_Iter_sink<C, ContigIter>::_M_finish): Do not use _M_overflow. (_Counting_sink::count): Likewise. * testsuite/std/format/functions/format_to_n.cc: Check cases where the output fits into the buffer. (cherry picked from commit 003016a4)
-
Cui, Lili authored
Update model values for Raptorlake according to SDM. gcc/ChangeLog * common/config/i386/cpuinfo.h (get_intel_cpu): Add model value 0xba to Raptorlake.
-
GCC Administrator authored
-
- Aug 13, 2023
-
-
Iain Sandoe authored
This corrects some typos in the suffix of the m2rte pluing that lead to a bootstrap fail on Darwin, where the suffix is not '.so'. On some versions of Darwin, the linker complains if libSystem is not linked, so we disable all the default libs, but add libc back. Signed-off-by:
Iain Sandoe <iain@sandoe.co.uk> gcc/m2/ChangeLog: * Make-lang.in: Update suffix spellings to use 'soext'. Add libc to the plugin link. (cherry picked from commit 25be11e9)
-
GCC Administrator authored
-