- Aug 18, 2021
-
-
GCC Administrator authored
-
- Aug 17, 2021
-
-
Andrew MacLeod authored
Debugging range-ops and gori unwinding needed some help. * gimple-range-gori.cc (gori_compute::gori_compute): Enable tracing. (gori_compute::compute_operand_range): Add tracing. (gori_compute::logical_combine): Ditto. (gori_compute::compute_logical_operands): Ditto. (gori_compute::compute_operand1_range): Ditto. (gori_compute::compute_operand2_range): Ditto. (gori_compute::outgoing_edge_range_p): Ditto. * gimple-range-gori.h (class gori_compute): Add range_tracer.
-
Andrew MacLeod authored
Remove tracing in hybrid mode. Add trace/gori/cache tracing options. tracing options are now 'trace', 'gori', 'cache', or all combined in 'debug' * flag-types.h (enum evrp_mode): Adjust evrp-mode values. * gimple-range-cache.cc (DEBUG_RANGE_CACHE): Relocate from. * gimple-range-trace.h (DEBUG_RANGE_CACHE): Here. * params.opt (--param=evrp-mode): Adjust options.
-
Andrew MacLeod authored
Generalize range tracing into a class and integrae it with gimple_ranger. Remove the old derived trace_ranger class. * Makefile.in (OBJS): Add gimple-range-trace.o. * gimple-range-cache.h (enable_new_values): Remove unused prototype. * gimple-range-fold.cc: Adjust headers. * gimple-range-trace.cc: New. * gimple-range-trace.h: New. * gimple-range.cc (gimple_ranger::gimple_ranger): Enable tracer. (gimple_ranger::range_of_expr): Add tracing. (gimple_ranger::range_on_entry): Ditto. (gimple_ranger::range_on_exit): Ditto. (gimple_ranger::range_on_edge): Ditto. (gimple_ranger::fold_range_internal): Ditto. (gimple_ranger::dump_bb): Do not calculate edge range twice. (trace_ranger::*): Remove. (enable_ranger): Never create a trace_ranger. (debug_seed_ranger): Move to gimple-range-trace.cc. (dump_ranger): Ditto. (debug_ranger): Ditto. * gimple-range.h: Include gimple-range-trace.h. (range_on_entry, range_on_exit): No longer virtual. (class trace_ranger): Remove. (DEBUG_RANGE_CACHE): Move to gimple-range-trace.h.
-
Martin Sebor authored
Also resolves: PR middle-end/101854 - Invalid warning -Wstringop-overflow wrong argument gcc/ChangeLog: PR middle-end/101854 * builtins.c (expand_builtin_alloca): Move warning code to check_alloca in gimple-ssa-warn-access.cc. * calls.c (alloc_max_size): Move code to check_alloca. (get_size_range): Move to pointer-query.cc. (maybe_warn_alloc_args_overflow): Move to gimple-ssa-warn-access.cc. (get_attr_nonstring_decl): Move to tree.c. (fntype_argno_type): Move to gimple-ssa-warn-access.cc. (append_attrname): Same. (maybe_warn_rdwr_sizes): Same. (initialize_argument_information): Move code to gimple-ssa-warn-access.cc. * calls.h (maybe_warn_alloc_args_overflow): Move to gimple-ssa-warn-access.h. (get_attr_nonstring_decl): Move to tree.h. (maybe_warn_nonstring_arg): Move to gimple-ssa-warn-access.h. (enum size_range_flags): Move to pointer-query.h. (get_size_range): Same. * gimple-ssa-warn-access.cc (has_location): Remove unused overload to avoid Clang -Wunused-function. (get_size_range): Declare static. (maybe_emit_free_warning): Rename... (maybe_check_dealloc_call): ...to this for consistency. (class pass_waccess): Add members. (pass_waccess::~pass_waccess): Defined. (alloc_max_size): Move here from calls.c. (maybe_warn_alloc_args_overflow): Same. (check_alloca): New function. (check_alloc_size_call): New function. (check_strncat): Handle another warning flag. (pass_waccess::check_builtin): Handle alloca. (fntype_argno_type): Move here from calls.c. (append_attrname): Same. (maybe_warn_rdwr_sizes): Same. (pass_waccess::check_call): Define. (check_nonstring_args): New function. (pass_waccess::check): Call new member functions. (pass_waccess::execute): Enable ranger. * gimple-ssa-warn-access.h (get_size_range): Move here from calls.h. (maybe_warn_nonstring_arg): Same. * gimple-ssa-warn-restrict.c: Remove #include. * pointer-query.cc (get_size_range): Move here from calls.c. * pointer-query.h (enum size_range_flags): Same. (get_size_range): Same. * tree.c (get_attr_nonstring_decl): Move here from calls.c. * tree.h (get_attr_nonstring_decl): Move here from calls.h. gcc/testsuite/ChangeLog: * gcc.dg/attr-alloc_size-5.c: Adjust optimization to -O1. * gcc.dg/attr-alloc_size-7.c: Use #pragmas to adjust optimization. * gcc.dg/attr-alloc_size-8.c: Adjust optimization to -O1. PR middle-end/101854 * gcc.dg/Wstringop-overflow-72.c: New test.
-
Jakub Jelinek authored
The following patch implements __is_layout_compatible trait and __builtin_is_corresponding_member helper function for the std::is_corresponding_member template function. As the current definition of layout compatible type has various problems, which result e.g. in corresponding members in layout compatible types having different member offsets, the patch anticipates some changes to the C++ standard: 1) class or enumeral types aren't layout compatible if they have different alignment or size 2) if two members have different offsets, they can't be corresponding members ([[no_unique_address]] with empty types can change that, or alignas on the member decls) 3) in unions, bitfields can't correspond to non-unions, or bitfields can't correspond to bitfields with different widths, or members with [[no_unique_address]] can't correspond to members without that attribute __builtin_is_corresponding_member for anonymous structs (GCC extension) will recurse into the anonymous structs. For anonymous unions it will emit a sorry if it can't prove such member types can't appear in the anonymous unions or anonymous aggregates in that union, because corresponding member is defined only using common initial sequence which is only defined for std-layout non-union class types and so I have no idea what to do otherwise in that case. 2021-08-17 Jakub Jelinek <jakub@redhat.com> PR c++/101539 gcc/c-family/ * c-common.h (enum rid): Add RID_IS_LAYOUT_COMPATIBLE. * c-common.c (c_common_reswords): Add __is_layout_compatible. gcc/cp/ * cp-tree.h (enum cp_trait_kind): Add CPTK_IS_LAYOUT_COMPATIBLE. (enum cp_built_in_function): Add CP_BUILT_IN_IS_CORRESPONDING_MEMBER. (fold_builtin_is_corresponding_member, next_common_initial_seqence, layout_compatible_type_p): Declare. * parser.c (cp_parser_primary_expression): Handle RID_IS_LAYOUT_COMPATIBLE. (cp_parser_trait_expr): Likewise. * cp-objcp-common.c (names_builtin_p): Likewise. * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_LAYOUT_COMPATIBLE. * decl.c (cxx_init_decl_processing): Register __builtin_is_corresponding_member builtin. * constexpr.c (cxx_eval_builtin_function_call): Handle CP_BUILT_IN_IS_CORRESPONDING_MEMBER builtin. * semantics.c (is_corresponding_member_union, is_corresponding_member_aggr, fold_builtin_is_corresponding_member): New functions. (trait_expr_value): Handle CPTK_IS_LAYOUT_COMPATIBLE. (finish_trait_expr): Likewise. * typeck.c (next_common_initial_seqence, layout_compatible_type_p): New functions. * cp-gimplify.c (cp_gimplify_expr): Fold CP_BUILT_IN_IS_CORRESPONDING_MEMBER. (cp_fold): Likewise. * tree.c (builtin_valid_in_constant_expr_p): Handle CP_BUILT_IN_IS_CORRESPONDING_MEMBER. * cxx-pretty-print.c (pp_cxx_trait_expression): Handle CPTK_IS_LAYOUT_COMPATIBLE. * class.c (remove_zero_width_bit_fields): Remove. (layout_class_type): Don't call it. gcc/testsuite/ * g++.dg/cpp2a/is-corresponding-member1.C: New test. * g++.dg/cpp2a/is-corresponding-member2.C: New test. * g++.dg/cpp2a/is-corresponding-member3.C: New test. * g++.dg/cpp2a/is-corresponding-member4.C: New test. * g++.dg/cpp2a/is-corresponding-member5.C: New test. * g++.dg/cpp2a/is-corresponding-member6.C: New test. * g++.dg/cpp2a/is-corresponding-member7.C: New test. * g++.dg/cpp2a/is-corresponding-member8.C: New test. * g++.dg/cpp2a/is-layout-compatible1.C: New test. * g++.dg/cpp2a/is-layout-compatible2.C: New test. * g++.dg/cpp2a/is-layout-compatible3.C: New test.
-
Matt Jacobson authored
Signed-off-by:
Matt Jacobson <mhjacobson@me.com> gcc/c-family/ChangeLog: * c-opts.c (c_common_post_options): Default to flag_objc_sjlj_exceptions = 1 only when flag_objc_abi < 2. gcc/objc/ChangeLog: * objc-next-runtime-abi-02.c (objc_next_runtime_abi_02_init): Warn about and reset flag_objc_sjlj_exceptions regardless of flag_objc_exceptions. (next_runtime_02_initialize): Use a checking assert that flag_objc_sjlj_exceptions is off.
-
Thomas Schwinge authored
This is a follow-up to commit 697b94cf "libstdc++: Avoid illegal argument to verbose in dg-test callback". I'm confirming the original problem, but on one system, it's not resolved by this change, because instead we get: extra_tool_flags are: ERROR: tcl error sourcing [...]/libstdc++-v3/testsuite/libstdc++-dg/conformance.exp. ERROR: usage: send [args] string while executing "send_log "$message\n"" (procedure "verbose" line 48) invoked from within "verbose -log -- $extra_tool_flags" (procedure "libstdc++-dg-test" line 45) invoked from within "${tool}-dg-test $prog [lindex ${dg-do-what} 0] "$tool_flags ${dg-extra-tool-flags}"" (procedure "saved-dg-test" line 115) invoked from within [...] That's Ubuntu's dejagnu 1.5-3ubuntu1 being so old that it doesn't include DejaGnu commit 57c22601afe43d2c2b8819df4f2ecacb034516fd "Protect from leading dash in message". (I suppose that's what'd make this work, but have not verified.) libstdc++-v3/ * testsuite/lib/libstdc++.exp: Avoid illegal argument to verbose, continued.
-
Iain Sandoe authored
The default for building host-side binaries for mdynamic-no-pic hosts is to enable this. However, it is not compatible with dynamic libraries, so must be switched off for libcc1. Signed-off-by:
Iain Sandoe <iain@sandoe.co.uk> libcc1/ChangeLog: * Makefile.am: Switch mdynamic-no-pic to fPIC. * Makefile.in: Regenerated.
-
Thomas Schwinge authored
This simplifies the interface and gets us rid of a global variable. No change in behavior. Clean-up for 2004-09-02 CVS commit (Subversion r86974, Git commit 07724022) "Better memory statistics, take 2". gcc/ * ggc.h (ggc_collect): Add 'force_collect' parameter. * ggc-page.c (ggc_collect): Use that one instead of global 'ggc_force_collect'. Adjust all users. * doc/gty.texi (Invoking the garbage collector): Update. * ggc-internal.h (ggc_force_collect): Remove. * ggc-common.c (ggc_force_collect): Likewise. * selftest.h (forcibly_ggc_collect): Remove. * ggc-tests.c (selftest::forcibly_ggc_collect): Likewise. * read-rtl-function.c (test_loading_labels): Adjust. * selftest-run-tests.c (run_tests): Likewise.
-
Thomas Schwinge authored
... after it had gotten disabled in r243681 (Git commit ecfc21ff) "Introduce selftest::locate_file". gcc/testsuite/ * gcc.dg/pr78213.c: Restore testing.
-
Iain Sandoe authored
For a single use (typical compile) this vector will be reclaimed as GGC. For JIT this is not sufficient since it does not reset the pointer to NULL (and thus we think the the vector is already allocated when a context is reused). The clears the vector and sets the pointer to NULL at the end of object output. Signed-off-by:
Iain Sandoe <iain@sandoe.co.uk> gcc/ChangeLog: * config/darwin.c (darwin_file_end): Reset and reclaim the section names table at the end of compile.
-
Iain Sandoe authored
Versions of the assembler using clang from XCode 12.5/12.5.1 have a bug which produces different code layout between debug and non-debug input, leading to a compare fail for default configure parameters. This is a workaround fix to disable the optimisation that is responsible for the bug. Signed-off-by:
Iain Sandoe <iain@sandoe.co.uk> PR target/100340 - Bootstrap fails with Clang 12.0.5 (XCode 12.5) PR target/100340 gcc/ChangeLog: * config.in: Regenerate. * config/i386/darwin.h (EXTRA_ASM_OPTS): New (ASM_SPEC): Pass options to disable branch shortening where needed. * configure: Regenerate. * configure.ac: Detect versions of 'as' that support the optimisation which has the bug.
-
Richard Biener authored
This adds a fallback to the masked_ variants for gather_load and scatter_store if the latter are not available. 2021-08-17 Richard Biener <rguenther@suse.de> * optabs-query.c (supports_vec_gather_load_p): Also check for masked optabs. (supports_vec_scatter_store_p): Likewise. * tree-vect-data-refs.c (vect_gather_scatter_fn_p): Fall back to masked variants if non-masked are not supported. * tree-vect-patterns.c (vect_recog_gather_scatter_pattern): When we need to use masked gather/scatter but do not have a mask set up a constant true one. * tree-vect-stmts.c (vect_check_scalar_mask): Also allow non-SSA_NAME masks.
-
Luc Michel authored
This fixes an incorrect invocation of gdb on remote targets where DejaGNU would try to run host's gdb in remote target simulator. gdb-test skips the testing when target is remote or non native but the gdb version check function does not. Suggested-by:
Jonathan Wakely <jwakely@redhat.com> Signed-off-by:
Luc Michel <lmichel@kalray.eu> Co-authored-by:
Marc Poulhies <mpoulhies@kalrayinc.com> libstdc++-v3/ChangeLog: * testsuite/lib/gdb-test.exp (gdb_version_check) (gdb_version_check_xmethods): Only check the GDB version for local native targets.
-
Antony Polukhin authored
When std::seed_seq is constructed from random access iterators we can detect the internal vector size in O(1). Reserving memory for elements in such cases may avoid multiple memory allocations. Signed-off-by:
Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: * include/bits/random.tcc (seed_seq::seed_seq): Reserve capacity if distance is O(1). * testsuite/26_numerics/random/pr60037-neg.cc: Adjust dg-error line number. Co-authored-by:
Jonathan Wakely <jwakely@redhat.com>
-
Roger Sayle authored
This patch improves the bit bounds for MINUS_EXPR during tree-ssa's conditional constant propagation (CCP) pass (and as an added bonus adds support for POINTER_DIFF_EXPR). The pessimistic assumptions made by the current algorithm are demonstrated by considering 1 - (x&1). Intuitively this should have possible values 0 and 1, and therefore an unknown mask of 1. Alas by treating subtraction as a negation followed by addition, the second operand first becomes 0 or -1, with an unknown mask of all ones, which results in the addition containing no known bits. Improved bounds are achieved by using the same approach used for PLUS_EXPR, determining the result with the minimum number of borrows, the result from the maximum number of borrows, and examining the bits they have in common. One additional benefit of this approach is that it is applicable to POINTER_DIFF_EXPR, where previously the negation of a pointer didn't/doesn't make sense. A more convincing example, where a transformation missed by .032t.cpp isn't caught a few passes later by .038t.evrp, is the expression (7 - (x&5)) & 2, which (in the new test case) currently survives the tree-level optimizers but with this patch is now simplified to the constant value 2. 2021-08-17 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog * tree-ssa-ccp.c (bit_value_binop) [MINUS_EXPR]: Use same algorithm as PLUS_EXPR to improve subtraction bit bounds. [POINTER_DIFF_EXPR]: Treat as synonymous with MINUS_EXPR. gcc/testsuite/ChangeLog * gcc.dg/tree-ssa/ssa-ccp-40.c: New test case.
-
Roger Sayle authored
This patch allows GCC to constant fold (i | (i<<16)) | ((i<<24) | (i<<8)), where i is an unsigned char, or the equivalent (i*65537) | (i*16777472), to i*16843009. The trick is to teach tree_nonzero_bits which bits may be set in the result of a multiplication by a constant given which bits are potentially set in the operands. This allows the optimizations recently added to match.pd to catch more cases. The required mask/value pair from a multiplication may be calculated using a classical shift-and-add algorithm, given we already have implementations for both addition and shift by constant. To keep this optimization "cheap", this functionality is only used if the constant multiplier has a few bits set (unless flag_expensive_optimizations), and we provide a special case fast-path implementation for the common case where the (non-constant) operand has no bits that are guaranteed to be set. I have no evidence that this functionality causes performance issues, it's just that sparse multipliers provide the largest benefit to CCP. 2021-08-17 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog * tree-ssa-ccp.c (bit_value_mult_const): New helper function to calculate the mask-value pair result of a multiplication by an unsigned constant. (bit_value_binop) [MULT_EXPR]: Call it from here for multiplications by (sparse) non-negative constants. gcc/testsuite/ChangeLog * gcc.dg/fold-ior-5.c: New test case.
-
Tobias Burnus authored
Fortran version to commit e45483c7, which implemented OpenMP's scope construct for C and C++. Most testcases are based on the C testcases; it also contains some testcases which existed previously but had no Fortran equivalent. gcc/fortran/ChangeLog: * dump-parse-tree.c (show_omp_node, show_code_node): Handle EXEC_OMP_SCOPE. * gfortran.h (enum gfc_statement): Add ST_OMP_(END_)SCOPE. (enum gfc_exec_op): Add EXEC_OMP_SCOPE. * match.h (gfc_match_omp_scope): New. * openmp.c (OMP_SCOPE_CLAUSES): Define (gfc_match_omp_scope): New. (gfc_match_omp_cancellation_point, gfc_match_omp_end_nowait): Improve error diagnostic. (omp_code_to_statement): Handle ST_OMP_SCOPE. (gfc_resolve_omp_directive): Handle EXEC_OMP_SCOPE. * parse.c (decode_omp_directive, next_statement, gfc_ascii_statement, parse_omp_structured_block, parse_executable): Handle OpenMP's scope construct. * resolve.c (gfc_resolve_blocks): Likewise * st.c (gfc_free_statement): Likewise * trans-openmp.c (gfc_trans_omp_scope): New. (gfc_trans_omp_directive): Call it. * trans.c (trans_code): handle EXEC_OMP_SCOPE. libgomp/ChangeLog: * testsuite/libgomp.fortran/scope-1.f90: New test. * testsuite/libgomp.fortran/task-reduction-16.f90: New test. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/scan-1.f90: * gfortran.dg/gomp/cancel-1.f90: New test. * gfortran.dg/gomp/cancel-4.f90: New test. * gfortran.dg/gomp/loop-4.f90: New test. * gfortran.dg/gomp/nesting-1.f90: New test. * gfortran.dg/gomp/nesting-2.f90: New test. * gfortran.dg/gomp/nesting-3.f90: New test. * gfortran.dg/gomp/nowait-1.f90: New test. * gfortran.dg/gomp/reduction-task-1.f90: New test. * gfortran.dg/gomp/reduction-task-2.f90: New test. * gfortran.dg/gomp/reduction-task-2a.f90: New test. * gfortran.dg/gomp/reduction-task-3.f90: New test. * gfortran.dg/gomp/scope-1.f90: New test. * gfortran.dg/gomp/scope-2.f90: New test.
-
Jonathan Wakely authored
Signed-off-by:
Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: * testsuite/26_numerics/random/seed_seq/cons/range.cc: Check construction from input iterators.
-
Jonathan Wakely authored
The std::error_category printer wasn't meant to be part of the commit adding std::error_code and std::error_condition printers. Signed-off-by:
Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: * python/libstdcxx/v6/printers.py (StdErrorCatPrinter): Remove.
-
Jonathan Wakely authored
PR 101923 points out that the unconditional swap in the std::function move constructor makes it slower than copying an empty std::function. The copy constructor has to check for the empty case before doing anything, and that makes it very fast for the empty case. Adding the same check to the move constructor avoids copying the _Any_data POD when we don't need to. We can also inline the effects of swap, by copying each member and then zeroing the pointer members. This makes moving an empty object at least as fast as copying an empty object. Signed-off-by:
Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: PR libstdc++/101923 * include/bits/std_function.h (function(function&&)): Check for non-empty parameter before doing any work.
-
Jonathan Wakely authored
The new contains member of the COW string is defined for non-strict gnu++20 mode as well as for C++23 modes. I think that was left in the committed patch unintentionally. It is inconsistent with the SSO string, and doesn't actually compile because it uses the basic_string_view::contains member which only defined for C++23. This makes it only defined for C++23. Signed-off-by:
Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: * include/bits/cow_string.h (basic_string::contains): Do not define for -std=gnu++20.
-
Jonathan Wakely authored
This is done to match an editorial change in the working draft, to rename the exposition-only not-same-as helper to different-from. Signed-off-by:
Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: * include/bits/ranges_util.h (__not_same_as): Rename to __different_from. * include/std/ranges (__not_same_as): Likewise.
-
Jonathan Wakely authored
This is not required by the standard, but seems useful. Signed-off-by:
Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: * include/std/utility (exchange): Add noexcept-specifier. * testsuite/20_util/exchange/noexcept.cc: New test.
-
Jonathan Wakely authored
Signed-off-by:
Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: * python/libstdcxx/v6/printers.py (StdErrorCodePrinter): Define. (build_libstdcxx_dictionary): Register printer for std::error_code and std::error_condition. * testsuite/libstdc++-prettyprinters/cxx11.cc: Test it.
-
Christophe Lyon authored
Commit r12-1328 enabled DT_INIT_ARRAY/DT_FINI_ARRAY for all Linux targets, but this does not work for arm-none-uclinuxfdpiceabi: it makes all the execution tests fail. This patch restores the original behavior for uclinuxfdpiceabi. 2021-08-12 Christophe Lyon <christophe.lyon@foss.st.com> gcc/ PR target/100896 * config.gcc (gcc_cv_initfini_array): Leave undefined for uclinuxfdpiceabi targets.
-
Alexandre Oliva authored
We iterate over debug stmts from the last one in new_bb, and we insert them before the first post-label stmt in each dest block, without moving the insertion iterator, so they end up reversed. Moving the insertion iterator fixes this. for gcc/ChangeLog * tree-inline.c (maybe_move_debug_stmts_to_successors): Don't reverse debug stmts.
-
Alexandre Oliva authored
dump_function_to_file takes the function to dump as a parameter, and parts of it use the local fun variable where cfun would be used elsewhere. Others use cfun, presumably in error. Fixed to use fun uniformly. Added a few more tests for non-NULL fun before dereferencing it. for gcc/ChangeLog * tree-cfg.c (dump_function_to_file): Use fun, not cfun.
-
Jonathan Wright authored
Remove macros for vld4[q]_lane Neon intrinsics. This is a preparatory step before adding new modes for structures of Advanced SIMD vectors. gcc/ChangeLog: 2021-08-16 Jonathan Wright <jonathan.wright@arm.com> * config/aarch64/arm_neon.h (__LD4_LANE_FUNC): Delete. (__LD4Q_LANE_FUNC): Likewise. (vld4_lane_u8): Define without macro. (vld4_lane_u16): Likewise. (vld4_lane_u32): Likewise. (vld4_lane_u64): Likewise. (vld4_lane_s8): Likewise. (vld4_lane_s16): Likewise. (vld4_lane_s32): Likewise. (vld4_lane_s64): Likewise. (vld4_lane_f16): Likewise. (vld4_lane_f32): Likewise. (vld4_lane_f64): Likewise. (vld4_lane_p8): Likewise. (vld4_lane_p16): Likewise. (vld4_lane_p64): Likewise. (vld4q_lane_u8): Likewise. (vld4q_lane_u16): Likewise. (vld4q_lane_u32): Likewise. (vld4q_lane_u64): Likewise. (vld4q_lane_s8): Likewise. (vld4q_lane_s16): Likewise. (vld4q_lane_s32): Likewise. (vld4q_lane_s64): Likewise. (vld4q_lane_f16): Likewise. (vld4q_lane_f32): Likewise. (vld4q_lane_f64): Likewise. (vld4q_lane_p8): Likewise. (vld4q_lane_p16): Likewise. (vld4q_lane_p64): Likewise. (vld4_lane_bf16): Likewise. (vld4q_lane_bf16): Likewise.
-
Jonathan Wright authored
Remove macros for vld3[q]_lane Neon intrinsics. This is a preparatory step before adding new modes for structures of Advanced SIMD vectors. gcc/ChangeLog: 2021-08-16 Jonathan Wright <jonathan.wright@arm.com> * config/aarch64/arm_neon.h (__LD3_LANE_FUNC): Delete. (__LD3Q_LANE_FUNC): Delete. (vld3_lane_u8): Define without macro. (vld3_lane_u16): Likewise. (vld3_lane_u32): Likewise. (vld3_lane_u64): Likewise. (vld3_lane_s8): Likewise. (vld3_lane_s16): Likewise. (vld3_lane_s32): Likewise. (vld3_lane_s64): Likewise. (vld3_lane_f16): Likewise. (vld3_lane_f32): Likewise. (vld3_lane_f64): Likewise. (vld3_lane_p8): Likewise. (vld3_lane_p16): Likewise. (vld3_lane_p64): Likewise. (vld3q_lane_u8): Likewise. (vld3q_lane_u16): Likewise. (vld3q_lane_u32): Likewise. (vld3q_lane_u64): Likewise. (vld3q_lane_s8): Likewise. (vld3q_lane_s16): Likewise. (vld3q_lane_s32): Likewise. (vld3q_lane_s64): Likewise. (vld3q_lane_f16): Likewise. (vld3q_lane_f32): Likewise. (vld3q_lane_f64): Likewise. (vld3q_lane_p8): Likewise. (vld3q_lane_p16): Likewise. (vld3q_lane_p64): Likewise. (vld3_lane_bf16): Likewise. (vld3q_lane_bf16): Likewise.
-
Jonathan Wright authored
Remove macros for vld2[q]_lane Neon intrinsics. This is a preparatory step before adding new modes for structures of Advanced SIMD vectors. gcc/ChangeLog: 2021-08-12 Jonathan Wright <jonathan.wright@arm.com> * config/aarch64/arm_neon.h (__LD2_LANE_FUNC): Delete. (__LD2Q_LANE_FUNC): Likewise. (vld2_lane_u8): Define without macro. (vld2_lane_u16): Likewise. (vld2_lane_u32): Likewise. (vld2_lane_u64): Likewise. (vld2_lane_s8): Likewise. (vld2_lane_s16): Likewise. (vld2_lane_s32): Likewise. (vld2_lane_s64): Likewise. (vld2_lane_f16): Likewise. (vld2_lane_f32): Likewise. (vld2_lane_f64): Likewise. (vld2_lane_p8): Likewise. (vld2_lane_p16): Likewise. (vld2_lane_p64): Likewise. (vld2q_lane_u8): Likewise. (vld2q_lane_u16): Likewise. (vld2q_lane_u32): Likewise. (vld2q_lane_u64): Likewise. (vld2q_lane_s8): Likewise. (vld2q_lane_s16): Likewise. (vld2q_lane_s32): Likewise. (vld2q_lane_s64): Likewise. (vld2q_lane_f16): Likewise. (vld2q_lane_f32): Likewise. (vld2q_lane_f64): Likewise. (vld2q_lane_p8): Likewise. (vld2q_lane_p16): Likewise. (vld2q_lane_p64): Likewise. (vld2_lane_bf16): Likewise. (vld2q_lane_bf16): Likewise.
-
Maxim Kuvyrkov authored
* haifa-sched.c (advance_one_cycle): Output more context-synchronization lines for diff.
-
Maxim Kuvyrkov authored
* haifa-sched.c (enum rfs_decision, rfs_str): Add RFS_AUTOPREF. (rank_for_schedule): Use it.
-
Maxim Kuvyrkov authored
PR rtl-optimization/91598 * haifa-sched.c (autopref_rank_for_schedule): Prioritize "irrelevant" insns after memory reads and before memory writes.
-
Alistair Lee authored
gcc/ 2021-08-17 Alistair_Lee <alistair.lee@arm.com> * rtl.h (CONST_VECTOR_P): New macro. * config/aarch64/aarch64.c (aarch64_get_sve_pred_bits): Use RTL code testing macros. (aarch64_ptrue_all_mode): Likewise. (aarch64_expand_mov_immediate): Likewise. (aarch64_const_vec_all_in_range_p): Likewise. (aarch64_rtx_costs): Likewise. (aarch64_legitimate_constant_p): Likewise. (aarch64_simd_valid_immediate): Likewise. (aarch64_simd_make_constant): Likewise. (aarch64_convert_mult_to_shift): Likewise. (aarch64_expand_sve_vec_perm): Likewise. (aarch64_vec_fpconst_pow_of_2): Likewise.
-
Andrew MacLeod authored
With flag_wrapv, -TYPE_MIN_VALUE = TYPE_MIN_VALUE which is unrepresentable. We currently special case this in the ABS folding routine, but are missing similar treatment in operator_abs::op1_range. Tested on x86-64 Linux. PR tree-optimization/101938 gcc/ChangeLog: * range-op.cc (operator_abs::op1_range): Special case -TYPE_MIN_VALUE for flag_wrapv. gcc/testsuite/ChangeLog: * gcc.dg/pr101938.c: New test.
-
Richard Biener authored
This adds the testcase from the fix for the PR. 2021-08-17 Richard Biener <rguenther@suse.de> PR tree-optimization/101868 * gcc.dg/lto/pr101868_0.c: New testcase. * gcc.dg/lto/pr101868_1.c: Likewise. * gcc.dg/lto/pr101868_2.c: Likewise. * gcc.dg/lto/pr101868_3.c: Likewise.
-
Kewen Lin authored
As Richi pointed out, currently for BB reductions we don't actually build a SLP node with IFN_REDUC_* information, ideally we may end up with that eventually. For now, it's costed as shuffles and reduc operations, it misses the cost of lane extraction. This patch is to add one time of vec_to_scalar cost for lane extraction, it's to make the costings consistent and conservative for now. gcc/ChangeLog: * tree-vect-slp.c (vectorizable_bb_reduc_epilogue): Add the cost for value extraction.
-
Jakub Jelinek authored
This patch implements the OpenMP 5.1 scope construct, which is similar to worksharing constructs in many regards, but isn't one of them. The body of the construct is encountered by all threads though, it can be nested in itself or intermixed with taskgroup and worksharing etc. constructs can appear inside of it (but it can't be nested in worksharing etc. constructs). The main purpose of the construct is to allow reductions (normal and task ones) without the need to close the parallel and reopen another one. If it doesn't have task reductions, it can be implemented without any new library support, with nowait it just does the privatizations at the start if any and reductions before the end of the body, with without nowait emits a normal GOMP_barrier{,_cancel} at the end too. For task reductions, we need to ensure only one thread initializes the task reduction library data structures and other threads copy from that, so a new GOMP_scope_start routine is added to the library for that. It acts as if the start of the scope construct is a nowait worksharing construct (that is ok, it can't be nested in other worksharing constructs and all threads need to encounter the start in the same order) which does the task reduction initialization, but as the body can have other scope constructs and/or worksharing constructs, that is all where we use this dummy worksharing construct. With task reductions, the construct must not have nowait and ends with a GOMP_barrier{,_cancel}, followed by task reductions followed by GOMP_workshare_task_reduction_unregister. Only C/C++ FE support is done. 2021-08-17 Jakub Jelinek <jakub@redhat.com> gcc/ * tree.def (OMP_SCOPE): New tree code. * tree.h (OMP_SCOPE_BODY, OMP_SCOPE_CLAUSES): Define. * tree-nested.c (convert_nonlocal_reference_stmt, convert_local_reference_stmt, convert_gimple_call): Handle GIMPLE_OMP_SCOPE. * tree-pretty-print.c (dump_generic_node): Handle OMP_SCOPE. * gimple.def (GIMPLE_OMP_SCOPE): New gimple code. * gimple.c (gimple_build_omp_scope): New function. (gimple_copy): Handle GIMPLE_OMP_SCOPE. * gimple.h (gimple_build_omp_scope): Declare. (gimple_has_substatements): Handle GIMPLE_OMP_SCOPE. (gimple_omp_scope_clauses, gimple_omp_scope_clauses_ptr, gimple_omp_scope_set_clauses): New inline functions. (CASE_GIMPLE_OMP): Add GIMPLE_OMP_SCOPE. * gimple-pretty-print.c (dump_gimple_omp_scope): New function. (pp_gimple_stmt_1): Handle GIMPLE_OMP_SCOPE. * gimple-walk.c (walk_gimple_stmt): Likewise. * gimple-low.c (lower_stmt): Likewise. * gimplify.c (is_gimple_stmt): Handle OMP_MASTER. (gimplify_scan_omp_clauses): For task reductions, handle OMP_SCOPE like ORT_WORKSHARE constructs. Adjust diagnostics for %<scope%> allowing task reductions. Reject inscan reductions on scope. (omp_find_stores_stmt): Handle GIMPLE_OMP_SCOPE. (gimplify_omp_workshare, gimplify_expr): Handle OMP_SCOPE. * tree-inline.c (remap_gimple_stmt): Handle GIMPLE_OMP_SCOPE. (estimate_num_insns): Likewise. * omp-low.c (build_outer_var_ref): Look through GIMPLE_OMP_SCOPE contexts if var isn't privatized there. (check_omp_nesting_restrictions): Handle GIMPLE_OMP_SCOPE. (scan_omp_1_stmt): Likewise. (maybe_add_implicit_barrier_cancel): Look through outer scope constructs. (lower_omp_scope): New function. (lower_omp_task_reductions): Handle OMP_SCOPE. (lower_omp_1): Handle GIMPLE_OMP_SCOPE. (diagnose_sb_1, diagnose_sb_2): Likewise. * omp-expand.c (expand_omp_single): Support also GIMPLE_OMP_SCOPE. (expand_omp): Handle GIMPLE_OMP_SCOPE. (omp_make_gimple_edges): Likewise. * omp-builtins.def (BUILT_IN_GOMP_SCOPE_START): New built-in. gcc/c-family/ * c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_SCOPE. * c-pragma.c (omp_pragmas): Add scope construct. * c-omp.c (omp_directives): Uncomment scope directive entry. gcc/c/ * c-parser.c (OMP_SCOPE_CLAUSE_MASK): Define. (c_parser_omp_scope): New function. (c_parser_omp_construct): Handle PRAGMA_OMP_SCOPE. gcc/cp/ * parser.c (OMP_SCOPE_CLAUSE_MASK): Define. (cp_parser_omp_scope): New function. (cp_parser_omp_construct, cp_parser_pragma): Handle PRAGMA_OMP_SCOPE. * pt.c (tsubst_expr): Handle OMP_SCOPE. gcc/testsuite/ * c-c++-common/gomp/nesting-2.c (foo): Add scope and masked construct tests. * c-c++-common/gomp/scan-1.c (f3): Add scope construct test.. * c-c++-common/gomp/cancel-1.c (f2): Add scope and masked construct tests. * c-c++-common/gomp/reduction-task-2.c (bar): Add scope construct test. Adjust diagnostics for the addition of scope. * c-c++-common/gomp/loop-1.c (f5): Add master, masked and scope construct tests. * c-c++-common/gomp/clause-dups-1.c (f1): Add scope construct test. * gcc.dg/gomp/nesting-1.c (f1, f2, f3): Add scope construct tests. * c-c++-common/gomp/scope-1.c: New test. * c-c++-common/gomp/scope-2.c: New test. * g++.dg/gomp/attrs-1.C (bar): Add scope construct tests. * g++.dg/gomp/attrs-2.C (bar): Likewise. * gfortran.dg/gomp/reduction4.f90: Adjust expected diagnostics. * gfortran.dg/gomp/reduction7.f90: Likewise. libgomp/ * Makefile.am (libgomp_la_SOURCES): Add scope.c * Makefile.in: Regenerated. * libgomp_g.h (GOMP_scope_start): Declare. * libgomp.map: Add GOMP_scope_start@@GOMP_5.1. * scope.c: New file. * testsuite/libgomp.c-c++-common/scope-1.c: New test. * testsuite/libgomp.c-c++-common/task-reduction-16.c: New test.
-