- Oct 20, 2023
-
-
Patrick Palka authored
After the previous patch, we now only have two tsubst entry points for expression trees: tsubst_copy_and_build and tsubst_expr. The former despite its unwieldy name is the main entry point, and the latter is just a superset of the former that also handles statement trees. We could merge them so that we just have tsubst_expr, but it seems natural to distinguish statement trees from expression trees and to maintain a separate entry point for them. To that end, this this patch renames tsubst_copy_and_build to tsubst_expr, and renames the current tsubst_expr to tsubst_stmt, which continues to be a superset of the former (which is convenient since sometimes expression trees appear in statement contexts, e.g. a branch of an IF_STMT could be NOP_EXPR). (Making tsubst_stmt disjoint from tsubst_expr is left as future work if deemed desirable.) This patch in turn renames suitable existing uses of tsubst_expr (that expect to take statement trees) to use tsubst_stmt. Thus untouched tsubst_expr calls are implicitly strengthened to expect only expression trees after this patch. For the tsubst_omp_* routines I opted to rename all existing uses to ensure no unintended functional change. This patch also moves the handling of CO_YIELD_EXPR and CO_AWAIT_EXPR from tsubst_stmt to tsubst_expr since they're indeed expression trees. gcc/cp/ChangeLog: * cp-lang.cc (objcp_tsubst_copy_and_build): Rename to ... (objcp_tsubst_expr): ... this. * cp-objcp-common.h (objcp_tsubst_copy_and_build): Rename to ... (objcp_tsubst_expr): ... this. * cp-tree.h (tsubst_copy_and_build): Remove declaration. * init.cc (maybe_instantiate_nsdmi_init): Use tsubst_expr instead of tsubst_copy_and_build. * pt.cc (expand_integer_pack): Likewise. (instantiate_non_dependent_expr_internal): Likewise. (instantiate_class_template): Use tsubst_stmt instead of tsubst_expr for STATIC_ASSERT. (tsubst_function_decl): Adjust tsubst_copy_and_build uses. (tsubst_arg_types): Likewise. (tsubst_exception_specification): Likewise. (tsubst_tree_list): Likewise. (tsubst): Likewise. (tsubst_name): Likewise. (tsubst_omp_clause_decl): Use tsubst_stmt instead of tsubst_expr. (tsubst_omp_clauses): Likewise. (tsubst_copy_asm_operands): Adjust tsubst_copy_and_build use. (tsubst_omp_for_iterator): Use tsubst_stmt instead of tsubst_expr. (tsubst_expr): Rename to ... (tsubst_stmt): ... this. <case CO_YIELD_EXPR, CO_AWAIT_EXPR>: Move to tsubst_expr. (tsubst_omp_udr): Use tsubst_stmt instead of tsubst_expr. (tsubst_non_call_postfix_expression): Adjust tsubst_copy_and_build use. (tsubst_lambda_expr): Likewise. Use tsubst_stmt instead of tsubst_expr for the body of a lambda. (tsubst_copy_and_build_call_args): Rename to ... (tsubst_call_args): ... this. Adjust tsubst_copy_and_build use. (tsubst_copy_and_build): Rename to tsubst_expr. Adjust tsubst_copy_and_build and tsubst_copy_and_build_call_args use. <case TRANSACTION_EXPR>: Use tsubst_stmt instead of tsubst_expr. (maybe_instantiate_noexcept): Adjust tsubst_copy_and_build use. (instantiate_body): Use tsubst_stmt instead of tsubst_expr for substituting the function body. (tsubst_initializer_list): Adjust tsubst_copy_and_build use. gcc/objcp/ChangeLog: * objcp-lang.cc (objcp_tsubst_copy_and_build): Rename to ... (objcp_tsubst_expr): ... this. Adjust tsubst_copy_and_build uses. Reviewed-by:
Jason Merrill <jason@redhat.com>
-
Patrick Palka authored
The relationship between tsubst_copy_and_build and tsubst_copy (two of the main template argument substitution routines for expression trees) is rather hazy. The former is mostly a superset of the latter, with some differences. The main apparent difference is their handling of various tree codes, but much of the tree code handling in tsubst_copy appears to be dead code. This is because tsubst_copy mostly gets (directly) called on id-expressions rather than on arbitrary expressions. The interesting tree codes are PARM_DECL, VAR_DECL, BIT_NOT_EXPR, SCOPE_REF, TEMPLATE_ID_EXPR and IDENTIFIER_NODE: * for PARM_DECL and VAR_DECL, tsubst_copy_and_build calls tsubst_copy followed by doing some extra handling of its own * for BIT_NOT_EXPR tsubst_copy implicitly handles unresolved destructor calls (i.e. the first operand is an identifier or a type) * for SCOPE_REF, TEMPLATE_ID_EXPR and IDENTIFIER_NODE tsubst_copy refrains from doing name lookup of the terminal name Other more minor differences are that tsubst_copy exits early when 'args' is null, and it calls maybe_dependent_member_ref, and finally it dispatches to tsubst for type trees.[1] Thus tsubst_copy is similar enough to tsubst_copy_and_build that it makes sense to merge the two functions, with the main difference we want to preserve is tsubst_copy's lack of name lookup for id-expressions. This patch achieves this via a new tsubst flag tf_no_name_lookup which controls name lookup and resolution of a (top-level) id-expression. [1]: Exiting early for null 'args' doesn't seem right since it means we return templated trees even when !processing_template_decl. And dispatching to tsubst for type trees muddles the distinction between type and expressions which makes things less clear at the call site. So these properties of tsubst_copy don't seem worth preserving. N.B. the diff for this patch looks much cleaner when generated using the "patience diff" algorithm via Git's --patience flag. gcc/cp/ChangeLog: * cp-tree.h (enum tsubst_flags): Add tf_no_name_lookup. * pt.cc (tsubst_pack_expansion): Use tsubst for substituting BASES_TYPE. (tsubst_decl) <case USING_DECL>: Use tsubst_name instead of tsubst_copy. (tsubst) <case TEMPLATE_TYPE_PARM>: Use tsubst_copy_and_build instead of tsubst_copy for substituting CLASS_PLACEHOLDER_TEMPLATE. <case TYPENAME_TYPE>: Use tsubst_name instead of tsubst_copy for substituting TYPENAME_TYPE_FULLNAME. (tsubst_name): Define. (tsubst_qualified_id): Use tsubst_name instead of tsubst_copy for substituting the component name of a SCOPE_REF. (tsubst_copy): Remove. (tsubst_copy_and_build): Clear tf_no_name_lookup at the start, and remember if it was set. Call maybe_dependent_member_ref if tf_no_name_lookup was not set. <case IDENTIFIER_NODE>: Don't do name lookup if tf_no_name_lookup was set. <case TEMPLATE_ID_EXPR>: If tf_no_name_lookup was set, use tsubst_name instead of tsubst_copy_and_build to substitute the template and don't finish the template-id. <case BIT_NOT_EXPR>: Handle identifier and type operand (if tf_no_name_lookup was set). <case SCOPE_REF>: Avoid trying to resolve a SCOPE_REF if tf_no_name_lookup was set by calling build_qualified_name directly instead of tsubst_qualified_id. <case SIZEOF_EXPR>: Handling of sizeof... copied from tsubst_copy. <case CALL_EXPR>: Use tsubst_name instead of tsubst_copy to substitute a TEMPLATE_ID_EXPR callee naming an unresolved template. <case COMPONENT_REF>: Likewise to substitute the member. <case FUNCTION_DECL>: Copied from tsubst_copy and merged with ... <case VAR_DECL, PARM_DECL>: ... these. Initial handling copied from tsubst_copy. Optimize local variable substitution by trying retrieve_local_specialization before checking uses_template_parms. <case CONST_DECL>: Copied from tsubst_copy. <case FIELD_DECL>: Likewise. <case NAMESPACE_DECL>: Likewise. <case OVERLOAD>: Likewise. <case TEMPLATE_DECL>: Likewise. <case TEMPLATE_PARM_INDEX>: Likewise. <case TYPE_DECL>: Likewise. <case CLEANUP_POINT_EXPR>: Likewise. <case OFFSET_REF>: Likewise. <case EXPR_PACK_EXPANSION>: Likewise. <case NONTYPE_ARGUMENT_PACK>: Likewise. <case *_CST>: Likewise. <case *_*_FOLD_EXPR>: Likewise. <case DEBUG_BEGIN_STMT>: Likewise. <case CO_AWAIT_EXPR>: Likewise. <case TRAIT_EXPR>: Use tsubst and tsubst_copy_and_build instead of tsubst_copy. <default>: Copied from tsubst_copy. (tsubst_initializer_list): Use tsubst and tsubst_copy_and_build instead of tsubst_copy. Reviewed-by:
Jason Merrill <jason@redhat.com>
-
Patrick Palka authored
In cp_parser_postfix_expression, and in the CALL_EXPR case of tsubst_copy_and_build, we essentially repeat the type-dependent and COMPONENT_REF callee cases of finish_call_expr. This patch deduplicates this logic by making both spots consistently go through finish_call_expr. This allows us to easily fix PR106086 -- which is about us neglecting to capture 'this' when we resolve a use of a non-static member function of the current instantiation only at lambda regeneration time -- by moving the call to maybe_generic_this_capture from the parser to finish_call_expr so that we consider capturing 'this' at regeneration time as well. PR c++/106086 gcc/cp/ChangeLog: * parser.cc (cp_parser_postfix_expression): Consolidate three calls to finish_call_expr, one to build_new_method_call and one to build_min_nt_call_vec into one call to finish_call_expr. Don't call maybe_generic_this_capture here. * pt.cc (tsubst_copy_and_build) <case CALL_EXPR>: Remove COMPONENT_REF callee handling. (type_dependent_expression_p): Use t_d_object_e_p instead of t_d_e_p for COMPONENT_REF and OFFSET_REF. * semantics.cc (finish_call_expr): In the type-dependent case, call maybe_generic_this_capture here instead. gcc/testsuite/ChangeLog: * g++.dg/template/crash127.C: Expect additional error due to being able to check the member access expression ahead of time. Strengthen the test by not instantiating the class template. * g++.dg/cpp1y/lambda-generic-this5.C: New test.
-
Patrick Palka authored
This follow-up patch removes build_non_dependent_expr (and make_args_non_dependent) and calls thereof, no functional change. gcc/cp/ChangeLog: * call.cc (build_new_method_call): Remove calls to build_non_dependent_expr and/or make_args_non_dependent. * coroutines.cc (finish_co_return_stmt): Likewise. * cp-tree.h (build_non_dependent_expr): Remove. (make_args_non_dependent): Remove. * decl2.cc (grok_array_decl): Remove calls to build_non_dependent_expr and/or make_args_non_dependent. (build_offset_ref_call_from_tree): Likewise. * init.cc (build_new): Likewise. * pt.cc (make_args_non_dependent): Remove. (test_build_non_dependent_expr): Remove. (cp_pt_cc_tests): Adjust. * semantics.cc (finish_expr_stmt): Remove calls to build_non_dependent_expr and/or make_args_non_dependent. (finish_for_expr): Likewise. (finish_call_expr): Likewise. (finish_omp_atomic): Likewise. * typeck.cc (finish_class_member_access_expr): Likewise. (build_x_indirect_ref): Likewise. (build_x_binary_op): Likewise. (build_x_array_ref): Likewise. (build_x_vec_perm_expr): Likewise. (build_x_shufflevector): Likewise. (build_x_unary_op): Likewise. (cp_build_addressof): Likewise. (build_x_conditional_expr): Likewise. (build_x_compound_expr): Likewise. (build_static_cast): Likewise. (build_x_modify_expr): Likewise. (check_return_expr): Likewise. * typeck2.cc (build_x_arrow): Likewise. Reviewed-by:
Jason Merrill <jason@redhat.com>
-
Patrick Palka authored
This tree code dates all the way back to r69130[1] which implemented typing of non-dependent expressions. Its motivation was never clear (to me at least) since its documentation in e.g. cp-tree.def doesn't seem accurate anymore. build_non_dependent_expr has since gained a bunch of edge cases about whether or how to wrap certain templated trees, making it hard to reason about in general. So this patch removes this tree code, and temporarily turns build_non_dependent_expr into the identity function. The subsequent patch will remove build_non_dependent_expr and adjust its callers appropriately. We now need to more thoroughly handle templated (sub)trees in a couple of places which previously didn't need to since they didn't look through NON_DEPENDENT_EXPR. [1]: https://gcc.gnu.org/pipermail/gcc-patches/2003-July/109355.html gcc/c-family/ChangeLog: * c-warn.cc (check_address_or_pointer_of_packed_member): Handle type-dependent callee of CALL_EXPR. gcc/cp/ChangeLog: * class.cc (instantiate_type): Remove NON_DEPENDENT_EXPR handling. * constexpr.cc (cxx_eval_constant_expression): Likewise. (potential_constant_expression_1): Likewise. * coroutines.cc (coro_validate_builtin_call): Don't expect ALIGNOF_EXPR to be wrapped in NON_DEPENDENT_EXPR. * cp-objcp-common.cc (cp_common_init_ts): Remove NON_DEPENDENT_EXPR handling. * cp-tree.def (NON_DEPENDENT_EXPR): Remove. * cp-tree.h (build_non_dependent_expr): Temporarily redefine as the identity function. * cvt.cc (maybe_warn_nodiscard): Handle type-dependent and variable callee of CALL_EXPR. * cxx-pretty-print.cc (cxx_pretty_printer::expression): Remove NON_DEPENDENT_EXPR handling. * error.cc (dump_decl): Likewise. (dump_expr): Likewise. * expr.cc (mark_use): Likewise. (mark_exp_read): Likewise. * pt.cc (build_non_dependent_expr): Remove. * tree.cc (lvalue_kind): Remove NON_DEPENDENT_EXPR handling. (cp_stabilize_reference): Likewise. * typeck.cc (warn_for_null_address): Likewise. (cp_build_binary_op): Handle type-dependent SIZEOF_EXPR operands. (cp_build_unary_op) <case TRUTH_NOT_EXPR>: Don't fold inside a template. gcc/testsuite/ChangeLog: * g++.dg/concepts/var-concept3.C: Adjust expected diagnostic for attempting to call a variable concept. Reviewed-by:
Jason Merrill <jason@redhat.com>
-
Tamar Christina authored
During the refactoring I had passed loop_vinfo on to vect_set_loop_condition during prolog peeling. This parameter is unused in most cases except for in vect_set_loop_condition_partial_vectors where it's behaviour depends on whether loop_vinfo is NULL or not. Apparently this code expect it to be NULL and it reads the structures from a different location. This fixes the failing testcase which was not using the lens values determined earlier in vectorizable_store because it was looking it up in the given loop_vinfo instead. gcc/ChangeLog: PR tree-optimization/111866 * tree-vect-loop-manip.cc (vect_do_peeling): Pass null as vinfo to vect_set_loop_condition during prolog peeling.
-
Richard Biener authored
PR tree-optimization/111383 PR tree-optimization/110243 gcc/testsuite/ * gcc.dg/torture/pr111383.c: New testcase.
-
Richard Biener authored
The following fixes a missed check in the simple_iv attempt to simplify (signed T)((unsigned T) base + step) where it allows a truncating inner conversion leading to wrong code. PR tree-optimization/111445 * tree-scalar-evolution.cc (simple_iv_with_niters): Add missing check for a sign-conversion. * gcc.dg/torture/pr111445.c: New testcase.
-
Richard Biener authored
The following addresses IVOPTs rewriting expressions in its strip_offset without caring for definedness of overflow. Rather than the earlier attempt of just using the proper split_constant_offset from data-ref analysis the following adjusts IVOPTs helper trying to minimize changes from this fix, possibly easing backports. PR tree-optimization/110243 PR tree-optimization/111336 * tree-ssa-loop-ivopts.cc (strip_offset_1): Rewrite operations with undefined behavior on overflow to unsigned arithmetic. * gcc.dg/torture/pr110243.c: New testcase. * gcc.dg/torture/pr111336.c: Likewise.
-
Richard Biener authored
The following fixes the assert in vectorizable_simd_clone_call to assert we have a vector type during transform. Whether we have one during analysis depends on whether another SLP user decided on the type of a constant/external already. When we end up with a mismatch in desire the updating will fail and make vectorization fail. PR tree-optimization/111891 * tree-vect-stmts.cc (vectorizable_simd_clone_call): Fix assert. * gfortran.dg/pr111891.f90: New testcase.
-
Andrew Stubbs authored
Accept the architecture configure option and resolve build failures. This is enough to build binaries, but I've not got a device to test it on, so there are probably runtime issues to fix. The cache control instructions might be unsafe (or too conservative), and the kernel metadata might be off. Vector reductions will need to be reworked for RDNA2. In principle, it would be better to use wavefrontsize32 for this architecture, but that would mean switching everything to allow SImode masks, so wavefrontsize64 it is. The multilib is not included in the default configuration so either configure --with-arch=gfx1030 or include it in --with-multilib-list=gfx1030,.... The majority of this patch has no effect on other devices, but changing from using scalar writes for the exit value to vector writes means we don't need the scalar cache write-back instruction anywhere (which doesn't exist in RDNA2). gcc/ChangeLog: * config.gcc: Allow --with-arch=gfx1030. * config/gcn/gcn-hsa.h (NO_XNACK): gfx1030 does not support xnack. (ASM_SPEC): gfx1030 needs -mattr=+wavefrontsize64 set. * config/gcn/gcn-opts.h (enum processor_type): Add PROCESSOR_GFX1030. (TARGET_GFX1030): New. (TARGET_RDNA2): New. * config/gcn/gcn-valu.md (@dpp_move<mode>): Disable for RDNA2. (addc<mode>3<exec_vcc>): Add RDNA2 syntax variant. (subc<mode>3<exec_vcc>): Likewise. (<convop><mode><vndi>2_exec): Add RDNA2 alternatives. (vec_cmp<mode>di): Likewise. (vec_cmp<u><mode>di): Likewise. (vec_cmp<mode>di_exec): Likewise. (vec_cmp<u><mode>di_exec): Likewise. (vec_cmp<mode>di_dup): Likewise. (vec_cmp<mode>di_dup_exec): Likewise. (reduc_<reduc_op>_scal_<mode>): Disable for RDNA2. (*<reduc_op>_dpp_shr_<mode>): Likewise. (*plus_carry_dpp_shr_<mode>): Likewise. (*plus_carry_in_dpp_shr_<mode>): Likewise. * config/gcn/gcn.cc (gcn_option_override): Recognise gfx1030. (gcn_global_address_p): RDNA2 only allows smaller offsets. (gcn_addr_space_legitimate_address_p): Likewise. (gcn_omp_device_kind_arch_isa): Recognise gfx1030. (gcn_expand_epilogue): Use VGPRs instead of SGPRs. (output_file_start): Configure gfx1030. * config/gcn/gcn.h (TARGET_CPU_CPP_BUILTINS): Add __RDNA2__; (ASSEMBLER_DIALECT): New. * config/gcn/gcn.md (rdna): New define_attr. (enabled): Use "rdna" attribute. (gcn_return): Remove s_dcache_wb. (addcsi3_scalar): Add RDNA2 syntax variant. (addcsi3_scalar_zero): Likewise. (addptrdi3): Likewise. (mulsi3): v_mul_lo_i32 should be v_mul_lo_u32 on all ISA. (*memory_barrier): Add RDNA2 syntax variant. (atomic_load<mode>): Add RDNA2 cache control variants, and disable scalar atomics for RDNA2. (atomic_store<mode>): Likewise. (atomic_exchange<mode>): Likewise. * config/gcn/gcn.opt (gpu_type): Add gfx1030. * config/gcn/mkoffload.cc (EF_AMDGPU_MACH_AMDGCN_GFX1030): New. (main): Recognise -march=gfx1030. * config/gcn/t-omp-device: Add gfx1030 isa. libgcc/ChangeLog: * config/gcn/amdgcn_veclib.h (CDNA3_PLUS): Set false for __RDNA2__. libgomp/ChangeLog: * plugin/plugin-gcn.c (EF_AMDGPU_MACH_AMDGCN_GFX1030): New. (isa_hsa_name): Recognise gfx1030. (isa_code): Likewise. * team.c (defined): Remove s_endpgm.
-
Richard Biener authored
The following restricts moving variable shifts to when they are always executed in the loop as we currently do not have an efficient way to rewrite them to something that is unconditionally well-defined and value range analysis will otherwise compute invalid ranges for the shift operand. PR tree-optimization/111000 * stor-layout.h (element_precision): Move .. * tree.h (element_precision): .. here. * tree-ssa-loop-im.cc (movement_possibility_1): Restrict motion of shifts and rotates. * gcc.dg/torture/pr111000.c: New testcase.
-
Alexandre Oliva authored
This patch introduces an optional hardening pass to catch unexpected execution flows. Functions are transformed so that basic blocks set a bit in an automatic array, and (non-exceptional) function exit edges check that the bits in the array represent an expected execution path in the CFG. Functions with multiple exit edges, or with too many blocks, call an out-of-line checker builtin implemented in libgcc. For simpler functions, the verification is performed in-line. -fharden-control-flow-redundancy enables the pass for eligible functions, --param hardcfr-max-blocks sets a block count limit for functions to be eligible, and --param hardcfr-max-inline-blocks tunes the "too many blocks" limit for in-line verification. -fhardcfr-skip-leaf makes leaf functions non-eligible. Additional -fhardcfr-check-* options are added to enable checking at exception escape points, before potential sibcalls, hereby dubbed returning calls, and before noreturn calls and exception raises. A notable case is the distinction between noreturn calls expected to throw and those expected to terminate or loop forever: the default setting for -fhardcfr-check-noreturn-calls, no-xthrow, performs checking before the latter, but the former only gets checking in the exception handler. GCC can only tell between them by explicit marking noreturn functions expected to raise with the newly-introduced expected_throw attribute, and corresponding ECF_XTHROW flag. for gcc/ChangeLog * tree-core.h (ECF_XTHROW): New macro. * tree.cc (set_call_expr): Add expected_throw attribute when ECF_XTHROW is set. (build_common_builtin_node): Add ECF_XTHROW to __cxa_end_cleanup and _Unwind_Resume or _Unwind_SjLj_Resume. * calls.cc (flags_from_decl_or_type): Check for expected_throw attribute to set ECF_XTHROW. * gimple.cc (gimple_build_call_from_tree): Propagate ECF_XTHROW from decl flags to gimple call... (gimple_call_flags): ... and back. * gimple.h (GF_CALL_XTHROW): New gf_mask flag. (gimple_call_set_expected_throw): New. (gimple_call_expected_throw_p): New. * Makefile.in (OBJS): Add gimple-harden-control-flow.o. * builtins.def (BUILT_IN___HARDCFR_CHECK): New. * common.opt (fharden-control-flow-redundancy): New. (-fhardcfr-check-returning-calls): New. (-fhardcfr-check-exceptions): New. (-fhardcfr-check-noreturn-calls=*): New. (Enum hardcfr_check_noreturn_calls): New. (fhardcfr-skip-leaf): New. * doc/invoke.texi: Document them. (hardcfr-max-blocks, hardcfr-max-inline-blocks): New params. * flag-types.h (enum hardcfr_noret): New. * gimple-harden-control-flow.cc: New. * params.opt (-param=hardcfr-max-blocks=): New. (-param=hradcfr-max-inline-blocks=): New. * passes.def (pass_harden_control_flow_redundancy): Add. * tree-pass.h (make_pass_harden_control_flow_redundancy): Declare. * doc/extend.texi: Document expected_throw attribute. for gcc/ada/ChangeLog * gcc-interface/trans.cc (gigi): Mark __gnat_reraise_zcx with ECF_XTHROW. (build_raise_check): Likewise for all rcheck subprograms. for gcc/c-family/ChangeLog * c-attribs.cc (handle_expected_throw_attribute): New. (c_common_attribute_table): Add expected_throw. for gcc/cp/ChangeLog * decl.cc (push_throw_library_fn): Mark with ECF_XTHROW. * except.cc (build_throw): Likewise __cxa_throw, _ITM_cxa_throw, __cxa_rethrow. for gcc/testsuite/ChangeLog * c-c++-common/torture/harden-cfr.c: New. * c-c++-common/harden-cfr-noret-never-O0.c: New. * c-c++-common/torture/harden-cfr-noret-never.c: New. * c-c++-common/torture/harden-cfr-noret-noexcept.c: New. * c-c++-common/torture/harden-cfr-noret-nothrow.c: New. * c-c++-common/torture/harden-cfr-noret.c: New. * c-c++-common/torture/harden-cfr-notail.c: New. * c-c++-common/torture/harden-cfr-returning.c: New. * c-c++-common/torture/harden-cfr-tail.c: New. * c-c++-common/torture/harden-cfr-abrt-always.c: New. * c-c++-common/torture/harden-cfr-abrt-never.c: New. * c-c++-common/torture/harden-cfr-abrt-no-xthrow.c: New. * c-c++-common/torture/harden-cfr-abrt-nothrow.c: New. * c-c++-common/torture/harden-cfr-abrt.c: New. * c-c++-common/torture/harden-cfr-always.c: New. * c-c++-common/torture/harden-cfr-never.c: New. * c-c++-common/torture/harden-cfr-no-xthrow.c: New. * c-c++-common/torture/harden-cfr-nothrow.c: New. * c-c++-common/torture/harden-cfr-bret-always.c: New. * c-c++-common/torture/harden-cfr-bret-never.c: New. * c-c++-common/torture/harden-cfr-bret-noopt.c: New. * c-c++-common/torture/harden-cfr-bret-noret.c: New. * c-c++-common/torture/harden-cfr-bret-no-xthrow.c: New. * c-c++-common/torture/harden-cfr-bret-nothrow.c: New. * c-c++-common/torture/harden-cfr-bret-retcl.c: New. * c-c++-common/torture/harden-cfr-bret.c: New. * g++.dg/harden-cfr-throw-always-O0.C: New. * g++.dg/harden-cfr-throw-returning-O0.C: New. * g++.dg/torture/harden-cfr-noret-always-no-nothrow.C: New. * g++.dg/torture/harden-cfr-noret-never-no-nothrow.C: New. * g++.dg/torture/harden-cfr-noret-no-nothrow.C: New. * g++.dg/torture/harden-cfr-throw-always.C: New. * g++.dg/torture/harden-cfr-throw-never.C: New. * g++.dg/torture/harden-cfr-throw-no-xthrow.C: New. * g++.dg/torture/harden-cfr-throw-no-xthrow-expected.C: New. * g++.dg/torture/harden-cfr-throw-nothrow.C: New. * g++.dg/torture/harden-cfr-throw-nocleanup.C: New. * g++.dg/torture/harden-cfr-throw-returning.C: New. * g++.dg/torture/harden-cfr-throw.C: New. * gcc.dg/torture/harden-cfr-noret-no-nothrow.c: New. * gcc.dg/torture/harden-cfr-tail-ub.c: New. * gnat.dg/hardcfr.adb: New. for libgcc/ChangeLog * Makefile.in (LIB2ADD): Add hardcfr.c. * hardcfr.c: New.
-
Alex Coplan authored
This patch tweaks change_insns to also call ::remove_insn to ensure the underlying RTL insn gets removed from the insn chain in the case of a deletion. This avoids leaving NOTE_INSN_DELETED around after deleting insns. For movement, the RTL insn chain is updated earlier in change_insns with the call to move_insn. For deletion, it seems reasonable to do it here. gcc/ChangeLog: * rtl-ssa/changes.cc (function_info::change_insns): Ensure we call ::remove_insn on deleted insns.
-
Richard Biener authored
The following amends the {L,R}SHIFT_EXPR documentation with documentation about the {L,R}ROTATE_EXPR case. * doc/generic.texi ({L,R}ROTATE_EXPR): Document.
-
Oleg Endo authored
Fix accidentally inverted comparison. gcc/ChangeLog: PR target/101177 * config/sh/sh.md (unnamed split pattern): Fix comparison of find_regno_note result.
-
Richard Biener authored
The following makes sure to rewrite all gather/scatter detected by dataref analysis plus stmts classified as VMAT_GATHER_SCATTER. Maybe we need to rewrite all refs, the following covers the cases I've run into now. * tree-vect-loop.cc (update_epilogue_loop_vinfo): Rewrite both STMT_VINFO_GATHER_SCATTER_P and VMAT_GATHER_SCATTER stmt refs.
-
Richard Biener authored
I went a little bit too simple with implementing SLP gather support for emulated and builtin based gathers. The following fixes the conflict that appears when running into .MASK_LOAD where we rely on vect_get_operand_map and the bolted-on STMT_VINFO_GATHER_SCATTER_P checking wrecks that. The following properly integrates this with vect_get_operand_map, adding another special index refering to the vect_check_gather_scatter analyzed offset. This unbreaks aarch64 (and hopefully riscv), I'll followup with more fixes and testsuite coverage for x86 where I think I got masked gather SLP support wrong. * tree-vect-slp.cc (off_map, off_op0_map, off_arg2_map, off_arg3_arg2_map): New. (vect_get_operand_map): Get flag whether the stmt was recognized as gather or scatter and use the above accordingly. (vect_get_and_check_slp_defs): Adjust. (vect_build_slp_tree_2): Likewise.
-
Tobias Burnus authored
The omp_lock_hint_* parameters were deprecated in favor of omp_sync_hint_*. While omp.h contained deprecation markers for those, the omp_lib module only contained them for omp_{g,s}_nested. Note: The -Wdeprecated-declarations warning will only become active once openmp_version / _OPENMP is bumped from 201511 (4.5) to 201811 (5.0). libgomp/ChangeLog: * omp_lib.f90.in: Tag omp_lock_hint_* as being deprecated when _OPENMP >= 201811.
-
Juzhe-Zhong authored
1. Remove "m_" prefix as they are not private members. 2. Rename infos -> local_infos, info -> global_info to clarify their meaning. Pushed as it is obvious. gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (pre_vsetvl::fuse_local_vsetvl_info): Rename variables. (pre_vsetvl::pre_global_vsetvl_info): Ditto. (pre_vsetvl::emit_vsetvl): Ditto.
-
Tamar Christina authored
With the patch enabling the vectorization of early-breaks, we'd like to allow bitfield lowering in such loops, which requires the relaxation of allowing multiple exits when doing so. In order to avoid a similar issue to PR107275, the code that rejects loops with certain types of gimple_stmts was hoisted from 'if_convertible_loop_p_1' to 'get_loop_body_in_if_conv_order', to avoid trying to lower bitfields in loops we are not going to vectorize anyway. This also ensures 'ifcvt_local_dec' doesn't accidentally remove statements it shouldn't as it will never come across them. I made sure to add a comment to make clear that there is a direct connection between the two and if we were to enable vectorization of any other gimple statement we should make sure both handle it. gcc/ChangeLog: * tree-if-conv.cc (if_convertible_loop_p_1): Move check from here ... (get_loop_body_if_conv_order): ... to here. (if_convertible_loop_p): Remove single_exit check. (tree_if_conversion): Move single_exit check to if-conversion part and support multiple exits. gcc/testsuite/ChangeLog: * gcc.dg/vect/vect-bitfield-read-1-not.c: New test. * gcc.dg/vect/vect-bitfield-read-2-not.c: New test. * gcc.dg/vect/vect-bitfield-read-8.c: New test. * gcc.dg/vect/vect-bitfield-read-9.c: New test. Co-Authored-By:
Andre Vieira <andre.simoesdiasvieira@arm.com>
-
Tamar Christina authored
The bitfield vectorization support does not currently recognize bitfields inside gconds. This means they can't be used as conditions for early break vectorization which is a functionality we require. This adds support for them by explicitly matching and handling gcond as a source. Testcases are added in the testsuite update patch as the only way to get there is with the early break vectorization. See tests: - vect-early-break_20.c - vect-early-break_21.c gcc/ChangeLog: * tree-vect-patterns.cc (vect_init_pattern_stmt): Copy STMT_VINFO_TYPE from original statement. (vect_recog_bitfield_ref_pattern): Support bitfields in gcond. Co-Authored-By:
Andre Vieira <andre.simoesdiasvieira@arm.com>
-
Hu, Lin1 authored
Hi, all This patch aims to fix some scan-asm fail of pr89229-{5,6,7}b.c since we emit scalar vmov{s,d} here, when trying to use x/ymm 16+ w/o avx512vl but with avx512f+evex512. If everyone has no objection to the modification of this behavior, then we tend to solve these failures by modifying these testcases. BRs, Lin gcc/testsuite/ChangeLog: * gcc.target/i386/pr89229-5b.c: Modify test. * gcc.target/i386/pr89229-6b.c: Ditto. * gcc.target/i386/pr89229-7b.c: Ditto.
-
Juzhe-Zhong authored
Confirm dynamic LMUL algorithm works well for choosing LMUL = 4 for the PR: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111848 But it generate horrible register spillings. The root cause is that we didn't hoist the vmv.v.x outside the loop which increase the SLP loop register pressure. So, change the COSNT_VECTOR move into vec_duplicate splitter that we can gain better optimizations: 1. better LICM. 2. More opportunities of transforming 'vv' into 'vx' in the future. Before this patch: f3: ble a4,zero,.L8 csrr t0,vlenb slli t1,t0,4 csrr a6,vlenb sub sp,sp,t1 csrr a5,vlenb slli a6,a6,3 slli a5,a5,2 add a6,a6,sp vsetvli a7,zero,e16,m8,ta,ma slli a4,a4,3 vid.v v8 addi t6,a5,-1 vand.vi v8,v8,-2 neg t5,a5 vs8r.v v8,0(sp) vadd.vi v8,v8,1 vs8r.v v8,0(a6) j .L4 .L12: vsetvli a7,zero,e16,m8,ta,ma .L4: csrr t0,vlenb slli t0,t0,3 vl8re16.v v16,0(sp) add t0,t0,sp vmv.v.x v8,t6 mv t1,a4 vand.vv v24,v16,v8 mv a6,a4 vl8re16.v v16,0(t0) vand.vv v8,v16,v8 bleu a4,a5,.L3 mv a6,a5 .L3: vsetvli zero,a6,e8,m4,ta,ma vle8.v v20,0(a2) vle8.v v16,0(a3) vsetvli a7,zero,e8,m4,ta,ma vrgatherei16.vv v4,v20,v24 vadd.vv v4,v16,v4 vsetvli zero,a6,e8,m4,ta,ma vse8.v v4,0(a0) vle8.v v20,0(a2) vsetvli a7,zero,e8,m4,ta,ma vrgatherei16.vv v4,v20,v8 vadd.vv v4,v4,v16 vsetvli zero,a6,e8,m4,ta,ma vse8.v v4,0(a1) add a4,a4,t5 add a0,a0,a5 add a3,a3,a5 add a1,a1,a5 add a2,a2,a5 bgtu t1,a5,.L12 csrr t0,vlenb slli t1,t0,4 add sp,sp,t1 jr ra .L8: ret After this patch: f3: ble a4,zero,.L6 csrr a6,vlenb csrr a5,vlenb slli a6,a6,2 slli a5,a5,2 addi a6,a6,-1 slli a4,a4,3 neg t5,a5 vsetvli t1,zero,e16,m8,ta,ma vmv.v.x v24,a6 vid.v v8 vand.vi v8,v8,-2 vadd.vi v16,v8,1 vand.vv v8,v8,v24 vand.vv v16,v16,v24 .L4: mv t1,a4 mv a6,a4 bleu a4,a5,.L3 mv a6,a5 .L3: vsetvli zero,a6,e8,m4,ta,ma vle8.v v28,0(a2) vle8.v v24,0(a3) vsetvli a7,zero,e8,m4,ta,ma vrgatherei16.vv v4,v28,v8 vadd.vv v4,v24,v4 vsetvli zero,a6,e8,m4,ta,ma vse8.v v4,0(a0) vle8.v v28,0(a2) vsetvli a7,zero,e8,m4,ta,ma vrgatherei16.vv v4,v28,v16 vadd.vv v4,v4,v24 vsetvli zero,a6,e8,m4,ta,ma vse8.v v4,0(a1) add a4,a4,t5 add a0,a0,a5 add a3,a3,a5 add a1,a1,a5 add a2,a2,a5 bgtu t1,a5,.L4 .L6: ret Note that this patch triggers multiple FAILs: FAIL: gcc.target/riscv/rvv/autovec/cond/cond_arith_run-3.c execution test FAIL: gcc.target/riscv/rvv/autovec/cond/cond_arith_run-3.c execution test FAIL: gcc.target/riscv/rvv/autovec/cond/cond_arith_run-4.c execution test FAIL: gcc.target/riscv/rvv/autovec/cond/cond_arith_run-4.c execution test FAIL: gcc.target/riscv/rvv/autovec/cond/cond_arith_run-8.c execution test FAIL: gcc.target/riscv/rvv/autovec/cond/cond_arith_run-8.c execution test FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c execution test FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c execution test FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-2.c execution test FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-2.c execution test FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-1.c execution test FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-1.c execution test FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-2.c execution test FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-2.c execution test They failed are all because of bugs on VSETVL PASS: 10dd4: 0c707057 vsetvli zero,zero,e8,mf2,ta,ma 10dd8: 5e06b8d7 vmv.v.i v17,13 10ddc: 9ed030d7 vmv1r.v v1,v13 10de0: b21040d7 vncvt.x.x.w v1,v1 ----> raise illegal instruction since we don't have SEW = 8 -> SEW = 4 narrowing. 10de4: 5e0785d7 vmv.v.v v11,v15 Confirm the recent VSETVL refactor patch: https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633231.html fixed all of them. So this patch should be committed after the VSETVL refactor patch. PR target/111848 gcc/ChangeLog: * config/riscv/riscv-selftests.cc (run_const_vector_selftests): Adapt selftest. * config/riscv/riscv-v.cc (expand_const_vector): Change it into vec_duplicate splitter. gcc/testsuite/ChangeLog: * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-7.c: Adapt test. * gcc.dg/vect/costmodel/riscv/rvv/pr111848.c: New test.
-
Lehua Ding authored
This patch refactors and cleanups the vsetvl pass in order to make the code easier to modify and understand. This patch does several things: 1. Introducing a virtual CFG for vsetvl infos and Phase 1, 2 and 3 only maintain and modify this virtual CFG. Phase 4 performs insertion, modification and deletion of vsetvl insns based on the virtual CFG. The basic block in the virtual CFG is called vsetvl_block_info and the vsetvl information inside is called vsetvl_info. 2. Combine Phase 1 and 2 into a single Phase 1 and unified the demand system, this phase only fuse local vsetvl info in forward direction. 3. Refactor Phase 3, change the logic for determining whether to uplift vsetvl info to a pred basic block to a more unified method that there is a vsetvl info in the vsetvl defintion reaching in compatible with it. 4. Place all modification operations to the RTL in Phase 4 and Phase 5. Phase 4 is responsible for inserting, modifying and deleting vsetvl instructions based on fully optimized vsetvl infos. Phase 5 removes the avl operand from the RVV instruction and removes the unused dest operand register from the vsetvl insns. These modifications resulted in some testcases needing to be updated. The reasons for updating are summarized below: 1. more optimized vlmax_back_prop-{25,26}.c vlmax_conflict-{3,12}.c/vsetvl-{13,23}.c/vsetvl-23.c/ avl_single-{23,84,95}.c/pr109773-1.c 2. less unnecessary fusion avl_single-46.c/imm_bb_prop-1.c/pr109743-2.c/vsetvl-18.c 3. local fuse direction (backward -> forward) scalar_move-1.c 4. add some bugfix testcases. pr111037-{3,4}.c/pr111037-4.c avl_single-{89,104,105,106,107,108,109}.c PR target/111037 PR target/111234 PR target/111725 gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (bitmap_union_of_preds_with_entry): New. (debug): Removed. (compute_reaching_defintion): New. (enum vsetvl_type): Moved. (vlmax_avl_p): Moved. (enum emit_type): Moved. (vlmul_to_str): Moved. (vlmax_avl_insn_p): Removed. (policy_to_str): Moved. (loop_basic_block_p): Removed. (valid_sew_p): Removed. (vsetvl_insn_p): Moved. (vsetvl_vtype_change_only_p): Removed. (after_or_same_p): Removed. (before_p): Removed. (anticipatable_occurrence_p): Removed. (available_occurrence_p): Removed. (insn_should_be_added_p): Removed. (get_all_sets): Moved. (get_same_bb_set): Moved. (gen_vsetvl_pat): Removed. (calculate_vlmul): Moved. (get_max_int_sew): New. (emit_vsetvl_insn): Removed. (get_max_float_sew): New. (eliminate_insn): Removed. (insert_vsetvl): Removed. (count_regno_occurrences): Moved. (get_vl_vtype_info): Removed. (enum def_type): Moved. (validate_change_or_fail): Moved. (change_insn): Removed. (get_all_real_uses): Moved. (get_forward_read_vl_insn): Removed. (get_backward_fault_first_load_insn): Removed. (change_vsetvl_insn): Removed. (avl_source_has_vsetvl_p): Removed. (source_equal_p): Moved. (calculate_sew): Removed. (same_equiv_note_p): Moved. (get_expr_id): New. (incompatible_avl_p): Removed. (get_regno): New. (different_sew_p): Removed. (get_bb_index): New. (different_lmul_p): Removed. (has_no_uses): Moved. (different_ratio_p): Removed. (different_tail_policy_p): Removed. (different_mask_policy_p): Removed. (possible_zero_avl_p): Removed. (enum demand_flags): New. (second_ratio_invalid_for_first_sew_p): Removed. (second_ratio_invalid_for_first_lmul_p): Removed. (enum class): New. (float_insn_valid_sew_p): Removed. (second_sew_less_than_first_sew_p): Removed. (first_sew_less_than_second_sew_p): Removed. (class vsetvl_info): New. (compare_lmul): Removed. (second_lmul_less_than_first_lmul_p): Removed. (second_ratio_less_than_first_ratio_p): Removed. (DEF_INCOMPATIBLE_COND): Removed. (greatest_sew): Removed. (first_sew): Removed. (second_sew): Removed. (first_vlmul): Removed. (second_vlmul): Removed. (first_ratio): Removed. (second_ratio): Removed. (vlmul_for_first_sew_second_ratio): Removed. (vlmul_for_greatest_sew_second_ratio): Removed. (ratio_for_second_sew_first_vlmul): Removed. (class vsetvl_block_info): New. (DEF_SEW_LMUL_FUSE_RULE): New. (always_unavailable): Removed. (avl_unavailable_p): Removed. (class demand_system): New. (sew_unavailable_p): Removed. (lmul_unavailable_p): Removed. (ge_sew_unavailable_p): Removed. (ge_sew_lmul_unavailable_p): Removed. (ge_sew_ratio_unavailable_p): Removed. (DEF_UNAVAILABLE_COND): Removed. (same_sew_lmul_demand_p): Removed. (propagate_avl_across_demands_p): Removed. (reg_available_p): Removed. (support_relaxed_compatible_p): Removed. (demands_can_be_fused_p): Removed. (earliest_pred_can_be_fused_p): Removed. (vsetvl_dominated_by_p): Removed. (avl_info::avl_info): Removed. (avl_info::single_source_equal_p): Removed. (avl_info::multiple_source_equal_p): Removed. (DEF_SEW_LMUL_RULE): New. (avl_info::operator=): Removed. (avl_info::operator==): Removed. (DEF_POLICY_RULE): New. (avl_info::operator!=): Removed. (avl_info::has_non_zero_avl): Removed. (vl_vtype_info::vl_vtype_info): Removed. (vl_vtype_info::operator==): Removed. (DEF_AVL_RULE): New. (vl_vtype_info::operator!=): Removed. (vl_vtype_info::same_avl_p): Removed. (vl_vtype_info::same_vtype_p): Removed. (vl_vtype_info::same_vlmax_p): Removed. (vector_insn_info::operator>=): Removed. (vector_insn_info::operator==): Removed. (class pre_vsetvl): New. (vector_insn_info::parse_insn): Removed. (vector_insn_info::compatible_p): Removed. (vector_insn_info::skip_avl_compatible_p): Removed. (vector_insn_info::compatible_avl_p): Removed. (vector_insn_info::compatible_vtype_p): Removed. (vector_insn_info::available_p): Removed. (vector_insn_info::fuse_avl): Removed. (vector_insn_info::fuse_sew_lmul): Removed. (vector_insn_info::fuse_tail_policy): Removed. (vector_insn_info::fuse_mask_policy): Removed. (vector_insn_info::local_merge): Removed. (vector_insn_info::global_merge): Removed. (vector_insn_info::get_avl_or_vl_reg): Removed. (vector_insn_info::update_fault_first_load_avl): Removed. (vector_insn_info::dump): Removed. (vector_infos_manager::vector_infos_manager): Removed. (vector_infos_manager::create_expr): Removed. (vector_infos_manager::get_expr_id): Removed. (vector_infos_manager::all_same_ratio_p): Removed. (vector_infos_manager::all_avail_in_compatible_p): Removed. (vector_infos_manager::all_same_avl_p): Removed. (vector_infos_manager::expr_set_num): Removed. (vector_infos_manager::release): Removed. (vector_infos_manager::create_bitmap_vectors): Removed. (vector_infos_manager::free_bitmap_vectors): Removed. (vector_infos_manager::dump): Removed. (class pass_vsetvl): Adjust. (pass_vsetvl::get_vector_info): Removed. (pass_vsetvl::get_block_info): Removed. (pass_vsetvl::update_vector_info): Removed. (pass_vsetvl::update_block_info): Removed. (pre_vsetvl::compute_avl_def_data): New. (pass_vsetvl::simple_vsetvl): Removed. (pass_vsetvl::compute_local_backward_infos): Removed. (pass_vsetvl::need_vsetvl): Removed. (pass_vsetvl::transfer_before): Removed. (pass_vsetvl::transfer_after): Removed. (pre_vsetvl::compute_vsetvl_def_data): New. (pass_vsetvl::emit_local_forward_vsetvls): Removed. (pass_vsetvl::prune_expressions): Removed. (pass_vsetvl::compute_local_properties): Removed. (pre_vsetvl::compute_lcm_local_properties): New. (pass_vsetvl::earliest_fusion): Removed. (pre_vsetvl::fuse_local_vsetvl_info): New. (pass_vsetvl::vsetvl_fusion): Removed. (pass_vsetvl::can_refine_vsetvl_p): Removed. (pre_vsetvl::earliest_fuse_vsetvl_info): New. (pass_vsetvl::refine_vsetvls): Removed. (pass_vsetvl::cleanup_vsetvls): Removed. (pass_vsetvl::commit_vsetvls): Removed. (pass_vsetvl::pre_vsetvl): Removed. (pass_vsetvl::get_vsetvl_at_end): Removed. (local_avl_compatible_p): Removed. (pass_vsetvl::local_eliminate_vsetvl_insn): Removed. (pre_vsetvl::pre_global_vsetvl_info): New. (get_first_vsetvl_before_rvv_insns): Removed. (pass_vsetvl::global_eliminate_vsetvl_insn): Removed. (pre_vsetvl::emit_vsetvl): New. (pass_vsetvl::ssa_post_optimization): Removed. (pre_vsetvl::cleaup): New. (pre_vsetvl::remove_avl_operand): New. (pass_vsetvl::df_post_optimization): Removed. (pre_vsetvl::remove_unused_dest_operand): New. (pass_vsetvl::init): Removed. (pass_vsetvl::done): Removed. (pass_vsetvl::compute_probabilities): Removed. (pass_vsetvl::lazy_vsetvl): Adjust. (pass_vsetvl::execute): Adjust. * config/riscv/riscv-vsetvl.def (DEF_INCOMPATIBLE_COND): Removed. (DEF_SEW_LMUL_RULE): New. (DEF_SEW_LMUL_FUSE_RULE): Removed. (DEF_POLICY_RULE): New. (DEF_UNAVAILABLE_COND): Removed (DEF_AVL_RULE): New demand type. (sew_lmul): New demand type. (ratio_only): New demand type. (sew_only): New demand type. (ge_sew): New demand type. (ratio_and_ge_sew): New demand type. (tail_mask_policy): New demand type. (tail_policy_only): New demand type. (mask_policy_only): New demand type. (ignore_policy): New demand type. (avl): New demand type. (non_zero_avl): New demand type. (ignore_avl): New demand type. * config/riscv/t-riscv: Removed riscv-vsetvl.h * config/riscv/riscv-vsetvl.h: Removed. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/scalar_move-1.c: Adjust. * gcc.target/riscv/rvv/vsetvl/avl_single-23.c: Adjust. * gcc.target/riscv/rvv/vsetvl/avl_single-46.c: Adjust. * gcc.target/riscv/rvv/vsetvl/avl_single-84.c: Adjust. * gcc.target/riscv/rvv/vsetvl/avl_single-89.c: Adjust. * gcc.target/riscv/rvv/vsetvl/avl_single-95.c: Adjust. * gcc.target/riscv/rvv/vsetvl/imm_bb_prop-1.c: Adjust. * gcc.target/riscv/rvv/vsetvl/pr109743-2.c: Adjust. * gcc.target/riscv/rvv/vsetvl/pr109773-1.c: Adjust. * gcc.target/riscv/rvv/base/pr111037-1.c: Moved to... * gcc.target/riscv/rvv/vsetvl/pr111037-1.c: ...here. * gcc.target/riscv/rvv/base/pr111037-2.c: Moved to... * gcc.target/riscv/rvv/vsetvl/pr111037-2.c: ...here. * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-25.c: Adjust. * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-26.c: Adjust. * gcc.target/riscv/rvv/vsetvl/vlmax_conflict-12.c: Adjust. * gcc.target/riscv/rvv/vsetvl/vlmax_conflict-3.c: Adjust. * gcc.target/riscv/rvv/vsetvl/vsetvl-13.c: Adjust. * gcc.target/riscv/rvv/vsetvl/vsetvl-18.c: Adjust. * gcc.target/riscv/rvv/vsetvl/vsetvl-23.c: Adjust. * gcc.target/riscv/rvv/vsetvl/avl_single-104.c: New test. * gcc.target/riscv/rvv/vsetvl/avl_single-105.c: New test. * gcc.target/riscv/rvv/vsetvl/avl_single-106.c: New test. * gcc.target/riscv/rvv/vsetvl/avl_single-107.c: New test. * gcc.target/riscv/rvv/vsetvl/avl_single-108.c: New test. * gcc.target/riscv/rvv/vsetvl/avl_single-109.c: New test. * gcc.target/riscv/rvv/vsetvl/pr111037-3.c: New test. * gcc.target/riscv/rvv/vsetvl/pr111037-4.c: New test.
-
Alexandre Oliva authored
The need to initialize edge probabilities has made make_eh_edges undesirably hard to use. I suppose we don't want make_eh_edges to initialize the probability of the newly-added edge itself, so that the caller takes care of it, but identifying the added edge in need of adjustments is inefficient and cumbersome. Change make_eh_edges so that it returns the added edge. for gcc/ChangeLog * tree-eh.cc (make_eh_edges): Return the new edge. * tree-eh.h (make_eh_edges): Likewise.
-
Nathaniel Shead authored
This patch adds checks for attempting to change the active member of a union by methods other than a member access expression. To be able to properly distinguish `*(&u.a) = ` from `u.a = `, this patch redoes the solution for c++/59950 to avoid extranneous *&; it seems that the only case that needed the workaround was when copying empty classes. This patch also ensures that constructors for a union field mark that field as the active member before entering the call itself; this ensures that modifications of the field within the constructor's body don't cause false positives (as these will not appear to be member access expressions). This means that we no longer need to start the lifetime of empty union members after the constructor body completes. As a drive-by fix, this patch also ensures that value-initialised unions are considered to have activated their initial member for the purpose of checking stores and accesses, which catches some additional mistakes pre-C++20. PR c++/101631 PR c++/102286 gcc/cp/ChangeLog: * call.cc (build_over_call): Fold more indirect refs for trivial assignment op. * class.cc (type_has_non_deleted_trivial_default_ctor): Create. * constexpr.cc (cxx_eval_call_expression): Start lifetime of union member before entering constructor. (cxx_eval_component_reference): Check against first member of value-initialised union. (cxx_eval_store_expression): Activate member for value-initialised union. Check for accessing inactive union member indirectly. * cp-tree.h (type_has_non_deleted_trivial_default_ctor): Forward declare. gcc/testsuite/ChangeLog: * g++.dg/cpp1y/constexpr-89336-3.C: Fix union initialisation. * g++.dg/cpp1y/constexpr-union6.C: New test. * g++.dg/cpp1y/constexpr-union7.C: New test. * g++.dg/cpp2a/constexpr-union2.C: New test. * g++.dg/cpp2a/constexpr-union3.C: New test. * g++.dg/cpp2a/constexpr-union4.C: New test. * g++.dg/cpp2a/constexpr-union5.C: New test. * g++.dg/cpp2a/constexpr-union6.C: New test. Signed-off-by:
Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by:
Jason Merrill <jason@redhat.com>
-
Nathaniel Shead authored
This patch improves the errors given when casting from void* in C++26 to include the expected type if the types of the pointed-to objects were not similar. It also ensures (for all standard modes) that void* casts are checked even for DECL_ARTIFICIAL declarations, such as lifetime-extended temporaries, and is only ignored for cases where we know it's OK (e.g. source_location::current) or have no other choice (heap-allocated data). gcc/cp/ChangeLog: * constexpr.cc (is_std_source_location_current): New. (cxx_eval_constant_expression): Only ignore cast from void* for specific cases and improve other diagnostics. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/constexpr-cast4.C: New test. Signed-off-by:
Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by:
Marek Polacek <polacek@redhat.com> Reviewed-by:
Jason Merrill <jason@redhat.com>
-
GCC Administrator authored
-
- Oct 19, 2023
-
-
Marek Polacek authored
This patch is an optimization tweak for cp_fold_r. If we cp_fold_r the COND_EXPR's op0 first, we may be able to evaluate it to a constant if -O. cp_fold has: 3143 if (callee && DECL_DECLARED_CONSTEXPR_P (callee) 3144 && !flag_no_inline) ... 3151 r = maybe_constant_value (x, /*decl=*/NULL_TREE, flag_no_inline is 1 for -O0: 1124 if (opts->x_optimize == 0) 1125 { 1126 /* Inlining does not work if not optimizing, 1127 so force it not to be done. */ 1128 opts->x_warn_inline = 0; 1129 opts->x_flag_no_inline = 1; 1130 } but otherwise it's 0 and cp_fold will maybe_constant_value calls to constexpr functions. And if it doesn't, then folding the COND_EXPR will keep both arms, and we can avoid calling maybe_constant_value. gcc/cp/ChangeLog: * cp-gimplify.cc (cp_fold_r): Don't call maybe_constant_value.
-
Marek Polacek authored
I noticed that Patrick is missing here. gcc/ChangeLog: * doc/contrib.texi: Add entry for Patrick Palka.
-
Andre Vieira authored
This patch enables the compiler to use inbranch simdclones when generating masked loops in autovectorization. gcc/ChangeLog: * omp-simd-clone.cc (simd_clone_adjust_argument_types): Make function compatible with mask parameters in clone. * tree-vect-stmts.cc (vect_build_all_ones_mask): Allow vector boolean typed masks. (vectorizable_simd_clone_call): Enable the use of masked clones in fully masked loops.
-
Andre Vieira authored
When analyzing a loop and choosing a simdclone to use it is possible to choose a simdclone that cannot be used 'inbranch' for a loop that can use partial vectors. This may lead to the vectorizer deciding to use partial vectors which are not supported for notinbranch simd clones. This patch fixes that by disabling the use of partial vectors once a notinbranch simd clone has been selected. gcc/ChangeLog: PR tree-optimization/110485 * tree-vect-stmts.cc (vectorizable_simd_clone_call): Disable partial vectors usage if a notinbranch simdclone has been selected. gcc/testsuite/ChangeLog: * gcc.dg/gomp/pr110485.c: New test.
-
Andre Vieira authored
The vect_get_smallest_scalar_type helper function was using any argument to a simd clone call when trying to determine the smallest scalar type that would be vectorized. This included the function pointer type in a MASK_CALL for instance, and would result in the wrong type being selected. Instead this patch special cases simd_clone_call's and uses only scalar types of the original function that get transformed into vector types. gcc/ChangeLog: * tree-vect-data-refs.cc (vect_get_smallest_scalar_type): Special case simd clone calls and only use types that are mapped to vectors. (simd_clone_call_p): New helper function. gcc/testsuite/ChangeLog: * gcc.dg/vect/vect-simd-clone-16f.c: Remove unnecessary differentation between targets with different pointer sizes. * gcc.dg/vect/vect-simd-clone-17f.c: Likewise. * gcc.dg/vect/vect-simd-clone-18f.c: Likewise.
-
Andre Vieira authored
Teach parloops how to handle a poly nit and bound e ahead of the changes to enable non-constant simdlen. gcc/ChangeLog: * tree-parloops.cc (try_transform_to_exit_first_loop_alt): Accept poly NIT and ALT_BOUND.
-
Andre Vieira authored
SVE simd clones require to be compiled with a SVE target enabled or the argument types will not be created properly. To achieve this we need to copy DECL_FUNCTION_SPECIFIC_TARGET from the original function declaration to the clones. I decided it was probably also a good idea to copy DECL_FUNCTION_SPECIFIC_OPTIMIZATION in case the original function is meant to be compiled with specific optimization options. gcc/ChangeLog: * tree-parloops.cc (create_loop_fn): Copy specific target and optimization options to clone.
-
Andre Vieira authored
Refactor simd clone handling code ahead of support for poly simdlen. gcc/ChangeLog: * omp-simd-clone.cc (simd_clone_subparts): Remove. (simd_clone_init_simd_arrays): Replace simd_clone_supbarts with TYPE_VECTOR_SUBPARTS. (ipa_simd_modify_function_body): Likewise. * tree-vect-stmts.cc (vectorizable_simd_clone_call): Likewise. (simd_clone_subparts): Remove.
-
François Dumont authored
On merge, reuse a merged node's possibly cached hash code only if we are on the same type of hash and this hash is stateless. Usage of function pointers or std::function as hash functor will prevent reusing cached hash code. libstdc++-v3/ChangeLog * include/bits/hashtable_policy.h (_Hash_code_base::_M_hash_code(const _Hash&, const _Hash_node_value<>&)): Remove. (_Hash_code_base::_M_hash_code<_H2>(const _H2&, const _Hash_node_value<>&)): Remove. * include/bits/hashtable.h (_M_src_hash_code<_H2>(const _H2&, const key_type&, const __node_value_type&)): New. (_M_merge_unique<>, _M_merge_multi<>): Use latter. * testsuite/23_containers/unordered_map/modifiers/merge.cc (test04, test05, test06): New test cases.
-
Andrew Pinski authored
In the case of convert_argument, we would return the same expression back rather than error_mark_node after the error message about trying to convert to an incomplete type. This causes issues in the gimplfier trying to see if another conversion is needed. The code here dates back to before the revision history too so it might be the case it never noticed we should return an error_mark_node. Bootstrapped and tested on x86_64-linux-gnu with no regressions. PR c/100532 gcc/c/ChangeLog: * c-typeck.cc (convert_argument): After erroring out about an incomplete type return error_mark_node. gcc/testsuite/ChangeLog: * gcc.dg/pr100532-1.c: New test.
-
Andrew Pinski authored
In a similar way we don't warn about NULL pointer constant conversion to a different named address we should not warn to a different sso endian either. This adds the simple check. Bootstrapped and tested on x86_64-linux-gnu with no regressions. PR c/104822 gcc/c/ChangeLog: * c-typeck.cc (convert_for_assignment): Check for null pointer before warning about an incompatible scalar storage order. gcc/testsuite/ChangeLog: * gcc.dg/sso-18.c: New test. * gcc.dg/sso-19.c: New test.
-