- Jan 14, 2025
-
-
anetczuk authored
Raw dump of lang tree was missing information about virtual method call. The information is provided in "tok" field of obj_type_ref. gcc/ChangeLog: * tree-dump.cc (dequeue_and_dump): Handle OBJ_TYPE_REF. gcc/testsuite/ChangeLog: * g++.dg/diagnostic/lang-dump-1.C: New test.
-
Iain Buclaw authored
D front-end changes: - Import latest fixes from dmd v2.110.0-rc.1. D runtime changes: - Import latest fixes from druntime v2.110.0-rc.1. Phobos changes: - Import latest fixes from phobos v2.110.0-rc.1. Included in the merge are fixes for the following PRs: PR d/118438 PR d/118448 PR d/118449 gcc/d/ChangeLog: * dmd/MERGE: Merge upstream dmd d6f693b46a. * d-incpath.cc (add_import_paths): Update for new front-end interface. libphobos/ChangeLog: * libdruntime/MERGE: Merge upstream druntime d6f693b46a. * src/MERGE: Merge upstream phobos 336bed6d8. * testsuite/libphobos.init_fini/custom_gc.d: Adjust test.
-
Alexandre Oliva authored
Arrange for decode_field_reference to use local variables throughout, to modify the out parms only when we're about to return non-NULL, and to drop the unused case of NULL pand_mask, that had a latent failure to detect signbit masking. for gcc/ChangeLog * gimple-fold.cc (decode_field_reference): Rebustify to set out parms only when returning non-NULL. (fold_truth_andor_for_ifcombine): Bail if decode_field_reference returns NULL. Add complementary assert on r_const's not being set when l_const isn't.
-
Marek Polacek authored
In c++/102990 we had a problem where massage_init_elt got {}, digest_nsdmi_init turned that {} into { .value = (int) 1.0e+0 }, and we crashed in the call to fold_non_dependent_init because a FIX_TRUNC_EXPR/FLOAT_EXPR got into tsubst*. So we avoided calling fold_non_dependent_init for a CONSTRUCTOR. But that broke the following test, where we no longer fold the CONST_DECL in { .type = ZERO } to { .type = 0 } and then process_init_constructor_array does: if (next != error_mark_node && (initializer_constant_valid_p (next, TREE_TYPE (next)) != null_pointer_node)) { /* Use VEC_INIT_EXPR for non-constant initialization of trailing elements with no explicit initializers. */ picflags |= PICFLAG_VEC_INIT; because { .type = ZERO } isn't initializer_constant_valid_p. Then we create a VEC_INIT_EXPR and say we can't convert the argument. So we have to fold the elements of the CONSTRUCTOR. We just can't instantiate the elements in a template. This also fixes c++/118047. PR c++/118047 PR c++/118355 gcc/cp/ChangeLog: * typeck2.cc (massage_init_elt): Call fold_non_dependent_init unless for a CONSTRUCTOR in a template. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/nsdmi-list10.C: New test. * g++.dg/cpp0x/nsdmi-list9.C: New test. Reviewed-by:
Jason Merrill <jason@redhat.com>
-
Sandra Loosemore authored
After reimplementing late resolution of "declare variant", the declare_variant_alt and calls_declare_variant_alt flags on struct cgraph_node are no longer used by anything. For the purposes of marking functions that need late resolution, the has_omp_variant_constructs flag has replaced calls_declare_variant_alt. Likewise struct omp_declare_variant_entry, struct omp_declare_variant_base_entry, and the hash tables used to store these structures are no longer needed, since the information needed for late resolution is now stored in the gomp_variant_construct nodes. In addition, some obsolete code that was temporarily ifdef'ed out instead of delted in order to produce a more readable patch for the previous installment of this series is now removed entirely. There are no functional changes in this patch, just removing dead code. gcc/ChangeLog * cgraph.cc (symbol_table::create_edge): Don't set calls_declare_variant_alt in the caller. * cgraph.h (struct cgraph_node): Remove declare_variant_alt and calls_declare_variant_alt flags. * cgraphclones.cc (cgraph_node::create_clone): Don't copy calls_declare_variant_alt bit. * gimplify.cc: Remove previously #ifdef-ed out code. * ipa-free-lang-data.cc (free_lang_data_in_decl): Adjust code referencing declare_variant_alt bit. * ipa.cc (symbol_table::remove_unreachable_nodes): Likewise. * lto-cgraph.cc (lto_output_node): Remove references to deleted bits. (output_refs): Adjust code referencing declare_variant_alt bit. (input_overwrite_node): Remove references to deleted bits. (input_refs): Adjust code referencing declare_variant_alt bit. * lto-streamer-out.cc (lto_output): Likewise. * lto-streamer.h (omp_lto_output_declare_variant_alt): Delete. (omp_lto_input_declare_variant_alt): Delete. * omp-expand.cc (expand_omp_target): Use has_omp_variant_constructs bit to trigger pass_omp_device_lower instead of calls_declare_variant_alt. * omp-general.cc (struct omp_declare_variant_entry): Delete. (struct omp_declare_variant_base_entry): Delete. (struct omp_declare_variant_hasher): Delete. (omp_declare_variant_hasher::hash): Delete. (omp_declare_variant_hasher::equal): Delete. (omp_declare_variants): Delete. (omp_declare_variant_alt_hasher): Delete. (omp_declare_variant_alt_hasher::hash): Delete. (omp_declare_variant_alt_hasher::equal): Delete. (omp_declare_variant_alt): Delete. (omp_lto_output_declare_variant_alt): Delete. (omp_lto_input_declare_variant_alt): Delete. (includes): Delete unnecessary include of gt-omp-general.h. * omp-offload.cc (execute_omp_device_lower): Remove references to deleted bit. (pass_omp_device_lower::gate): Likewise. * omp-simd-clone.cc (simd_clone_create): Likewise. * passes.cc (ipa_write_summaries): Likeise. * symtab.cc (symtab_node::get_partitioning_class): Likewise. * tree-inline.cc (expand_call_inline): Likewise. (tree_function_versioning): Likewise. gcc/lto/ChangeLog * lto-partition.cc (lto_balanced_map): Adjust code referencing deleted declare_variant_alt bit.
-
Sandra Loosemore authored
This patch reimplements the middle-end support for "declare variant" and extends the resolution mechanism to also handle metadirectives (PR112779). It also adds partial support for dynamic selectors (PR113904) and fixes a selector scoring bug reported as PR114596. I hope this rewrite also improves the engineering aspect of the code, e.g. more comments to explain what it is doing. In most cases, variant constructs can be resolved either in the front end or during gimplification; if the variant with the highest score has a static selector, then only that one is emitted. In the case where it has a dynamic selector, it is resolved into a (possibly nested) if/then/else construct, testing the run-time predicate for each selector sorted by decreasing order of score until a static selector is found. In some cases, notably a variant construct in a "declare simd" function which may or may not expand into a simd clone, it may not be possible to score or sort the variants until later in compilation (the ompdevlow pass). In this case the gimplifier emits a loop containing a switch statement with the variants in arbitrary order and uses the OMP_NEXT_VARIANT tree node as a placeholder to control which variant is tested on each iteration of the loop. It looks something like: switch_var = OMP_NEXT_VARIANT (0, state); loop_label: switch (switch_var) { case 1: if (dynamic_selector_predicate_1) { alternative_1; goto end_label; } else { switch_var = OMP_NEXT_VARIANT (1, state); goto loop_label; } case 2: ... } end_label: Note that when there are no dynamic selectors, the loop is unnecessary and only the switch is emitted. Finally, in the ompdevlow pass, the OMP_NEXT_VARIANT magic cookies are resolved and replaced with constants. When compiling with -O we can expect that the loop and switch will be discarded by subsequent optimizations and replaced with direct jumps between the cases, eventually arriving at code with similar control flow to the early-resolution cases. This approach is somewhat simpler than the one currently used for handling declare variant in that all possible code paths are already included in the output of the gimplifier, so it is not necessary to maintain hidden references or data structures pointing to expansions of not-yet-resolved variant constructs and special logic for passing them through LTO (see PR lto/96680). A possible disadvantage of this expansion strategy is that dead code for unused variants in the switch can remain when compiling without -O. If this turns out to be a critical problem (e.g., an unused case includes calls to functions not available to the linker) perhaps some further processing could be performed by default after ompdevlow to simplify such constructs. In order to make this patch more readable for review purposes, it leaves the existing code for "declare variant" resolution (including the above-mentioned LTO hack) in place, in some cases just ifdef-ing out functions that won't compile due to changed interfaces for dependencies. The next patch in the series will delete all the now-unused code. gcc/ChangeLog PR middle-end/114596 PR middle-end/112779 PR middle-end/113904 * Makefile.in (GTFILES): Move omp-general.h earlier; required because of moving score_wide_int declaration to that file. * cgraph.h (struct cgraph_node): Add has_omp_variant_constructs flag. * cgraphclones.cc (cgraph_node::create_clone): Propagate has_omp_variant_constructs flag. * gimplify.cc (omp_resolved_variant_calls): New. (expand_late_variant_directive): New. (find_supercontext): New. (gimplify_variant_call_expr): New. (gimplify_call_expr): Adjust parameters to make fallback available. Update processing for "declare variant" substitution. (is_gimple_stmt): Add OMP_METADIRECTIVE. (omp_construct_selector_matches): Ifdef out unused function. (omp_get_construct_context): New. (gimplify_omp_dispatch): Replace call to deleted function omp_resolve_declare_variant with equivalent logic. (expand_omp_metadirective): New. (expand_late_variant_directive): New. (gimplify_omp_metadirective): New. (gimplify_expr): Adjust arguments to gimplify_call_expr. Add cases for OMP_METADIRECTIVE, OMP_NEXT_VARIANT, and OMP_TARGET_DEVICE_MATCHES. (gimplify_function_tree): Initialize/clean up omp_resolved_variant_calls. * gimplify.h (omp_construct_selector_matches): Delete declaration. (omp_get_construct_context): Declare. * lto-cgraph.cc (lto_output_node): Write has_omp_variant_constructs. (input_overwrite_node): Read has_omp_variant_constructs. * omp-builtins.def (BUILT_IN_OMP_GET_NUM_DEVICES): New. * omp-expand.cc (expand_omp_taskreg): Propagate has_omp_variant_constructs. (expand_omp_target): Likewise. * omp-general.cc (omp_maybe_offloaded): Add construct_context parameter; use it instead of querying gimplifier state. Add comments. (omp_context_name_list_prop): Do not test lang_GNU_Fortran in offload compiler, just use the string as-is. (expr_uses_parm_decl): New. (omp_check_context_selector): Add metadirective_p parameter. Remove sorry for target_device selector. Add additional checks specific to metadirective or declare variant. (make_omp_metadirective_variant): New. (omp_construct_traits_match): New. (omp_context_selector_matches): Temporarily ifdef out the previous code, and add a new implementation based on the old one with different parameters, some unnecessary loops removed, and code re-indented. (omp_target_device_matches_on_host): New. (resolve_omp_target_device_matches): New. (omp_construct_simd_compare): Support matching of "simdlen" and "aligned" clauses. (omp_context_selector_set_compare): Make static. Adjust call to omp_construct_simd_compare. (score_wide_int): Move declaration to omp-general.h. (omp_selector_is_dynamic): New. (omp_device_num_check): New. (omp_dynamic_cond): New. (omp_context_compute_score): Ifdef out the old version and re-implement with different parameters. (omp_complete_construct_context): New. (omp_resolve_late_declare_variant): Ifdef out. (omp_declare_variant_remove_hook): Likewise. (omp_resolve_declare_variant): Likewise. (sort_variant): New. (omp_get_dynamic_candidates): New. (omp_declare_variant_candidates): New. (omp_metadirective_candidates): New. (omp_early_resolve_metadirective): New. (omp_resolve_variant_construct): New. * omp-general.h (score_wide_int): Moved here from omp-general.cc. (struct omp_variant): New. (make_omp_metadirective_variant): Declare. (omp_construct_traits_to_codes): Delete declaration. (omp_check_context_selector): Adjust parameters. (omp_context_selector_matches): Likewise. (omp_context_selector_set_compare): Delete declaration. (omp_resolve_declare_variant): Likewise. (omp_declare_variant_candidates): Declare. (omp_metadirective_candidates): Declare. (omp_get_dynamic_candidates): Declare. (omp_early_resolve_metadirective): Declare. (omp_resolve_variant_construct): Declare. (omp_dynamic_cond): Declare. * omp-offload.cc (resolve_omp_variant_cookies): New. (execute_omp_device_lower): Call the above function to resolve variant directives. Remove call to omp_resolve_declare_variant. (pass_omp_device_lower::gate): Check has_omp_variant_construct bit. * omp-simd-clone.cc (simd_clone_create): Propagate has_omp_variant_constructs bit. * tree-inline.cc (expand_call_inline): Likewise. (tree_function_versioning): Likewise. gcc/c/ChangeLog PR middle-end/114596 PR middle-end/112779 PR middle-end/113904 * c-parser.cc (c_finish_omp_declare_variant): Update for changes to omp-general.h interfaces. gcc/cp/ChangeLog PR middle-end/114596 PR middle-end/112779 PR middle-end/113904 * decl.cc (omp_declare_variant_finalize_one): Update for changes to omp-general.h interfaces. * parser.cc (cp_finish_omp_declare_variant): Likewise. gcc/fortran/ChangeLog PR middle-end/114596 PR middle-end/112779 PR middle-end/113904 * trans-openmp.cc (gfc_trans_omp_declare_variant): Update for changes to omp-general.h interfaces. gcc/testsuite/ PR middle-end/114596 PR middle-end/112779 PR middle-end/113904 * c-c++-common/gomp/declare-variant-12.c: Adjust expected behavior per PR114596. * c-c++-common/gomp/declare-variant-13.c: Test that this is resolvable after gimplification, not just final resolution. * c-c++-common/gomp/declare-variant-14.c: Tweak testcase to ensure that -O causes dead code to be optimized away. * gfortran.dg/gomp/declare-variant-12.f90: Adjust expected behavior per PR114596. * gfortran.dg/gomp/declare-variant-13.f90: Test that this is resolvable after gimplification, not just final resolution. * gfortran.dg/gomp/declare-variant-14.f90: Tweak testcase to ensure that -O causes dead code to be optimized away. Co-Authored-By:
Kwok Cheung Yeung <kcy@codesourcery.com> Co-Authored-By:
Sandra Loosemore <sandra@codesourcery.com> Co-Authored-By:
Marcel Vollweiler <marcel@codesourcery.com>
-
Sandra Loosemore authored
This patch adds basic support for three new tree node types that will be used in subsequent patches to support OpenMP metadirectives and dynamic selectors. OMP_METADIRECTIVE is the internal representation of parsed OpenMP metadirective constructs. It's produced by the front ends and is expanded during gimplification. OMP_NEXT_VARIANT is used as a "magic cookie" for late resolution of variant constructs that cannot be fully resolved during gimplification, used to set the controlling variable of a switch statement that branches to the next alternative once the candidate list can be filtered and sorted. These nodes are expanded into constants in the ompdevlow pass. In some gimple passes, they need to be treated as constants. OMP_TARGET_DEVICE_MATCHES is a similar "magic cookie" used to resolve the target_device dynamic selector. It is wrapped in an OpenMP target construct, and can be resolved to a constant in the ompdevlow pass. gcc/ChangeLog: * doc/generic.texi (OpenMP): Document OMP_METADIRECTIVE, OMP_NEXT_VARIANT, and OMP_TARGET_DEVICE_MATCHES. * fold-const.cc (operand_compare::hash_operand): Ignore the new nodes. * gimple-expr.cc (is_gimple_val): Allow OMP_NEXT_VARIANT and OMP_TARGET_DEVICE_MATCHES. * gimple.cc (get_gimple_rhs_num_ops): OMP_NEXT_VARIANT and OMP_TARGET_DEVICE_MATCHES are both GIMPLE_SINGLE_RHS. * tree-cfg.cc (tree_node_can_be_shared): Allow sharing of OMP_NEXT_VARIANT. * tree-inline.cc (remap_gimple_op_r): Ignore subtrees of OMP_NEXT_VARIANT. * tree-pretty-print.cc (dump_generic_node): Handle OMP_METADIRECTIVE, OMP_NEXT_VARIANT, and OMP_TARGET_DEVICE_MATCHES. * tree-ssa-operands.cc (operands_scanner::get_expr_operands): Ignore operands of OMP_NEXT_VARIANT and OMP_TARGET_DEVICE_MATCHES. * tree.def (OMP_METADIRECTIVE): New. (OMP_NEXT_VARIANT): New. (OMP_TARGET_DEVICE_MATCHES): New. * tree.h (OMP_METADIRECTIVE_VARIANTS): New. (OMP_METADIRECTIVE_VARIANT_SELECTOR): New. (OMP_METADIRECTIVE_VARIANT_DIRECTIVE): New. (OMP_METADIRECTIVE_VARIANT_BODY): New. (OMP_NEXT_VARIANT_INDEX): New. (OMP_NEXT_VARIANT_STATE): New. (OMP_TARGET_DEVICE_MATCHES_SELECTOR): New. (OMP_TARGET_DEVICE_MATCHES_PROPERTIES): New. Co-Authored-By:
Kwok Cheung Yeung <kcy@codesourcery.com> Co-Authored-By:
Sandra Loosemore <sandra@codesourcery.com>
-
Alexandre Oliva authored
Add logic to check and extend constants compared with bitfields, so that fields are only compared with constants they could actually equal. This involves making sure the signedness doesn't change between loads and conversions before shifts: we'd need to carry a lot more data to deal with all the possibilities. for gcc/ChangeLog PR tree-optimization/118456 * gimple-fold.cc (decode_field_reference): Punt if shifting after changing signedness. (fold_truth_andor_for_ifcombine): Check extension bits in constants before clipping. for gcc/testsuite/ChangeLog PR tree-optimization/118456 * gcc.dg/field-merge-21.c: New. * gcc.dg/field-merge-22.c: New.
-
Robin Dapp authored
In PR118154 we emit strided stores but the first of those does not always have the proper VTYPE. That's because we erroneously delete a necessary vsetvl. In order to determine whether to elide (1) Expr[7]: VALID (insn 116, bb 17) Demand fields: demand_ratio_and_ge_sew demand_avl SEW=8, VLMUL=mf2, RATIO=16, MAX_SEW=64 TAIL_POLICY=agnostic, MASK_POLICY=agnostic AVL=(reg:DI 0 zero) when e.g. (2) Expr[3]: VALID (insn 360, bb 15) Demand fields: demand_sew_lmul demand_avl SEW=64, VLMUL=m1, RATIO=64, MAX_SEW=64 TAIL_POLICY=agnostic, MASK_POLICY=agnostic AVL=(reg:DI 0 zero) VL=(reg:DI 13 a3 [345]) is already available, we use sew_ge_and_prev_sew_le_next_max_sew_and_next_ratio_valid_for_prev_sew_p. (1) requires RATIO = SEW/LMUL = 16 and an SEW >= 8. (2) has ratio = 64, though, so we cannot directly elide (1). This patch uses ratio_eq_p instead of next_ratio_valid_for_prev_sew_p. PR target/118154 gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (MAX_LMUL): New define. (pre_vsetvl::earliest_fuse_vsetvl_info): Use. (pre_vsetvl::pre_global_vsetvl_info): New predicate with equal ratio. * config/riscv/riscv-vsetvl.def: Use. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/pr118154-1.c: New test. * gcc.target/riscv/rvv/autovec/pr118154-2.c: New test.
-
Robin Dapp authored
In PR118140 we simplify _ifc__33 = .COND_IOR (_41, d_lsm.7_11, _46, d_lsm.7_11); to 1: Match-and-simplified .COND_IOR (_41, d_lsm.7_11, _46, d_lsm.7_11) to 1 when _46 == 1. This happens by removing the conditional and applying a | 1 = 1. Normally we re-introduce the conditional and its else value if needed but that does not happen here as we're not dealing with a vector type. For correctness's sake, we must not remove the conditional even for non-vector types. This patch re-introduces a COND_EXPR in such cases. For PR118140 this result in a non-vectorized loop. PR middle-end/118140 gcc/ChangeLog: * gimple-match-exports.cc (maybe_resimplify_conditional_op): Add COND_EXPR when we simplified to a scalar gimple value but still have an else value. gcc/testsuite/ChangeLog: * gcc.dg/vect/pr118140.c: New test. * gcc.target/riscv/rvv/autovec/pr118140.c: New test.
-
Nathaniel Shead authored
The ICE in the linked PR is caused because name lookup finds duplicate copies of the deduction guides, causing a checking assert to fail. This is ultimately because we're exporting an imported guide; when name lookup processes 'dguide-5_b.H' it goes via the 'tt_entity' path and just returns the entity from 'dguide-5_a.H'. Because this doesn't ever go through 'key_mergeable' we never set 'BINDING_VECTOR_GLOBAL_DUPS_P' and so deduping is not engaged, allowing duplicate results. Currently I believe this to be a perculiarity of the ANY_REACHABLE handling for deduction guides; in no other case that I can find do we emit bindings purely to imported entities. As such, this patch fixes this problem from that end, by ensuring that we simply do not emit any imported deduction guides. This avoids the ICE because no duplicates need deduping to start with, and should otherwise have no functional change because lookup of deduction guides will look at all reachable modules (exported or not) regardless. Since we're now deliberately not emitting imported deduction guides we can use LOOK_want::NORMAL instead of LOOK_want::ANY_REACHABLE, since the extra work to find as-yet undiscovered deduction guides in transitive importers is not necessary here. PR c++/117397 gcc/cp/ChangeLog: * module.cc (depset::hash::add_deduction_guides): Don't emit imported deduction guides. (depset::hash::finalize_dependencies): Add check for any bindings referring to imported entities. gcc/testsuite/ChangeLog: * g++.dg/modules/dguide-5_a.H: New test. * g++.dg/modules/dguide-5_b.H: New test. * g++.dg/modules/dguide-5_c.H: New test. * g++.dg/modules/dguide-6.h: New test. * g++.dg/modules/dguide-6_a.C: New test. * g++.dg/modules/dguide-6_b.C: New test. * g++.dg/modules/dguide-6_c.C: New test. Signed-off-by:
Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by:
Jason Merrill <jason@redhat.com>
-
Eric Botcazou authored
...to the object file reader present in the run-time library. gcc/ada/ PR ada/118459 * libgnat/s-objrea.ads (Object_Arch): Add S390 and RISCV. * libgnat/s-objrea.adb (EM_S390): New named number. (EM_RISCV): Likewise. (ELF_Ops.Initialize): Deal with EM_S390 and EM_RISCV. (Read_Address): Deal with S390 and RISCV.
-
Richard Biener authored
When vectorizing a load we are now checking alignment before emitting a vector(1) T load instead of blindly assuming it's OK when we had a scalar T load. For reasons we're not handling alignment computation optimally here but we shouldn't ICE when we fall back to loads of T. The following ensures the IL remains correct by emitting VIEW_CONVERT from T to vector(1) T when needed. It also removes an earlier fix done in r9-382-gbb4e47476537f6 for the same issue with VMAT_ELEMENTWISE. PR tree-optimization/118405 * tree-vect-stmts.cc (vectorizable_load): When we fall back to scalar loads make sure we properly convert to vector(1) T when there was only a single vector element.
-
Anuj Mohite authored
This patch provided by Anuj Mohite as part of the GSoC project. It is modified slightly by Jerry DeLisle for minor formatting. The patch provides front-end parsing of the LOCALITY specs in DO_CONCURRENT and adds numerous test cases. gcc/fortran/ChangeLog: * dump-parse-tree.cc (show_code_node): Updated to use c->ext.concur.forall_iterator instead of c->ext.forall_iterator. * frontend-passes.cc (index_interchange): Updated to use c->ext.concur.forall_iterator instead of c->ext.forall_iterator. (gfc_code_walker): Likewise. * gfortran.h (enum locality_type): Added new enum for locality types in DO CONCURRENT constructs. * match.cc (match_simple_forall): Updated to use new_st.ext.concur.forall_iterator instead of new_st.ext.forall_iterator. (gfc_match_forall): Likewise. (gfc_match_do): Implemented support for matching DO CONCURRENT locality specifiers (LOCAL, LOCAL_INIT, SHARED, DEFAULT(NONE), and REDUCE). * parse.cc (parse_do_block): Updated to use new_st.ext.concur.forall_iterator instead of new_st.ext.forall_iterator. * resolve.cc (struct check_default_none_data): Added struct check_default_none_data. (do_concur_locality_specs_f2023): New function to check compliance with F2023's C1133 constraint for DO CONCURRENT. (check_default_none_expr): New function to check DEFAULT(NONE) compliance. (resolve_locality_spec): New function to resolve locality specs. (gfc_count_forall_iterators): Updated to use code->ext.concur.forall_iterator. (gfc_resolve_forall): Updated to use code->ext.concur.forall_iterator. * st.cc (gfc_free_statement): Updated to free locality specifications and use p->ext.concur.forall_iterator. * trans-stmt.cc (gfc_trans_forall_1): Updated to use code->ext.concur.forall_iterator. gcc/testsuite/ChangeLog: * gfortran.dg/do_concurrent_10.f90: New test. * gfortran.dg/do_concurrent_8_f2018.f90: New test. * gfortran.dg/do_concurrent_8_f2023.f90: New test. * gfortran.dg/do_concurrent_9.f90: New test. * gfortran.dg/do_concurrent_all_clauses.f90: New test. * gfortran.dg/do_concurrent_basic.f90: New test. * gfortran.dg/do_concurrent_constraints.f90: New test. * gfortran.dg/do_concurrent_local_init.f90: New test. * gfortran.dg/do_concurrent_locality_specs.f90: New test. * gfortran.dg/do_concurrent_multiple_reduce.f90: New test. * gfortran.dg/do_concurrent_nested.f90: New test. * gfortran.dg/do_concurrent_parser.f90: New test. * gfortran.dg/do_concurrent_reduce_max.f90: New test. * gfortran.dg/do_concurrent_reduce_sum.f90: New test. * gfortran.dg/do_concurrent_shared.f90: New test. Signed-off-by:
Anuj <anujmohite001@gmail.com>
-
David Malcolm authored
PR c/116871 notes that our diagnostics about incompatible function types could be improved. In particular, for the case of migrating to C23 I'm seeing a lot of build failures with signal handlers similar to this (simplified from alsa-tools-1.2.11, envy24control/profiles.c; see rhbz#2336278): typedef void (*__sighandler_t) (int); extern __sighandler_t signal (int __sig, __sighandler_t __handler) __attribute__ ((__nothrow__ , __leaf__)); void new_process(void) { void (*int_stat)(); int_stat = signal(2, ((__sighandler_t) 1)); signal(2, int_stat); } Before this patch, cc1 fails with this message: t.c: In function 'new_process': t.c:18:12: error: assignment to 'void (*)(void)' from incompatible pointer type '__sighandler_t' {aka 'void (*)(int)'} [-Wincompatible-pointer-types] 18 | int_stat = signal(2, ((__sighandler_t) 1)); | ^ t.c:20:13: error: passing argument 2 of 'signal' from incompatible pointer type [-Wincompatible-pointer-types] 20 | signal(2, int_stat); | ^~~~~~~~ | | | void (*)(void) t.c:11:57: note: expected '__sighandler_t' {aka 'void (*)(int)'} but argument is of type 'void (*)(void)' 11 | extern __sighandler_t signal (int __sig, __sighandler_t __handler) | ~~~~~~~~~~~~~~~^~~~~~~~~ With this patch, cc1 emits: t.c: In function 'new_process': t.c:18:12: error: assignment to 'void (*)(void)' from incompatible pointer type '__sighandler_t' {aka 'void (*)(int)'} [-Wincompatible-pointer-types] 18 | int_stat = signal(2, ((__sighandler_t) 1)); | ^ t.c:9:16: note: '__sighandler_t' declared here 9 | typedef void (*__sighandler_t) (int); | ^~~~~~~~~~~~~~ t.c:20:13: error: passing argument 2 of 'signal' from incompatible pointer type [-Wincompatible-pointer-types] 20 | signal(2, int_stat); | ^~~~~~~~ | | | void (*)(void) t.c:11:57: note: expected '__sighandler_t' {aka 'void (*)(int)'} but argument is of type 'void (*)(void)' 11 | extern __sighandler_t signal (int __sig, __sighandler_t __handler) | ~~~~~~~~~~~~~~~^~~~~~~~~ t.c:9:16: note: '__sighandler_t' declared here 9 | typedef void (*__sighandler_t) (int); | ^~~~~~~~~~~~~~ showing the location of the pertinent typedef ("__sighandler_t") Another example, simplfied from a52dec-0.7.4: src/a52dec.c (rhbz#2336013): typedef void (*__sighandler_t) (int); extern __sighandler_t signal (int __sig, __sighandler_t __handler) __attribute__ ((__nothrow__ , __leaf__)); /* Mismatching return type. */ static RETSIGTYPE signal_handler (int sig) { } static void print_fps (int final) { signal (42, signal_handler); } Before this patch, cc1 emits: t2.c: In function 'print_fps': t2.c:22:15: error: passing argument 2 of 'signal' from incompatible pointer type [-Wincompatible-pointer-types] 22 | signal (42, signal_handler); | ^~~~~~~~~~~~~~ | | | int (*)(int) t2.c:11:57: note: expected '__sighandler_t' {aka 'void (*)(int)'} but argument is of type 'int (*)(int)' 11 | extern __sighandler_t signal (int __sig, __sighandler_t __handler) | ~~~~~~~~~~~~~~~^~~~~~~~~ With this patch cc1 emits: t2.c: In function 'print_fps': t2.c:22:15: error: passing argument 2 of 'signal' from incompatible pointer type [-Wincompatible-pointer-types] 22 | signal (42, signal_handler); | ^~~~~~~~~~~~~~ | | | int (*)(int) t2.c:11:57: note: expected '__sighandler_t' {aka 'void (*)(int)'} but argument is of type 'int (*)(int)' 11 | extern __sighandler_t signal (int __sig, __sighandler_t __handler) | ~~~~~~~~~~~~~~~^~~~~~~~~ t2.c:16:19: note: 'signal_handler' declared here 16 | static RETSIGTYPE signal_handler (int sig) | ^~~~~~~~~~~~~~ t2.c:9:16: note: '__sighandler_t' declared here 9 | typedef void (*__sighandler_t) (int); | ^~~~~~~~~~~~~~ showing the location of the pertinent fndecl ("signal_handler"), and, as before, the pertinent typedef. The patch also updates the colorization in the messages to visually link and contrast the different types and typedefs. My hope is that this make it easier for users to decipher build failures seen with the new C23 default. Further improvements could be made to colorization in convert_for_assignment, and similar improvements to C++, but I'm punting those to GCC 16. gcc/c/ChangeLog: PR c/116871 * c-typeck.cc (pedwarn_permerror_init): Return bool for whether a warning was emitted. Only call print_spelling if we warned. (pedwarn_init): Return bool for whether a warning was emitted. (permerror_init): Likewise. (warning_init): Return bool for whether a warning was emitted. Only call print_spelling if we warned. (class pp_element_quoted_decl): New. (maybe_inform_typedef_location): New. (convert_for_assignment): For OPT_Wincompatible_pointer_types, move auto_diagnostic_group to cover all cases. Use %e and pp_element rather than %qT and tree to colorize the types. Capture whether a warning was emitted, and, if it was, show various notes: for a pointer to a function, show the function decl, for typedef types, and show the decls. gcc/testsuite/ChangeLog: PR c/116871 * gcc.dg/c23-mismatching-fn-ptr-a52dec.c: New test. * gcc.dg/c23-mismatching-fn-ptr-alsatools.c: New test. * gcc.dg/c23-mismatching-fn-ptr.c: New test. Signed-off-by:
David Malcolm <dmalcolm@redhat.com>
-
Andrew Pinski authored
With the addition of supporting operations on the SVE scalable vector types, the vec_duplicate tree will show up in expressions and the constexpr handling was not done for this tree code. This is a simple fix to treat VEC_DUPLICATE like any other unary operator and allows the constexpr-add-1.C testcase to work. Built and tested for aarch64-linux-gnu. PR c++/118445 gcc/cp/ChangeLog: * constexpr.cc (cxx_eval_constant_expression): Handle VEC_DUPLICATE like a "normal" unary operator. (potential_constant_expression_1): Likewise. gcc/testsuite/ChangeLog: * g++.target/aarch64/sve/constexpr-add-1.C: New test. Signed-off-by:
Andrew Pinski <quic_apinski@quicinc.com>
-
Robin Dapp authored
Hi, currently ssa-dse-1.C ICEs because expand_simple_binop returns NULL when building the scalar that is used to IOR two interleaving sequences. That's because we try to emit a shift in HImode. This patch shifts in Xmode and then lowpart-subregs the result to HImode. Regtested on rv64gcv_zvl512b. Regards Robin gcc/ChangeLog: * config/riscv/riscv-v.cc (expand_const_vector): Shift in Xmode.
-
Jiufu Guo authored
Previously, vsx_stxvd2x4_le_const_<mode> was introduced for 'split1' pass, so it is guarded by "can_create_pseudo_p ()". While it would be possible to match the pattern of this insn during/after RA, this insn could be updated to make it work for split pass after RA. And this insn would not be the best choice if the address has alignment like "&(-16)", so "!altivec_indexed_or_indirect_operand" is added to guard this insn. 2025-01-13 Jiufu Guo <guojiufu@linux.ibm.com> gcc/ PR target/116030 * config/rs6000/vsx.md (vsx_stxvd2x4_le_const_<mode>): Add clobber and guard with !altivec_indexed_or_indirect_operand. gcc/testsuite/ PR target/116030 * gcc.target/powerpc/pr116030.c: New test.
-
GCC Administrator authored
-
Robin Dapp authored
Hi, in PR117682 we build an interleaving pattern { 1, 201, 209, 25, 161, 105, 113, 185, 65, 9, 17, 89, 225, 169, 177, 249, 129, 73, 81, 153, 33, 233, 241, 57, 193, 137, 145, 217, 97, 41, 49, 121 }; with negative step expecting wraparound semantics due to -fwrapv. For building interleaved patterns we have an optimization that does e.g. {1, 209, ...} = { 1, 0, 209, 0, ...} and {201, 25, ...} >> 8 = { 0, 201, 0, 25, ...} and IORs those. The optimization only works if the lowpart bits are zero. When overflowing e.g. with a negative step we cannot guarantee this. This patch makes us fall back to the generic merge handling for negative steps. I'm not 100% certain we're good even for positive steps. If the step or the vector length is large enough we'd still overflow and have non-zero lower bits. I haven't seen this happen during my testing, though and the patch doesn't make things worse, so... Regtested on rv64gcv_zvl512b. Let's see what the CI says. Regards Robin PR target/117682 gcc/ChangeLog: * config/riscv/riscv-v.cc (expand_const_vector): Fall back to merging if either step is negative. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/pr117682.c: New test.
-
- Jan 13, 2025
-
-
Robin Dapp authored
Hi, the zbb-rol-ror and stack_save_restore tests use the -fno-lto option and scan the final assembly. For an invocation like -flto ... -fno-lto the output file we scan is still something like zbb-rol-ror-09.ltrans0.ltrans.s. Therefore skip the tests when "-flto" is present. This gets rid of a few UNRESOLVED tests. Regtested on rv64gcv_zvl512b. Going to push if the CI agrees. Regards Robin gcc/testsuite/ChangeLog: * gcc.target/riscv/stack_save_restore_1.c: Skip for -flto. * gcc.target/riscv/stack_save_restore_2.c: Ditto. * gcc.target/riscv/zbb-rol-ror-04.c: Ditto. * gcc.target/riscv/zbb-rol-ror-05.c: Ditto. * gcc.target/riscv/zbb-rol-ror-06.c: Ditto. * gcc.target/riscv/zbb-rol-ror-07.c: Ditto. * gcc.target/riscv/zbb-rol-ror-08.c: Ditto. * gcc.target/riscv/zbb-rol-ror-09.c: Ditto.
-
Gaius Mulley authored
This patch removes the physical address from the COPYING.FDL and replaces it with a URL. gcc/m2/ChangeLog: PR modula2/116557 * COPYING.FDL: Remove physical address and replace with a URL. Signed-off-by:
Gaius Mulley <gaiusmod2@gmail.com>
-
Thomas Koenig authored
gcc/fortran/ChangeLog: * dump-parse-tree.cc (show_attr): Fix typos for in_equivalence.
-
Xi Ruoyao authored
The test case long test (long x, long y) { return ((x | 0x1ff) << 3) + y; } is now compiled (-O2 -march=rv64g_zba) to li a4,4096 slliw a5,a0,3 addi a4,a4,-8 or a5,a5,a4 addw a0,a5,a1 ret Despite this check was originally intended to use zba better, now removing it actually enables the use of zba for this test case (thanks to late combine): ori a5,a0,511 sh3add a0,a5,a1 ret Obviously, bitmanip.md does not cover (any_or (ashift (reg) (imm123)) imm) at all, and even for and it just seems more natural splitting to (ashift (and (reg) (imm')) (imm123)) first, then let late combine to combine the outer ashift and the plus. I've not found any test case regressed by the removal. And "make check-gcc RUNTESTFLAGS=riscv.exp='zba-*.c'" also reports no failure. gcc/ChangeLog: PR target/115921 * config/riscv/riscv.md (<optab>_shift_reverse): Remove check for TARGET_ZBA. gcc/testsuite/ChangeLog: PR target/115921 * gcc.target/riscv/zba-shNadd-08.c: New test.
-
Iain Buclaw authored
Each major release is not binary compatible with the previous. PR d/117701 libphobos/ChangeLog: * configure: Regenerate. * configure.ac (libtool_VERSION): Update to 6:0:0.
-
Richard Sandiford authored
In g:06c4cf39 I mishandled signed comparisons of comparison results on STORE_FLAG_VALUE < 0 targets (despite specifically referencing STORE_FLAG_VALUE in the commit message). There, (lt TRUE FALSE) is true, although (ltu FALSE TRUE) still holds. Things get messy with vector modes, and since those weren't the focus of the original commit, it seemed better to punt on them for now. However, punting means that this optimisation no longer feels like a natural tail-call operation. The patch therefore converts "return simplify..." to the usual call-and-conditional-return pattern. gcc/ PR target/118418 * simplify-rtx.cc (simplify_context::simplify_relational_operation_1): Take STORE_FLAG_VALUE into account when handling signed comparisons of comparison results.
-
Xi Ruoyao authored
When zbs is not available, there's nothing special with single-bit immediates and we should perform reassociation as normal immediates. gcc/ChangeLog: PR target/115921 * config/riscv/riscv.md (<optab>_shift_reverse): Only check popcount_hwi if !TARGET_ZBS.
-
Jin Ma authored
When the vsetvl instructions of the two RVV instructions are merged using "use_max_sew", it is possible to update the sew of prev if prev.sew < next.sew, but keep the original ratio, which is obviously wrong. when the subsequent instructions are equal to the wrong ratio, it is possible to generate the wrong "vsetvli zero,zero" instruction, which will lead to unknown avl. gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (demand_system::use_max_sew): Also set the ratio for PREV. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/bug-10.c: New test.
-
Vineet Gupta authored
This seeming benign mistake caused a massive SPEC2017 Cactu regression (2.1 trillion insn to 2.5 trillion) wiping out all the gains from my recent sched1 improvement. Thankfully the issue was trivial to fix even if hard to isolate. On BPI3: Before bug ---------- | Performance counter stats for './cactusBSSN_r_base-1': | | 4,557,471.02 msec task-clock:u # 1.000 CPUs utilized | 1,245 context-switches:u # 0.273 /sec | 1 cpu-migrations:u # 0.000 /sec | 205,376 page-faults:u # 45.064 /sec | 7,291,944,801,307 cycles:u # 1.600 GHz | 2,134,835,735,951 instructions:u # 0.29 insn per cycle | 10,799,296,738 branches:u # 2.370 M/sec | 15,308,966 branch-misses:u # 0.14% of all branches | | 4557.710508078 seconds time elapsed Bug --- | Performance counter stats for './cactusBSSN_r_base-2': | | 4,801,813.79 msec task-clock:u # 1.000 CPUs utilized | 8,066 context-switches:u # 1.680 /sec | 1 cpu-migrations:u # 0.000 /sec | 203,836 page-faults:u # 42.450 /sec | 7,682,826,638,790 cycles:u # 1.600 GHz | 2,503,133,291,344 instructions:u # 0.33 insn per cycle ^^^^^^^^^^^^^^^^^ | 10,799,287,796 branches:u # 2.249 M/sec | 16,641,200 branch-misses:u # 0.15% of all branches | | 4802.616638386 seconds time elapsed | Fix --- | Performance counter stats for './cactusBSSN_r_base-3': | | 4,556,170.75 msec task-clock:u # 1.000 CPUs utilized | 1,739 context-switches:u # 0.382 /sec | 0 cpu-migrations:u # 0.000 /sec | 203,458 page-faults:u # 44.655 /sec | 7,289,854,613,923 cycles:u # 1.600 GHz | 2,134,854,070,916 instructions:u # 0.29 insn per cycle | 10,799,296,807 branches:u # 2.370 M/sec | 15,403,357 branch-misses:u # 0.14% of all branches | | 4556.445490123 seconds time elapsed Fixes: 46888571 ("RISC-V: Add cr and cf constraint") Signed-off-by:
Vineet Gupta <vineetg@rivosinc.com> gcc/ChangeLog: * config/riscv/riscv.cc (riscv_register_move_cost): Remove buggy check.
-
Paul-Antoine Arras authored
Add support to the Fortran parser for the OpenMP syntax that allows a comma after the directive name and between clauses of declare variant. The C and C++ parsers already support this syntax so only a new test is added. gcc/fortran/ChangeLog: * openmp.cc (gfc_match_omp_declare_variant): Match comma after directive name and between clauses. Emit more useful diagnostics. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/declare-variant-2.f90: Remove error test for a comma after the directive name. Add tests for other invalid syntaxes (extra comma and invalid clause). * c-c++-common/gomp/adjust-args-5.c: New test. * gfortran.dg/gomp/adjust-args-11.f90: New test.
-
Jin Ma authored
Correct logic on 64-bit host: ... bseti a5,zero,38 bseti a5,a5,63 addi a5,a5,-1 and a4,a4,a5 ... Wrong logic on 32-bit host: ... li a5,64 bseti a5,a5,31 addi a5,a5,-1 and a4,a4,a5 ... gcc/ChangeLog: * config/riscv/riscv.cc (riscv_build_integer_1): Change 1UL/1ULL to HOST_WIDE_INT_1U. gcc/testsuite/ChangeLog: * gcc.target/riscv/zbs-bug.c: New test.
-
Paul-Antoine Arras authored
Without the target directive, the test would run on the host but still try to use device pointers, which causes a segfault. libgomp/ChangeLog: * testsuite/libgomp.fortran/dispatch-1.f90: Add missing target directive.
-
Gaius Mulley authored
P2SymBuild.mod.BuildSubrange does not use a virtual token and therefore any error message containing a subrange type produces poor location carots. This patch rewrites BuildSubrange and the buildError4 procedure in M2Check.mod (which is only called when there is a formal/actual parameter mismatch). buildError4 now issues a sub error for the formal and actual type declaration highlighing the type mismatch. gcc/m2/ChangeLog: PR modula2/118453 * gm2-compiler/M2Check.mod (buildError4): Call MetaError1 for the actual and formal parameter type. * gm2-compiler/P2Build.bnf (SubrangeType): Construct a virtual token containing the subrange type declaration. (PrefixedSubrangeType): Ditto. * gm2-compiler/P2SymBuild.def (BuildSubrange): Add tok parameter. * gm2-compiler/P2SymBuild.mod (BuildSubrange): Use tok parameter, rather than the token at the start of the subrange. gcc/testsuite/ChangeLog: PR modula2/118453 * gm2/pim/fail/badbecomes2.mod: New test. * gm2/pim/fail/badparamset1.mod: New test. * gm2/pim/fail/badparamset2.mod: New test. * gm2/pim/fail/badsyntaxset1.mod: New test. Signed-off-by:
Gaius Mulley <gaiusmod2@gmail.com>
-
Jeff Law authored
This resurrects a patch from a bit over 2 years ago that I never wrapped up. IIRC, I ended up up catching covid, then in the hospital for an unrelated issue and it just got dropped on the floor in the insanity. The basic idea here is to help postreload-cse eliminate more const/copies by recording a small set of conditional equivalences (as Richi said in 2022, "Ick"). It was originally to help eliminate an unnecessary constant load I saw in coremark, but as seen in BZ107455 the same issues show up in real code as well. Bootstrapped and regression tested on x86-64, also been through multiple spins in my tester. Changes since v2: - Simplified logic for blocks to examine - Remove redundant tests when filtering blocks to examine - Remove bogus check which only allowed reg->reg copies Changes since v1: Richard B and Richard S both had good comments last time around and their requests are reflected in this update: - Use rtx_equal_p rather than pointer equality - Restrict to register "destinations" - Restrict to integer modes - Adjust entry block handling My own wider scale testing resulted in a few more changes. - Robustify extracting the (set (pc) ... ), which then required ... - Handle if src/dst are clobbered by the conditional branch - Fix logic error causing too many equivalences to be recorded PR rtl-optimization/107455 gcc/ * postreload.cc (reload_cse_regs_1): Take advantage of conditional equivalences. gcc/testsuite * gcc.target/riscv/pr107455-1.c: New test. * gcc.target/riscv/pr107455-2.c: New test.
-
Alexandre Oliva authored
If a single-bit bitfield takes up the sign bit of a storage unit, comparing the corresponding bitfield between two objects loads the storage units, XORs them, converts the result to signed char, and compares it with zero: ((signed char)(a.<byte> ^ c.<byte>) >= 0). fold_truth_andor_for_ifcombine recognizes the compare with zero as a sign bit test, then it decomposes the XOR into an equality test. The problem is that, after this decomposition, that figures out the width of the accessed fields, we apply the sign bit mask to the left-hand operand of the compare, but we failed to also apply it to the right-hand operand when both were taken from the same XOR. This patch fixes that. for gcc/ChangeLog PR tree-optimization/118409 * gimple-fold.cc (fold_truth_andor_for_ifcombine): Apply the signbit mask to the right-hand XOR operand too. for gcc/testsuite/ChangeLog PR tree-optimization/118409 * gcc.dg/field-merge-20.c: New.
-
Jakub Jelinek authored
Something I've noticed during working on the crc wrong-code fix. My first version of the patch failed because of no longer matching some expected strings in the assembly, so I had to add TDF_DETAILS debugging into the -fdump-rtl-expand-details dump which the crc tests can use. For PR115910 Andrew has added similar note for the division/modulo case if it is positive and we can choose either unsigned or signed division. The problem is that unlike most other TDF_DETAILS diagnostics, this is not done before emitting the IL for the function, but during it. Other messages there are prefixed with ;;, both details on what it is doing and the GIMPLE IL for which it expands RTL, so the ;; Generating RTL for gimple basic block 4 ;; (code_label 13 12 14 2 (nil) [0 uses]) (note 14 13 0 NOTE_INSN_BASIC_BLOCK) positive division: unsigned cost: 30; signed cost: 28 ;; return _4; message in between just looks weird and IMHO should be ;; prefixed. 2025-01-13 Jakub Jelinek <jakub@redhat.com> PR target/115910 * expr.cc (expand_expr_divmod): Prefix the TDF_DETAILS note with ";; " and add a space before (needed tie breaker). Formatting fixes.
-
Martin Jambor authored
This commit makes the contrib/check-MAINTAINERS.py script happy about our MAINTAINERS file. I hope that it knows best how things ought to be and so am committing this as obvious. ChangeLog: 2025-01-13 Martin Jambor <mjambor@suse.cz> * MAINTAINERS: Fix the name order of the Write After Approval section.
-
Pascal Obry authored
gcc/ada/ChangeLog: * doc/gnat_ugn/platform_specific_information.rst: Update. * gnat_ugn.texi: Regenerate.
-
Javier Miranda authored
Partially revert the fix for sem_ch13.adb as it does not comply with RM 13.14(7.2/5). gcc/ada/ChangeLog: * sem_ch13.adb (Check_Aspect_At_End_Of_Declarations): Restore calls to Preanalyze_Spec_Expression that were replaced by calls to Preanalyze_And_Resolve. Add documentation. (Check_Aspect_At_Freeze_Point): Ditto.
-
Pascal Obry authored
gcc/ada/ChangeLog: * mdll.adb: For the created DLL to be relocatable we do not want to use the base file name when calling gnatdll. * gnatdll.adb: Removes option -d which is not working anymore. And when using a truly relocatable DLL the base-address has no real meaning. Also reword the usage string for -d as we do not want to specify relocatable as gnatdll can be used to create both relocatable and non relocatable DLL.
-