- Feb 20, 2025
-
-
Filip Kastl authored
file-cache-lines param was documented as file-cache-files. This fixes the typo. gcc/ChangeLog: * doc/invoke.texi: Fix typo file-cache-files -> file-cache-lines. Signed-off-by:
Filip Kastl <fkastl@suse.cz>
-
Jonathan Wakely authored
We started using the __array_rank built-in with r15-1252-g6f0dfa6f1acdf7 but that built-in is buggy in versions of Clang up to and including 19. libstdc++-v3/ChangeLog: PR libstdc++/118559 * include/std/type_traits (rank, rank_v): Do not use __array_rank for Clang 19 and older.
-
Jonathan Wakely authored
This allows _GLIBCXX_HAS_BUILTIN (or _GLIBCXX_USE_BUILTIN_TRAIT) to be used as part of a larger logical expression. libstdc++-v3/ChangeLog: * include/bits/c++config (_GLIBCXX_HAS_BUILTIN): Add parentheses.
-
Jonathan Wakely authored
This makes several functions in <bit> faster to compile, with fewer expressions to parse and fewer instantiations of __numeric_traits required. libstdc++-v3/ChangeLog: PR libstdc++/118855 * include/std/bit (__count_lzero, __count_rzero, __popcount): Use type-generic built-ins when available.
-
Jonathan Wakely authored
These should have been unsigned, but the static assertions are only in the public std::bit_ceil and std::bit_width functions, not the internal __bit_ceil and __bit_width ones. libstdc++-v3/ChangeLog: * include/experimental/bits/simd.h (__find_next_valid_abi): Cast __bit_ceil argument to unsigned. * src/c++17/floating_from_chars.cc (__floating_from_chars_hex): Cast __bit_ceil argument to unsigned. * src/c++17/memory_resource.cc (big_block): Cast __bit_width argument to unsigned.
-
Jonathan Wakely authored
Since r15-7511-g4e7f74225116e7 we can disable the warnings for using a reserved priority using a diagnostic pragma. That means we no longer need to put globals using that attribute into separate files that get included. This replaces the two uses of such separate files by moving the variable definition into the source file and adding the diagnostic pragma. libstdc++-v3/ChangeLog: * src/c++17/memory_resource.cc (default_res): Define here instead of including default_resource.h. * src/c++98/globals_io.cc (__ioinit): Define here instead of including ios_base_init.h. * src/c++17/default_resource.h: Removed. * src/c++98/ios_base_init.h: Removed.
-
Jonathan Wakely authored
When linking statically to libstdc++.a (or to libstdc++_nonshared.a in the RHEL devtoolset compiler) there's a static initialization order problem where user code might be constructed before the std::chrono::tzdb_list globals, and so might try to use them after they've already been destroyed. Use the init_priority attribute on those globals so that they are initialized early. Since r15-7511-g4e7f74225116e7 we can disable the warnings for using a reserved priority using a diagnostic pragma. libstdc++-v3/ChangeLog: PR libstdc++/118811 * src/c++20/tzdb.cc (tzdb_list::_Node): Use init_priority attribute on static data members. * testsuite/std/time/tzdb_list/pr118811.cc: New test.
-
Andre Vehreschild authored
gcc/fortran/ChangeLog: PR fortran/107635 * gfortran.texi: Remove deprecated functions from documentation. * trans-decl.cc (gfc_build_builtin_function_decls): Remove decprecated function decls. * trans-intrinsic.cc (gfc_conv_intrinsic_exponent): Remove deprecated/no longer needed routines. * trans.h: Remove unused decls. libgfortran/ChangeLog: * caf/libcaf.h (_gfortran_caf_get): Removed because deprecated. (_gfortran_caf_send): Same. (_gfortran_caf_sendget): Same. (_gfortran_caf_send_by_ref): Same. * caf/single.c (assign_char4_from_char1): Same. (assign_char1_from_char4): Same. (convert_type): Same. (defined): Same. (_gfortran_caf_get): Same. (_gfortran_caf_send): Same. (_gfortran_caf_sendget): Same. (copy_data): Same. (get_for_ref): Same. (_gfortran_caf_get_by_ref): Same. (send_by_ref): Same. (_gfortran_caf_send_by_ref): Same. (_gfortran_caf_sendget_by_ref): Same.
-
Andre Vehreschild authored
Add the last missing coarray data manipulation routine using remote accessors. gcc/fortran/ChangeLog: PR fortran/107635 * coarray.cc (rewrite_caf_send): Rewrite to transfer_between_remotes when both sides of the assignment have a coarray. (coindexed_code_callback): Prevent duplicate rewrite. * gfortran.texi: Add documentation for transfer_between_remotes. * intrinsic.cc (add_subroutines): Add intrinsic symbol for caf_sendget to allow easy rewrite to transfer_between_remotes. * trans-decl.cc (gfc_build_builtin_function_decls): Add prototype for transfer_between_remotes. * trans-intrinsic.cc (conv_caf_vector_subscript_elem): Mark as deprecated. (conv_caf_vector_subscript): Same. (compute_component_offset): Same. (conv_expr_ref_to_caf_ref): Same. (conv_stat_and_team): Extract stat and team from expr. (gfc_conv_intrinsic_caf_get): Use conv_stat_and_team. (conv_caf_send_to_remote): Same. (has_ref_after_cafref): Mark as deprecated. (conv_caf_sendget): Translate to transfer_between_remotes. * trans.h: Add prototype for transfer_between_remotes. libgfortran/ChangeLog: * caf/libcaf.h: Add prototype for transfer_between_remotes. * caf/single.c: Implement transfer_between_remotes. gcc/testsuite/ChangeLog: * gfortran.dg/coarray_lib_comm_1.f90: Fix up scan_trees.
-
Andre Vehreschild authored
Refactor to use send_to_remote instead of the slow send_by_ref. gcc/fortran/ChangeLog: PR fortran/107635 * coarray.cc (move_coarray_ref): Move the coarray reference out of the given one. Especially when there is a regular array ref. (fixup_comp_refs): Move components refs to a derived type where the codim has been removed, aka a new type. (split_expr_at_caf_ref): Correctly split the reference chain. (remove_caf_ref): Simplify. (create_get_callback): Fix some deficiencies. (create_allocated_callback): Adapt to new signature of split. (create_send_callback): New function. (rewrite_caf_send): Rewrite a call to caf_send to caf_send_to_remote. (coindexed_code_callback): Treat caf_send and caf_sendget correctly. * gfortran.h (enum gfc_isym_id): Add SENDGET-isym. * gfortran.texi: Add documentation for send_to_remote. * resolve.cc (gfc_resolve_code): No longer generate send_by_ref when allocatable coarray (component) is on the lhs. * trans-decl.cc (gfc_build_builtin_function_decls): Add caf_send_to_remote decl. * trans-intrinsic.cc (conv_caf_func_index): Ensure the static variables created are not in a block-scope. (conv_caf_send_to_remote): Translate caf_send_to_remote calls. (conv_caf_send): Renamed to conv_caf_sendget. (conv_caf_sendget): Renamed from conv_caf_send. (gfc_conv_intrinsic_subroutine): Branch correctly for conv_caf_send and sendget. * trans.h: Correct decl. libgfortran/ChangeLog: * caf/libcaf.h: Add/Correct prototypes for caf_get_from_remote, caf_send_to_remote. * caf/single.c (struct accessor_hash_t): Rename accessor_t to getter_t. (_gfortran_caf_register_accessor): Use new name of getter_t. (_gfortran_caf_send_to_remote): New function for sending data to coarray on a remote image. gcc/testsuite/ChangeLog: * gfortran.dg/coarray/send_char_array_1.f90: Extend test to catch more cases. * gfortran.dg/coarray_42.f90: Invert tests use, because no longer a send is needed when local memory in a coarray is allocated.
-
Andre Vehreschild authored
Replace caf_is_present by caf_is_present_on_remote which is using a dedicated callback for each object to test on the remote image. gcc/fortran/ChangeLog: PR fortran/107635 * coarray.cc (create_allocated_callback): Add creating remote side procedure for checking allocation status of coarray. (rewrite_caf_allocated): Rewrite ALLOCATED on coarray to use caf routine. (coindexed_expr_callback): Exempt caf_is_present_on_remote from being rewritten again. * gfortran.h (enum gfc_isym_id): Add caf_is_present_on_remote id. * gfortran.texi: Add documentation for caf_is_present_on_remote. * intrinsic.cc (add_functions): Add caf_is_present_on_remote symbol. * trans-decl.cc (gfc_build_builtin_function_decls): Define interface of caf_is_present_on_remote. * trans-intrinsic.cc (gfc_conv_intrinsic_caf_is_present_remote): Translate caf_is_present_on_remote. (trans_caf_is_present): Remove. (caf_this_image_ref): Remove. (gfc_conv_allocated): Take out coarray treatment, because that is rewritten to caf_is_present_on_remote now. (gfc_conv_intrinsic_function): Handle caf_is_present_on_remote calls. * trans.h: Add symbol for caf_is_present_on_remote and remove old one. libgfortran/ChangeLog: * caf/libcaf.h (_gfortran_caf_is_present_on_remote): Add new function. (_gfortran_caf_is_present): Remove deprecated one. * caf/single.c (struct accessor_hash_t): Add function ptr access for remote side call. (_gfortran_caf_is_present_on_remote): Added. (_gfortran_caf_is_present): Removed. gcc/testsuite/ChangeLog: * gfortran.dg/coarray/coarray_allocated.f90: Adapt to new method of checking on remote image. * gfortran.dg/coarray_lib_alloc_4.f90: Same.
-
Andre Vehreschild authored
Extract calls to non-pure or non-elemental functions from index expressions on a coarray. gcc/fortran/ChangeLog: PR fortran/107635 * coarray.cc (get_arrayspec_from_expr): Treat array result of function calls correctly. (remove_coarray_from_derived_type): Prevent memory loss. (add_caf_get_from_remote): Correct locus. (find_comp): New function to find or create a new component in a derived type. (check_add_new_comp_handle_array): Handle allocatable arrays or non-pure/non-elemental functions in indexes of coarrays. (check_add_new_component): Use above function. (create_get_parameter_type): Rename to create_caf_add_data_parameter_type. (create_caf_add_data_parameter_type): Renaming of variable and make the additional data a coarray. (remove_caf_ref): Factor out to reuse in other caf-functions. (create_get_callback): Use function factored out, set locus correctly and ensure a kind is set for parameters. (add_caf_get_intrinsic): Rename to add_caf_get_from_remote and rename some variables. (coindexed_expr_callback): Skip over function created by the rewriter. (coindexed_code_callback): Filter some intrinsics not to process. (gfc_coarray_rewrite): Rewrite also contained functions. * trans-intrinsic.cc (gfc_conv_intrinsic_caf_get): Reflect changed order on caf_get_from_remote (). libgfortran/ChangeLog: * caf/libcaf.h (_gfortran_caf_register_accessor): Reflect changed parameter order. * caf/single.c (struct accessor_hash_t): Same. (_gfortran_caf_register_accessor): Call accessor using a token for accessing arrays with a descriptor on the source side. gcc/testsuite/ChangeLog: * gfortran.dg/coarray_lib_comm_1.f90: Adapt scan expression. * gfortran.dg/coarray/get_with_fn_parameter.f90: New test. * gfortran.dg/coarray/get_with_scalar_fn.f90: New test.
-
Andre Vehreschild authored
Factor out generation of code to get remote function index and to create the additional data structure. Rename caf_get_by_ct to caf_get_from_remote. gcc/fortran/ChangeLog: PR fortran/107635 * gfortran.texi: Rename caf_get_by_ct to caf_get_from_remote. * trans-decl.cc (gfc_build_builtin_function_decls): Rename intrinsic. * trans-intrinsic.cc (conv_caf_func_index): Factor out functionality to be reused by other caf-functions. (conv_caf_add_call_data): Same. (gfc_conv_intrinsic_caf_get): Use functions factored out. * trans.h: Rename intrinsic symbol. libgfortran/ChangeLog: * caf/libcaf.h (_gfortran_caf_get_by_ref): Remove from ABI. This function is replaced by caf_get_from_remote (). (_gfortran_caf_get_remote_function_index): Use better name. * caf/single.c (_gfortran_caf_finalize): Free internal data. (_gfortran_caf_get_by_ref): Remove from public interface, but keep it, because it is still used by sendget (). gcc/testsuite/ChangeLog: * gfortran.dg/coarray_lib_comm_1.f90: Adapt to renamed ABI function. * gfortran.dg/coarray_stat_function.f90: Same. * gfortran.dg/coindexed_1.f90: Same.
-
Andre Vehreschild authored
Add a rewriter to keep all expression tree that is not optimization together. At the moment this is just a move from resolve.cc, but will be extended to handle more cases where rewriting the expression tree may be easier. The first use case is to extract accessors for coarray remote image data access. gcc/fortran/ChangeLog: PR fortran/107635 * Make-lang.in: Add coarray.cc. * coarray.cc: New file. * gfortran.h (gfc_coarray_rewrite): New procedure. * parse.cc (rewrite_expr_tree): Add entrypoint for rewriting expression trees. * resolve.cc (gfc_resolve_ref): Remove caf_lhs handling. (get_arrayspec_from_expr): Moved to rewrite.cc. (remove_coarray_from_derived_type): Same. (convert_coarray_class_to_derived_type): Same. (split_expr_at_caf_ref): Same. (check_add_new_component): Same. (create_get_parameter_type): Same. (create_get_callback): Same. (add_caf_get_intrinsic): Same. (resolve_variable): Remove caf_lhs handling. libgfortran/ChangeLog: * caf/single.c (_gfortran_caf_finalize): Free memory preventing leaks. (_gfortran_caf_get_by_ct): Fix constness. * caf/libcaf.h (_gfortran_caf_register_accessor): Fix constness.
-
Richard Biener authored
The PR indicates a very specific issue with regard to SSA coalescing failures because there's a pre IV increment loop exit test. While IVOPTs created the desired IL we later simplify the exit test into the undesirable form again. The following fixes this up during RTL expansion where we try to improve coalescing of IVs. That seems easier that trying to avoid the simplification with some weird heuristics (it could also have been written this way). PR tree-optimization/86270 * tree-outof-ssa.cc (insert_backedge_copies): Pattern match a single conflict in a loop condition and adjust that avoiding the conflict if possible. * gcc.target/i386/pr86270.c: Adjust to check for no reg-reg copies as well.
-
H.J. Lu authored
Add a test for PR target/118936 which was fixed by reverting: 565d4e75 i386: Simplify PARALLEL RTX scan in ix86_find_all_reg_use 11902be7 x86: Properly find the maximum stack slot alignment PR target/118936 * gcc.target/i386/pr118936.c: New test. Signed-off-by:
H.J. Lu <hjl.tools@gmail.com>
-
Patrick Palka authored
Even though 'iterator' is a reserved macro name, we can't use it as the name of this implementation detail type since it could introduce name lookup ambiguity in valid code, e.g. struct A { using iterator = void; } struct B : concat_view<...>, A { using type = iterator; }; libstdc++-v3/ChangeLog: * include/std/ranges (concat_view::iterator): Rename to ... (concat_view::_Iterator): ... this throughout. Reviewed-by:
Jonathan Wakely <jwakely@redhat.com>
-
Patrick Palka authored
Our concat_view implementation is accidentally based off of an older revision of the paper, P2542R7 instead of R8. As far as I can tell the only semantic change in the final revision is the relaxed constraints on the iterator's iter/sent operator- overloads, which this patch updates. This patch also simplifies the concat_view::end wording via C++26 pack indexing as per the final revision. In turn we make the availability of this library feature conditional on __cpp_pack_indexing. (Note pack indexing is implemented in GCC 15 and Clang 19). PR libstdc++/115209 libstdc++-v3/ChangeLog: * include/bits/version.def (ranges_concat): Depend on __cpp_pack_indexing. * include/bits/version.h: Regenerate. * include/std/ranges (__detail::__last_is_common): Remove. (__detail::__all_but_first_sized): New. (concat_view::end): Use C++26 pack indexing instead of __last_is_common as per R8 of P2542. (concat_view::iterator::operator-): Update constraints on iter/sent overloads as per R8 of P2542. Reviewed-by:
Jonathan Wakely <jwakely@redhat.com>
-
GCC Administrator authored
-
- Feb 19, 2025
-
-
Thomas Schwinge authored
..., where "support" means that the build doesn't fail, but it doesn't mean that all target libraries get built and we get pretty test results for the additional languages. * configure.ac (unsupported_languages) [GCN, nvptx]: Add 'ada'. (noconfigdirs) [GCN, nvptx]: Add 'target-libobjc', 'target-libffi', 'target-libgo'. * configure: Regenerate.
-
Georg-Johann Lay authored
gcc/testsuite/ * gcc.target/avr/torture/isr-04-regs.c: New test. * gcc.target/avr/isr-test.h: Don't set GPRs to values that are 0 mod 0x11.
-
Andrew Pinski authored
This testcase started to fail with r15-268-g9dbff9c05520a7. When late_combine was added, it was turned on for -O2+ only, so this testcase still failed. This changes the option to be -O2 instead of -O and the testcase started to pass again. tested for aarch64-linux-gnu. gcc/testsuite/ChangeLog: * gcc.target/aarch64/pr112105.c: Change to be -O2 rather than -O1. Signed-off-by:
Andrew Pinski <quic_apinski@quicinc.com>
-
David Malcolm authored
input.cc's file_cache was borrowing copies of the file name. This could lead to use-after-free when writing out sarif output from Fortran, which frees its filenames before the sarif output is fully written out. Fix by taking a copy in file_cache_slot. gcc/ChangeLog: PR other/118919 * input.cc (file_cache_slot::m_file_path): Make non-const. (file_cache_slot::evict): Free m_file_path. (file_cache_slot::create): Store a copy of file_path if non-null. (file_cache_slot::~file_cache_slot): Free m_file_path. Signed-off-by:
David Malcolm <dmalcolm@redhat.com>
-
David Malcolm authored
Previously the analyzer treated IFN_UBSAN_BOUNDS as a no-op, but the other IFN_UBSAN_* were unrecognized and conservatively treated as having arbitrary behavior. Treat IFN_UBSAN_NULL and IFN_UBSAN_PTR also as no-ops, which should make -fanalyzer behave better with -fsanitize=undefined. gcc/analyzer/ChangeLog: PR analyzer/118300 * kf.cc (class kf_ubsan_bounds): Replace this with... (class kf_ubsan_noop): ...this. (register_sanitizer_builtins): Use it to handle IFN_UBSAN_NULL, IFN_UBSAN_BOUNDS, and IFN_UBSAN_PTR as nop-ops. (register_known_functions): Drop handling of IFN_UBSAN_BOUNDS here, as it's now handled by register_sanitizer_builtins above. gcc/testsuite/ChangeLog: PR analyzer/118300 * gcc.dg/analyzer/ubsan-pr118300.c: New test. Signed-off-by:
David Malcolm <dmalcolm@redhat.com>
-
Pan Li authored
This patch would like to fix the ICE similar as below, assump we have sample code: 1 │ int a, b, c; 2 │ short d, e, f; 3 │ long g (long h) { return h; } 4 │ 5 │ void i () { 6 │ for (; b; ++b) { 7 │ f = 5 >> a ? d : d << a; 8 │ e &= c | g(f); 9 │ } 10 │ } It will ice when compile with -O3 -march=rv64gc_zve64f -mrvv-vector-bits=zvl during GIMPLE pass: vect pr116351-1.c: In function ‘i’: pr116351-1.c:8:6: internal compiler error: in get_len_load_store_mode, at optabs-tree.cc:655 8 | void i () { | ^ 0x44d6b9d internal_error(char const*, ...) /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/diagnostic-global-context.cc:517 0x44a26a6 fancy_abort(char const*, int, char const*) /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/diagnostic.cc:1722 0x19e4309 get_len_load_store_mode(machine_mode, bool, internal_fn*, vec<int, va_heap, vl_ptr>*) /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/optabs-tree.cc:655 0x1fada40 vect_verify_loop_lens /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:1566 0x1fb2b07 vect_analyze_loop_2 /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3037 0x1fb4302 vect_analyze_loop_1 /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3478 0x1fb4e9a vect_analyze_loop(loop*, gimple*, vec_info_shared*) /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3638 0x203c2dc try_vectorize_loop_1 /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1095 0x203c839 try_vectorize_loop /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1212 0x203cb2c execute During vectorization the override_widen pattern matched and then will get DImode as vector_mode in loop_info. After that the loop_vinfo will step in vect_analyze_xx with below flow: vect_analyze_loop_2 |- vect_pattern_recog // over-widening and set loop_vinfo->vector_mode to DImode |- ... |- vect_analyze_loop_operations |- stmt_info->def_type == vect_reduction_def |- stmt_info->slp_type == pure_slp |- vectorizable_lc_phi // Not Hit |- vectorizable_induction // Not Hit |- vectorizable_reduction // Not Hit |- vectorizable_recurr // Not Hit |- vectorizable_live_operation // Not Hit |- vect_analyze_stmt |- stmt_info->relevant == vect_unused_in_scope |- stmt_info->live == false |- p pattern_stmt_info == (stmt_vec_info) 0x0 |- return opt_result::success (); OR |- PURE_SLP_STMT (stmt_info) && !node then dump "handled only by SLP analysis\n" |- Early return opt_result::success (); |- vectorizable_load/store/call_convert/... // Not Hit |- LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P && !LOOP_VINFO_MASKS(loop_vinfo).is_empty () |- vect_verify_loop_lens (loop_vinfo) |- assert (VECTOR_MODE_P (loop_vinfo->vector_mode); // Hit assert result in ICE Finally, the DImode in loop_vinfo will hit the assert (VECTOR_MODE_P (mode)) in vect_verify_loop_lens. This patch would like to return false directly if the loop_vinfo has relevant mode like DImode for the ICE fix, but still may have mis-optimization for similar cases. We will try to cover that in separated patches. The below test suites are passed for this patch. * The rv64gcv fully regression test. * The x86 bootstrap test. * The x86 fully regression test. PR middle-end/116351 gcc/ChangeLog: * tree-vect-loop.cc (vect_verify_loop_lens): Return false if the loop_vinfo has relevant mode such as DImode. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/pr116351-1.c: New test. * gcc.target/riscv/rvv/base/pr116351-2.c: New test. * gcc.target/riscv/rvv/base/pr116351.h: New test. Signed-off-by:
Pan Li <pan2.li@intel.com>
-
Xi Ruoyao authored
Allowing (t + (1ul << imm >> 1)) >> imm to be recognized as a rounding shift operation. gcc/ChangeLog: * config/loongarch/lasx.md (UNSPEC_LASX_XVSRARI): Remove. (UNSPEC_LASX_XVSRLRI): Remove. (lasx_xvsrari_<lsxfmt>): Remove. (lasx_xvsrlri_<lsxfmt>): Remove. * config/loongarch/lsx.md (UNSPEC_LSX_VSRARI): Remove. (UNSPEC_LSX_VSRLRI): Remove. (lsx_vsrari_<lsxfmt>): Remove. (lsx_vsrlri_<lsxfmt>): Remove. * config/loongarch/simd.md (simd_<optab>_imm_round_<mode>): New define_insn. (<simd_isa>_<x>v<insn>ri_<simdfmt>): New define_expand. gcc/testsuite/ChangeLog: * gcc.target/loongarch/vect-shift-imm-round.c: New test.
-
Xi Ruoyao authored
Despite it's just a special case of "a widening product of which the result used for reduction," having these standard names allows to recognize the dot product pattern earlier and it may be beneficial to optimization. Also fix some test failures with the test cases: - gcc.dg/vect/vect-reduc-chain-2.c - gcc.dg/vect/vect-reduc-chain-3.c - gcc.dg/vect/vect-reduc-chain-dot-slp-3.c - gcc.dg/vect/vect-reduc-chain-dot-slp-4.c gcc/ChangeLog: * config/loongarch/simd.md (wvec_half): New define_mode_attr. (<su>dot_prod<wvec_half><mode>): New define_expand. gcc/testsuite/ChangeLog: * gcc.target/loongarch/wide-mul-reduc-2.c (dg-final): Scan DOT_PROD_EXPR in optimized tree.
-
Xi Ruoyao authored
Since PR116142 has been fixed, now we can add the standard names so the compiler will generate better code if the result of a widening production is reduced. gcc/ChangeLog: * config/loongarch/simd.md (even_odd): New define_int_attr. (vec_widen_<su>mult_<even_odd>_<mode>): New define_expand. gcc/testsuite/ChangeLog: * gcc.target/loongarch/wide-mul-reduc-1.c: New test. * gcc.target/loongarch/wide-mul-reduc-2.c: New test.
-
Xi Ruoyao authored
Like what we've done for {lsx_,lasx_x}v{add,sub,mul}l{ev,od}, use special predicates instead of hard-coded const vectors. This is not suitable for LASX where lasx_xvpick has a different semantic. gcc/ChangeLog: * config/loongarch/simd.md (LVEC): New define_mode_attr. (simdfmt_as_i): Make it same as simdfmt for integer vector modes. (_f): New define_mode_attr. * config/loongarch/lsx.md (lsx_vpickev_b): Remove. (lsx_vpickev_h): Remove. (lsx_vpickev_w): Remove. (lsx_vpickev_w_f): Remove. (lsx_vpickod_b): Remove. (lsx_vpickod_h): Remove. (lsx_vpickod_w): Remove. (lsx_vpickev_w_f): Remove. (lsx_pick_evod_<mode>): New define_insn. (lsx_<x>vpick<ev_od>_<simdfmt_as_i><_f>): New define_expand.
-
Xi Ruoyao authored
Like what we've done for {lsx_,lasx_x}v{add,sub,mul}l{ev,od}, use special predicates and TImode RTL instead of hard-coded const vectors and UNSPECs. Also reorder two operands of the outer plus in the template, so combine will recognize {x,}vadd + {x,}vmulw{ev,od} => {x,}vmaddw{ev,od}. gcc/ChangeLog: * config/loongarch/lasx.md (UNSPEC_LASX_XVMADDWEV): Remove. (UNSPEC_LASX_XVMADDWEV2): Remove. (UNSPEC_LASX_XVMADDWEV3): Remove. (UNSPEC_LASX_XVMADDWOD): Remove. (UNSPEC_LASX_XVMADDWOD2): Remove. (UNSPEC_LASX_XVMADDWOD3): Remove. (lasx_xvmaddwev_h_b<u>): Remove. (lasx_xvmaddwev_w_h<u>): Remove. (lasx_xvmaddwev_d_w<u>): Remove. (lasx_xvmaddwev_q_d): Remove. (lasx_xvmaddwod_h_b<u>): Remove. (lasx_xvmaddwod_w_h<u>): Remove. (lasx_xvmaddwod_d_w<u>): Remove. (lasx_xvmaddwod_q_d): Remove. (lasx_xvmaddwev_q_du): Remove. (lasx_xvmaddwod_q_du): Remove. (lasx_xvmaddwev_h_bu_b): Remove. (lasx_xvmaddwev_w_hu_h): Remove. (lasx_xvmaddwev_d_wu_w): Remove. (lasx_xvmaddwev_q_du_d): Remove. (lasx_xvmaddwod_h_bu_b): Remove. (lasx_xvmaddwod_w_hu_h): Remove. (lasx_xvmaddwod_d_wu_w): Remove. (lasx_xvmaddwod_q_du_d): Remove. * config/loongarch/lsx.md (UNSPEC_LSX_VMADDWEV): Remove. (UNSPEC_LSX_VMADDWEV2): Remove. (UNSPEC_LSX_VMADDWEV3): Remove. (UNSPEC_LSX_VMADDWOD): Remove. (UNSPEC_LSX_VMADDWOD2): Remove. (UNSPEC_LSX_VMADDWOD3): Remove. (lsx_vmaddwev_h_b<u>): Remove. (lsx_vmaddwev_w_h<u>): Remove. (lsx_vmaddwev_d_w<u>): Remove. (lsx_vmaddwev_q_d): Remove. (lsx_vmaddwod_h_b<u>): Remove. (lsx_vmaddwod_w_h<u>): Remove. (lsx_vmaddwod_d_w<u>): Remove. (lsx_vmaddwod_q_d): Remove. (lsx_vmaddwev_q_du): Remove. (lsx_vmaddwod_q_du): Remove. (lsx_vmaddwev_h_bu_b): Remove. (lsx_vmaddwev_w_hu_h): Remove. (lsx_vmaddwev_d_wu_w): Remove. (lsx_vmaddwev_q_du_d): Remove. (lsx_vmaddwod_h_bu_b): Remove. (lsx_vmaddwod_w_hu_h): Remove. (lsx_vmaddwod_d_wu_w): Remove. (lsx_vmaddwod_q_du_d): Remove. * config/loongarch/simd.md (simd_maddw_evod_<mode>_<su>): New define_insn. (<simd_isa>_<x>vmaddw<ev_od>_<simdfmt_w>_<simdfmt><u>): New define_expand. (simd_maddw_evod_<mode>_hetero): New define_insn. (<simd_isa>_<x>vmaddw<ev_od>_<simdfmt_w>_<simdfmt>u_<simdfmt>): New define_expand. (<simd_isa>_maddw<ev_od>_q_d<u>_punned): New define_expand. (<simd_isa>_maddw<ev_od>_q_du_d_punned): New define_expand. * config/loongarch/loongarch-builtins.cc (CODE_FOR_lsx_vmaddwev_q_d): Define as a macro to override it with the punned expand. (CODE_FOR_lsx_vmaddwev_q_du): Likewise. (CODE_FOR_lsx_vmaddwev_q_du_d): Likewise. (CODE_FOR_lsx_vmaddwod_q_d): Likewise. (CODE_FOR_lsx_vmaddwod_q_du): Likewise. (CODE_FOR_lsx_vmaddwod_q_du_d): Likewise. (CODE_FOR_lasx_xvmaddwev_q_d): Likewise. (CODE_FOR_lasx_xvmaddwev_q_du): Likewise. (CODE_FOR_lasx_xvmaddwev_q_du_d): Likewise. (CODE_FOR_lasx_xvmaddwod_q_d): Likewise. (CODE_FOR_lasx_xvmaddwod_q_du): Likewise. (CODE_FOR_lasx_xvmaddwod_q_du_d): Likewise.
-
Xi Ruoyao authored
Like what we've done for {lsx_,lasx_x}v{add,sub,mul}l{ev,od}, use special predicates and TImode RTL instead of hard-coded const vectors and UNSPECs. gcc/ChangeLog: * config/loongarch/lasx.md (UNSPEC_LASX_XVHADDW_Q_D): Remove. (UNSPEC_LASX_XVHSUBW_Q_D): Remove. (UNSPEC_LASX_XVHADDW_QU_DU): Remove. (UNSPEC_LASX_XVHSUBW_QU_DU): Remove. (lasx_xvh<addsub:optab>w_h<u>_b<u>): Remove. (lasx_xvh<addsub:optab>w_w<u>_h<u>): Remove. (lasx_xvh<addsub:optab>w_d<u>_w<u>): Remove. (lasx_xvhaddw_q_d): Remove. (lasx_xvhsubw_q_d): Remove. (lasx_xvhaddw_qu_du): Remove. (lasx_xvhsubw_qu_du): Remove. (reduc_plus_scal_v4di): Call gen_lasx_haddw_q_d_punned instead of gen_lasx_xvhaddw_q_d. (reduc_plus_scal_v8si): Likewise. * config/loongarch/lsx.md (UNSPEC_LSX_VHADDW_Q_D): Remove. (UNSPEC_ASX_VHSUBW_Q_D): Remove. (UNSPEC_ASX_VHADDW_QU_DU): Remove. (UNSPEC_ASX_VHSUBW_QU_DU): Remove. (lsx_vh<addsub:optab>w_h<u>_b<u>): Remove. (lsx_vh<addsub:optab>w_w<u>_h<u>): Remove. (lsx_vh<addsub:optab>w_d<u>_w<u>): Remove. (lsx_vhaddw_q_d): Remove. (lsx_vhsubw_q_d): Remove. (lsx_vhaddw_qu_du): Remove. (lsx_vhsubw_qu_du): Remove. (reduc_plus_scal_v2di): Change the temporary register mode to V1TI, and pun the mode calling gen_vec_extractv2didi. (reduc_plus_scal_v4si): Change the temporary register mode to V1TI. * config/loongarch/simd.md (simd_h<optab>w_<mode>_<su>): New define_insn. (<simd_isa>_<x>vh<optab>w_<simdfmt_w><u>_<simdfmt><u>): New define_expand. (<simd_isa>_h<optab>w_q<u>_d<u>_punned): New define_expand. * config/loongarch/loongarch-builtins.cc (CODE_FOR_lsx_vhaddw_q_d): Define as a macro to override with punned expand. (CODE_FOR_lsx_vhaddw_qu_du): Likewise. (CODE_FOR_lsx_vhsubw_q_d): Likewise. (CODE_FOR_lsx_vhsubw_qu_du): Likewise. (CODE_FOR_lasx_xvhaddw_q_d): Likewise. (CODE_FOR_lasx_xvhaddw_qu_du): Likewise. (CODE_FOR_lasx_xvhsubw_q_d): Likewise. (CODE_FOR_lasx_xvhsubw_qu_du): Likewise.
-
Xi Ruoyao authored
These pattern definitions are tediously long, invoking 32 UNSPECs and many hard-coded long const vectors. To simplify them, at first we use the TImode vector operations instead of the UNSPECs, then we adopt an approach in AArch64: using a special predicate to match the const vectors for odd/even indices for define_insn's, and generate those vectors in define_expand's. For "backward compatibilty" we need to provide a "punned" version for the operations invoking TImode vectors as the intrinsics still expect DImode vectors. The stat is "201 insertions, 905 deletions." gcc/ChangeLog: * config/loongarch/lasx.md (UNSPEC_LASX_XVADDWEV): Remove. (UNSPEC_LASX_XVADDWEV2): Remove. (UNSPEC_LASX_XVADDWEV3): Remove. (UNSPEC_LASX_XVSUBWEV): Remove. (UNSPEC_LASX_XVSUBWEV2): Remove. (UNSPEC_LASX_XVMULWEV): Remove. (UNSPEC_LASX_XVMULWEV2): Remove. (UNSPEC_LASX_XVMULWEV3): Remove. (UNSPEC_LASX_XVADDWOD): Remove. (UNSPEC_LASX_XVADDWOD2): Remove. (UNSPEC_LASX_XVADDWOD3): Remove. (UNSPEC_LASX_XVSUBWOD): Remove. (UNSPEC_LASX_XVSUBWOD2): Remove. (UNSPEC_LASX_XVMULWOD): Remove. (UNSPEC_LASX_XVMULWOD2): Remove. (UNSPEC_LASX_XVMULWOD3): Remove. (lasx_xv<addsubmul:optab>wev_h_b<u>): Remove. (lasx_xv<addsubmul:optab>wev_w_h<u>): Remove. (lasx_xv<addsubmul:optab>wev_d_w<u>): Remove. (lasx_xvaddwev_q_d): Remove. (lasx_xvsubwev_q_d): Remove. (lasx_xvmulwev_q_d): Remove. (lasx_xv<addsubmul:optab>wod_h_b<u>): Remove. (lasx_xv<addsubmul:optab>wod_w_h<u>): Remove. (lasx_xv<addsubmul:optab>wod_d_w<u>): Remove. (lasx_xvaddwod_q_d): Remove. (lasx_xvsubwod_q_d): Remove. (lasx_xvmulwod_q_d): Remove. (lasx_xvaddwev_q_du): Remove. (lasx_xvsubwev_q_du): Remove. (lasx_xvmulwev_q_du): Remove. (lasx_xvaddwod_q_du): Remove. (lasx_xvsubwod_q_du): Remove. (lasx_xvmulwod_q_du): Remove. (lasx_xv<addmul:optab>wev_h_bu_b): Remove. (lasx_xv<addmul:optab>wev_w_hu_h): Remove. (lasx_xv<addmul:optab>wev_d_wu_w): Remove. (lasx_xv<addmul:optab>wod_h_bu_b): Remove. (lasx_xv<addmul:optab>wod_w_hu_h): Remove. (lasx_xv<addmul:optab>wod_d_wu_w): Remove. (lasx_xvaddwev_q_du_d): Remove. (lasx_xvsubwev_q_du_d): Remove. (lasx_xvmulwev_q_du_d): Remove. (lasx_xvaddwod_q_du_d): Remove. (lasx_xvsubwod_q_du_d): Remove. * config/loongarch/lsx.md (UNSPEC_LSX_XVADDWEV): Remove. (UNSPEC_LSX_VADDWEV2): Remove. (UNSPEC_LSX_VADDWEV3): Remove. (UNSPEC_LSX_VSUBWEV): Remove. (UNSPEC_LSX_VSUBWEV2): Remove. (UNSPEC_LSX_VMULWEV): Remove. (UNSPEC_LSX_VMULWEV2): Remove. (UNSPEC_LSX_VMULWEV3): Remove. (UNSPEC_LSX_VADDWOD): Remove. (UNSPEC_LSX_VADDWOD2): Remove. (UNSPEC_LSX_VADDWOD3): Remove. (UNSPEC_LSX_VSUBWOD): Remove. (UNSPEC_LSX_VSUBWOD2): Remove. (UNSPEC_LSX_VMULWOD): Remove. (UNSPEC_LSX_VMULWOD2): Remove. (UNSPEC_LSX_VMULWOD3): Remove. (lsx_v<addsubmul:optab>wev_h_b<u>): Remove. (lsx_v<addsubmul:optab>wev_w_h<u>): Remove. (lsx_v<addsubmul:optab>wev_d_w<u>): Remove. (lsx_vaddwev_q_d): Remove. (lsx_vsubwev_q_d): Remove. (lsx_vmulwev_q_d): Remove. (lsx_v<addsubmul:optab>wod_h_b<u>): Remove. (lsx_v<addsubmul:optab>wod_w_h<u>): Remove. (lsx_v<addsubmul:optab>wod_d_w<u>): Remove. (lsx_vaddwod_q_d): Remove. (lsx_vsubwod_q_d): Remove. (lsx_vmulwod_q_d): Remove. (lsx_vaddwev_q_du): Remove. (lsx_vsubwev_q_du): Remove. (lsx_vmulwev_q_du): Remove. (lsx_vaddwod_q_du): Remove. (lsx_vsubwod_q_du): Remove. (lsx_vmulwod_q_du): Remove. (lsx_v<addmul:optab>wev_h_bu_b): Remove. (lsx_v<addmul:optab>wev_w_hu_h): Remove. (lsx_v<addmul:optab>wev_d_wu_w): Remove. (lsx_v<addmul:optab>wod_h_bu_b): Remove. (lsx_v<addmul:optab>wod_w_hu_h): Remove. (lsx_v<addmul:optab>wod_d_wu_w): Remove. (lsx_vaddwev_q_du_d): Remove. (lsx_vsubwev_q_du_d): Remove. (lsx_vmulwev_q_du_d): Remove. (lsx_vaddwod_q_du_d): Remove. (lsx_vsubwod_q_du_d): Remove. (lsx_vmulwod_q_du_d): Remove. * config/loongarch/loongarch-modes.def: Add V4TI and V1DI. * config/loongarch/loongarch-protos.h (loongarch_gen_stepped_int_parallel): New function prototype. * config/loongarch/loongarch.cc (loongarch_print_operand): Accept 'O' for printing "ev" or "od." (loongarch_gen_stepped_int_parallel): Implement. * config/loongarch/predicates.md (vect_par_cnst_even_or_odd_half): New define_predicate. * config/loongarch/simd.md (WVEC_HALF): New define_mode_attr. (simdfmt_w): Likewise. (zero_one): New define_int_iterator. (ev_od): New define_int_attr. (simd_<optab>w_evod_<mode:IVEC>_<su>): New define_insn. (<simd_isa>_<x>v<optab>w<ev_od>_<simdfmt_w>_<simdfmt><u>): New define_expand. (simd_<optab>w_evod_<mode>_hetero): New define_insn. (<simd_isa>_<x>v<optab>w<ev_od>_<simdfmt_w>_<simdfmt>u_<simdfmt>): New define_expand. (DIVEC): New define_mode_iterator. (<simd_isa>_<optab>w<ev_od>_q_d<u>_punned): New define_expand. (<simd_isa>_<optab>w<ev_od>_q_du_d_punned): Likewise. * config/loongarch/loongarch-builtins.cc (CODE_FOR_lsx_vaddwev_q_d): Define as a macro to override it with the punned expand. (CODE_FOR_lsx_vaddwev_q_du): Likewise. (CODE_FOR_lsx_vsubwev_q_d): Likewise. (CODE_FOR_lsx_vsubwev_q_du): Likewise. (CODE_FOR_lsx_vmulwev_q_d): Likewise. (CODE_FOR_lsx_vmulwev_q_du): Likewise. (CODE_FOR_lsx_vaddwod_q_d): Likewise. (CODE_FOR_lsx_vaddwod_q_du): Likewise. (CODE_FOR_lsx_vsubwod_q_d): Likewise. (CODE_FOR_lsx_vsubwod_q_du): Likewise. (CODE_FOR_lsx_vmulwod_q_d): Likewise. (CODE_FOR_lsx_vmulwod_q_du): Likewise. (CODE_FOR_lsx_vaddwev_q_du_d): Likewise. (CODE_FOR_lsx_vmulwev_q_du_d): Likewise. (CODE_FOR_lsx_vaddwod_q_du_d): Likewise. (CODE_FOR_lsx_vmulwod_q_du_d): Likewise. (CODE_FOR_lasx_xvaddwev_q_d): Likewise. (CODE_FOR_lasx_xvaddwev_q_du): Likewise. (CODE_FOR_lasx_xvsubwev_q_d): Likewise. (CODE_FOR_lasx_xvsubwev_q_du): Likewise. (CODE_FOR_lasx_xvmulwev_q_d): Likewise. (CODE_FOR_lasx_xvmulwev_q_du): Likewise. (CODE_FOR_lasx_xvaddwod_q_d): Likewise. (CODE_FOR_lasx_xvaddwod_q_du): Likewise. (CODE_FOR_lasx_xvsubwod_q_d): Likewise. (CODE_FOR_lasx_xvsubwod_q_du): Likewise. (CODE_FOR_lasx_xvmulwod_q_d): Likewise. (CODE_FOR_lasx_xvmulwod_q_du): Likewise. (CODE_FOR_lasx_xvaddwev_q_du_d): Likewise. (CODE_FOR_lasx_xvmulwev_q_du_d): Likewise. (CODE_FOR_lasx_xvaddwod_q_du_d): Likewise. (CODE_FOR_lasx_xvmulwod_q_du_d): Likewise.
-
Xi Ruoyao authored
We have some vector instructions for operations on 128-bit integer, i.e. TImode, vectors. Previously they had been modeled with unspecs, but it's more natural to just model them with TImode vector RTL expressions. For the preparation, allow moving V1TImode and V2TImode vectors in LSX and LASX registers so we won't get a reload failure when we start to save TImode vectors in these registers. This implicitly depends on the vrepli optimization: without it we'd try "vrepli.q" which does not really exist and trigger an ICE. gcc/ChangeLog: * config/loongarch/lsx.md (mov<LSX:mode>): Remove. (movmisalign<LSX:mode>): Remove. (mov<LSX:mode>_lsx): Remove. * config/loongarch/lasx.md (mov<LASX:mode>): Remove. (movmisalign<LASX:mode>): Remove. (mov<LASX:mode>_lasx): Remove. * config/loongarch/loongarch-modes.def (V1TI): Add. (V2TI): Mention in the comment. * config/loongarch/loongarch.md (mode): Add V1TI and V2TI. * config/loongarch/simd.md (ALLVEC_TI): New mode iterator. (mov<ALLVEC_TI:mode): New define_expand. (movmisalign<ALLVEC_TI:mode>): Likewise. (mov<ALLVEC_TI:mode>_simd): New define_insn_and_split.
-
Xi Ruoyao authored
For a = (v4si){0xdddddddd, 0xdddddddd, 0xdddddddd, 0xdddddddd} we just want vrepli.b $vr0, 0xdd but the compiler actually produces a load: la.local $r14,.LC0 vld $vr0,$r14,0 It's because we only tried vrepli.d which wouldn't work. Try all vrepli instructions for const int vector materializing to fix it. gcc/ChangeLog: * config/loongarch/loongarch-protos.h (loongarch_const_vector_vrepli): New function prototype. * config/loongarch/loongarch.cc (loongarch_const_vector_vrepli): Implement. (loongarch_const_insns): Call loongarch_const_vector_vrepli instead of loongarch_const_vector_same_int_p. (loongarch_split_vector_move_p): Likewise. (loongarch_output_move): Use loongarch_const_vector_vrepli to pun operend[1] into a better mode if it's a const int vector, and decide the suffix of [x]vrepli with the new mode. * config/loongarch/constraints.md (YI): Call loongarch_const_vector_vrepli instead of loongarch_const_vector_same_int_p. gcc/testsuite/ChangeLog: * gcc.target/loongarch/vrepli.c: New test.
-
Xi Ruoyao authored
Since r15-1120, multi-word shifts/rotates produces PLUS instead of IOR. It's generally a good thing (allowing to use our alsl instruction or similar instrunction on other architectures), but it's preventing us from using bytepick. For example, if we shift a __int128 by 16 bits, the higher word can be produced via a single bytepick.d instruction with immediate 2, but we got: srli.d $r12,$r4,48 slli.d $r5,$r5,16 slli.d $r4,$r4,16 add.d $r5,$r12,$r5 jr $r1 This wasn't work with GCC 14, but after r15-6490 it's supposed to work if IOR was used instead of PLUS. To fix this, add a code iterator to match IOR, XOR, and PLUS and use it instead of just IOR if we know the operands have no overlapping bits. gcc/ChangeLog: PR target/115478 * config/loongarch/loongarch.md (any_or_plus): New define_code_iterator. (bstrins_<mode>_for_ior_mask): Use any_or_plus instead of ior. (bytepick_w_<bytepick_imm>): Likewise. (bytepick_d_<bytepick_imm>): Likewise. (bytepick_d_<bytepick_imm>_rev): Likewise. gcc/testsuite/ChangeLog: PR target/115478 * gcc.target/loongarch/bytepick_shift_128.c: New test.
-
Jeff Law authored
The sibling and unshare passes were dropped as distinct passes 10+ years ago. Docs weren't ever updated. This just removes them; given their age I don't think we need to keep them around any longer. PR middle-end/113525 gcc/ * doc/invoke.texi (dump-rtl-sibling): Drop documentation for pass removed long ago. (dump-rtl-unshare): Likewise.
-
GCC Administrator authored
-
- Feb 18, 2025
-
-
Andi Kleen authored
The file-cache-lines / file-cache-files tunables were documented in the wrong section. Fix that. Reported-by: Filip Kastl Comitted as obvious. gcc/ChangeLog: * doc/invoke.texi:
-