Commits · a42374b60884d9ac4ff47e7787b32142526ac666 · COBOLworx / gcc-cobol

Feb 20, 2025

invoke.texi: Fix typo in the file-cache-lines param · a42374b6

Filip Kastl authored 4 weeks ago


file-cache-lines param was documented as file-cache-files.  This fixes
the typo.

gcc/ChangeLog:

	* doc/invoke.texi: Fix typo file-cache-files ->
	file-cache-lines.

Signed-off-by: Filip Kastl <fkastl@suse.cz>

a42374b6

libstdc++: Workaround Clang bug with __array_rank built-in [PR118559] · c0e865f7

Jonathan Wakely authored 4 weeks ago

We started using the __array_rank built-in with r15-1252-g6f0dfa6f1acdf7
but that built-in is buggy in versions of Clang up to and including 19.

libstdc++-v3/ChangeLog:

	PR libstdc++/118559
	* include/std/type_traits (rank, rank_v): Do not use
	__array_rank for Clang 19 and older.

c0e865f7

libstdc++: Add parentheses around _GLIBCXX_HAS_BUILTIN definition · 57f65c5c

Jonathan Wakely authored 4 weeks ago

This allows _GLIBCXX_HAS_BUILTIN (or _GLIBCXX_USE_BUILTIN_TRAIT) to be
used as part of a larger logical expression.

libstdc++-v3/ChangeLog:

	* include/bits/c++config (_GLIBCXX_HAS_BUILTIN): Add parentheses.

57f65c5c

libstdc++: Use new type-generic built-ins in <bit> [PR118855] · e8ad697a

Jonathan Wakely authored 4 weeks ago

This makes several functions in <bit> faster to compile, with fewer
expressions to parse and fewer instantiations of __numeric_traits
required.

libstdc++-v3/ChangeLog:

	PR libstdc++/118855
	* include/std/bit (__count_lzero, __count_rzero, __popcount):
	Use type-generic built-ins when available.

e8ad697a

libstdc++: Fix invalid signed arguments to <bit> functions · 32457bc2

Jonathan Wakely authored 1 month ago

These should have been unsigned, but the static assertions are only in
the public std::bit_ceil and std::bit_width functions, not the internal
__bit_ceil and __bit_width ones.

libstdc++-v3/ChangeLog:

	* include/experimental/bits/simd.h (__find_next_valid_abi): Cast
	__bit_ceil argument to unsigned.
	* src/c++17/floating_from_chars.cc (__floating_from_chars_hex):
	Cast __bit_ceil argument to unsigned.
	* src/c++17/memory_resource.cc (big_block): Cast __bit_width
	argument to unsigned.

32457bc2

libstdc++: Remove workaround for reserved init_priority warnings · 29eb6f8f

Jonathan Wakely authored 1 month ago

Since r15-7511-g4e7f74225116e7 we can disable the warnings for using a
reserved priority using a diagnostic pragma. That means we no longer
need to put globals using that attribute into separate files that get
included.

This replaces the two uses of such separate files by moving the variable
definition into the source file and adding the diagnostic pragma.

libstdc++-v3/ChangeLog:

	* src/c++17/memory_resource.cc (default_res): Define here
	instead of including default_resource.h.
	* src/c++98/globals_io.cc (__ioinit): Define here instead of
	including ios_base_init.h.
	* src/c++17/default_resource.h: Removed.
	* src/c++98/ios_base_init.h: Removed.

29eb6f8f

libstdc++: Use init_priority attribute for tzdb globals [PR118811] · 99f57446

Jonathan Wakely authored 1 month ago

When linking statically to libstdc++.a (or to libstdc++_nonshared.a in
the RHEL devtoolset compiler) there's a static initialization order
problem where user code might be constructed before the
std::chrono::tzdb_list globals, and so might try to use them after
they've already been destroyed.

Use the init_priority attribute on those globals so that they are
initialized early. Since r15-7511-g4e7f74225116e7 we can disable the
warnings for using a reserved priority using a diagnostic pragma.

libstdc++-v3/ChangeLog:

	PR libstdc++/118811
	* src/c++20/tzdb.cc (tzdb_list::_Node): Use init_priority
	attribute on static data members.
	* testsuite/std/time/tzdb_list/pr118811.cc: New test.

99f57446

Fortran: Remove deprecated coarray routines [PR107635] · d3244675

Andre Vehreschild authored 1 month ago

gcc/fortran/ChangeLog:

	PR fortran/107635

	* gfortran.texi: Remove deprecated functions from documentation.
	* trans-decl.cc (gfc_build_builtin_function_decls): Remove
	decprecated function decls.
	* trans-intrinsic.cc (gfc_conv_intrinsic_exponent): Remove
	deprecated/no longer needed routines.
	* trans.h: Remove unused decls.

libgfortran/ChangeLog:

	* caf/libcaf.h (_gfortran_caf_get): Removed because deprecated.
	(_gfortran_caf_send): Same.
	(_gfortran_caf_sendget): Same.
	(_gfortran_caf_send_by_ref): Same.
	* caf/single.c (assign_char4_from_char1): Same.
	(assign_char1_from_char4): Same.
	(convert_type): Same.
	(defined): Same.
	(_gfortran_caf_get): Same.
	(_gfortran_caf_send): Same.
	(_gfortran_caf_sendget): Same.
	(copy_data): Same.
	(get_for_ref): Same.
	(_gfortran_caf_get_by_ref): Same.
	(send_by_ref): Same.
	(_gfortran_caf_send_by_ref): Same.
	(_gfortran_caf_sendget_by_ref): Same.

d3244675

Fortran: Add transfer_between_remotes [PR107635] · 8bf0ee8d

Andre Vehreschild authored 1 month ago

Add the last missing coarray data manipulation routine using remote
accessors.

gcc/fortran/ChangeLog:

	PR fortran/107635

	* coarray.cc (rewrite_caf_send): Rewrite to
	transfer_between_remotes when both sides of the assignment have
	a coarray.
	(coindexed_code_callback): Prevent duplicate rewrite.
	* gfortran.texi: Add documentation for transfer_between_remotes.
	* intrinsic.cc (add_subroutines): Add intrinsic symbol for
	caf_sendget to allow easy rewrite to transfer_between_remotes.
	* trans-decl.cc (gfc_build_builtin_function_decls): Add
	prototype for transfer_between_remotes.
	* trans-intrinsic.cc (conv_caf_vector_subscript_elem): Mark as
	deprecated.
	(conv_caf_vector_subscript): Same.
	(compute_component_offset): Same.
	(conv_expr_ref_to_caf_ref): Same.
	(conv_stat_and_team): Extract stat and team from expr.
	(gfc_conv_intrinsic_caf_get): Use conv_stat_and_team.
	(conv_caf_send_to_remote): Same.
	(has_ref_after_cafref): Mark as deprecated.
	(conv_caf_sendget): Translate to transfer_between_remotes.
	* trans.h: Add prototype for transfer_between_remotes.

libgfortran/ChangeLog:

	* caf/libcaf.h: Add prototype for transfer_between_remotes.
	* caf/single.c: Implement transfer_between_remotes.

gcc/testsuite/ChangeLog:

	* gfortran.dg/coarray_lib_comm_1.f90: Fix up scan_trees.

8bf0ee8d

Fortran: Add send_to_remote [PR107635] · 69eb0268

Andre Vehreschild authored 1 month ago

Refactor to use send_to_remote instead of the slow send_by_ref.

gcc/fortran/ChangeLog:

	PR fortran/107635

	* coarray.cc (move_coarray_ref): Move the coarray reference out
	of the given one.  Especially when there is a regular array ref.
	(fixup_comp_refs): Move components refs to a derived type where
	the codim has been removed, aka a new type.
	(split_expr_at_caf_ref): Correctly split the reference chain.
	(remove_caf_ref): Simplify.
	(create_get_callback): Fix some deficiencies.
	(create_allocated_callback): Adapt to new signature of split.
	(create_send_callback): New function.
	(rewrite_caf_send): Rewrite a call to caf_send to
	caf_send_to_remote.
	(coindexed_code_callback): Treat caf_send and caf_sendget
	correctly.
	* gfortran.h (enum gfc_isym_id): Add SENDGET-isym.
	* gfortran.texi: Add documentation for send_to_remote.
	* resolve.cc (gfc_resolve_code): No longer generate send_by_ref
	when allocatable coarray (component) is on the lhs.
	* trans-decl.cc (gfc_build_builtin_function_decls): Add
	caf_send_to_remote decl.
	* trans-intrinsic.cc (conv_caf_func_index): Ensure the static
	variables created are not in a block-scope.
	(conv_caf_send_to_remote): Translate caf_send_to_remote calls.
	(conv_caf_send): Renamed to conv_caf_sendget.
	(conv_caf_sendget): Renamed from conv_caf_send.
	(gfc_conv_intrinsic_subroutine): Branch correctly for
	conv_caf_send and sendget.
	* trans.h: Correct decl.

libgfortran/ChangeLog:

	* caf/libcaf.h: Add/Correct prototypes for caf_get_from_remote,
	caf_send_to_remote.
	* caf/single.c (struct accessor_hash_t): Rename accessor_t to
	getter_t.
	(_gfortran_caf_register_accessor): Use new name of getter_t.
	(_gfortran_caf_send_to_remote): New function for sending data to
	coarray on a remote image.

gcc/testsuite/ChangeLog:

	* gfortran.dg/coarray/send_char_array_1.f90: Extend test to
	catch more cases.
	* gfortran.dg/coarray_42.f90: Invert tests use, because no
	longer a send is needed when local memory in a coarray is
	allocated.

69eb0268

Fortran: Add caf_is_present_on_remote. [PR107635] · 15847252

Andre Vehreschild authored 1 month ago

Replace caf_is_present by caf_is_present_on_remote which is using a
dedicated callback for each object to test on the remote image.

gcc/fortran/ChangeLog:

	PR fortran/107635

	* coarray.cc (create_allocated_callback): Add creating remote
	side procedure for checking allocation status of coarray.
	(rewrite_caf_allocated): Rewrite ALLOCATED on coarray to use caf
	routine.
	(coindexed_expr_callback): Exempt caf_is_present_on_remote from
	being rewritten again.
	* gfortran.h (enum gfc_isym_id): Add caf_is_present_on_remote
	id.
	* gfortran.texi: Add documentation for caf_is_present_on_remote.
	* intrinsic.cc (add_functions): Add caf_is_present_on_remote
	symbol.
	* trans-decl.cc (gfc_build_builtin_function_decls): Define
	interface of caf_is_present_on_remote.
	* trans-intrinsic.cc (gfc_conv_intrinsic_caf_is_present_remote):
	Translate caf_is_present_on_remote.
	(trans_caf_is_present): Remove.
	(caf_this_image_ref): Remove.
	(gfc_conv_allocated): Take out coarray treatment, because that
	is rewritten to caf_is_present_on_remote now.
	(gfc_conv_intrinsic_function): Handle caf_is_present_on_remote
	calls.
	* trans.h: Add symbol for caf_is_present_on_remote and remove
	old one.

libgfortran/ChangeLog:

	* caf/libcaf.h (_gfortran_caf_is_present_on_remote): Add new
	function.
	(_gfortran_caf_is_present): Remove deprecated one.
	* caf/single.c (struct accessor_hash_t): Add function ptr access
	for remote side call.
	(_gfortran_caf_is_present_on_remote): Added.
	(_gfortran_caf_is_present): Removed.

gcc/testsuite/ChangeLog:

	* gfortran.dg/coarray/coarray_allocated.f90: Adapt to new method
	of checking on remote image.
	* gfortran.dg/coarray_lib_alloc_4.f90: Same.

15847252

Fortran: Allow to use non-pure/non-elemental functions in coarray indexes [PR107635] · abbfeb2e

Andre Vehreschild authored 1 month ago

Extract calls to non-pure or non-elemental functions from index
expressions on a coarray.

gcc/fortran/ChangeLog:

	PR fortran/107635

	* coarray.cc (get_arrayspec_from_expr): Treat array result of
	function calls correctly.
	(remove_coarray_from_derived_type): Prevent memory loss.
	(add_caf_get_from_remote): Correct locus.
	(find_comp): New function to find or create a new component in a
	derived type.
	(check_add_new_comp_handle_array): Handle allocatable arrays or
	non-pure/non-elemental functions in indexes of coarrays.
	(check_add_new_component): Use above function.
	(create_get_parameter_type): Rename to
	create_caf_add_data_parameter_type.
	(create_caf_add_data_parameter_type): Renaming of variable and
	make the additional data a coarray.
	(remove_caf_ref): Factor out to reuse in other caf-functions.
	(create_get_callback): Use function factored out, set locus
	correctly and ensure a kind is set for parameters.
	(add_caf_get_intrinsic): Rename to add_caf_get_from_remote and
	rename some variables.
	(coindexed_expr_callback): Skip over function created by the
	rewriter.
	(coindexed_code_callback): Filter some intrinsics not to
	process.
	(gfc_coarray_rewrite): Rewrite also contained functions.
	* trans-intrinsic.cc (gfc_conv_intrinsic_caf_get): Reflect
	changed order on caf_get_from_remote ().

libgfortran/ChangeLog:

	* caf/libcaf.h (_gfortran_caf_register_accessor): Reflect
	changed parameter order.
	* caf/single.c (struct accessor_hash_t): Same.
	(_gfortran_caf_register_accessor): Call accessor using a token
	for accessing arrays with a descriptor on the source side.

gcc/testsuite/ChangeLog:

	* gfortran.dg/coarray_lib_comm_1.f90: Adapt scan expression.
	* gfortran.dg/coarray/get_with_fn_parameter.f90: New test.
	* gfortran.dg/coarray/get_with_scalar_fn.f90: New test.

abbfeb2e

Fortran: Prepare for more caf-rework. [PR107635] · b114312b

Andre Vehreschild authored 2 months ago

Factor out generation of code to get remote function index and to
create the additional data structure.  Rename caf_get_by_ct to
caf_get_from_remote.

gcc/fortran/ChangeLog:

	PR fortran/107635

	* gfortran.texi: Rename caf_get_by_ct to caf_get_from_remote.
	* trans-decl.cc (gfc_build_builtin_function_decls): Rename
	intrinsic.
	* trans-intrinsic.cc (conv_caf_func_index): Factor out
	functionality to be reused by other caf-functions.
	(conv_caf_add_call_data): Same.
	(gfc_conv_intrinsic_caf_get): Use functions factored out.
	* trans.h: Rename intrinsic symbol.

libgfortran/ChangeLog:

	* caf/libcaf.h (_gfortran_caf_get_by_ref): Remove from ABI.
	This function is replaced by caf_get_from_remote ().
	(_gfortran_caf_get_remote_function_index): Use better name.
	* caf/single.c (_gfortran_caf_finalize): Free internal data.
	(_gfortran_caf_get_by_ref): Remove from public interface, but
	keep it, because it is still used by sendget ().

gcc/testsuite/ChangeLog:

	* gfortran.dg/coarray_lib_comm_1.f90: Adapt to renamed ABI
	function.
	* gfortran.dg/coarray_stat_function.f90: Same.
	* gfortran.dg/coindexed_1.f90: Same.

b114312b

Fortran: Move caf_get-rewrite to coarray.cc [PR107635] · 90ba8291

Andre Vehreschild authored 2 months ago

Add a rewriter to keep all expression tree that is not optimization
together.  At the moment this is just a move from resolve.cc, but will
be extended to handle more cases where rewriting the expression tree may
be easier.  The first use case is to extract accessors for coarray
remote image data access.

gcc/fortran/ChangeLog:

	PR fortran/107635
	* Make-lang.in: Add coarray.cc.
	* coarray.cc: New file.
	* gfortran.h (gfc_coarray_rewrite): New procedure.
	* parse.cc (rewrite_expr_tree): Add entrypoint for rewriting
	expression trees.
	* resolve.cc (gfc_resolve_ref): Remove caf_lhs handling.
	(get_arrayspec_from_expr): Moved to rewrite.cc.
	(remove_coarray_from_derived_type): Same.
	(convert_coarray_class_to_derived_type): Same.
	(split_expr_at_caf_ref): Same.
	(check_add_new_component): Same.
	(create_get_parameter_type): Same.
	(create_get_callback): Same.
	(add_caf_get_intrinsic): Same.
	(resolve_variable): Remove caf_lhs handling.

libgfortran/ChangeLog:

	* caf/single.c (_gfortran_caf_finalize): Free memory preventing
	leaks.
	(_gfortran_caf_get_by_ct): Fix constness.
	* caf/libcaf.h (_gfortran_caf_register_accessor): Fix constness.

90ba8291

tree-optimization/86270 - improve SSA coalescing for loop exit test · 94d01a88

Richard Biener authored 1 month ago

The PR indicates a very specific issue with regard to SSA coalescing
failures because there's a pre IV increment loop exit test.  While
IVOPTs created the desired IL we later simplify the exit test into
the undesirable form again.  The following fixes this up during RTL
expansion where we try to improve coalescing of IVs.  That seems
easier that trying to avoid the simplification with some weird
heuristics (it could also have been written this way).

	PR tree-optimization/86270
	* tree-outof-ssa.cc (insert_backedge_copies): Pattern
	match a single conflict in a loop condition and adjust
	that avoiding the conflict if possible.

	* gcc.target/i386/pr86270.c: Adjust to check for no reg-reg
	copies as well.

94d01a88

x86: Add a test for PR target/118936 · 83bc61c9

H.J. Lu authored 4 weeks ago


Add a test for PR target/118936 which was fixed by reverting:

565d4e75 i386: Simplify PARALLEL RTX scan in ix86_find_all_reg_use
11902be7 x86: Properly find the maximum stack slot alignment

	PR target/118936
	* gcc.target/i386/pr118936.c: New test.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>

83bc61c9

Revert "x86: Properly find the maximum stack slot alignment" · 6921c93d
H.J. Lu authored 4 weeks ago
```
This reverts commit 11902be7.
```
6921c93d
Revert "i386: Simplify PARALLEL RTX scan in ix86_find_all_reg_use" · 0312d11b
H.J. Lu authored 4 weeks ago
```
This reverts commit 565d4e75.
```
0312d11b

libstdc++: Rename concat_view::iterator to ::_Iterator · 49bc1cf6

Patrick Palka authored 4 weeks ago


Even though 'iterator' is a reserved macro name, we can't use it as the
name of this implementation detail type since it could introduce name
lookup ambiguity in valid code, e.g.

  struct A { using iterator = void; }
  struct B : concat_view<...>, A { using type = iterator; };

libstdc++-v3/ChangeLog:

	* include/std/ranges (concat_view::iterator): Rename to ...
	(concat_view::_Iterator): ... this throughout.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

49bc1cf6

libstdc++: Sync concat_view with final P2542 revision [PR115209] · 8543dc52

Patrick Palka authored 4 weeks ago


Our concat_view implementation is accidentally based off of an older
revision of the paper, P2542R7 instead of R8.  As far as I can tell the
only semantic change in the final revision is the relaxed constraints on
the iterator's iter/sent operator- overloads, which this patch updates.

This patch also simplifies the concat_view::end wording via C++26 pack
indexing as per the final revision.  In turn we make the availability of
this library feature conditional on __cpp_pack_indexing.  (Note pack
indexing is implemented in GCC 15 and Clang 19).

	PR libstdc++/115209

libstdc++-v3/ChangeLog:

	* include/bits/version.def (ranges_concat): Depend on
	__cpp_pack_indexing.
	* include/bits/version.h: Regenerate.
	* include/std/ranges (__detail::__last_is_common): Remove.
	(__detail::__all_but_first_sized): New.
	(concat_view::end): Use C++26 pack indexing instead of
	__last_is_common as per R8 of P2542.
	(concat_view::iterator::operator-): Update constraints on
	iter/sent overloads as per R8 of P2542.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

8543dc52

Daily bump. · 6d8e9cdb
GCC Administrator authored 4 weeks ago

6d8e9cdb

Feb 19, 2025

GCN, nvptx: Support '--enable-languages=all' · ab35fc0d

Thomas Schwinge authored 4 weeks ago

..., where "support" means that the build doesn't fail, but it doesn't mean
that all target libraries get built and we get pretty test results for the
additional languages.

	* configure.ac (unsupported_languages) [GCN, nvptx]: Add 'ada'.
	(noconfigdirs) [GCN, nvptx]: Add 'target-libobjc',
	'target-libffi', 'target-libgo'.
	* configure: Regenerate.

ab35fc0d

AVR: Add new ISR test gcc.target/avr/torture/isr-04-regs.c. · 30c82049

Georg-Johann Lay authored 4 weeks ago

gcc/testsuite/
	* gcc.target/avr/torture/isr-04-regs.c: New test.
	* gcc.target/avr/isr-test.h: Don't set GPRs to values
	that are 0 mod 0x11.

30c82049

aarch64: Fix testcase pr112105.c · c3fecec6

Andrew Pinski authored 4 weeks ago


This testcase started to fail with r15-268-g9dbff9c05520a7.
When late_combine was added, it was turned on for -O2+ only,
so this testcase still failed.
This changes the option to be -O2 instead of -O and the testcase
started to pass again.

tested for aarch64-linux-gnu.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/pr112105.c: Change to be -O2 rather
	than -O1.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

c3fecec6

input: give file_cache_slot its own copy of the file path [PR118919] · ee6619b1

David Malcolm authored 4 weeks ago


input.cc's file_cache was borrowing copies of the file name.
This could lead to use-after-free when writing out sarif output
from Fortran, which frees its filenames before the sarif output
is fully written out.

Fix by taking a copy in file_cache_slot.

gcc/ChangeLog:
	PR other/118919
	* input.cc (file_cache_slot::m_file_path): Make non-const.
	(file_cache_slot::evict): Free m_file_path.
	(file_cache_slot::create): Store a copy of file_path if non-null.
	(file_cache_slot::~file_cache_slot): Free m_file_path.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

ee6619b1

analyzer: handle more IFN_UBSAN_* as no-ops [PR118300] · 58b90139

David Malcolm authored 4 weeks ago


Previously the analyzer treated IFN_UBSAN_BOUNDS as a no-op, but
the other IFN_UBSAN_* were unrecognized and conservatively treated
as having arbitrary behavior.

Treat IFN_UBSAN_NULL and IFN_UBSAN_PTR also as no-ops, which should
make -fanalyzer behave better with -fsanitize=undefined.

gcc/analyzer/ChangeLog:
	PR analyzer/118300
	* kf.cc (class kf_ubsan_bounds): Replace this with...
	(class kf_ubsan_noop): ...this.
	(register_sanitizer_builtins): Use it to handle IFN_UBSAN_NULL,
	IFN_UBSAN_BOUNDS, and IFN_UBSAN_PTR as nop-ops.
	(register_known_functions): Drop handling of IFN_UBSAN_BOUNDS
	here, as it's now handled by register_sanitizer_builtins above.

gcc/testsuite/ChangeLog:
	PR analyzer/118300
	* gcc.dg/analyzer/ubsan-pr118300.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

58b90139

Vect: Fix ICE when vect_verify_loop_lens acts on relevant mode [PR116351] · 25256ec1

Pan Li authored 4 weeks ago


This patch would like to fix the ICE similar as below, assump we have
sample code:

   1   │ int a, b, c;
   2   │ short d, e, f;
   3   │ long g (long h) { return h; }
   4   │
   5   │ void i () {
   6   │   for (; b; ++b) {
   7   │     f = 5 >> a ? d : d << a;
   8   │     e &= c | g(f);
   9   │   }
  10   │ }

It will ice when compile with -O3 -march=rv64gc_zve64f -mrvv-vector-bits=zvl

during GIMPLE pass: vect
pr116351-1.c: In function ‘i’:
pr116351-1.c:8:6: internal compiler error: in get_len_load_store_mode,
at optabs-tree.cc:655
    8 | void i () {
      |      ^
0x44d6b9d internal_error(char const*, ...)
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/diagnostic-global-context.cc:517
0x44a26a6 fancy_abort(char const*, int, char const*)
        /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/diagnostic.cc:1722
0x19e4309 get_len_load_store_mode(machine_mode, bool, internal_fn*, vec<int, va_heap, vl_ptr>*)
        /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/optabs-tree.cc:655
0x1fada40 vect_verify_loop_lens
        /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:1566
0x1fb2b07 vect_analyze_loop_2
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3037
0x1fb4302 vect_analyze_loop_1
        /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3478
0x1fb4e9a vect_analyze_loop(loop*, gimple*, vec_info_shared*)
        /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3638
0x203c2dc try_vectorize_loop_1
        /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1095
0x203c839 try_vectorize_loop
        /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1212
0x203cb2c execute

During vectorization the override_widen pattern matched and then will get DImode
as vector_mode in loop_info.  After that the loop_vinfo will step in vect_analyze_xx
with below flow:

vect_analyze_loop_2
 |- vect_pattern_recog // over-widening and set loop_vinfo->vector_mode to DImode
 |- ...
 |- vect_analyze_loop_operations
   |- stmt_info->def_type == vect_reduction_def
   |- stmt_info->slp_type == pure_slp
   |- vectorizable_lc_phi     // Not Hit
   |- vectorizable_induction  // Not Hit
   |- vectorizable_reduction  // Not Hit
   |- vectorizable_recurr     // Not Hit
   |- vectorizable_live_operation  // Not Hit
   |- vect_analyze_stmt
     |- stmt_info->relevant == vect_unused_in_scope
     |- stmt_info->live == false
     |- p pattern_stmt_info == (stmt_vec_info) 0x0
     |- return opt_result::success ();
     OR
     |- PURE_SLP_STMT (stmt_info) && !node then dump "handled only by SLP analysis\n"
       |- Early return opt_result::success ();
     |- vectorizable_load/store/call_convert/... // Not Hit
   |- LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P && !LOOP_VINFO_MASKS(loop_vinfo).is_empty ()
     |- vect_verify_loop_lens (loop_vinfo)
       |- assert (VECTOR_MODE_P (loop_vinfo->vector_mode); // Hit assert result in ICE

Finally, the DImode in loop_vinfo will hit the assert (VECTOR_MODE_P (mode))
in vect_verify_loop_lens.  This patch would like to return false
directly if the loop_vinfo has relevant mode like DImode for the ICE
fix, but still may have mis-optimization for similar cases.  We will try
to cover that in separated patches.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.
* The x86 bootstrap test.
* The x86 fully regression test.

	PR middle-end/116351

gcc/ChangeLog:

	* tree-vect-loop.cc (vect_verify_loop_lens): Return false if the
	loop_vinfo has relevant mode such as DImode.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/pr116351-1.c: New test.
	* gcc.target/riscv/rvv/base/pr116351-2.c: New test.
	* gcc.target/riscv/rvv/base/pr116351.h: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

25256ec1

LoongArch: Use normal RTL pattern instead of UNSPEC for {x,}vsr{a,l}ri instructions · 42738604

Xi Ruoyao authored 1 month ago

Allowing (t + (1ul << imm >> 1)) >> imm to be recognized as a rounding
shift operation.

gcc/ChangeLog:

	* config/loongarch/lasx.md (UNSPEC_LASX_XVSRARI): Remove.
	(UNSPEC_LASX_XVSRLRI): Remove.
	(lasx_xvsrari_<lsxfmt>): Remove.
	(lasx_xvsrlri_<lsxfmt>): Remove.
	* config/loongarch/lsx.md (UNSPEC_LSX_VSRARI): Remove.
	(UNSPEC_LSX_VSRLRI): Remove.
	(lsx_vsrari_<lsxfmt>): Remove.
	(lsx_vsrlri_<lsxfmt>): Remove.
	* config/loongarch/simd.md (simd_<optab>_imm_round_<mode>): New
	define_insn.
	(<simd_isa>_<x>v<insn>ri_<simdfmt>): New define_expand.

gcc/testsuite/ChangeLog:

	* gcc.target/loongarch/vect-shift-imm-round.c: New test.

42738604

LoongArch: Implement [su]dot_prod* for LSX and LASX modes · cef5f23a

Xi Ruoyao authored 1 month ago

Despite it's just a special case of "a widening product of which the
result used for reduction," having these standard names allows to
recognize the dot product pattern earlier and it may be beneficial to
optimization.  Also fix some test failures with the test cases:

- gcc.dg/vect/vect-reduc-chain-2.c
- gcc.dg/vect/vect-reduc-chain-3.c
- gcc.dg/vect/vect-reduc-chain-dot-slp-3.c
- gcc.dg/vect/vect-reduc-chain-dot-slp-4.c

gcc/ChangeLog:

	* config/loongarch/simd.md (wvec_half): New define_mode_attr.
	(<su>dot_prod<wvec_half><mode>): New define_expand.

gcc/testsuite/ChangeLog:

	* gcc.target/loongarch/wide-mul-reduc-2.c (dg-final): Scan
	DOT_PROD_EXPR in optimized tree.

cef5f23a

LoongArch: Implement vec_widen_mult_{even,odd}_* for LSX and LASX modes · 7c54e46b

Xi Ruoyao authored 1 month ago

Since PR116142 has been fixed, now we can add the standard names so the
compiler will generate better code if the result of a widening
production is reduced.

gcc/ChangeLog:

	* config/loongarch/simd.md (even_odd): New define_int_attr.
	(vec_widen_<su>mult_<even_odd>_<mode>): New define_expand.

gcc/testsuite/ChangeLog:

	* gcc.target/loongarch/wide-mul-reduc-1.c: New test.
	* gcc.target/loongarch/wide-mul-reduc-2.c: New test.

7c54e46b

LoongArch: Simplify lsx_vpick description · 7dda6715

Xi Ruoyao authored 1 month ago

Like what we've done for {lsx_,lasx_x}v{add,sub,mul}l{ev,od}, use
special predicates instead of hard-coded const vectors.

This is not suitable for LASX where lasx_xvpick has a different
semantic.

gcc/ChangeLog:

	* config/loongarch/simd.md (LVEC): New define_mode_attr.
	(simdfmt_as_i): Make it same as simdfmt for integer vector
	modes.
	(_f): New define_mode_attr.
	* config/loongarch/lsx.md (lsx_vpickev_b): Remove.
	(lsx_vpickev_h): Remove.
	(lsx_vpickev_w): Remove.
	(lsx_vpickev_w_f): Remove.
	(lsx_vpickod_b): Remove.
	(lsx_vpickod_h): Remove.
	(lsx_vpickod_w): Remove.
	(lsx_vpickev_w_f): Remove.
	(lsx_pick_evod_<mode>): New define_insn.
	(lsx_<x>vpick<ev_od>_<simdfmt_as_i><_f>): New
	define_expand.

7dda6715

LoongArch: Simplify {lsx_,lasx_x}vmaddw description · f727a4c5

Xi Ruoyao authored 7 months ago

Like what we've done for {lsx_,lasx_x}v{add,sub,mul}l{ev,od}, use
special predicates and TImode RTL instead of hard-coded const vectors
and UNSPECs.

Also reorder two operands of the outer plus in the template, so combine
will recognize {x,}vadd + {x,}vmulw{ev,od} => {x,}vmaddw{ev,od}.

gcc/ChangeLog:

	* config/loongarch/lasx.md (UNSPEC_LASX_XVMADDWEV): Remove.
	(UNSPEC_LASX_XVMADDWEV2): Remove.
	(UNSPEC_LASX_XVMADDWEV3): Remove.
	(UNSPEC_LASX_XVMADDWOD): Remove.
	(UNSPEC_LASX_XVMADDWOD2): Remove.
	(UNSPEC_LASX_XVMADDWOD3): Remove.
	(lasx_xvmaddwev_h_b<u>): Remove.
	(lasx_xvmaddwev_w_h<u>): Remove.
	(lasx_xvmaddwev_d_w<u>): Remove.
	(lasx_xvmaddwev_q_d): Remove.
	(lasx_xvmaddwod_h_b<u>): Remove.
	(lasx_xvmaddwod_w_h<u>): Remove.
	(lasx_xvmaddwod_d_w<u>): Remove.
	(lasx_xvmaddwod_q_d): Remove.
	(lasx_xvmaddwev_q_du): Remove.
	(lasx_xvmaddwod_q_du): Remove.
	(lasx_xvmaddwev_h_bu_b): Remove.
	(lasx_xvmaddwev_w_hu_h): Remove.
	(lasx_xvmaddwev_d_wu_w): Remove.
	(lasx_xvmaddwev_q_du_d): Remove.
	(lasx_xvmaddwod_h_bu_b): Remove.
	(lasx_xvmaddwod_w_hu_h): Remove.
	(lasx_xvmaddwod_d_wu_w): Remove.
	(lasx_xvmaddwod_q_du_d): Remove.
	* config/loongarch/lsx.md (UNSPEC_LSX_VMADDWEV): Remove.
	(UNSPEC_LSX_VMADDWEV2): Remove.
	(UNSPEC_LSX_VMADDWEV3): Remove.
	(UNSPEC_LSX_VMADDWOD): Remove.
	(UNSPEC_LSX_VMADDWOD2): Remove.
	(UNSPEC_LSX_VMADDWOD3): Remove.
	(lsx_vmaddwev_h_b<u>): Remove.
	(lsx_vmaddwev_w_h<u>): Remove.
	(lsx_vmaddwev_d_w<u>): Remove.
	(lsx_vmaddwev_q_d): Remove.
	(lsx_vmaddwod_h_b<u>): Remove.
	(lsx_vmaddwod_w_h<u>): Remove.
	(lsx_vmaddwod_d_w<u>): Remove.
	(lsx_vmaddwod_q_d): Remove.
	(lsx_vmaddwev_q_du): Remove.
	(lsx_vmaddwod_q_du): Remove.
	(lsx_vmaddwev_h_bu_b): Remove.
	(lsx_vmaddwev_w_hu_h): Remove.
	(lsx_vmaddwev_d_wu_w): Remove.
	(lsx_vmaddwev_q_du_d): Remove.
	(lsx_vmaddwod_h_bu_b): Remove.
	(lsx_vmaddwod_w_hu_h): Remove.
	(lsx_vmaddwod_d_wu_w): Remove.
	(lsx_vmaddwod_q_du_d): Remove.
	* config/loongarch/simd.md (simd_maddw_evod_<mode>_<su>):
	New define_insn.
	(<simd_isa>_<x>vmaddw<ev_od>_<simdfmt_w>_<simdfmt><u>): New
	define_expand.
	(simd_maddw_evod_<mode>_hetero): New define_insn.
	(<simd_isa>_<x>vmaddw<ev_od>_<simdfmt_w>_<simdfmt>u_<simdfmt>):
	New define_expand.
	(<simd_isa>_maddw<ev_od>_q_d<u>_punned): New define_expand.
	(<simd_isa>_maddw<ev_od>_q_du_d_punned): New define_expand.
	* config/loongarch/loongarch-builtins.cc
	(CODE_FOR_lsx_vmaddwev_q_d): Define as a macro to override it
	with the punned expand.
	(CODE_FOR_lsx_vmaddwev_q_du): Likewise.
	(CODE_FOR_lsx_vmaddwev_q_du_d): Likewise.
	(CODE_FOR_lsx_vmaddwod_q_d): Likewise.
	(CODE_FOR_lsx_vmaddwod_q_du): Likewise.
	(CODE_FOR_lsx_vmaddwod_q_du_d): Likewise.
	(CODE_FOR_lasx_xvmaddwev_q_d): Likewise.
	(CODE_FOR_lasx_xvmaddwev_q_du): Likewise.
	(CODE_FOR_lasx_xvmaddwev_q_du_d): Likewise.
	(CODE_FOR_lasx_xvmaddwod_q_d): Likewise.
	(CODE_FOR_lasx_xvmaddwod_q_du): Likewise.
	(CODE_FOR_lasx_xvmaddwod_q_du_d): Likewise.

f727a4c5

LoongArch: Simplify {lsx_,lasx_x}vh{add,sub}w description · 2ca759fc

Xi Ruoyao authored 7 months ago

Like what we've done for {lsx_,lasx_x}v{add,sub,mul}l{ev,od}, use
special predicates and TImode RTL instead of hard-coded const vectors
and UNSPECs.

gcc/ChangeLog:

	* config/loongarch/lasx.md (UNSPEC_LASX_XVHADDW_Q_D): Remove.
	(UNSPEC_LASX_XVHSUBW_Q_D): Remove.
	(UNSPEC_LASX_XVHADDW_QU_DU): Remove.
	(UNSPEC_LASX_XVHSUBW_QU_DU): Remove.
	(lasx_xvh<addsub:optab>w_h<u>_b<u>): Remove.
	(lasx_xvh<addsub:optab>w_w<u>_h<u>): Remove.
	(lasx_xvh<addsub:optab>w_d<u>_w<u>): Remove.
	(lasx_xvhaddw_q_d): Remove.
	(lasx_xvhsubw_q_d): Remove.
	(lasx_xvhaddw_qu_du): Remove.
	(lasx_xvhsubw_qu_du): Remove.
	(reduc_plus_scal_v4di): Call gen_lasx_haddw_q_d_punned instead
	of gen_lasx_xvhaddw_q_d.
	(reduc_plus_scal_v8si): Likewise.
	* config/loongarch/lsx.md (UNSPEC_LSX_VHADDW_Q_D): Remove.
	(UNSPEC_ASX_VHSUBW_Q_D): Remove.
	(UNSPEC_ASX_VHADDW_QU_DU): Remove.
	(UNSPEC_ASX_VHSUBW_QU_DU): Remove.
	(lsx_vh<addsub:optab>w_h<u>_b<u>): Remove.
	(lsx_vh<addsub:optab>w_w<u>_h<u>): Remove.
	(lsx_vh<addsub:optab>w_d<u>_w<u>): Remove.
	(lsx_vhaddw_q_d): Remove.
	(lsx_vhsubw_q_d): Remove.
	(lsx_vhaddw_qu_du): Remove.
	(lsx_vhsubw_qu_du): Remove.
	(reduc_plus_scal_v2di): Change the temporary register mode to
	V1TI, and pun the mode calling gen_vec_extractv2didi.
	(reduc_plus_scal_v4si): Change the temporary register mode to
	V1TI.
	* config/loongarch/simd.md (simd_h<optab>w_<mode>_<su>): New
	define_insn.
	(<simd_isa>_<x>vh<optab>w_<simdfmt_w><u>_<simdfmt><u>): New
	define_expand.
	(<simd_isa>_h<optab>w_q<u>_d<u>_punned): New define_expand.
	* config/loongarch/loongarch-builtins.cc
	(CODE_FOR_lsx_vhaddw_q_d): Define as a macro to override with
	punned expand.
	(CODE_FOR_lsx_vhaddw_qu_du): Likewise.
	(CODE_FOR_lsx_vhsubw_q_d): Likewise.
	(CODE_FOR_lsx_vhsubw_qu_du): Likewise.
	(CODE_FOR_lasx_xvhaddw_q_d): Likewise.
	(CODE_FOR_lasx_xvhaddw_qu_du): Likewise.
	(CODE_FOR_lasx_xvhsubw_q_d): Likewise.
	(CODE_FOR_lasx_xvhsubw_qu_du): Likewise.

2ca759fc

LoongArch: Simplify {lsx_,lasx_x}v{add,sub,mul}l{ev,od} description · a36c15aa

Xi Ruoyao authored 1 month ago

These pattern definitions are tediously long, invoking 32 UNSPECs and
many hard-coded long const vectors.  To simplify them, at first we use
the TImode vector operations instead of the UNSPECs, then we adopt an
approach in AArch64: using a special predicate to match the const
vectors for odd/even indices for define_insn's, and generate those
vectors in define_expand's.

For "backward compatibilty" we need to provide a "punned" version for
the operations invoking TImode vectors as the intrinsics still expect
DImode vectors.

The stat is "201 insertions, 905 deletions."

gcc/ChangeLog:

	* config/loongarch/lasx.md (UNSPEC_LASX_XVADDWEV): Remove.
	(UNSPEC_LASX_XVADDWEV2): Remove.
	(UNSPEC_LASX_XVADDWEV3): Remove.
	(UNSPEC_LASX_XVSUBWEV): Remove.
	(UNSPEC_LASX_XVSUBWEV2): Remove.
	(UNSPEC_LASX_XVMULWEV): Remove.
	(UNSPEC_LASX_XVMULWEV2): Remove.
	(UNSPEC_LASX_XVMULWEV3): Remove.
	(UNSPEC_LASX_XVADDWOD): Remove.
	(UNSPEC_LASX_XVADDWOD2): Remove.
	(UNSPEC_LASX_XVADDWOD3): Remove.
	(UNSPEC_LASX_XVSUBWOD): Remove.
	(UNSPEC_LASX_XVSUBWOD2): Remove.
	(UNSPEC_LASX_XVMULWOD): Remove.
	(UNSPEC_LASX_XVMULWOD2): Remove.
	(UNSPEC_LASX_XVMULWOD3): Remove.
	(lasx_xv<addsubmul:optab>wev_h_b<u>): Remove.
	(lasx_xv<addsubmul:optab>wev_w_h<u>): Remove.
	(lasx_xv<addsubmul:optab>wev_d_w<u>): Remove.
	(lasx_xvaddwev_q_d): Remove.
	(lasx_xvsubwev_q_d): Remove.
	(lasx_xvmulwev_q_d): Remove.
	(lasx_xv<addsubmul:optab>wod_h_b<u>): Remove.
	(lasx_xv<addsubmul:optab>wod_w_h<u>): Remove.
	(lasx_xv<addsubmul:optab>wod_d_w<u>): Remove.
	(lasx_xvaddwod_q_d): Remove.
	(lasx_xvsubwod_q_d): Remove.
	(lasx_xvmulwod_q_d): Remove.
	(lasx_xvaddwev_q_du): Remove.
	(lasx_xvsubwev_q_du): Remove.
	(lasx_xvmulwev_q_du): Remove.
	(lasx_xvaddwod_q_du): Remove.
	(lasx_xvsubwod_q_du): Remove.
	(lasx_xvmulwod_q_du): Remove.
	(lasx_xv<addmul:optab>wev_h_bu_b): Remove.
	(lasx_xv<addmul:optab>wev_w_hu_h): Remove.
	(lasx_xv<addmul:optab>wev_d_wu_w): Remove.
	(lasx_xv<addmul:optab>wod_h_bu_b): Remove.
	(lasx_xv<addmul:optab>wod_w_hu_h): Remove.
	(lasx_xv<addmul:optab>wod_d_wu_w): Remove.
	(lasx_xvaddwev_q_du_d): Remove.
	(lasx_xvsubwev_q_du_d): Remove.
	(lasx_xvmulwev_q_du_d): Remove.
	(lasx_xvaddwod_q_du_d): Remove.
	(lasx_xvsubwod_q_du_d): Remove.
	* config/loongarch/lsx.md (UNSPEC_LSX_XVADDWEV): Remove.
	(UNSPEC_LSX_VADDWEV2): Remove.
	(UNSPEC_LSX_VADDWEV3): Remove.
	(UNSPEC_LSX_VSUBWEV): Remove.
	(UNSPEC_LSX_VSUBWEV2): Remove.
	(UNSPEC_LSX_VMULWEV): Remove.
	(UNSPEC_LSX_VMULWEV2): Remove.
	(UNSPEC_LSX_VMULWEV3): Remove.
	(UNSPEC_LSX_VADDWOD): Remove.
	(UNSPEC_LSX_VADDWOD2): Remove.
	(UNSPEC_LSX_VADDWOD3): Remove.
	(UNSPEC_LSX_VSUBWOD): Remove.
	(UNSPEC_LSX_VSUBWOD2): Remove.
	(UNSPEC_LSX_VMULWOD): Remove.
	(UNSPEC_LSX_VMULWOD2): Remove.
	(UNSPEC_LSX_VMULWOD3): Remove.
	(lsx_v<addsubmul:optab>wev_h_b<u>): Remove.
	(lsx_v<addsubmul:optab>wev_w_h<u>): Remove.
	(lsx_v<addsubmul:optab>wev_d_w<u>): Remove.
	(lsx_vaddwev_q_d): Remove.
	(lsx_vsubwev_q_d): Remove.
	(lsx_vmulwev_q_d): Remove.
	(lsx_v<addsubmul:optab>wod_h_b<u>): Remove.
	(lsx_v<addsubmul:optab>wod_w_h<u>): Remove.
	(lsx_v<addsubmul:optab>wod_d_w<u>): Remove.
	(lsx_vaddwod_q_d): Remove.
	(lsx_vsubwod_q_d): Remove.
	(lsx_vmulwod_q_d): Remove.
	(lsx_vaddwev_q_du): Remove.
	(lsx_vsubwev_q_du): Remove.
	(lsx_vmulwev_q_du): Remove.
	(lsx_vaddwod_q_du): Remove.
	(lsx_vsubwod_q_du): Remove.
	(lsx_vmulwod_q_du): Remove.
	(lsx_v<addmul:optab>wev_h_bu_b): Remove.
	(lsx_v<addmul:optab>wev_w_hu_h): Remove.
	(lsx_v<addmul:optab>wev_d_wu_w): Remove.
	(lsx_v<addmul:optab>wod_h_bu_b): Remove.
	(lsx_v<addmul:optab>wod_w_hu_h): Remove.
	(lsx_v<addmul:optab>wod_d_wu_w): Remove.
	(lsx_vaddwev_q_du_d): Remove.
	(lsx_vsubwev_q_du_d): Remove.
	(lsx_vmulwev_q_du_d): Remove.
	(lsx_vaddwod_q_du_d): Remove.
	(lsx_vsubwod_q_du_d): Remove.
	(lsx_vmulwod_q_du_d): Remove.
	* config/loongarch/loongarch-modes.def: Add V4TI and V1DI.
	* config/loongarch/loongarch-protos.h
	(loongarch_gen_stepped_int_parallel): New function prototype.
	* config/loongarch/loongarch.cc (loongarch_print_operand):
	Accept 'O' for printing "ev" or "od."
	(loongarch_gen_stepped_int_parallel): Implement.
	* config/loongarch/predicates.md
	(vect_par_cnst_even_or_odd_half): New define_predicate.
	* config/loongarch/simd.md (WVEC_HALF): New define_mode_attr.
	(simdfmt_w): Likewise.
	(zero_one): New define_int_iterator.
	(ev_od): New define_int_attr.
	(simd_<optab>w_evod_<mode:IVEC>_<su>): New define_insn.
	(<simd_isa>_<x>v<optab>w<ev_od>_<simdfmt_w>_<simdfmt><u>): New
	define_expand.
	(simd_<optab>w_evod_<mode>_hetero): New define_insn.
	(<simd_isa>_<x>v<optab>w<ev_od>_<simdfmt_w>_<simdfmt>u_<simdfmt>):
	New define_expand.
	(DIVEC): New define_mode_iterator.
	(<simd_isa>_<optab>w<ev_od>_q_d<u>_punned): New define_expand.
	(<simd_isa>_<optab>w<ev_od>_q_du_d_punned): Likewise.
	* config/loongarch/loongarch-builtins.cc
	(CODE_FOR_lsx_vaddwev_q_d): Define as a macro to override it
	with the punned expand.
	(CODE_FOR_lsx_vaddwev_q_du): Likewise.
	(CODE_FOR_lsx_vsubwev_q_d): Likewise.
	(CODE_FOR_lsx_vsubwev_q_du): Likewise.
	(CODE_FOR_lsx_vmulwev_q_d): Likewise.
	(CODE_FOR_lsx_vmulwev_q_du): Likewise.
	(CODE_FOR_lsx_vaddwod_q_d): Likewise.
	(CODE_FOR_lsx_vaddwod_q_du): Likewise.
	(CODE_FOR_lsx_vsubwod_q_d): Likewise.
	(CODE_FOR_lsx_vsubwod_q_du): Likewise.
	(CODE_FOR_lsx_vmulwod_q_d): Likewise.
	(CODE_FOR_lsx_vmulwod_q_du): Likewise.
	(CODE_FOR_lsx_vaddwev_q_du_d): Likewise.
	(CODE_FOR_lsx_vmulwev_q_du_d): Likewise.
	(CODE_FOR_lsx_vaddwod_q_du_d): Likewise.
	(CODE_FOR_lsx_vmulwod_q_du_d): Likewise.
	(CODE_FOR_lasx_xvaddwev_q_d): Likewise.
	(CODE_FOR_lasx_xvaddwev_q_du): Likewise.
	(CODE_FOR_lasx_xvsubwev_q_d): Likewise.
	(CODE_FOR_lasx_xvsubwev_q_du): Likewise.
	(CODE_FOR_lasx_xvmulwev_q_d): Likewise.
	(CODE_FOR_lasx_xvmulwev_q_du): Likewise.
	(CODE_FOR_lasx_xvaddwod_q_d): Likewise.
	(CODE_FOR_lasx_xvaddwod_q_du): Likewise.
	(CODE_FOR_lasx_xvsubwod_q_d): Likewise.
	(CODE_FOR_lasx_xvsubwod_q_du): Likewise.
	(CODE_FOR_lasx_xvmulwod_q_d): Likewise.
	(CODE_FOR_lasx_xvmulwod_q_du): Likewise.
	(CODE_FOR_lasx_xvaddwev_q_du_d): Likewise.
	(CODE_FOR_lasx_xvmulwev_q_du_d): Likewise.
	(CODE_FOR_lasx_xvaddwod_q_du_d): Likewise.
	(CODE_FOR_lasx_xvmulwod_q_du_d): Likewise.

a36c15aa

LoongArch: Allow moving TImode vectors · ac1b0586

Xi Ruoyao authored 1 month ago

We have some vector instructions for operations on 128-bit integer, i.e.
TImode, vectors.  Previously they had been modeled with unspecs, but
it's more natural to just model them with TImode vector RTL expressions.

For the preparation, allow moving V1TImode and V2TImode vectors in LSX
and LASX registers so we won't get a reload failure when we start to
save TImode vectors in these registers.

This implicitly depends on the vrepli optimization: without it we'd try
"vrepli.q" which does not really exist and trigger an ICE.

gcc/ChangeLog:

	* config/loongarch/lsx.md (mov<LSX:mode>): Remove.
	(movmisalign<LSX:mode>): Remove.
	(mov<LSX:mode>_lsx): Remove.
	* config/loongarch/lasx.md (mov<LASX:mode>): Remove.
	(movmisalign<LASX:mode>): Remove.
	(mov<LASX:mode>_lasx): Remove.
	* config/loongarch/loongarch-modes.def (V1TI): Add.
	(V2TI): Mention in the comment.
	* config/loongarch/loongarch.md (mode): Add V1TI and V2TI.
	* config/loongarch/simd.md (ALLVEC_TI): New mode iterator.
	(mov<ALLVEC_TI:mode): New define_expand.
	(movmisalign<ALLVEC_TI:mode>): Likewise.
	(mov<ALLVEC_TI:mode>_simd): New define_insn_and_split.

ac1b0586

LoongArch: Try harder using vrepli instructions to materialize const vectors · ed979454

Xi Ruoyao authored 1 month ago

For

  a = (v4si){0xdddddddd, 0xdddddddd, 0xdddddddd, 0xdddddddd}

we just want

  vrepli.b $vr0, 0xdd

but the compiler actually produces a load:

  la.local $r14,.LC0
  vld      $vr0,$r14,0

It's because we only tried vrepli.d which wouldn't work.  Try all vrepli
instructions for const int vector materializing to fix it.

gcc/ChangeLog:

	* config/loongarch/loongarch-protos.h
	(loongarch_const_vector_vrepli): New function prototype.
	* config/loongarch/loongarch.cc (loongarch_const_vector_vrepli):
	Implement.
	(loongarch_const_insns): Call loongarch_const_vector_vrepli
	instead of loongarch_const_vector_same_int_p.
	(loongarch_split_vector_move_p): Likewise.
	(loongarch_output_move): Use loongarch_const_vector_vrepli to
	pun operend[1] into a better mode if it's a const int vector,
	and decide the suffix of [x]vrepli with the new mode.
	* config/loongarch/constraints.md (YI): Call
	loongarch_const_vector_vrepli instead of
	loongarch_const_vector_same_int_p.

gcc/testsuite/ChangeLog:

	* gcc.target/loongarch/vrepli.c: New test.

ed979454

LoongArch: Accept ADD, IOR or XOR when combining objects with no bits in common [PR115478] · ea3ebe48

Xi Ruoyao authored 1 month ago

Since r15-1120, multi-word shifts/rotates produces PLUS instead of IOR.
It's generally a good thing (allowing to use our alsl instruction or
similar instrunction on other architectures), but it's preventing us
from using bytepick.  For example, if we shift a __int128 by 16 bits,
the higher word can be produced via a single bytepick.d instruction with
immediate 2, but we got:

	srli.d	$r12,$r4,48
	slli.d	$r5,$r5,16
	slli.d	$r4,$r4,16
	add.d	$r5,$r12,$r5
	jr	$r1

This wasn't work with GCC 14, but after r15-6490 it's supposed to work
if IOR was used instead of PLUS.

To fix this, add a code iterator to match IOR, XOR, and PLUS and use it
instead of just IOR if we know the operands have no overlapping bits.

gcc/ChangeLog:

	PR target/115478
	* config/loongarch/loongarch.md (any_or_plus): New
	define_code_iterator.
	(bstrins_<mode>_for_ior_mask): Use any_or_plus instead of ior.
	(bytepick_w_<bytepick_imm>): Likewise.
	(bytepick_d_<bytepick_imm>): Likewise.
	(bytepick_d_<bytepick_imm>_rev): Likewise.

gcc/testsuite/ChangeLog:

	PR target/115478
	* gcc.target/loongarch/bytepick_shift_128.c: New test.

ea3ebe48

[PR middle-end/113525] Drop obsolete options from documentation · 3e93035f

Jeff Law authored 4 weeks ago

The sibling and unshare passes were dropped as distinct passes 10+ years ago.
Docs weren't ever updated.  This just removes them; given their age I don't
think we need to keep them around any longer.

	PR middle-end/113525

gcc/
	* doc/invoke.texi (dump-rtl-sibling): Drop documentation for pass
	removed long ago.
	(dump-rtl-unshare): Likewise.

3e93035f

Daily bump. · db7b21ac
GCC Administrator authored 4 weeks ago

db7b21ac

Feb 18, 2025

Fix description of file-cache-lines/file-cache-files params · 29482d4e

Andi Kleen authored 4 weeks ago

The file-cache-lines / file-cache-files tunables were documented in the
wrong section. Fix that.

Reported-by: Filip Kastl

Comitted as obvious.

gcc/ChangeLog:

	* doc/invoke.texi:

29482d4e