Commits · 2ac01a4efceacb9f2f9433db636545885296da0a · COBOLworx / gcc-cobol

Oct 23, 2024

AArch64: Remove redundant check in aarch64_simd_mov · 2ac01a4e

Wilco Dijkstra authored 5 months ago

The split condition in aarch64_simd_mov uses aarch64_simd_special_constant_p.
While doing the split, it checks the mode before calling
aarch64_maybe_generate_simd_constant.  This risky since it may result in
unexpectedly calling aarch64_split_simd_move instead of
aarch64_maybe_generate_simd_constant.  Since the mode is already checked,
remove the spurious explicit mode check.

gcc/ChangeLog:

	* config/aarch64/aarch64-simd.md (aarch64_simd_mov<VQMOV:mode>):
	Remove redundant mode check.

2ac01a4e

AArch64: Fix copysign patterns · 7c7c895c

Wilco Dijkstra authored 5 months ago

The current copysign pattern has a mismatch in the predicates and constraints -
operand[2] is a register_operand but also has an alternative X which allows any
operand.  Since it is a floating point operation, having an integer alternative
makes no sense.  Change the expander to always use vector immediates which
results in better code and sharing of immediates between copysign and xorsign.

gcc/ChangeLog:

	* config/aarch64/aarch64.md (copysign<GPF:mode>3): Widen immediate to
	vector.
	(copysign<GPF:mode>3_insn): Use VQ_INT_EQUIV in operand 3.
	* config/aarch64/iterators.md (VQ_INT_EQUIV): New iterator.
	(vq_int_equiv): Likewise.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/copysign_3.c: New test.
	* gcc.target/aarch64/copysign_4.c: New test.
	* gcc.target/aarch64/fneg-abs_2.c: Fixup test.
	* gcc.target/aarch64/sve/fneg-abs_2.c: Likewise.

7c7c895c

doc: remove obsolete deprecated info · 2b666dc4

Jason Merrill authored 5 months ago

These formerly deprecated features eventually made it into the C++ standard.

gcc/ChangeLog:

	* doc/extend.texi (Deprecated Features): Remove text about some
	no-longer-deprecated features.

2b666dc4

AArch64: Add support for SIMD xor immediate (3/3) · 22a37534

Wilco Dijkstra authored 5 months ago

Add support for SVE xor immediate when generating AdvSIMD code and SVE is
available.

gcc/ChangeLog:

	* config/aarch64/aarch64.cc (enum simd_immediate_check): Add
	AARCH64_CHECK_XOR.
	(aarch64_simd_valid_xor_imm): New function.
	(aarch64_output_simd_imm): Add AARCH64_CHECK_XOR support.
	(aarch64_output_simd_xor_imm): New function.
	* config/aarch64/aarch64-protos.h (aarch64_output_simd_xor_imm): New
	prototype.
	(aarch64_simd_valid_xor_imm): New prototype.
	* config/aarch64/aarch64-simd.md (xor<mode>3<vczle><vczbe>):
	Use aarch64_reg_or_xor_imm predicate and add an immediate alternative.
	* config/aarch64/predicates.md (aarch64_reg_or_xor_imm): Add new
	predicate.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/sve/simd_imm.c: New test.

22a37534

AArch64: Improve SIMD immediate generation (2/3) · 756890d6

Wilco Dijkstra authored 5 months ago

Allow use of SVE immediates when generating AdvSIMD code and SVE is available.
First check for a valid AdvSIMD immediate, and if SVE is available, try using
an SVE move or bitmask immediate.

gcc/ChangeLog:

	* config/aarch64/aarch64-simd.md (ior<mode>3<vczle><vczbe>):
	Use aarch64_reg_or_orr_imm predicate.  Combine SVE/AdvSIMD immediates
	and use aarch64_output_simd_orr_imm.
	* config/aarch64/aarch64.cc (struct simd_immediate_info): Add SVE_MOV.
	(aarch64_sve_valid_immediate): Use SVE_MOV for SVE move immediates.
	(aarch64_simd_valid_imm): Enable SVE SIMD immediates when possible.
	(aarch64_output_simd_imm): Support emitting SVE SIMD immediates.
	* config/aarch64/predicates.md (aarch64_orr_imm_sve_advsimd): Remove.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/sve/acle/asm/insr_s64.c: Allow SVE MOV imm.
	* gcc.target/aarch64/sve/acle/asm/insr_u64.c: Likewise.
	* gcc.target/aarch64/sve/fneg-abs_1.c: Update to check for ORRI.
	* gcc.target/aarch64/sve/fneg-abs_2.c: Likewise.
	* gcc.target/aarch64/sve/simd_imm_mov.c: New test.

756890d6

AArch64: Improve SIMD immediate generation (1/3) · bcbf4fa4

Wilco Dijkstra authored 5 months ago

Cleanup the various interfaces related to SIMD immediate generation.  Introduce
new functions that make it clear which operation (AND, OR, MOV) we are testing
for rather than guessing the final instruction.  Reduce the use of overly long
names, unused and default parameters for clarity.  No changes to internals or
generated code.

gcc/ChangeLog:

	* config/aarch64/aarch64-protos.h (enum simd_immediate_check): Move to aarch64.cc.
	(aarch64_output_simd_mov_immediate): Remove.
	(aarch64_output_simd_mov_imm): New prototype.
	(aarch64_output_simd_orr_imm): Likewise.
	(aarch64_output_simd_and_imm): Likewise.
	(aarch64_simd_valid_immediate): Remove.
	(aarch64_simd_valid_and_imm): New prototype.
	(aarch64_simd_valid_mov_imm): Likewise.
	(aarch64_simd_valid_orr_imm): Likewise.
	* config/aarch64/aarch64-simd.md: Use aarch64_output_simd_mov_imm.
	* config/aarch64/aarch64.cc (enum simd_immediate_check): Moved from aarch64-protos.h.
	Use AARCH64_CHECK_AND rather than AARCH64_CHECk_BIC.
	(aarch64_expand_sve_const_vector): Use aarch64_simd_valid_mov_imm.
	(aarch64_expand_mov_immediate): Likewise.
	(aarch64_can_const_movi_rtx_p): Likewise.
	(aarch64_secondary_reload): Likewise.
	(aarch64_legitimate_constant_p): Likewise.
	(aarch64_advsimd_valid_immediate): Simplify checks on 'which' param.
	(aarch64_sve_valid_immediate): Add extra param for move vs logical.
	(aarch64_simd_valid_immediate): Rename to aarch64_simd_valid_imm.
	(aarch64_simd_valid_mov_imm): New function.
	(aarch64_simd_valid_orr_imm): Likewise.
	(aarch64_simd_valid_and_imm): Likewise.
	(aarch64_mov_operand_p): Use aarch64_simd_valid_mov_imm.
	(aarch64_simd_scalar_immediate_valid_for_move): Likewise.
	(aarch64_simd_make_constant): Likewise.
	(aarch64_expand_vector_init_fallback): Likewise.
	(aarch64_output_simd_mov_immediate): Rename to aarch64_output_simd_imm.
	(aarch64_output_simd_orr_imm): New function.
	(aarch64_output_simd_and_imm): Likewise.
	(aarch64_output_simd_mov_imm): Likewise.
	(aarch64_output_scalar_simd_mov_immediate): Use aarch64_output_simd_mov_imm.
	(aarch64_output_sve_mov_immediate): Use aarch64_simd_valid_imm.
	(aarch64_output_sve_ptrues): Likewise.
	* config/aarch64/constraints.md (Do): Use aarch64_simd_valid_orr_imm.
	(Db): Use aarch64_simd_valid_and_imm.
	* config/aarch64/predicates.md (aarch64_reg_or_bic_imm): Use aarch64_simd_valid_orr_imm.
	(aarch64_reg_or_and_imm): Use aarch64_simd_valid_and_imm.

bcbf4fa4

Fix ICE due to isa mismatch for the builtins. · 403e361d

liuhongt authored 5 months ago

gcc/ChangeLog:

	PR target/117240
	* config/i386/i386-builtin.def: Add avx/avx512f to vaes
	ymm/zmm builtins.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/pr117240_avx.c: New test.
	* gcc.target/i386/pr117240_avx512f.c: New test.

403e361d

Fortran: Minor follow-up cleanup to error.cc · 0ecc45a8

Tobias Burnus authored 5 months ago

Follow up to r15-4268-g459c6018d2308d, which removed dead code,
but missing that terminal_width was only set but not used.

gcc/fortran/ChangeLog:

	* error.cc (terminal_width, gfc_get_terminal_width): Remove.
	(gfc_error_init_1): Do not call one to set the other.

0ecc45a8

tree-sra: Avoid SRAing arguments to a function returning_twice (PR 117142) · 29d8f1f0

Martin Jambor authored 5 months ago

PR 117142 shows that the current SRA probably never worked reliably
with arguments passed to a function returning twice, because it then
creates statements before the call which however needs to be at the
beginning of a basic block.

While it should be possible to make at least the case of passing
arguments by value work with SRA (the statements would need to be put
just on the non-abnormal edges leading to the BB), this would mean
large surgery of function sra_modify_expr and I guess the time would
better be spent re-organizing the whole pass.

gcc/ChangeLog:

2024-10-21  Martin Jambor  <mjambor@suse.cz>

	PR tree-optimization/117142
	* tree-sra.cc (build_access_from_call_arg): Disqualify any
	candidate passed to a function returning twice.

gcc/testsuite/ChangeLog:

2024-10-21  Martin Jambor  <mjambor@suse.cz>

	PR tree-optimization/117142
	* gcc.dg/tree-ssa/pr117142.c: New test.

29d8f1f0

c-family: Regenerate c.opt.urls · 89c92804

Jakub Jelinek authored 5 months ago

Forgot to regenerate urls after -Wleading-whitespace addition.

2024-10-23  Jakub Jelinek  <jakub@redhat.com>

	* c.opt.urls: Regenerate.

89c92804

libcpp: Add -Wleading-whitespace= warning · d4499a23

Jakub Jelinek authored 5 months ago

The following patch on top of the r15-4346 patch adds
-Wleading-whitespace= warning option.
This warning doesn't care how much one actually indents which line
in the source (that is something that can't be easily done in the
preprocessor without doing syntactic analysis), but just simple checks
on what kind of whitespace is used in the indentation.
I think it is still useful to get warnings about such issues early,
while git diagnoses some of it in patches (e.g. the tab after space
case), getting the warnings earlier might help avoiding such issues
sooner.

There are projects which ban use of tabs and require just spaces,
others which require indentation just with horizontal tabs, and finally
projects which want indentation with tabs for multiples of tabstop size
followed by spaces (fewer than tabstop size), like GCC.
For all 3 kinds the warning diagnoses indentation with '\v' or '\f'
characters (unless line contains just whitespace), and for the last one
also cases where a space in the indentation is followed by horizontal
tab or where there are N or more consecutive spaces in the indentation
(for -ftabstop=N).

BTW, for additional testing I've enabled the warnings (without -Werror
for them) in stage3.  There are many warnings (both trailing and leading
whitespace), some of them something that can be easily fixed in the headers
or source files, but others with whitespace issues in generated sources,
so if we enable the warnings, either we'd need to adjust the generators
or disable the warnings in (some of the) generated files.

2024-10-23  Jakub Jelinek  <jakub@redhat.com>

libcpp/
	* include/cpplib.h (struct cpp_options): Add
	cpp_warn_leading_whitespace and cpp_tabstop members.
	(enum cpp_warning_reason): Add CPP_W_LEADING_WHITESPACE.
	* internal.h (struct _cpp_line_note): Document new
	line note kinds.
	* init.cc (cpp_create_reader): Set cpp_tabstop to 8.
	* lex.cc (find_leading_whitespace_issues): New function.
	(_cpp_clean_line): Use it.
	(_cpp_process_line_notes): Handle 'L', 'S' and 'T' line notes.
	(lex_raw_string): Clear type on 'L', 'S' and 'T' line notes
	inside of raw string literals.
gcc/
	* doc/invoke.texi (Wleading-whitespace=): Document.
gcc/c-family/
	* c.opt (Wleading-whitespace=): New option.
	* c-opts.cc (c_common_post_options): Set cpp_opts->cpp_tabstop
	to global_dc->m_tabstop.
gcc/testsuite/
	* c-c++-common/cpp/Wleading-whitespace-1.c: New test.
	* c-c++-common/cpp/Wleading-whitespace-2.c: New test.
	* c-c++-common/cpp/Wleading-whitespace-3.c: New test.
	* c-c++-common/cpp/Wleading-whitespace-4.c: New test.

d4499a23

libstdc++: Always instantiate key_type to compute hash code [PR115285] · ee030b28

François Dumont authored 5 months ago

Even if it is possible to compute a hash code from the inserted arguments
we need to instantiate the key_type to guaranty hash code consistency.

Preserve the lazy instantiation of the mapped_type in the context of
associative containers.

libstdc++-v3/ChangeLog:

	PR libstdc++/115285
	* include/bits/hashtable.h (_S_forward_key<_Kt>): Always return a temporary
	key_type instance.
	* testsuite/23_containers/unordered_map/96088.cc: Adapt to additional instanciation.
	Also check that mapped_type is not instantiated when there is no insertion.
	* testsuite/23_containers/unordered_multimap/96088.cc: Adapt to additional
	instanciation.
	* testsuite/23_containers/unordered_multiset/96088.cc: Likewise.
	* testsuite/23_containers/unordered_set/96088.cc: Likewise.
	* testsuite/23_containers/unordered_set/pr115285.cc: New test case.

ee030b28

i386: Optimize EQ/NE comparison between avx512 kmask and -1. · ee7e77e9

liuhongt authored 5 months ago

r15-974-gbf7745f887c765e06f2e75508f263debb60aeb2e has optimized for
jcc/setcc, but missed movcc.
The patch supports movcc.

gcc/ChangeLog:

	PR target/117232
	* config/i386/sse.md (*kortest_cmp<SWI1248_AVX512BWDQ_64:mode>_movqicc):
	New define_insn_and_split.
	(*kortest_cmp<SWI1248_AVX512BWDQ_64:mode>_mov<SWI248:mode>cc):
	Ditto.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/pr117232-1.c: New test.
	* gcc.target/i386/pr117232-apx-1.c: New test.

ee7e77e9

Daily bump. · 01ed5c62
GCC Administrator authored 5 months ago

01ed5c62

c: Restore "originally defined" struct redefinition messages for C23 · ecb55d94

Joseph Myers authored 5 months ago

One failure with a -std=gnu23 default that indicates a
quality-of-implementation regression in C23 mode is gcc.dg/pr39084.c,
which loses the expected "originally defined here" message on struct
redefinition errors (which occur in a different place in the front end
for C23 because it is necessary to see the members of the struct to
determine whether the redefinition is valid).  That message seems a
good thing to have both in and out of C23 mode, so add logic to
restore it in the C23 case.

Bootstrapped with no regressions for x86-64-pc-linux-gnu.

gcc/c/
	* c-decl.cc (c_struct_parse_info): Add member refloc.
	(start_struct): Store refloc in struct_parse_info.
	(finish_struct): Give "originally defined" message for C23 struct
	redefinition errors.

gcc/testsuite/
	* gcc.dg/gnu17-tag-1.c, gcc.dg/gnu23-tag-5.c: New tests.

ecb55d94

Oct 22, 2024

c++: non-dep structured binding decltype again [PR117107] · 71e13ea1

Jason Merrill authored 5 months ago

The patch for PR92687 handled the usual case of a decomp variable not being
in the table, but missed the case of there being nothing in the table yet.

	PR c++/117107
	PR c++/92687

gcc/cp/ChangeLog:

	* decl.cc (lookup_decomp_type): Handle null table.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp2a/decomp10.C: New test.

71e13ea1

c++: add testcase [PR116929] · 5c6c1aba

Jason Merrill authored 5 months ago

This testcase was fixed by r15-822-g0173dcce92baa6 .

	PR c++/116929

gcc/testsuite/ChangeLog:

	* g++.dg/modules/enum-14.C: New test.

5c6c1aba

libstdc++: Implement LWG 4166 changes to concat_view::end() · f191c830

Patrick Palka authored 5 months ago


This patch proactively implements the proposed resolution for this LWG
issue, which seems straightforward and slated to get approved as-is.

(No _GLIBCXX_RESOLVE_LIB_DEFECTS code comment is added since concat_view
is C++26, so this isn't a defect against a published standard.)

libstdc++-v3/ChangeLog:

	* include/std/ranges (concat_view::begin): Add space after
	'requires' starting a requires-clause.
	(concat_view::end): Likewise.  Refine condition for returning an
	iterator rather than default_sentinel as per LWG 4166.
	* testsuite/std/ranges/concat/1.cc (test03): Verify LWG 4166
	example.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

f191c830

c: Better fix for speed up compilation of large char array initializers when... · a6db5908

Jakub Jelinek authored 5 months ago

c: Better fix for speed up compilation of large char array initializers when not using #embed [PR117190]

On Wed, Oct 16, 2024 at 11:09:32PM +0200, Jakub Jelinek wrote:
> Apparently my
> c: Speed up compilation of large char array initializers when not using #embed
> patch broke building glibc.
>
> The issue is that when using CPP_EMBED, we are guaranteed by the
> preprocessor that there is CPP_NUMBER CPP_COMMA before it and
> CPP_COMMA CPP_NUMBER after it (or CPP_COMMA CPP_EMBED), so RAW_DATA_CST
> never ends up at the end of arrays of unknown length.
> Now, the c_parser_initval optimization attempted to preserve that property
> rather than changing everything that e.g. inferes array number of elements
> from the initializer etc. to deal with RAW_DATA_CST at the end, but
> it didn't take into account the possibility that there could be
> CPP_COMMA followed by CPP_CLOSE_BRACE (where the CPP_COMMA is redundant).
>
> As we are peaking already at 4 tokens in that code, peeking more would
> require using raw tokens and that seems to be expensive doing it for
> every pair of tokens due to vec_free done when we are out of raw tokens.

Sorry for rushing the previous patch too much, turns out I was wrong,
given that the c_parser_peek_nth_token numbering is 1 based, we can peek
also with c_parser_peek_nth_token (parser, 4) and the loop actually peeked
just at 3 tokens, not 4.

So, I think it is better to revert the previous patch (but keep the new
test) and instead peek the 4th non-raw token, which is what the following
patch does.

Additionally, PR117190 shows one further spot which missed the peek of
the token after CPP_COMMA, in case it is incomplete array with exactly 65
elements with redundant comma after it, which this patch handles too.

2024-10-22  Jakub Jelinek  <jakub@redhat.com>

	PR c/117190
gcc/c/
	* c-parser.cc (c_parser_initval): Revert 2024-10-17 changes.
	Instead peek the 4th token and if it is not CPP_NUMBER,
	handle it like 3rd token CPP_CLOSE_BRACE for orig_len == INT_MAX.
	Also, check (2 + 2 * i)th raw token for the orig_len == INT_MAX
	case and punt if it is not CPP_NUMBER.
gcc/testsuite/
	* c-c++-common/init-5.c: New test.

a6db5908

c-family: Fix up -Wsizeof-pointer-memaccess ICEs [PR117230] · 5fd1c0c1

Jakub Jelinek authored 5 months ago

In the following testcases, we ICE on all 4 function calls.
The problem is using TYPE_PRECISION on vector types (but guess it
would be similarly problematic on structures/unions/arrays).
The test only differentiates between suggestion what to do, whether
to supply explicit size because sizeof (*p) for
{,{,un}signed }char *p is not very likely what the user want, or
dereferencing the pointer, so I think limiting that suggestion
to integral types is ok.

2024-10-22  Jakub Jelinek  <jakub@redhat.com>

	PR c/117230
	* c-warn.cc (sizeof_pointer_memaccess_warning): Only compare
	TYPE_PRECISION of TREE_TYPE (type) to precision of char if
	TREE_TYPE (type) is integral type.

	* c-c++-common/Wsizeof-pointer-memaccess5.c: New test.

5fd1c0c1

varasm: Handle RAW_DATA_CST in compare_constant [PR117199] · f616bc41

Jakub Jelinek authored 5 months ago

On the following testcase without LTO we unnecessarily don't merge
two identical .LC* constants (constant hashing computes the same hash,
but as compare_constant returned false for the RAW_DATA_CST in it,
it never compares equal), and with LTO fails to link because LTO assumes such
constants have to be merged and so doesn't emit the other constant.

2024-10-22  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/117199
	* varasm.cc (compare_constant): Handle RAW_DATA_CST.  Formatting fix
	in the STRING_CST case.

	* gcc.dg/lto/pr117199_0.c: New test.

f616bc41

varasm: Fix up RAW_DATA_CST handling in array_size_for_constructor [PR117190] · 8f173da4

Jakub Jelinek authored 5 months ago

CONSTRUCTOR indices for arrays have bitsize type, and the r15-4375
patch actually got it right in 6 other spots, but not in this function,
where it used size_int rather than bitsize_int and so size_binop can ICE
on type mismatch.

This is covered by the init-5.c testcase I've just posted, though the ICE
goes away when the C FE is fixed (and when it is not, there is another
ICE).

2024-10-22  Jakub Jelinek  <jakub@redhat.com>

	PR c/117190
	* varasm.cc (array_size_for_constructor): For RAW_DATA_CST,
	use bitsize_int rather than size_int.

8f173da4

GCN: Initial generic-target handling, add more GCN macro defines · 1bdeebe6

Tobias Burnus authored 5 months ago

Newer llvm-mc assemblers support the gfx*-generic targets, permitting to
generate code for all GPUs belonging to the same generation, even if not
optimal code. This requires LLVM 19.

This patch adds the compiler-side support for generic gfx and also
adds -march=gfx10-3-generic and -march=gfx-11. However, those -march= are
not documented nor used anywhere, yet.

Disclaimer: Not tested (as my ROCm does not support it); additionally,
libgomp/plugin/plugin-gcn.c has to be updated before it becomes useful.

For better compatibility with LLVM's Clang, this commit additionally adds
the macro definitions __GFX<9|10|11>__ for the architecture family,
__AMDGPU__ besides the existing __AMDGCN__ and the two strings-containing
macros __amdgcn_processor__ and __amdgcn_target_id__, where the former has
'-' replaced by '_' but otherwise both contain the lower case name. For the
new generic targets, the same happens, yielding, e.g., __gfx10_3_generic__.

gcc/ChangeLog:

	* config/gcn/gcn-devices.def: Add generic version/flag as additional
	value and architecture family entry; update; add gfx-10-3-generic
	and gfx11-generic.
	* config/gcn/gcn-hsa.h (ABI_VERSION_SPEC): Remove
	(ASM_SPEC): Use generated ABI_VERSION_OPT instead.
	* config/gcn/gcn-tables.opt: Regenerate
	* config/gcn/gcn.h (gcn_device_def): Add generic_version and
	arch_family members.
	(TARGET_CPU_CPP_BUILTINS): Fix allocation bug, handle '-' in the
	name and add additional macro defines.
	* config/gcn/gcn.cc (gcn_devices): Handle it.
	* config/gcn/gen-gcn-device-macros.awk: Likewise; use ELF name
	for the macro name; generate ABI_VERSION_OPT.
	* config/gcn/mkoffload.cc (ELFABIVERSION_AMDGPU_HSA_V6,
	EF_AMDGPU_GENERIC_VERSION_V, EF_AMDGPU_GENERIC_VERSION_OFFSET,
	GET_GENERIC_VERSION, SET_GENERIC_VERSION): Define.
	(get_arch): Call SET_GENERIC_VERSION flag on elf_flags.
	(copy_early_debug_info): If the arch sets the generic version,
	use ELFABIVERSION_AMDGPU_HSA_V6.

1bdeebe6

testsuite: arm: Use check-function-bodies in fp16-aapcs-* tests · 205515da

Torbjörn SVENSSON authored 5 months ago


Converted the tests to use check-function-bodies in order to ensure that
the sequence is correct.

gcc/testsuite/ChangeLog:

	* gcc.target/arm/fp16-aapcs-1.c: Use check-function-bodies.
	* gcc.target/arm/fp16-aapcs-2.c: Likewise.
	* gcc.target/arm/fp16-aapcs-3.c: Likewise.
	* gcc.target/arm/fp16-aapcs-4.c: Likewise.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>

205515da

testsuite: arm: Relax expected asm in bitfield* and union-2 tests · a79ca49b

Torbjörn SVENSSON authored 5 months ago


Below -O2, lsls/lsrs are prefered. For -O2 and above, lsl/lsr are
prefered.

gcc/testsuite/ChangeLog:

	* gcc.target/arm/cmse/mainline/8_1m/bitfield-4.c: Allow lsl and
	lsr instructions.
	* gcc.target/arm/cmse/mainline/8_1m/bitfield-6.c: Likewise.
	* gcc.target/arm/cmse/mainline/8_1m/bitfield-8.c: Likewise.
	* gcc.target/arm/cmse/mainline/8_1m/bitfield-and-union.c: Likewise.
	* gcc.target/arm/cmse/mainline/8_1m/union-2.c: Likewise.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>

a79ca49b

testsuite: arm: Use check-function-bodies in cmse-5 tests · 835ad52f

Torbjörn SVENSSON authored 5 months ago


Converted the tests to use check-function-bodies in order to ensure that
the sequence is correct.
This also allows both APSR_nzcvq and APSR_nzcvqg as target selector does
not work when the -march and/or -mcpu overrides the target to test.

gcc/testsuite/ChangeLog:

	* gcc.target/arm/cmse/mainline/8m/hard-sp/cmse-5.c: Use
	check-function-bodies.
	* gcc.target/arm/cmse/mainline/8m/hard/cmse-5.c: Likewise.
	* gcc.target/arm/cmse/mainline/8m/soft/cmse-5.c: Likewise.
	* gcc.target/arm/cmse/mainline/8m/softfp-sp/cmse-5.c: Likewise.
	* gcc.target/arm/cmse/mainline/8m/softfp/cmse-5.c: Likewise.
	* gcc.target/arm/cmse/mainline/8_1m/hard-sp/cmse-5.c: Likewise.
	* gcc.target/arm/cmse/mainline/8_1m/hard/cmse-5.c: Likewise.
	* gcc.target/arm/cmse/mainline/8_1m/soft/cmse-5.c: Likewise.
	* gcc.target/arm/cmse/mainline/8_1m/softfp-sp/cmse-5.c:
	Likewise.
	* gcc.target/arm/cmse/mainline/8_1m/softfp/cmse-5.c: Likewise.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>

835ad52f

libstdc++: Avoid using std::__to_address with iterators · 85e5b80e

Jonathan Wakely authored 5 months ago

In r12-3935-g82626be2d633a9 I added the partial specialization
std::pointer_traits<__normal_iterator<It, Cont>> so that __to_address
would work with __normal_iterator objects. Soon after that, François
replaced it in r12-6004-g807ad4bc854cae with an overload of __to_address
that served the same purpose, but was less complicated and less wrong.

I now think that both commits were mistakes, and that instead of adding
hacks to make __normal_iterator work with __to_address, we should not be
using __to_address with iterators at all before C++20.

The pre-C++20 std::__to_address function should only be used with
pointer-like types, specifically allocator_traits<A>::pointer types.
Those pointer-like types are guaranteed to be contiguous iterators, so
that getting a raw memory address from them is OK.

For arbitrary iterators, even random access iterators, we don't know
that it's safe to lower the iterator to a pointer e.g. for std::deque
iterators it's not, because (it + n) == (std::to_address(it) + n) only
holds within the same block of the deque's storage.

For C++20, std::to_address does work correctly for contiguous iterators,
including __normal_iterator, and __to_address just calls std::to_address
so also works. But we have to be sure we have an iterator that satisfies
the std::contiguous_iterator concept for it to be safe, and we can't
check that before C++20.

So for pre-C++20 code the correct way to handle iterators that might be
pointers or might be __normal_iterator is to call __niter_base, and if
necessary use is_pointer to check whether __niter_base returned a real
pointer.

We currently have some uses of std::__to_address with iterators where
we've checked that they're either pointers, or __normal_iterator
wrappers around pointers, or satisfy std::contiguous_iterator. But this
seems a little fragile, and it would be better to just use
std::__niter_base for the pointers and __normal_iterator cases, and use
C++20 std::to_address when the C++20 std::contiguous_iterator concept is
satisfied. This patch does that.

libstdc++-v3/ChangeLog:

	* include/bits/basic_string.h (basic_string::assign): Replace
	use of __to_address with __niter_base or std::to_address as
	appropriate.
	* include/bits/ptr_traits.h (__to_address): Add comment.
	* include/bits/shared_ptr_base.h (__shared_ptr): Qualify calls
	to __to_address.
	* include/bits/stl_algo.h (find): Replace use of __to_address
	with __niter_base or std::to_address as appropriate. Only use
	either of them when the range is not empty.
	* include/bits/stl_iterator.h (__to_address): Remove overload
	for __normal_iterator.
	* include/debug/safe_iterator.h (__to_address): Remove overload
	for _Safe_iterator.
	* include/std/ranges (views::counted): Replace use of
	__to_address with std::to_address.
	* testsuite/24_iterators/normal_iterator/to_address.cc: Removed.

85e5b80e

testsuite: Add test directive checking removal of link_error · bf11ecbb

Jennifer Schmitz authored 5 months ago


This test needs a directive checking the removal of the link_error.
Committed as obvious.

Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>

gcc/testsuite/
	* gcc.dg/tree-ssa/log_ident.c: Add scan for removal of
	link_error in optimized tree dump.

bf11ecbb

c++: redundant hashing in register_specialization · ae614b8a

Patrick Palka authored 5 months ago


After r15-4050-g5dad738c1dd164 register_specialization needs to set
elt.hash to the (maybe) precomputed hash so that the lookup uses it
rather than redundantly computing it from scratch.

gcc/cp/ChangeLog:

	* pt.cc (register_specialization): Set elt.hash.

Reviewed-by: Jason Merrill <jason@redhat.com>

ae614b8a

testsuite: Skip pr112305.c for -O[01] on simulators · 4e80432c

Richard Sandiford authored 5 months ago

gcc.dg/torture/pr112305.c contains an inner loop that executes
0x8000_0014 times and an outer loop that executes 5 times, giving about
10 billion total executions of the inner loop body.  At -O2 and above we
are able to remove the inner loop, but at -O1 we keep a no-op loop:

        dls     lr, r3
.L3:
        subs    r3, r3, #1
        le      lr, .L3

and at -O0 we of course don't optimise.

This can lead to long execution times on simulators, possibly
triggering a timeout.

gcc/testsuite
	* gcc.dg/torture/pr112305.c: Skip at -O0 and -O1 for simulators.

4e80432c

c++/modules: Handle forward-declared class types · 9f9afc65

Nathaniel Shead authored 5 months ago


In some cases we can access members of a namespace-scope class without
ever having performed name-lookup on it; this can occur when a
forward-declaration of the class is used as a return type, for
instance, or with PIMPL.

One possible approach would be to do name lookup in complete_type to
force lazy loading to occur, but this seems overly expensive for a
relatively rare case.  Instead, this patch generalises the existing
pending-entity support to handle this case as well.

Unfortunately this does mean that almost every class definition will be
added to the pending-entity table, and almost always unnecessarily, but
I don't see a good way to avoid this.

gcc/cp/ChangeLog:

	* module.cc (depset::DB_IS_MEMBER_BIT): Rename to...
	(depset::DB_IS_PENDING_BIT): ...this.
	(depset::is_member): Remove.
	(depset::is_pending_entity): New function.
	(depset::hash::make_dependency): Mark definitions of
	namespace-scope types as maybe-pending entities.
	(depset::hash::add_class_entities): Rename DB_IS_MEMBER_BIT to
	DB_IS_PENDING_BIT.
	(depset::hash::find_dependencies): Use is_pending_entity
	instead of is_member.
	(module_state::write_pendings): Likewise; adjust comment.

gcc/testsuite/ChangeLog:

	* g++.dg/modules/inst-4_b.C: Adjust pending-entity count.
	* g++.dg/modules/member-def-1_c.C: Likewise.
	* g++.dg/modules/member-def-2_c.C: Likewise.
	* g++.dg/modules/tpl-spec-3_b.C: Likewise.
	* g++.dg/modules/tpl-spec-4_b.C: Likewise.
	* g++.dg/modules/tpl-spec-5_b.C: Likewise.
	* g++.dg/modules/class-9_a.H: New test.
	* g++.dg/modules/class-9_b.H: New test.
	* g++.dg/modules/class-9_c.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Jason Merrill <jason@redhat.com>

9f9afc65

tree-optimization/117254 - ICE with access diangostics · d464a52d

Richard Biener authored 5 months ago

The diagnostics code fails to handle non-constant domain max.

	PR tree-optimization/117254
	* gimple-ssa-warn-access.cc (maybe_warn_nonstring_arg):
	Check the array domain max is constant before using it.

	* gcc.dg/pr117254.c: New testcase.

d464a52d

amdgcn: Refactor device settings into a def file · a6b26e5e

Andrew Stubbs authored 6 months ago


Almost all device-specific settings are now centralised into gcn-devices.def
for the compiler, mkoffload, and libgomp.  No longer will we have to touch 10
files in multiple places just to add another device without any exotic
features.  (New ISAs and devices with incompatible metadata will continue to
need a bit more.)

In order to remove the device-specific conditionals in the code a new value
HSACO_ATTR_UNSUPPORTED has been added, indicating that the assembler will
reject any setting of that option.

This incorporates some of Tobias's patch from March 2024.

Co-Authored-By: Tobias Burnus <tburnus@baylibre.com>

gcc/ChangeLog:

	* config.gcc (amdgcn): Add gcn-device-macros.h to tm_file.
	Add gcn-tables.opt to extra_options.
	* config/gcn/gcn-hsa.h (NO_XNACK): Delete.
	(NO_SRAM_ECC): Delete.
	(SRAMOPT): Move definition to generated file gcn-device-macros.h.
	(XNACKOPT): Likewise.
	(ASM_SPEC): Redefine using generated values from gcn-device-macros.h.
	* config/gcn/gcn-opts.h
	(enum processor_type): Generate from gcn-devices.def.
	(TARGET_VEGA10): Delete.
	(TARGET_VEGA20): Delete.
	(TARGET_GFX908): Delete.
	(TARGET_GFX90a): Delete.
	(TARGET_GFX90c): Delete.
	(TARGET_GFX1030): Delete.
	(TARGET_GFX1036): Delete.
	(TARGET_GFX1100): Delete.
	(TARGET_GFX1103): Delete.
	(TARGET_XNACK): Redefine to allow for HSACO_ATTR_UNSUPPORTED.
	(enum hsaco_attr_type): Add HSACO_ATTR_UNSUPPORTED.
	(TARGET_TGSPLIT): New define.
	* config/gcn/gcn.cc (gcn_devices): New constant table.
	(gcn_option_override): Rework to use gcn_devices table.
	(gcn_omp_device_kind_arch_isa): Likewise.
	(output_file_start): Likewise.
	(gcn_hsa_declare_function_name): Rework using TARGET_* macros.
	* config/gcn/gcn.h (gcn_devices): Declare struct and table.
	(TARGET_CPU_CPP_BUILTINS): Rework using gcn_devices.
	* config/gcn/gcn.opt: Move enum data to generated file gcn-tables.opt.
	Use new names for the default values.
	* config/gcn/mkoffload.cc (EF_AMDGPU_MACH_AMDGCN_GFX900): Delete.
	(EF_AMDGPU_MACH_AMDGCN_GFX906): Delete.
	(EF_AMDGPU_MACH_AMDGCN_GFX908): Delete.
	(EF_AMDGPU_MACH_AMDGCN_GFX90a): Delete.
	(EF_AMDGPU_MACH_AMDGCN_GFX90c): Delete.
	(EF_AMDGPU_MACH_AMDGCN_GFX1030): Delete.
	(EF_AMDGPU_MACH_AMDGCN_GFX1036): Delete.
	(EF_AMDGPU_MACH_AMDGCN_GFX1100): Delete.
	(EF_AMDGPU_MACH_AMDGCN_GFX1103): Delete.
	(enum elf_arch_code): Define using gcn-devices.def.
	(get_arch): Rework using gcn-devices.def.
	(main): Rework using gcn-devices.def
	* config/gcn/t-gcn-hsa (gcn-tables.opt): Generate file.
	(gcn-device-macros.h): Generate file.
	* config/gcn/t-omp-device: Generate isa list from gcn-devices.def.
	* config/gcn/gcn-devices.def: New file.
	* config/gcn/gcn-tables.opt: New file.
	* config/gcn/gcn-tables.opt.urls: New file.
	* config/gcn/gen-gcn-device-macros.awk: New file.
	* config/gcn/gen-opt-tables.awk: New file.

libgomp/ChangeLog:

	* plugin/plugin-gcn.c (EF_AMDGPU_MACH): Generate from gcn-devices.def.
	(gcn_gfx803_s): Delete.
	(gcn_gfx900_s): Delete.
	(gcn_gfx906_s): Delete.
	(gcn_gfx908_s): Delete.
	(gcn_gfx90a_s): Delete.
	(gcn_gfx90c_s): Delete.
	(gcn_gfx1030_s): Delete.
	(gcn_gfx1036_s): Delete.
	(gcn_gfx1100_s): Delete.
	(gcn_gfx1103_s): Delete.
	(gcn_isa_name_len): Delete.
	(isa_hsa_name): Rename ...
	(isa_name): ... to this, and rework using gcn-devices.def.
	(isa_gcc_name): Delete.
	(isa_code): Rework using gcn-devices.def.
	(max_isa_vgprs): Rework using gcn-devices.def.
	(isa_matches_agent): Update isa_name usage.
	(GOMP_OFFLOAD_init_device): Improve diagnostic using the name.

a6b26e5e

tree-optimization/117123 - missed PHI equivalence in VN · c33d8c55

Richard Biener authored 5 months ago

Value-numbering can use its set of equivalences to prove that
a PHI node with args <a_1, 5, 10> is equal to a_1 iff on the
edges with the constants a_1 == 5 and a_1 == 10 hold.  This
breaks down when the order of PHI args is <5, 10, a_1> as then
we drop to VARYING early.  The following mitigates this by
shuffling a copy of the edge vector to always process a SSA name
argument first.  Which should also handle the special-case of
a two argument <5, a_1> we already had.

	PR tree-optimization/117123
	* tree-ssa-sccvn.cc (visit_phi): First process a non-constant
	argument edge to handle more equivalences.  Remove the
	two-arg special case.

	* g++.dg/tree-ssa/pr117123.C: New testcase.

c33d8c55

testsuite: Fix typo in ext-floating19.C · 9263523b
Stefan Schulze Frielinghaus authored 5 months ago
```
gcc/testsuite/ChangeLog:

	* g++.dg/cpp23/ext-floating19.C: Fix typo for bfloat16 guard.
```
9263523b

RISC-V: Add testcases for unsigned .SAT_SUB form 1 with IMM = 1. · adf4ece4

xuli authored 5 months ago


form 1:
T __attribute__((noinline))             \
sat_u_sub_imm##IMM##_##T##_fmt_1 (T y)  \
{                                       \
  return (T)IMM >= y ? (T)IMM - y : 0;  \
}

Passed the rv64gcv regression test.

Change-Id: I8805225b445cdbbc685f4f54a4d66c7ee8f748e1
Signed-off-by: Li Xu <xuli1@eswincomputing.com>
gcc/testsuite/ChangeLog:

	* gcc.target/riscv/sat_u_sub_imm-1_4.c: New test.
	* gcc.target/riscv/sat_u_sub_imm-2_4.c: New test.
	* gcc.target/riscv/sat_u_sub_imm-3_4.c: New test.
	* gcc.target/riscv/sat_u_sub_imm-4_2.c: New test.

adf4ece4

Match: Support IMM=1 for unsigned scalar .SAT_SUB IMM form 1 · 4e65e12a

xuli authored 5 months ago


This patch would like to support .SAT_SUB when one of the op
is IMM = 1 of form1.

Form 1:
 #define DEF_SAT_U_SUB_IMM_FMT_1(T, IMM) \
 T __attribute__((noinline))             \
 sat_u_sub_imm##IMM##_##T##_fmt_1 (T y)  \
 {                                       \
   return IMM >= y ? IMM - y : 0;        \
 }

Take below form 1 as example:
DEF_SAT_U_SUB_IMM_FMT_1(uint8_t, 1)

Before this patch:
__attribute__((noinline))
uint8_t sat_u_sub_imm1_uint8_t_fmt_1 (uint8_t y)
{
  uint8_t _1;
  uint8_t _3;

  <bb 2> [local count: 1073741824]:
  if (y_2(D) <= 1)
    goto <bb 3>; [41.00%]
  else
    goto <bb 4>; [59.00%]

  <bb 3> [local count: 440234144]:
  _3 = y_2(D) ^ 1;

  <bb 4> [local count: 1073741824]:
  # _1 = PHI <0(2), _3(3)>
  return _1;

}

After this patch:
__attribute__((noinline))
uint8_t sat_u_sub_imm1_uint8_t_fmt_1 (uint8_t y)
{
  uint8_t _1;

;;   basic block 2, loop depth 0
;;    pred:       ENTRY
  _1 = .SAT_SUB (1, y_2(D)); [tail call]
  return _1;
;;    succ:       EXIT

}

The below test suites are passed for this patch:
1. The rv64gcv fully regression tests.
2. The x86 bootstrap tests.
3. The x86 fully regression tests.

Signed-off-by: Li Xu <xuli1@eswincomputing.com>
gcc/ChangeLog:

	* match.pd: Support IMM=1.

4e65e12a

RISC-V: Add testcases for unsigned .SAT_SUB form 1 with IMM = max -1. · 93b6f287

xuli authored 5 months ago


form 1:
T __attribute__((noinline))             \
sat_u_sub_imm##IMM##_##T##_fmt_1 (T y)  \
{                                       \
  return (T)IMM >= y ? (T)IMM - y : 0;  \
}

Passed the rv64gcv regression test.

Change-Id: Idaa1ab41f2a5785112279ea8ee2c93236457b740
Signed-off-by: Li Xu <xuli1@eswincomputing.com>
gcc/testsuite/ChangeLog:

	* gcc.target/riscv/sat_u_sub_imm-1_3.c: New test.
	* gcc.target/riscv/sat_u_sub_imm-2_3.c: New test.
	* gcc.target/riscv/sat_u_sub_imm-3_3.c: New test.
	* gcc.target/riscv/sat_u_sub_imm-4_1.c: New test.

93b6f287

Match: Support IMM=max-1 for unsigned scalar .SAT_SUB IMM form 1 · 1dccec47

xuli authored 5 months ago


This patch would like to support .SAT_SUB when one of the op
is IMM = max - 1 of form1.

Form 1:
 #define DEF_SAT_U_SUB_IMM_FMT_1(T, IMM) \
 T __attribute__((noinline))             \
 sat_u_sub_imm##IMM##_##T##_fmt_1 (T y)  \
 {                                       \
   return IMM >= y ? IMM - y : 0;        \
 }

Take below form 1 as example:
DEF_SAT_U_SUB_IMM_FMT_1(uint8_t, 254)

Before this patch:
__attribute__((noinline))
uint8_t sat_u_sub_imm254_uint8_t_fmt_1 (uint8_t y)
{
  uint8_t _1;
  uint8_t _3;

  <bb 2> [local count: 1073741824]:
  if (y_2(D) != 255)
    goto <bb 3>; [66.00%]
  else
    goto <bb 4>; [34.00%]

  <bb 3> [local count: 708669600]:
  _3 = 254 - y_2(D);

  <bb 4> [local count: 1073741824]:
  # _1 = PHI <0(2), _3(3)>
  return _1;

}

After this patch:
__attribute__((noinline))
uint8_t sat_u_sub_imm254_uint8_t_fmt_1 (uint8_t y)
{
  uint8_t _1;

  <bb 2> [local count: 1073741824]:
  _1 = .SAT_SUB (254, y_2(D)); [tail call]
  return _1;

}

The below test suites are passed for this patch:
1. The rv64gcv fully regression tests.
2. The x86 bootstrap tests.
3. The x86 fully regression tests.

Signed-off-by: Li Xu <xuli1@eswincomputing.com>

gcc/ChangeLog:

	* match.pd: Support IMM=max-1.

1dccec47

Daily bump. · 52cc5f04
GCC Administrator authored 5 months ago

52cc5f04