Commits · 2c24e0568392e51a77ebdaab629d631969ce8966 · COBOLworx / gcc-cobol

Aug 08, 2024

AArch64: Fix signbit mask creation after late combine [PR116229] · 2c24e056

Tamar Christina authored 7 months ago

The optimization to generate a Di signbit constant by using fneg was relying
on nothing being able to push the constant into the negate.  It's run quite
late for this reason.

However late combine now runs after it and triggers RTL simplification based on
the neg.  When -fno-signed-zeros this ends up dropping the - from the -0.0 and
thus producing incorrect code.

This change adds a new unspec FNEG on DI mode which prevents this simplication.

gcc/ChangeLog:

	PR target/116229
	* config/aarch64/aarch64-simd.md (aarch64_fnegv2di2<vczle><vczbe>): New.
	* config/aarch64/aarch64.cc (aarch64_maybe_generate_simd_constant):
	Update call to gen_aarch64_fnegv2di2.
	* config/aarch64/iterators.md: New UNSPEC_FNEG.

gcc/testsuite/ChangeLog:

	PR target/116229
	* gcc.target/aarch64/pr116229.c: New test.

2c24e056

AVR: target/116295 - Fix unrecognizable insn with __flash read. · c4d3dba2

Georg-Johann Lay authored 7 months ago

Some loads from non-generic address-spaces are performed by
libgcc calls, and they don't have a POST_INC form.  Don't consider
such insns when running -mfuse-add.

     PR target/116295
gcc/
	* config/avr/avr.cc (Mem_Insn::Mem_Insn): Don't consider MEMs
	that are avr_mem_memx_p or avr_load_libgcc_p.

gcc/testsuite/
	* gcc.target/avr/torture/pr116295.c: New test.

c4d3dba2

AVR: Fix a typo in __builtin_avr_mask1 documentation. · f6a41ebb
Georg-Johann Lay authored 7 months ago
```
gcc/
	* doc/extend.texi (AVR Built-in Functions) <mask1>: Fix a typo.
```
f6a41ebb

AVR: Improve POST_INC output in some rare cases. · ef697f83

Georg-Johann Lay authored 7 months ago

gcc/
	* config/avr/avr.cc (avr_insn_has_reg_unused_note_p): New function.
	(_reg_unused_after): Use it to recognize more cases.
	(avr_out_lpm_no_lpmx) [POST_INC]: Use reg_unused_after.

ef697f83

amdgcn: Fix VGPR max count · 71531733

Andrew Stubbs authored 7 months ago

The metadata for RDNA3 kernels allocates VGPRs in blocks of 12, which means the
maximum usable number of registers is 252.  This patch prevents the compiler
from exceeding this artifical limit.

gcc/ChangeLog:

	* config/gcn/gcn.cc (gcn_conditional_register_usage): Fix registers
	remaining after maximum allocation using TARGET_VGPR_GRANULARITY.

71531733

libgomp.texi: Update implementation status table for OpenMP TR13 · 89d2f3fe

Tobias Burnus authored 7 months ago

libgomp/ChangeLog:

	* libgomp.texi (OpenMP Technical Report 13): Renamed from
	'OpenMP Technical Report 12'; updated for TR13 changes.

89d2f3fe

ada: Missing legality check when type completed · fc49ee59

Steve Baird authored 7 months ago

An access discriminant is allowed to have a default value only if the
discriminated type is immutably limited. In the case of a discriminated
limited private type declaration, this rule needs to be checked when
the completion of the type is seen.

gcc/ada/

	* sem_ch6.adb (Check_Discriminant_Conformance): Perform check for
	illegal access discriminant default values when the completion of
	a limited private type is analyzed.
	* sem_aux.adb (Is_Immutably_Limited): If passed the
	not-yet-analyzed entity for the full view of a record type, test
	the Limited_Present flag
	(which is set by the parser).

fc49ee59

ada: Etype missing for raise expression · 480819c9

Steve Baird authored 7 months ago

If the primitive equality operator of the component type of an array type is
abstract, then a call to that abstract function raises Program_Error (when
such a call is legal). The FE generates a raise expression to implement this.
That raise expression is an expression so it should have a valid Etype.

gcc/ada/

	* exp_ch4.adb (Build_Eq_Call): In the abstract callee case, copy
	the Etype of the callee onto the Make_Raise_Program_Error result.

480819c9

ada: Run-time error with GNAT-LLVM on container aggregate with finalization · 85f2ffd8

Gary Dismukes authored 7 months ago

When unnesting is enabled, the compiler was failing to copy the At_End_Proc
field from a block statement to the procedure created to replace it when
unnesting of top-level blocks is done.  At run time this could lead to
exceptions due to missing finalization calls.

gcc/ada/

	* exp_ch7.adb (Unnest_Block): Copy the At_End_Proc from the block
	statement to the newly created subprogram body.

85f2ffd8

ada: Futher refinements to mutably tagged types · 352d1478

Justin Squirek authored 7 months ago

This patch further enhances the mutably tagged type implementation by fixing
several oversights relating to generic instantiations, attributes, and
type conversions.

gcc/ada/

	* exp_put_image.adb (Append_Component_Attr): Obtain the mutably
	tagged type for the component type.
	* mutably_tagged.adb (Make_Mutably_Tagged_Conversion): Add more
	cases to avoid conversion generation.
	* sem_attr.adb (Check_Put_Image_Attribute): Add mutably tagged
	type conversion.
	* sem_ch12.adb (Analyze_One_Association): Add rewrite for formal
	type declarations which are mutably tagged type to their
	equivalent type.
	(Instantiate_Type): Add condition to obtain class wide equivalent
	types.
	(Validate_Private_Type_Instance): Add check for class wide
	equivalent types which are considered "definite".
	* sem_util.adb (Is_Variable): Add condition to handle selected
	components of view conversions. Add missing check for selected
	components.
	(Is_View_Conversion): Add condition to handle class wide
	equivalent types.

352d1478

ada: Spurious maximum nesting level warnings · c5420753

Justin Squirek authored 7 months ago

This patch fixes an issue in the compiler whereby disabling style checks via
pragma Style_Checks ("-L") resulted in the minimum nesting level being zero
but the style still being enabled - leading to spurious maximum nesting level
exceeded warnings.

gcc/ada/

	* stylesw.adb (Set_Style_Check_Options): Disable max nesting level
	when unspecified

c5420753

ada: Finalization_Size raises Constraint_Error · 90b3826d

Javier Miranda authored 7 months ago

When the attribute Finalization_Size is applied to an interface type
object, the compiler-generated code fails at runtime, raising a
Constraint_Error exception.

gcc/ada/

	* exp_attr.adb (Expand_N_Attribute_Reference) <Finalization_Size>:
	If the prefix is an interface type, generate code to obtain its
	address and displace it to reference the base of the object.

90b3826d

RISC-V: rv32/DF: Prevent 2 SImode loads using XTheadMemIdx · 33aca37e

Christoph Müllner authored 7 months ago


When enabling XTheadFmv/Zfa and XThead(F)MemIdx, we might end up
with the following insn (registers are examples, but of correct class):

(set (reg:DF a4)
     (mem:DF (plus:SI (mult:SI (reg:SI a0)
			       (const_int 8))
		      (reg:SI a5))))

This is a result of an attempt to load the DF register via two SI
register loads followed by XTheadFmv/Zfa instructions to move the
contents of the two SI registers into the DF register.

The two loads are generated in riscv_split_doubleword_move(),
where the second load adds an offset of 4 to load address.
While this works fine for RVI loads, this can't be handled
for XTheadMemIdx addresses.  Coming back to the example above,
we would end up with the following insn, which can't be simplified
or matched:

(set (reg:SI a4)
     (mem:SI (plus:SI (plus:SI (mult:SI (reg:SI a0)
					(const_int 8))
			       (reg:SI a5))
		      (const_int 4))))

This triggered an ICE in the past, which was resolved in b79cd204,
which also added the test xtheadfmemidx-medany.c, where the examples
are from.  The patch postponed the optimization insn_and_split pattern
for XThead(F)MemIdx, so that the situation could effectively be avoided.

Since we don't want to rely on these optimization pattern in the future,
we need a different solution.  Therefore, this patch restricts the
movdf_hardfloat_rv32 insn to not match for split-double-word-moves
with XThead(F)MemIdx operands.  This ensures we don't need to split
them up later.

When looking at the code generation of the test file, we can see that
we have less GP<->FP conversions, but cannot use the indexed loads.
The new sequence is identical to rv32gc_xtheadfmv (similar to rv32gc_zfa).

Old:
[...]
	lla     a5,.LANCHOR0
	th.flrd fa5,a5,a0,3
	fmv.x.w a4,fa5
	th.fmv.x.hw     a5,fa5
.L1:
	fmv.w.x fa0,a4
	th.fmv.hw.x     fa0,a5
	ret
[...]

New:
[...]
	lla     a5,.LANCHOR0
	slli    a4,a0,3
	add     a4,a4,a5
	lw      a5,4(a4)
	lw      a4,0(a4)
.L1:
	fmv.w.x fa0,a4
	th.fmv.hw.x     fa0,a5
	ret
[...]

This was tested (together with the patch that eliminates the
XTheadMemIdx optimization patterns) with SPEC CPU 2017 intrate
on QEMU (RV64/lp64d).

gcc/ChangeLog:

	* config/riscv/constraints.md (th_m_noi): New constraint.
	* config/riscv/riscv.md: Adjust movdf_hardfloat_rv32 for
	XTheadMemIdx.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/xtheadfmemidx-xtheadfmv-medany.c: Adjust.
	* gcc.target/riscv/xtheadfmemidx-zfa-medany.c: Likewise.

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>

33aca37e

RISC-V: xthead(f)memidx: Eliminate optimization patterns · 31c3c5d1

Christoph Müllner authored 7 months ago


We have a huge amount of optimization patterns (insn_and_split) for
XTheadMemIdx and XTheadFMemIdx that attempt to do something, that can be
done more efficient by generic GCC passes, if we have proper support code.

A key function in eliminating the optimization patterns is
th_memidx_classify_address_index(), which needs to identify each possible
memory expression that can be lowered into a XTheadMemIdx/XTheadFMemIdx
instruction.  This patch adds all memory expressions that were
previously only recognized by the optimization patterns.

Now, that the address classification is complete, we can finally remove
all optimization patterns with the side-effect or getting rid of the
non-canonical memory expression they produced: (plus (reg) (ashift (reg) (imm))).

A positive side-effect of this change is, that we address an RV32 ICE,
that was caused by the th_memidx_I_c pattern, which did not properly
handle SUBREGs (more details are in PR116131).

A temporary negative side-effect of this change is, that we cause a
regression of the xtheadfmemidx + xtheadfmv/zfa tests (initially
introduced as part of b79cd204 to address an ICE).
As this issue cannot be addressed in the code parts that are
adjusted in this patch, we just accept the regression for now.

	PR target/116131

gcc/ChangeLog:

	* config/riscv/thead.cc (th_memidx_classify_address_index):
	Recognize all possible XTheadMemIdx memory operand structures.
	(th_fmemidx_output_index): Do strict classification.
	* config/riscv/thead.md (*th_memidx_operand): Remove.
	(TARGET_XTHEADMEMIDX): Likewise.
	(TARGET_HARD_FLOAT && TARGET_XTHEADFMEMIDX): Likewise.
	(!TARGET_64BIT && TARGET_XTHEADMEMIDX): Likewise.
	(*th_memidx_I_a): Likewise.
	(*th_memidx_I_b): Likewise.
	(*th_memidx_I_c): Likewise.
	(*th_memidx_US_a): Likewise.
	(*th_memidx_US_b): Likewise.
	(*th_memidx_US_c): Likewise.
	(*th_memidx_UZ_a): Likewise.
	(*th_memidx_UZ_b): Likewise.
	(*th_memidx_UZ_c): Likewise.
	(*th_fmemidx_movsf_hardfloat): Likewise.
	(*th_fmemidx_movdf_hardfloat_rv64): Likewise.
	(*th_fmemidx_I_a): Likewise.
	(*th_fmemidx_I_c): Likewise.
	(*th_fmemidx_US_a): Likewise.
	(*th_fmemidx_US_c): Likewise.
	(*th_fmemidx_UZ_a): Likewise.
	(*th_fmemidx_UZ_c): Likewise.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/pr116131.c: New test.

Reported-by: Patrick O'Neill <patrick@rivosinc.com>
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>

31c3c5d1

RISC-V: testsuite: xtheadfmemidx: Rename test and add similar Zfa test · 8e6bc6dd

Christoph Müllner authored 7 months ago


Test file xtheadfmemidx-medany.c has been added in b79cd204 as a
test case that provoked an ICE when loading DFmode registers via two
SImode register loads followed by a SI->DF[63:32] move from XTheadFmv.
Since Zfa is affected in the same way as XTheadFmv, even if both
have slightly different instructions, let's add a test for Zfa as well
and give the tests proper names.

Let's also add a test into the test files that counts the SI->DF moves
from XTheadFmv/Zfa.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/xtheadfmemidx-medany.c: Move to...
	* gcc.target/riscv/xtheadfmemidx-xtheadfmv-medany.c: ...here.
	* gcc.target/riscv/xtheadfmemidx-zfa-medany.c: New test.

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>

8e6bc6dd

vect: Small C++11-ification of vect_vect_recog_func_ptrs · ad7d4843

Andrew Pinski authored 7 months ago


This is a small C++11-ificiation for the use of vect_vect_recog_func_ptrs.
Changes the loop into a range based loop which then we can remove the variable
definition of NUM_PATTERNS. Also uses const reference instead of a pointer.

Bootstrapped and tested on x86_64-linux-gnu.

gcc/ChangeLog:

	* tree-vect-patterns.cc (NUM_PATTERNS): Delete.
	(vect_pattern_recog_1): Constify and change
	recog_func to a reference.
	(vect_pattern_recog): Use range-based loop over
	vect_vect_recog_func_ptrs.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

ad7d4843

RISC-V: Delete duplicate '#define RISCV_DWARF_VLENB' · ecdf7a4e
Jin Ma authored 7 months ago
```
gcc/ChangeLog:

	* config/riscv/riscv.h (RISCV_DWARF_VLENB): Delete.
```
ecdf7a4e

amdgcn: Re-enable trampolines · 6f71e050

Andrew Stubbs authored 7 months ago

The stacks are executable since the reverse-offload features were added, so
trampolines actually do work.

gcc/ChangeLog:

	* config/gcn/gcn.cc (gcn_trampoline_init): Re-enable trampolines.

6f71e050

[RISC-V][PR target/116240] Ensure object is a comparison before extracting arguments · 190ad812

Jeff Law authored 7 months ago

This was supposed to go out the door yesterday, but I kept getting interrupted.

The target bits for rtx costing can't assume the rtl they're given actually
matches a target pattern.   It's just kind of inherent in how the costing
routines get called in various places.

In this particular case we're trying to cost a conditional move:

(set (dest) (if_then_else (cond) (true) (false))

On the RISC-V port the backend only allows actual conditionals for COND.  So
something like (eq (reg) (const_int 0)).  In the costing code for if-then-else
we did something like

(XEXP (XEXP (cond, 0), 0)))

Which fails miserably if COND is a terminal node like (reg) rather than (ne
(reg) (const_int 0)

So this patch tightens up the RTL scanning to ensure that we have a comparison
before we start looking at the comparison's arguments.

Run through my tester without incident, but I'll wait for the pre-commit tester
to run through a cycle before pushing to the trunk.

Jeff

ps.   We probably could support a naked REG for the condition and internally convert it to (ne (reg) (const_int 0)), but I don't think it likely happens with any regularity.

	PR target/116240
gcc/
	* config/riscv/riscv.cc (riscv_rtx_costs): Ensure object is a
	comparison before looking at its arguments.

gcc/testsuite
	* gcc.target/riscv/pr116240.c: New test.

190ad812

Rearrange SLP nodes with duplicate statements [PR98138] · ab187858

Manolis Tsamis authored 8 months ago

This change checks when a two_operators SLP node has multiple occurrences of
the same statement (e.g. {A, B, A, B, ...}) and tries to rearrange the operands
so that there are no duplicates. Two vec_perm expressions are then introduced
to recreate the original ordering. These duplicates can appear due to how
two_operators nodes are handled, and they prevent vectorization in some cases.

This targets the vectorization of the SPEC2017 x264 pixel_satd functions.
In some processors a larger than 10% improvement on x264 has been observed.

	PR tree-optimization/98138

gcc/ChangeLog:

	* tree-vect-slp.cc: Avoid duplicates in two_operators nodes.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/vect-slp-two-operator.c: New test.

ab187858

c++: Propagate TREE_ADDRESSABLE in fixup_type_variants [PR115062] · 71aebb36

Nathaniel Shead authored 7 months ago


This has caused issues with modules when an import fills in the
definition of a type already created with a typedef.

	PR c++/115062

gcc/cp/ChangeLog:

	* class.cc (fixup_type_variants): Propagate TREE_ADDRESSABLE.
	(finish_struct_bits): Cleanup now that TREE_ADDRESSABLE is
	propagated by fixup_type_variants.

gcc/testsuite/ChangeLog:

	* g++.dg/modules/pr115062_a.H: New test.
	* g++.dg/modules/pr115062_b.H: New test.
	* g++.dg/modules/pr115062_c.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>

71aebb36

c++/modules: Assume header bindings are global module · 0de1481a

Nathaniel Shead authored 7 months ago


While stepping through some code I noticed that we do some extra work
(finding the originating module decl, stripping the template, and
inspecting the attached-ness) for every declaration taken from a header
unit.  This doesn't seem necessary though since no declaration in a
header unit can be attached to anything but the global module, so we can
just assume that global_p will be true.

This was the original behaviour before I removed this assumption while
refactoring for r15-2807-gc592310d5275e0.

gcc/cp/ChangeLog:

	* module.cc (module_state::read_cluster): Assume header module
	declarations will require GM merging.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>

0de1481a

libgomp/libgomp.texi: Mention -fno-builtin-omp_is_initial_device · 8b5a8b1f

Tobias Burnus authored 7 months ago

libgomp/ChangeLog:

	* libgomp.texi (omp_is_initial_device): Mention
	-fno-builtin-omp_is_initial_device and folding by default.

8b5a8b1f

i386: Tweak ix86_mode_can_transfer_bits to restore bootstrap on RHEL. · 4d44f3fc

Roger Sayle authored 7 months ago

This minor patch, very similar to one posted and approved previously at
https://gcc.gnu.org/pipermail/gcc-patches/2024-July/657229.html is
required to restore builds on systems using gcc 4.8 as a host compiler.
Using the enumeration constants E_SFmode and E_DFmode avoids issues with
SFmode and DFmode being "non-literal types in constant expressions".

2024-08-08  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	* config/i386/i386.cc (ix86_mode_can_transfer_bits): Use E_?Fmode
	enumeration constants in switch statement.

4d44f3fc

c++, libstdc++: Implement C++26 P2747R2 - constexpr placement new [PR115744] · afa3a4a5

Jakub Jelinek authored 7 months ago

With the PR115754 fix in, constexpr placement new mostly just works,
so this patch just adds constexpr keyword to the placement new operators
in <new>, adds FTMs and testsuite coverage.

There is one accepts-invalid though, the
new (p + 1) int[]{2, 3};      // error (in this paper)
case from the paper.  Can we handle that incrementally?
The problem with that is I think calling operator new now that it is
constexpr should be fine even in that case in constant expressions, so
int *p = std::allocator<int>{}.allocate(3);
int *q = operator new[] (sizeof (int) * 2, p + 1);
should be ok, so it can't be easily the placement new operator call
itself on whose constexpr evaluation we try something special, it should
be on the new expression, but constexpr.cc actually sees only
<<< Unknown tree: expr_stmt
  (void) (TARGET_EXPR <D.2640, (void *) TARGET_EXPR <D.2641, VIEW_CONVERT_EXPR<int *>(b) + 4>>, TARGET_EXPR <D.2642, operator new [] (8, NON_LVALUE_EXPR <D.2640>)>,   int * D.2643;
  <<< Unknown tree: expr_stmt
    (void) (D.2643 = (int *) D.2642) >>>;
and that is just fine by the preexisting constexpr evaluation rules.

Should build_new_1 emit some extra cast for the array cases with placement
new in maybe_constexpr_fn (current_function_decl) that the existing P2738
code would catch?

2024-08-08  Jakub Jelinek  <jakub@redhat.com>

	PR c++/115744
gcc/c-family/
	* c-cppbuiltin.cc (c_cpp_builtins): Change __cpp_constexpr
	from 202306L to 202406L for C++26.
gcc/testsuite/
	* g++.dg/cpp2a/construct_at.h (operator new, operator new[]):
	Use constexpr instead of inline if __cpp_constexpr >= 202406L.
	* g++.dg/cpp26/constexpr-new1.C: New test.
	* g++.dg/cpp26/constexpr-new2.C: New test.
	* g++.dg/cpp26/constexpr-new3.C: New test.
	* g++.dg/cpp26/feat-cxx26.C (__cpp_constexpr): Adjust expected
	value.
libstdc++-v3/
	* libsupc++/new (__glibcxx_want_constexpr_new): Define before
	including bits/version.h.
	(_GLIBCXX_PLACEMENT_CONSTEXPR): Define.
	(operator new, operator new[]): Use it for placement new instead
	of inline.
	* include/bits/version.def (constexpr_new): New FTM.
	* include/bits/version.h: Regenerate.

afa3a4a5

libgomp.c++/static-aggr-constructor-destructor-{1,2}.C: Fix scan-tree-dump · e3a6dec3

Tobias Burnus authored 7 months ago

In principle, the optimized dump should be the same on the host, but as
'nohost' is not handled, is is present. However when ENABLE_OFFLOADING is
false, it is handled early enough to remove the function.

libgomp/ChangeLog:

	* testsuite/libgomp.c++/static-aggr-constructor-destructor-1.C: Split
	scan-tree-dump into with and without target offload_target_any.
	* testsuite/libgomp.c++/static-aggr-constructor-destructor-2.C:
	Likewise.

e3a6dec3

Ada, libgnarl: Fix s-taprop__posix.adb compilation. · 6db71509

Iain Sandoe authored 7 months ago


Bootstrap on Darwin, and likely any other targets using the posix
implementation of s-taprop was broken by commits between r15-2743
and r15-2747:
s-taprop.adb:297:15: error: "size_t" is not visible
s-taprop.adb:297:15: error: multiple use clauses cause hiding
s-taprop.adb:297:15: error: hidden declaration at s-osinte.ads:58
s-taprop.adb:297:15: error: hidden declaration at i-c.ads:9

This seems to be caused by an omitted change to use Interfaces.C.size_t
instead of just size_t.  Fixed thus.

gcc/ada/ChangeLog:

	* libgnarl/s-taprop__posix.adb (Stack_Guard): Use Interfaces.C.size_t
	for the type of Page_Size.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>

6db71509

ada: Fix s-taprop__solaris.adb compilation · 82ed4d51

Rainer Orth authored 7 months ago

Solaris Ada bootstrap is broken as of 2024-08-06 with

s-taprop.adb:1971:23: error: "int" is not visible
s-taprop.adb:1971:23: error: multiple use clauses cause hiding
s-taprop.adb:1971:23: error: hidden declaration at s-osinte.ads:51
s-taprop.adb:1971:23: error: hidden declaration at i-c.ads:62

because one instance of int isn't qualified.  This patch fixes this.

Bootstrapped without regressions on i386-pc-solaris2.11 and
sparc-sun-solaris2.11.

2024-08-07  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

	gcc/ada:
	* libgnarl/s-taprop__solaris.adb (Set_Task_Affinity): Fully
	quality int.

82ed4d51

tree-optimization/116258 - fix i386 testcase · 5aa4cd91

Richard Biener authored 7 months ago

With -march=cascadelake we use vpermilps instead of shufps.

	PR tree-optimization/116258
	* gcc.target/i386/pr116258.c: Also allow vpermilps.

5aa4cd91

lra: emit caller-save register spills before call insn [PR116028] · 3c67a0fa

Surya Kumari Jangala authored 1 year ago

LRA emits insns to save caller-save registers in the
inheritance/splitting pass. In this pass, LRA builds EBBs (Extended
Basic Block) and traverses the insns in the EBBs in reverse order from
the last insn to the first insn. When LRA sees a write to a pseudo (that
has been assigned a caller-save register), and there is a read following
the write, with an intervening call insn between the write and read,
then LRA generates a spill immediately after the write and a restore
immediately before the read. The spill is needed because the call insn
will clobber the caller-save register.

If there is a write insn and a call insn in two separate BBs but
belonging to the same EBB, the spill insn gets generated in the BB
containing the write insn. If the write insn is in the entry BB, then
the spill insn that is generated in the entry BB prevents shrink wrap
from happening. This is because the spill insn references the stack
pointer and hence the prolog gets generated in the entry BB itself.

This patch ensures the the spill insn is generated before the call insn
instead of after the write. This also ensures that the spill occurs
only in the path containing the call.

2024-08-01  Surya Kumari Jangala  <jskumari@linux.ibm.com>

gcc:
	PR rtl-optimization/116028
	* lra-constraints.cc (split_reg): Spill register before call
	insn.
	(latest_call_insn): New variable.
	(inherit_in_ebb): Track the latest call insn.

gcc/testsuite:
	PR rtl-optimization/116028
	* gcc.dg/ira-shrinkwrap-prep-1.c: Remove xfail for powerpc.
	* gcc.dg/pr10474.c: Remove xfail for powerpc.

3c67a0fa

RISC-V: Minimal support for Zimop extension. · c8f3fdd5

Jiawei authored 7 months ago

This patch support Zimop and Zcmop extension[1].To enable GCC to recognize
and process Zimop and Zcmop extension correctly at compile time.

https://github.com/riscv/riscv-isa-manual/blob/main/src/zimop.adoc

gcc/ChangeLog:

	* common/config/riscv/riscv-common.cc: New extension.
	* config/riscv/riscv.opt: New mask.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/arch-42.c: New test.
	* gcc.target/riscv/arch-43.c: New test.

c8f3fdd5

c++/modules: Handle instantiating already tsubsted template friend classes [PR115801] · 79209273

Nathaniel Shead authored 7 months ago


With modules it may be the case that a template friend class provided
with a qualified name is not found by name lookup at instantiation time,
due to the class not being exported from its module.  This causes issues
in tsubst_friend_class which did not handle this case.

This is caused by the named friend class not actually requiring
tsubsting.  This was already worked around for the "found by name
lookup" case (g++.dg/template/friend5.C), but it looks like there's no
need to do name lookup at all for this particular case to work.

We do need to be careful to continue to do name lookup to handle
templates from an outer current instantiation though; this patch adds a
new testcase for this as well.  This should not impact modules (because
exportingness will only affect namespace lookup).

	PR c++/115801

gcc/cp/ChangeLog:

	* pt.cc (tsubst_friend_class): Return the type immediately when
	no tsubsting or name lookup is required.

gcc/testsuite/ChangeLog:

	* g++.dg/modules/tpl-friend-16_a.C: New test.
	* g++.dg/modules/tpl-friend-16_b.C: New test.
	* g++.dg/template/friend82.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Patrick Palka <ppalka@redhat.com>
Reviewed-by: Jason Merrill <jason@redhat.com>

79209273

c++/modules: Fix merging of GM entities in partitions [PR114950] · c592310d

Nathaniel Shead authored 7 months ago


Currently name lookup generally seems to assume that all entities
declared within a named module (partition) are attached to said module,
which is not true for GM entities (e.g. via extern "C++"), and causes
issues with deduplication.

This patch fixes the issue by ensuring that module attachment of a
declaration is consistently used to handling merging.  Handling this
exposes some issues with deduplicating temploid friends; to resolve this
we always create the BINDING_SLOT_PARTITION slot so that we have
somewhere to place attached names (from any module).

This doesn't yet completely handle issues with allowing otherwise
conflicting temploid friends from different modules to co-exist in the
same module if neither are reachable from the other via name lookup.

	PR c++/114950

gcc/cp/ChangeLog:

	* module.cc (trees_out::decl_value): Stream bit indicating
	imported temploid friends early.
	(trees_in::decl_value): Use this bit with key_mergeable.
	(trees_in::key_mergeable): Allow merging attached declarations
	if they're imported temploid friends (which must be namespace
	scope).
	(module_state::read_cluster): Check for GM entities that may
	require merging even when importing from partitions.
	* name-lookup.cc (enum binding_slots): Adjust comment.
	(get_fixed_binding_slot): Always create partition slot.
	(name_lookup::search_namespace_only): Support binding vectors
	with both partition and GM entities to dedup.
	(walk_module_binding): Likewise.
	(name_lookup::adl_namespace_fns): Likewise.
	(set_module_binding): Likewise.
	(check_module_override): Use attachment of the decl when
	checking overrides rather than named_module_p.
	(lookup_imported_hidden_friend): Use partition slot for finding
	mergeable template bindings.
	* name-lookup.h (set_module_binding): Split mod_glob_flag
	parameter into separate global_p and partition_p params.

gcc/testsuite/ChangeLog:

	* g++.dg/modules/tpl-friend-13_e.C: Adjust error message.
	* g++.dg/modules/ambig-2_a.C: New test.
	* g++.dg/modules/ambig-2_b.C: New test.
	* g++.dg/modules/part-9_a.C: New test.
	* g++.dg/modules/part-9_b.C: New test.
	* g++.dg/modules/part-9_c.C: New test.
	* g++.dg/modules/tpl-friend-15.h: New test.
	* g++.dg/modules/tpl-friend-15_a.C: New test.
	* g++.dg/modules/tpl-friend-15_b.C: New test.
	* g++.dg/modules/tpl-friend-15_c.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Jason Merrill <jason@redhat.com>

c592310d

c++/modules: Clarify error message in read_enum_def · c0ad382c

Nathaniel Shead authored 7 months ago


This error message reads to me the wrong way around, particularly in the
context of other errors.  Updated so that the ellipsis connect.

gcc/cp/ChangeLog:

	* module.cc (trees_in::read_enum_def): Clarify error.

gcc/testsuite/ChangeLog:

	* g++.dg/modules/enum-bad-1_b.C: Update error message.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>

c0ad382c

Daily bump. · ea973bd4
GCC Administrator authored 7 months ago

ea973bd4

Aug 07, 2024

compiler: don't assume that ATTRIBUTE_UNUSED is defined · ac8a87c4
Ian Lance Taylor authored 7 months ago
```
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/604075
```
ac8a87c4

Darwin: Recognise -weak_framework in the driver [PR116237]. · 4cec7bc7

Iain Sandoe authored 7 months ago


XCode compilers recognise the weak_framework linker option in the driver
and forward it.  This patch makes GCC adopt the same behaviour.

	PR target/116237

gcc/ChangeLog:

	* config/darwin.h (SUBTARGET_DRIVER_SELF_SPECS): Add a spec for
	weak_framework.
	* config/darwin.opt: Handle weak_framework driver option.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>

4cec7bc7

c++: erroneous partial spec vs primary tmpl [PR116064] · d1fc9816

Patrick Palka authored 7 months ago


When a partial specialization is deemed erroneous at parse time, we
currently flag the primary template as erroneous instead.  Later
at instantiation time we check if the primary template is erroneous
rather than the selected partial specialization, so at least we're
consistent.

But it's better not to conflate a partial specialization with the
primary template since they're instantiated independenty.  This avoids
rejecting the instantiation of A<int> in the below testcase.

	PR c++/116064

gcc/cp/ChangeLog:

	* error.cc (get_current_template): If the current scope is
	a partial specialization, return it instead of the primary
	template.
	* pt.cc (instantiate_class_template): Pass the partial
	specialization if any to maybe_diagnose_erroneous_template
	instead of the primary template.

gcc/testsuite/ChangeLog:

	* g++.dg/template/permissive-error2.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>

d1fc9816

Partially support streaming of poly_int for offloading. · 38900247

Prathamesh Kulkarni authored 7 months ago


When offloading is enabled, the patch streams out host
NUM_POLY_INT_COEFFS, and changes streaming in as follows:

if (host_num_poly_int_coeffs <= NUM_POLY_INT_COEFFS)
{
  for (i = 0; i < host_num_poly_int_coeffs; i++)
    poly_int.coeffs[i] = stream_in coeff;
  for (; i < NUM_POLY_INT_COEFFS; i++)
    poly_int.coeffs[i] = 0;
}
else
{
  for (i = 0; i < NUM_POLY_INT_COEFFS; i++)
    poly_int.coeffs[i] = stream_in coeff;

  /* Ensure that degree of poly_int <= accel NUM_POLY_INT_COEFFS.  */
  for (; i < host_num_poly_int_coeffs; i++)
    {
      val = stream_in coeff;
      if (val != 0)
	error ();
    }
}

gcc/ChangeLog:
	PR ipa/96265
	PR ipa/111937
	* data-streamer-in.cc (streamer_read_poly_uint64): Remove code for
	streaming, and call poly_int_read_common instead.
	(streamer_read_poly_int64): Likewise.
	* data-streamer.cc (host_num_poly_int_coeffs): Conditionally define
	new variable if ACCEL_COMPILER is defined.
	* data-streamer.h (host_num_poly_int_coeffs): Declare.
	(poly_int_read_common): New function template.
	(bp_unpack_poly_value): Remove code for streaming and call
	poly_int_read_common instead.
	* lto-streamer-in.cc (lto_input_mode_table): Stream-in host
	NUM_POLY_INT_COEFFS into host_num_poly_int_coeffs if ACCEL_COMPILER
	is defined.
	* lto-streamer-out.cc (lto_write_mode_table): Stream out
	NUM_POLY_INT_COEFFS if offloading is enabled.
	* poly-int.h (MAX_NUM_POLY_INT_COEFFS_BITS): New macro.
	* tree-streamer-in.cc (lto_input_ts_poly_tree_pointers): Adjust
	streaming-in of poly_int.

Signed-off-by: Prathamesh Kulkarni <prathameshk@nvidia.com>

38900247

Don't call clean_symbol_name in create_tmp_var_name [PR116219] · 165e3e7c

Jakub Jelinek authored 7 months ago

SRA adds fancy names like offset$D94316$_M_impl$D93629$_M_start
where the numbers in there are DECL_UIDs if there are unnamed
FIELD_DECLs etc.
Because -g0 vs. -g can cause differences between the exact DECL_UID
values (add bigger gaps in between them, corresponding decls should
still be ordered the same based on DECL_UID) we make sure such
decls have DECL_NAMELESS set and depending on exact options either don't
dump such names at all or dump_fancy_name sanitizes the D123456$ parts in
there to Dxxxx$.
Unfortunately in tons of places we then use get_name to grab either user
names or these SRA created names and use that as argument to
create_tmp_var{,_name,_raw} to base other artificial temporary names based
on that.  Those are DECL_NAMELESS too, but unfortunately create_tmp_var_name
starting with
https://gcc.gnu.org/git/?p=gcc.git&a=commit;h=725494f6e4121eace43b7db1202f8ecbf52a8276
calls clean_symbol_name which replaces the $s in there with _s and thus
dump_fancy_name doesn't sanitize it anymore.

I don't see any discussion of that commit (originally to TM branch, later
merged) on the mailing list, but from
   DECL_NAME (new_decl)
     = create_tmp_var_name (IDENTIFIER_POINTER (DECL_NAME (old_decl)));
-  SET_DECL_ASSEMBLER_NAME (new_decl, NULL_TREE);
+  SET_DECL_ASSEMBLER_NAME (new_decl, DECL_NAME (new_decl));
snippet elsewhere in that commit it seems create_tmp_var_name was used at
that point also to determine function names of clones, so presumably the
clean_symbol_name at that point was to ensure the symbol could be emitted
into assembly, maybe in case DECL_NAME is something like C++ operators or
whatever could have there undesirable characters.

Anyway, we don't do that for years anymore, already GCC 4.5 uses for such
purposes clone_function_name which starts of DECL_ASSEMBLER_NAME of the old
function and appends based on supportable symbol suffix separators the
separator and some suffix and/or number, so that part doesn't go through
create_tmp_var_name.

I don't see problems with having the $ and . etc. characters in the names
intended just to make dumps more readable, after all, we already are using
those in the SRA created names.  Those names shouldn't make it into the
assembly in any way, neither debug info nor assembly labels.

There is one theoretical case, where the gimplifier promotes automatic
vars into TREE_STATIC ones and therefore those can then appear in assembly,
just in case it would be on e.g. SRA created names and regimplified later.
Because no cases of promotion of DECL_NAMELESS vars to static was observed in
{x86_64,i686,powerpc64le}-linux bootstraps/regtests, the code simply uses
C.NNN names for DECL_NAMELESS vars like it does for !DECL_NAME vars.

Richi mentioned on IRC that the non-cleaned up names might make things
harder to feed stuff back to the GIMPLE FE, but if so, I think it should be
the dumping for GIMPLE FE purposes that cleans those up (but at that point
it should also verify if some such cleaned up names don't collide with
others and somehow deal with those).

2024-08-07  Jakub Jelinek  <jakub@redhat.com>

	PR c++/116219
	* gimple-expr.cc (remove_suffix): Formatting fixes.
	(create_tmp_var_name): Don't call clean_symbol_name.
	* gimplify.cc (gimplify_init_constructor): When promoting automatic
	DECL_NAMELESS vars to static, don't preserve their DECL_NAME.

165e3e7c