Commits · c9353e0fcd0ddc0d48ae8a2b0518f0f82670d708 · COBOLworx / gcc-cobol

Jan 10, 2025

libstdc++: Fix unused parameter warnings in <bits/atomic_futex.h> · c9353e0f

Jonathan Wakely authored 2 months ago

This fixes warnings like the following during bootstrap:

sparc-sun-solaris2.11/libstdc++-v3/include/bits/atomic_futex.h:324:53: warning: unused parameter ‘__mo’ [-Wunused-parameter]
  324 |     _M_load_when_equal(unsigned __val, memory_order __mo)
      |                                        ~~~~~~~~~~~~~^~~~

libstdc++-v3/ChangeLog:

	* include/bits/atomic_futex.h (__atomic_futex_unsigned): Remove
	names of unused parameters in non-futex implementation.

c9353e0f

c++: add fixed test [PR118391] · d2017159

Marek Polacek authored 2 months ago

Fixed by r15-6740.

	PR c++/118391

gcc/testsuite/ChangeLog:

	* g++.dg/cpp2a/lambda-uneval20.C: New test.

d2017159

libatomic: Cleanup AArch64 ifunc selection · 81bcf412

Wilco Dijkstra authored 2 months ago

Simplify and cleanup ifunc selection logic.  Since LRCPC3 does
not imply LSE2, has_rcpc3() should also check LSE2 is enabled.

Passes regress and bootstrap, OK for commit?

libatomic:
	* config/linux/aarch64/host-config.h (has_lse2): Cleanup.
	(has_lse128): Likewise.
	(has_rcpc3): Add early check for LSE2.

81bcf412

testsuite: arm: Add pattern for armv8-m.base to cmse-15.c test · cfd7c54b

Torbjörn SVENSSON authored 2 months ago


Since armv8-m.base uses thumb1 that does not suport sibcall/tailcall,
a pattern is needed that uses PUSH/BL/POP sequence instead of a single
B instruction to reuse an already existing function in the compile unit.

gcc/testsuite/ChangeLog:

	* gcc.target/arm/cmse/cmse-15.c: Added pattern for armv8-m.base.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>

cfd7c54b

Do not call cp_parser_omp_dispatch directly in cp_parser_pragma · b5a67989

Paul-Antoine Arras authored 2 months ago

This is a followup to
ed49709a OpenMP: C++ front-end support for dispatch + adjust_args.

The call to cp_parser_omp_dispatch only belongs in cp_parser_omp_construct. In
cp_parser_pragma, handle PRAGMA_OMP_DISPATCH by calling cp_parser_omp_construct.

gcc/cp/ChangeLog:

	* parser.cc (cp_parser_pragma): Replace call to cp_parser_omp_dispatch
	with cp_parser_omp_construct and check context.

gcc/testsuite/ChangeLog:

	* g++.dg/gomp/dispatch-8.C: New test.

b5a67989

c++: Fix ICE with invalid defaulted operator <=> [PR118387] · 4c688399

Jakub Jelinek authored 2 months ago

In the following testcase there are 2 issues, one is that B doesn't
have operator<=> and the other is that A's operator<=> has int return
type, i.e. not the standard comparison category.
Because of the int return type, retcat is cc_last; when we first
try to synthetize it, it is therefore with tentative false and complain
tf_none, we find that B doesn't have operator<=> and because retcat isn't
tc_last, don't try to search for other operators in genericize_spaceship.
And then mark the operator deleted.
When trying to explain the use of the deleted operator, tentative is still
false, but complain is tf_error_or_warning.
do_one_comp will first do:
  tree comp = build_new_op (loc, code, flags, lhs, rhs,
                            NULL_TREE, NULL_TREE, &overload,
                            tentative ? tf_none : complain);
and because complain isn't tf_none, it will actually diagnose the bug
already, but then (tentative || complain) is true and we call
genericize_spaceship, which has
  if (tag == cc_last && is_auto (type))
    {
...
    }

  gcc_checking_assert (tag < cc_last);
and because tag is cc_last and type isn't auto, we just ICE on that
assertion.

The patch fixes it by returning error_mark_node from genericize_spaceship
instead of failing the assertion.

Note, the PR raises another problem.
If on the same testcase the B b; line is removed, we silently synthetize
operator<=> which will crash at runtime due to returning without a return
statement.  That is because the standard says that in that case
it should return static_cast<int>(std::strong_ordering::equal);
but I can't find anywhere wording which would say that if that isn't
valid, the function is deleted.
https://eel.is/c++draft/class.compare#class.spaceship-2.2
seems to talk just about cases where there are some members and their
comparison is invalid it is deleted, but here there are none and it
follows
https://eel.is/c++draft/class.compare#class.spaceship-3.sentence-2
So, we synthetize with tf_none, see the static_cast is invalid, don't
add error_mark_node statement silently, but as the function isn't deleted,
we just silently emit it.
Should the standard be amended to say that the operator should be deleted
even if it has no elements and the static cast from
https://eel.is/c++draft/class.compare#class.spaceship-3.sentence-2
?

2025-01-10  Jakub Jelinek  <jakub@redhat.com>

	PR c++/118387
	* method.cc (genericize_spaceship): For tag == cc_last if
	type is not auto just return error_mark_node instead of failing
	checking assertion.

	* g++.dg/cpp2a/spaceship-synth17.C: New test.

4c688399

c++: modules and DECL_REPLACEABLE_P · e86daddb

Jason Merrill authored 4 months ago

We need to remember that the ::operator new is replaceable to avoid a bogus
error about __builtin_operator_new finding a non-replaceable function.

This affected __get_temporary_buffer in stl_tempbuf.h.

gcc/cp/ChangeLog:

	* module.cc (trees_out::core_bools): Write replaceable_operator.
	(trees_in::core_bools): Read it.

gcc/testsuite/ChangeLog:

	* g++.dg/modules/operator-2_a.C: New test.
	* g++.dg/modules/operator-2_b.C: New test.

e86daddb

Fix some memory leaks · 9193641d

Richard Biener authored 2 months ago

The following fixes memory leaks found compiling SPEC CPU 2017 with
valgrind.

	* df-core.cc (rest_of_handle_df_finish): Release dflow for
	problems without free function (like LR).
	* gimple-crc-optimization.cc (crc_optimization::loop_may_calculate_crc):
	Release loop_bbs on all exits.
	* tree-vectorizer.h (supportable_indirect_convert_operation): Change.
	* tree-vect-generic.cc (expand_vector_conversion): Adjust.
	* tree-vect-stmts.cc (vectorizable_conversion): Use auto_vec for
	converts.
	(supportable_indirect_convert_operation): Get a reference to
	the output vector of converts.

9193641d

[PR118017][LRA]: Fix test for i686 · 94d8de53

Vladimir N. Makarov authored 2 months ago

My previous patch for PR118017 contains a test which fails on i686.  The patch fixes this.

gcc/testsuite/ChangeLog:

	PR target/118017
	* gcc.target/i386/pr118017.c: Check target int128.

94d8de53

arm: [MVE intrinsics] Fix tuples field name (PR 118332) · 288ac095

Christophe Lyon authored 2 months ago

The previous fix only worked for C, for C++ we need to add more
information to the underlying type so that
finish_class_member_access_expr accepts it.

We use the same logic as in aarch64's register_tuple_type for AdvSIMD
tuples.

This patch makes gcc.target/arm/mve/intrinsics/pr118332.c pass in C++
mode.

gcc/ChangeLog:

	PR target/118332
	* config/arm/arm-mve-builtins.cc (wrap_type_in_struct): Delete.
	(register_type_decl): Delete.
	(register_builtin_tuple_types): Use
	lang_hooks.types.simulate_record_decl.

288ac095

Fix bootstrap on !HARDREG_PRE_REGNOS targets · 55341185
Richard Biener authored 2 months ago
```
Pushed as obvious.

	* gcse.cc (pass_hardreg_pre::gate): Wrap possibly unused
	fun argument.
```
55341185

rtl-optimization/117467 - limit ext-dce memory use · 03faac50

Richard Biener authored 2 months ago

The following puts in a hard limit on ext-dce because it might end
up requiring memory on the order of the number of basic blocks
times the number of pseudo registers.  The limiting follows what
GCSE based passes do and thus I re-use --param max-gcse-memory here.

This doesn't in any way address the implementation issues of the pass,
but it reduces the memory-use when compiling the
module_first_rk_step_part1.F90 TU from 521.wrf_r from 25GB to 1GB.

	PR rtl-optimization/117467
	PR rtl-optimization/117934
	* ext-dce.cc (ext_dce_execute): Do nothing if a memory
	allocation estimate exceeds what is allowed by
	--param max-gcse-memory.

03faac50

c++: ICE with pack indexing and partial inst [PR117937] · d6444794

Marek Polacek authored 3 months ago

Here we ICE in expand_expr_real_1:

      if (exp)
        {
          tree context = decl_function_context (exp);
          gcc_assert (SCOPE_FILE_SCOPE_P (context)
                      || context == current_function_decl

on something like this test:

  void
  f (auto... args)
  {
    [&]<size_t... i>(seq<i...>) {
	g(args...[i]...);
    }(seq<0>());
  }

because while current_function_decl is:

  f<int>(int)::<lambda(seq<i ...>)> [with long unsigned int ...i = {0}]

(correct), context is:

  f<int>(int)::<lambda(seq<i ...>)>

which is only the partial instantiation.

I think that when tsubst_pack_index gets a partial instantiation, e.g.
{*args#0} as the pack, we should still tsubst it.  The args#0's value-expr
can be __closure->__args#0 where the closure's context is the partially
instantiated operator().  So we should let retrieve_local_specialization
find the right args#0.

	PR c++/117937

gcc/cp/ChangeLog:

	* pt.cc (tsubst_pack_index): tsubst the pack even when it's not
	PACK_EXPANSION_P.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp26/pack-indexing13.C: New test.
	* g++.dg/cpp26/pack-indexing14.C: New test.

d6444794

s390: Add expander for uaddc/usubc optabs · 8a2d5bc2

Stefan Schulze Frielinghaus authored 2 months ago

gcc/ChangeLog:

	* config/s390/s390-protos.h (s390_emit_compare): Add mode
	parameter for the resulting RTX.
	* config/s390/s390.cc (s390_emit_compare): Dito.
	(s390_emit_compare_and_swap): Change.
	(s390_expand_vec_strlen): Change.
	(s390_expand_cs_hqi): Change.
	(s390_expand_split_stack_prologue): Change.
	* config/s390/s390.md (*add<mode>3_carry1_cc): Renamed to ...
	(add<mode>3_carry1_cc): this and in order to use the
	corresponding gen function, encode CC mode into pattern.
	(*sub<mode>3_borrow_cc): Renamed to ...
	(sub<mode>3_borrow_cc): this and in order to use the
	corresponding gen function, encode CC mode into pattern.
	(*add<mode>3_alc_carry1_cc): Renamed to ...
	(add<mode>3_alc_carry1_cc): this and in order to use the
	corresponding gen function, encode CC mode into pattern.
	(sub<mode>3_slb_borrow1_cc): New.
	(uaddc<mode>5): New.
	(usubc<mode>5): New.

gcc/testsuite/ChangeLog:

	* gcc.target/s390/uaddc-1.c: New test.
	* gcc.target/s390/uaddc-2.c: New test.
	* gcc.target/s390/uaddc-3.c: New test.
	* gcc.target/s390/usubc-1.c: New test.
	* gcc.target/s390/usubc-2.c: New test.
	* gcc.target/s390/usubc-3.c: New test.

8a2d5bc2

docs: Document new hardreg PRE pass · 016e2f00
Andrew Carlotti authored 3 months ago
```
gcc/ChangeLog:

	* doc/passes.texi: Document hardreg PRE pass.
```
016e2f00

Add new hardreg PRE pass · e7f98d96

Andrew Carlotti authored 5 months ago

This pass is used to optimise assignments to the FPMR register in
aarch64.  I chose to implement this as a middle-end pass because it
mostly reuses the existing RTL PRE code within gcse.cc.

Compared to RTL PRE, the key difference in this new pass is that we
insert new writes directly to the destination hardreg, instead of
writing to a new pseudo-register and copying the result later.  This
requires changes to the analysis portion of the pass, because sets
cannot be moved before existing instructions that set, use or clobber
the hardreg, and the value becomes unavailable after any uses of
clobbers of the hardreg.

Any uses of the hardreg in debug insns will be deleted.  We could do
better than this, but for the aarch64 fpmr I don't think we emit useful
debuginfo for deleted fp8 instructions anyway (and I don't even know if
it's possible to have a debug fpmr use when entering hardreg PRE).

gcc/ChangeLog:

	* config/aarch64/aarch64.h (HARDREG_PRE_REGNOS): New macro.
	* gcse.cc (doing_hardreg_pre_p): New global variable.
	(do_load_motion): New boolean check.
	(current_hardreg_regno): New global variable.
	(compute_local_properties): Unset transp for hardreg clobbers.
	(prune_hardreg_uses): New function.
	(want_to_gcse_p): Use different checks for hardreg PRE.
	(oprs_unchanged_p): Disable load motion for hardreg PRE pass.
	(hash_scan_set): For hardreg PRE, skip non-hardreg sets and
	check for hardreg clobbers.
	(record_last_mem_set_info): Skip for hardreg PRE.
	(compute_pre_data): Prune hardreg uses from transp bitmap.
	(pre_expr_reaches_here_p_work): Add sentence to comment.
	(insert_insn_start_basic_block): New functions.
	(pre_edge_insert): Don't add hardreg sets to predecessor block.
	(pre_delete): Use hardreg for the reaching reg.
	(reset_hardreg_debug_uses): New function.
	(pre_gcse): For hardreg PRE, reset debug uses and don't insert
	copies.
	(one_pre_gcse_pass): Disable load motion for hardreg PRE.
	(execute_hardreg_pre): New.
	(class pass_hardreg_pre): New.
	(pass_hardreg_pre::gate): New.
	(make_pass_hardreg_pre): New.
	* passes.def (pass_hardreg_pre): New pass.
	* tree-pass.h (make_pass_hardreg_pre): New.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/acle/fpmr-1.c: New test.
	* gcc.target/aarch64/acle/fpmr-2.c: New test.
	* gcc.target/aarch64/acle/fpmr-3.c: New test.
	* gcc.target/aarch64/acle/fpmr-4.c: New test.

e7f98d96

Disable a broken multiversioning optimisation · 21212f08

Andrew Carlotti authored 2 months ago

This patch skips redirect_to_specific clone for aarch64 and riscv,
because the optimisation has two flaws:

1. It checks the value of the "target" attribute, even on targets that
don't use this attribute for multiversioning.

2. The algorithm used is too aggressive, and will eliminate the
indirection in some cases where the runtime choice of callee version
can't be determined statically at compile time.  A correct would need to
verify that:
 - if the current caller version were selected at runtime, then the
   chosen callee version would be eligible for selection.
 - if any higher priority callee version were selected at runtime, then
   a higher priority caller version would have been eligble for
   selection (and hence the current caller version wouldn't have been
   selected).

The current checks only verify a more restrictive version of the first
condition, and don't check the second condition at all.

Fixing the optimisation properly would require implementing target hooks
to check for implications between version attributes, which is too
complicated for this stage.  However, I would like to see this hook
implemented in the future, since it could also help deduplicate other
multiversioning code.

Since this behaviour has existed for x86 and powerpc for a while, I
think it's best to preserve the existing behaviour on those targets,
unless any maintainer for those targets disagrees.

gcc/ChangeLog:

	* multiple_target.cc
	(redirect_to_specific_clone): Assert that "target" attribute is
	used for FMV before checking it.
	(ipa_target_clone): Skip redirect_to_specific_clone on some
	targets.

gcc/testsuite/ChangeLog:

	* g++.target/aarch64/mv-pragma.C: New test.

21212f08

docs: Add new AArch64 flags · abbe2905
Andrew Carlotti authored 4 months ago
```
gcc/ChangeLog:

	* doc/invoke.texi: Add new AArch64 flags.
```
abbe2905

aarch64: Add new +xs flag · f06c6f8b

Andrew Carlotti authored 7 months ago

GCC does not emit tlbi instructions, so this only affects the flags
passed through to the assembler.

gcc/ChangeLog:

	* config/aarch64/aarch64-arches.def (V8_7A): Add XS.
	* config/aarch64/aarch64-option-extensions.def (XS): New flag.

f06c6f8b

aarch64: Add new +wfxt flag · 4984119b

Andrew Carlotti authored 7 months ago

GCC does not currently emit the wfet or wfit instructions, so this
primarily affects the flags passed through to the assembler.

gcc/ChangeLog:

	* config/aarch64/aarch64-arches.def (V8_7A): Add WFXT.
	* config/aarch64/aarch64-option-extensions.def (WFXT): New flag.

4984119b

aarch64: Add new +rcpc2 flag · 5747c121

Andrew Carlotti authored 7 months ago

gcc/ChangeLog:

	* config/aarch64/aarch64-arches.def (V8_4A): Add RCPC2.
	* config/aarch64/aarch64-option-extensions.def
	(RCPC2): New flag.
	(RCPC3): Add RCPC2 dependency.
	* config/aarch64/aarch64.h (TARGET_RCPC2): Use new flag.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/cpunative/native_cpu_21.c: Add rcpc2 to
	expected feature string instead of rcpc.
	* gcc.target/aarch64/cpunative/native_cpu_22.c: Ditto.

5747c121

aarch64: Add new +flagm2 flag · f5915726

Andrew Carlotti authored 7 months ago

GCC does not currently emit the axflag or xaflag instructions, so this
primarily affects the flags passed through to the assembler.

gcc/ChangeLog:

	* config/aarch64/aarch64-arches.def (V8_5A): Add FLAGM2.
	* config/aarch64/aarch64-option-extensions.def (FLAGM2): New flag.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/cpunative/native_cpu_21.c: Add flagm2 to
	expected feature string instead of flagm.
	* gcc.target/aarch64/cpunative/native_cpu_22.c: Ditto.

f5915726

aarch64: Add new +frintts flag · 32a45a21

Andrew Carlotti authored 7 months ago

gcc/ChangeLog:

	* config/aarch64/aarch64-arches.def (V8_5A): Add FRINTTS
	* config/aarch64/aarch64-option-extensions.def (FRINTTS): New flag.
	* config/aarch64/aarch64.h (TARGET_FRINT): Use new flag.
	* config/aarch64/arm_acle.h: Use new flag for frintts intrinsics.
	* config/aarch64/arm_neon.h: Ditto.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/cpunative/native_cpu_21.c: Add frintts to
	expected feature string.
	* gcc.target/aarch64/cpunative/native_cpu_22.c: Ditto.

32a45a21

aarch64: Add new +jscvt flag · 2c891357

Andrew Carlotti authored 7 months ago

gcc/ChangeLog:

	* config/aarch64/aarch64-arches.def (V8_3A): Add JSCVT.
	* config/aarch64/aarch64-option-extensions.def (JSCVT): New flag.
	* config/aarch64/aarch64.h (TARGET_JSCVT): Use new flag.
	* config/aarch64/arm_acle.h: Use new flag for jscvt intrinsics.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/cpunative/native_cpu_21.c: Add jscvt to
	expected feature string.
	* gcc.target/aarch64/cpunative/native_cpu_22.c: Ditto.

2c891357

aarch64: Add new +fcma flag · 9bbb91e8

Andrew Carlotti authored 7 months ago

This includes +fcma as a dependency of +sve, and means that we can
finally support fcma intrinsics on a64fx.

Also add fcma to the Features list in several cpunative testcases that
incorrectly included sve without fcma.

gcc/ChangeLog:

	* config/aarch64/aarch64-arches.def (V8_3A): Add FCMA.
	* config/aarch64/aarch64-option-extensions.def (FCMA): New flag.
	(SVE): Add FCMA dependency.
	* config/aarch64/aarch64.h (TARGET_COMPLEX): Use new flag.
	* config/aarch64/arm_neon.h: Use new flag for fcma intrinsics.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/cpunative/info_15: Add fcma to Features.
	* gcc.target/aarch64/cpunative/info_16: Ditto.
	* gcc.target/aarch64/cpunative/info_17: Ditto.
	* gcc.target/aarch64/cpunative/info_8: Ditto.
	* gcc.target/aarch64/cpunative/info_9: Ditto.

9bbb91e8

aarch64: Use PAUTH instead of V8_3A in some places · 20385cb9

Andrew Carlotti authored 7 months ago

gcc/ChangeLog:

	* config/aarch64/aarch64.cc
	(aarch64_expand_epilogue): Use TARGET_PAUTH.
	* config/aarch64/aarch64.md: Update comment.

20385cb9

c: Fix up expr location for __builtin_stdc_rotate_* [PR118376] · 76b7f60f

Jakub Jelinek authored 2 months ago

Seems I forgot to set_c_expr_source_range for the __builtin_stdc_rotate_*
case (the other __builtin_stdc_* cases already have it), which means
the locations in expr are uninitialized, sometimes causing ICEs in linemap
code, at other times just valgrind errors about uninitialized var uses.

2025-01-10  Jakub Jelinek  <jakub@redhat.com>

	PR c/118376
	* c-parser.cc (c_parser_postfix_expression): Call
	set_c_expr_source_range before break in the __builtin_stdc_rotate_*
	case.

	* gcc.dg/pr118376.c: New test.

76b7f60f

rtl: Remove invalid compare simplification [PR117186] · 06c4cf39

Richard Sandiford authored 2 months ago

g:d882fe51, posted at
https://gcc.gnu.org/pipermail/gcc-patches/2000-July/033786.html ,
added code to treat:

  (set (reg:CC cc) (compare:CC (gt:M (reg:CC cc) 0) (lt:M (reg:CC cc) 0)))

as a nop.  This PR shows that that isn't always correct.
The compare in the set above is between two 0/1 booleans (at least
on STORE_FLAG_VALUE==1 targets), whereas the unknown comparison that
produced the incoming (reg:CC cc) is unconstrained; it could be between
arbitrary integers, or even floats.  The fold is therefore replacing a
cc that is valid for both signed and unsigned comparisons with one that
is only known to be valid for signed comparisons.

  (gt (compare (gt cc 0) (lt cc 0) 0)

does simplify to:

  (gt cc 0)

but:

  (gtu (compare (gt cc 0) (lt cc 0) 0)

does not simplify to:

  (gtu cc 0)

The optimisation didn't come with a testcase, but it was added for
i386's cmpstrsi, now cmpstrnsi.  That probably doesn't matter as much
as it once did, since it's now conditional on -minline-all-stringops.
But the patch is almost 25 years old, so whatever the original
motivation was, it seems likely that other things now rely on it.

It therefore seems better to try to preserve the optimisation on rtl
rather than get rid of it.  To do that, we need to look at how the
result of the outer compare is used.  We'd therefore be looking at four
instructions (the gt, the lt, the compare, and the use of the compare),
but combine already allows that for 3-instruction combinations thanks
to:

  /* If the source is a COMPARE, look for the use of the comparison result
     and try to simplify it unless we already have used undobuf.other_insn.  */

When applied to boolean inputs, a comparison operator is
effectively a boolean logical operator (AND, ANDNOT, XOR, etc.).
simplify_logical_relational_operation already had code to simplify
logical operators between two comparison results, but:

* It only handled IOR, which doesn't cover all the cases needed here.
  The others are easily added.

* It treated comparisons of integers as having an ORDERED/UNORDERED result.
  Therefore:

  * it would not treat "true for LT + EQ + GT" as "always true" for
    comparisons between integers, because the mask excluded the UNORDERED
    condition.

  * it would try to convert "true for LT + GT" into LTGT even for comparisons
    between integers.  To prevent an ICE later, the code used:

       /* Many comparison codes are only valid for certain mode classes.  */
       if (!comparison_code_valid_for_mode (code, mode))
         return 0;

    However, this used the wrong mode, since "mode" is here the integer
    result of the comparisons (and the mode of the IOR), not the mode of
    the things being compared.  Thus the effect was to reject all
    floating-point-only codes, even when comparing floats.

  I think instead the code should detect whether the comparison is between
  integer values and remove UNORDERED from consideration if so.  It then
  always produces a valid comparison (or an always true/false result),
  and so comparison_code_valid_for_mode is not needed.  In particular,
  "true for LT + GT" becomes NE for comparisons between integers but
  remains LTGT for comparisons between floats.

* There was a missing check for whether the comparison inputs had
  side effects.

While there, it also seemed worth extending
simplify_logical_relational_operation to unsigned comparisons, since
that makes the testing easier.

As far as that testing goes: the patch exhaustively tests all
combinations of integer comparisons in:

  (cmp1 (cmp2 X Y) (cmp3 X Y))

for the 10 integer comparisons, giving 1000 fold attempts in total.
It then tries all combinations of (X in {-1,0,1} x Y in {-1,0,1})
on the result of the fold, giving 9 checks per fold, or 9000 in total.
That's probably more than is typical for self-tests, but it seems to
complete in neglible time, even for -O0 builds.

gcc/
	PR rtl-optimization/117186
	* rtl.h (simplify_context::simplify_logical_relational_operation): Add
	an invert0_p parameter.
	* simplify-rtx.cc (unsigned_comparison_to_mask): New function.
	(mask_to_unsigned_comparison): Likewise.
	(comparison_code_valid_for_mode): Delete.
	(simplify_context::simplify_logical_relational_operation): Add
	an invert0_p parameter.  Handle AND and XOR.  Handle unsigned
	comparisons.  Handle always-false results.  Ignore the low bit
	of the mask if the operands are always ordered and remove the
	then-redundant check of comparison_code_valid_for_mode.  Check
	for side-effects in the operands before simplifying them away.
	(simplify_context::simplify_binary_operation_1): Remove
	simplification of (compare (gt ...) (lt ...)) and instead...
	(simplify_context::simplify_relational_operation_1): ...handle
	comparisons of comparisons here.
	(test_comparisons): New function.
	(test_scalar_ops): Call it.

gcc/testsuite/
	PR rtl-optimization/117186
	* gcc.dg/torture/pr117186.c: New test.
	* gcc.target/aarch64/pr117186.c: Likewise.

06c4cf39

[ifcombine] drop other misuses of uniform_integer_cst_p · 47ac6ca9

Alexandre Oliva authored 2 months ago

As Jakub pointed out in PR118206, the use of uniform_integer_cst_p in
ifcombine makes no sense, we're not dealing with vectors.  Indeed,
I've been misunderstanding and misusing it since I cut&pasted it from
some preexisting match predicate in earlier version of the ifcombine
field-merge patch.


for  gcc/ChangeLog

	* gimple-fold.cc (decode_field_reference): Drop misuses of
	uniform_integer_cst_p.
	(fold_truth_andor_for_ifcombine): Likewise.

47ac6ca9

[ifcombine] fix mask variable test to match use [PR118344] · fd4e979d

Alexandre Oliva authored 2 months ago

There was a cut&pasto in the rr_and_mask's adjustment to match the
combined type: the test on whether there was a mask already was
testing the wrong variable, and then it might crash or otherwise fail
accessing an undefined mask.  This only hit with checking enabled,
and rarely at that.


for  gcc/ChangeLog

	PR tree-optimization/118344
	* gimple-fold.cc (fold_truth_andor_for_ifcombine): Fix typo in
	rr_and_mask's type adjustment test.

for  gcc/testsuite/ChangeLog

	PR tree-optimization/118344
	* gcc.dg/field-merge-19.c: New.

fd4e979d

[ifcombine] reuse left-hand mask to decode right-hand xor operand · 740c8497

Alexandre Oliva authored 2 months ago

If fold_truth_andor_for_ifcombine applies a mask to an xor, say
because the result of the xor is compared with a power of two [minus
one], we have to apply the same mask when processing both the left-
and right-hand xor paths for the transformation to be sound.  Arrange
for decode_field_reference to propagate the incoming mask along with
the expression to the right-hand operand.

Don't require the right-hand xor operand to be a constant, that was a
cut&pasto.


for  gcc/ChangeLog

	* gimple-fold.cc (decode_field_reference): Add xor_pand_mask.
	Propagate pand_mask to the right-hand xor operand.  Don't
	require the right-hand xor operand to be a constant.
	(fold_truth_andor_for_ifcombine): Pass right-hand mask when
	appropriate.

740c8497

[ifcombine] adjust for narrowing converts before shifts [PR118206] · c96a6c2c

Alexandre Oliva authored 2 months ago

A narrowing conversion and a shift both drop bits from the loaded
value, but we need to take into account which one comes first to get
the right number of bits and mask.

Fold when applying masks to parts, comparing the parts, and combining
the results, in the odd chance either mask happens to be zero.


for  gcc/ChangeLog

	PR tree-optimization/118206
	* gimple-fold.cc (decode_field_reference): Account for upper
	bits dropped by narrowing conversions whether before or after
	a right shift.
	(fold_truth_andor_for_ifcombine): Fold masks, compares, and
	combined results.

for  gcc/testsuite/ChangeLog

	PR tree-optimization/118206
	* gcc.dg/field-merge-18.c: New.

c96a6c2c

testsuite: generalized field-merge tests for <32-bit int [PR118025] · d3c91b04

Alexandre Oliva authored 2 months ago

Explicitly convert constants to the desired types, so as to not elicit
warnings about implicit truncations, nor execution errors, on targets
whose ints are narrower than 32 bits.


for  gcc/testsuite/ChangeLog

	PR testsuite/118025
	* gcc.dg/field-merge-1.c: Convert constants to desired types.
	* gcc.dg/field-merge-3.c: Likewise.
	* gcc.dg/field-merge-4.c: Likewise.
	* gcc.dg/field-merge-5.c: Likewise.
	* gcc.dg/field-merge-11.c: Likewise.
	* gcc.dg/field-merge-17.c: Don't mess with padding bits.

d3c91b04

testsuite: generalize ifcombine field-merge tests [PR118025] · 261ffe68

Alexandre Oliva authored 2 months ago

A number of tests that check for specific ifcombine transformations
fail on AVR and PRU targets, whose type sizes and alignments aren't
conducive of the expected transformations.  Adjust the expectations.

Most execution tests should run successfully regardless of the
transformations, but a few that could conceivably fail if short and
char have the same bit width now check for that and bypass the tests
that would fail.

Conversely, one test that had such a runtime test, but that would work
regardless, no longer has that runtime test, and its types are
narrowed so that the transformations on 32-bit targets are more likely
to be the same as those that used to take place on 64-bit targets.
This latter change is somewhat obviated by a separate patch, but I've
left it in place anyway.


for  gcc/testsuite/ChangeLog

	PR testsuite/118025
	* gcc.dg/field-merge-1.c: Skip BIT_FIELD_REF counting on AVR and PRU.
	* gcc.dg/field-merge-3.c: Bypass the test if short doesn't have the
	expected size.
	* gcc.dg/field-merge-8.c: Likewise.
	* gcc.dg/field-merge-9.c: Likewise.  Skip optimization counting on
	AVR and PRU.
	* gcc.dg/field-merge-13.c: Skip optimization counting on AVR and PRU.
	* gcc.dg/field-merge-15.c: Likewise.
	* gcc.dg/field-merge-17.c: Likewise.
	* gcc.dg/field-merge-16.c: Likewise.  Drop runtime bypass.  Use
	smaller types.
	* gcc.dg/field-merge-14.c: Add comments.

261ffe68

ifcombine field-merge: improve handling of dwords · 38401c58

Alexandre Oliva authored 2 months ago

On 32-bit hosts, data types with 64-bit alignment aren't getting
treated as desired by ifcombine field-merging: we limit the choice of
modes at BITS_PER_WORD sizes, but when deciding the boundary for a
split, we'd limit the choice only by the alignment, so we wouldn't
even consider a split at an odd 32-bit boundary.  Fix that by limiting
the boundary choice by word choice as well.

Now, this would still leave misaligned 64-bit fields in 64-bit-aligned
data structures unhandled by ifcombine on 32-bit hosts.  We already
need to loading them as double words, and if they're not byte-aligned,
the code gets really ugly, but ifcombine could improve it if it allows
double-word loads as a last resort.  I've added that.


for  gcc/ChangeLog

	* gimple-fold.cc (fold_truth_andor_for_ifcombine): Limit
	boundary choice by word size as well.  Try aligned double-word
	loads as a last resort.

for  gcc/testsuite/ChangeLog

	* gcc.dg/field-merge-17.c: New.

38401c58

ipa-cp: Fold-convert values when necessary (PR 118138) · d019ab4f

Martin Jambor authored 2 months ago

PR 118138 and quite a few duplicates that it has acquired in a short
time show that even though we are careful to make sure we do not loose
any bits when newly allowing type conversions in jump-functions, we
still need to perform the fold conversions during IPA constant
propagation and not just at the end in order to properly perform
sign-extensions or zero-extensions as appropriate.

This patch does just that, changing a safety predicate we already use
at the appropriate places to return the necessary type.

gcc/ChangeLog:

2025-01-03  Martin Jambor  <mjambor@suse.cz>

	PR ipa/118138
	* ipa-cp.cc (ipacp_value_safe_for_type): Return the appropriate
	type instead of a bool, accept NULL_TREE VALUEs.
	(propagate_vals_across_arith_jfunc): Use the new returned value of
	ipacp_value_safe_for_type.
	(propagate_vals_across_ancestor): Likewise.
	(propagate_scalar_across_jump_function): Likewise.

gcc/testsuite/ChangeLog:

2025-01-03  Martin Jambor  <mjambor@suse.cz>

	PR ipa/118138
	* gcc.dg/ipa/pr118138.c: New test.

d019ab4f

nvptx: Add '__builtin_frame_address(0)' test case · 86175a64
Thomas Schwinge authored 3 months ago
```
Documenting the status quo.

	gcc/testsuite/
	* gcc.target/nvptx/__builtin_frame_address_0-1.c: New.
```
86175a64
nvptx: Add '__builtin_stack_address()' test case · 91dec10f
Thomas Schwinge authored 3 months ago
```
Documenting the status quo.

	gcc/testsuite/
	* gcc.target/nvptx/__builtin_stack_address-1.c: New.
```
91dec10f

testsuite: arm: Use -std=c17 and effective-target arm_arch_v5te_thumb · f447c3c0

Torbjörn SVENSSON authored 3 months ago


With -std=c23, the following errors are now emitted as the function
prototype and implementation does not match:

.../pr59858.c: In function 're_search_internal':
.../pr59858.c:95:17: error: too many arguments to function 'check_matching'
.../pr59858.c:75:12: note: declared here
.../pr59858.c: At top level:
.../pr59858.c:100:1: error: conflicting types for 'check_matching'; have 'int(re_match_context_t *, int *)'
.../pr59858.c:75:12: note: previous declaration of 'check_matching' with type 'int(void)'
.../pr59858.c: In function 'check_matching':
.../pr59858.c:106:14: error: too many arguments to function 'transit_state'
.../pr59858.c:77:23: note: declared here
.../pr59858.c: At top level:
.../pr59858.c:111:1: error: conflicting types for 'transit_state'; have 're_dfastate_t *(re_match_context_t *, re_dfastate_t *)'
.../pr59858.c:77:23: note: previous declaration of 'transit_state' with type 're_dfastate_t *(void)'
.../pr59858.c: In function 'transit_state':
.../pr59858.c:116:7: error: too many arguments to function 'build_trtable'
.../pr59858.c:79:12: note: declared here
.../pr59858.c: At top level:
.../pr59858.c:121:1: error: conflicting types for 'build_trtable'; have 'int(const re_dfa_t *, re_dfastate_t *)'
.../pr59858.c:79:12: note: previous declaration of 'build_trtable' with type 'int(void)'

Adding -std=c17 removes these errors.

Also, updated test case to use -mcpu=unset/-march=unset feature
introduced in r15-3606-g7d6c6a0d15c.

gcc/testsuite/ChangeLog:

	* gcc.target/arm/pr59858.c: Use -std=c17 and effective-target
	arm_arch_v5te_thumb.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>

f447c3c0

ada: Incorrect accessibilty level for library level subprograms · 3ff216b7

squirek authored 4 months ago

The patch fixes an issue in the compiler whereby accessibility level
calculations for objects declared witihin library-level subprograms
were done incorrectly - potentially allowing runtime accessibility
checks to spuriously pass.

gcc/ada/ChangeLog:

	* accessibility.adb:
	(Innermost_master_Scope_Depth): Add special case for expressions
	within library level subprograms.

3ff216b7