Commits · dcee0b6547211a428b75adb03a461285fed0f20d · COBOLworx / gcc-cobol

Oct 09, 2024

Jason Merrill authored 5 months ago

In r15-4119-gc877a27f04f648 I told preprocess_file to use the
directives-only scan with modules, but it seems that I also need to set the
cpp_option so that communication between _cpp_handle_directive and
scan_translation_unit_directives_only works properly in
c-c++-common/cpp/embed-6.c.

gcc/c-family/ChangeLog:

	* c-ppoutput.cc (preprocess_file): Set directives_only flag.

dcee0b65

libcpp: fix typo · d264b75e
Jason Merrill authored 5 months ago
```
libcpp/ChangeLog:

	* macro.cc (_cpp_pop_context): Fix typo.
```
d264b75e

testsuite: arm: use effective-target for mod* tests · 08e91d71

Torbjörn SVENSSON authored 5 months ago

This fixes a typo introduced in r15-4200-gcf08dd297ca that was reported
at https://linaro.atlassian.net/browse/GNU-1369

.

gcc/testsuite/ChangeLog

	* gcc.target/arm/mod_2.c: Corrected effective-target to
	arm_cpu_cortex_a57_ok.
	* gcc.target/arm/mod_256.c: Likewise.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>

08e91d71

libstdc++: Test 17_intro/names.cc with -D_FORTIFY_SOURCE=2 [PR116210] · 4f97411c

Jonathan Wakely authored 5 months ago

Add a new testcase that repeats 17_intro/names.cc but with
_FORTIFY_SOURCE defined, to find problems in Glibc fortify wrappers like
https://sourceware.org/bugzilla/show_bug.cgi?id=32052 (which is fixed
now).

libstdc++-v3/ChangeLog:

	PR libstdc++/116210
	* testsuite/17_intro/names.cc (sz): Undef for versions of Glibc
	that use it in the fortify wrappers.
	* testsuite/17_intro/names_fortify.cc: New test.

4f97411c

libstdc++: Drop format attribute from snprintf wrapper [PR116969] · 5247ee08

Jonathan Wakely authored 5 months ago

When __LONG_DOUBLE_IEEE128__ is defined we need to declare a wrapper for
Glibc's 'snprintf' symbol, so we can call the original definition that
works with the IBM128 format of long double. Because we were declaring
the wrapper using __typeof__(__builtin_snprintf) it inherited the
__attribute__((format(printf, 3, 4))) decoration, and then we got a
warning for calling that wrapper with an __ibm128 argument for a %Lf
conversion specifier. The warning is bogus, because the function we're
calling really does want __ibm128 for %Lf, but there's no "printf but
with a different long double format" archetype for the attribute.

In r15-4039-g28911f626864e7 I added a diagnostic pragma to suppress the
warning, but it would be better to just declare the wrapper without the
attribute, and not have to suppress a warning for code that we know is
actually correct.

libstdc++-v3/ChangeLog:

	PR libstdc++/116969
	* include/bits/locale_facets_nonio.tcc (money_put::__do_put):
	Remove diagnostic pragmas.
	(__glibcxx_snprintfibm128): Declare type manually, instead of
	using __typeof__(__builtin_snprintf).

5247ee08

libstdc++: Workaround glibc headers on ia64-linux · c0bc9a15

Frank Scheiner authored 5 months ago

We see:

```
FAIL: 17_intro/names.cc  -std=gnu++17 (test for excess errors)
FAIL: 17_intro/names_pstl.cc  -std=gnu++17 (test for excess errors)
FAIL: experimental/names.cc  -std=gnu++17 (test for excess errors)
```

...on ia64-linux.

This is due to:

* /usr/include/bits/sigcontext.h:32-38:
```
32 struct __ia64_fpreg
33   {
34     union
35       {
36         unsigned long bits[2];
37       } u;
38   } __attribute__ ((__aligned__ (16)));
```

* /usr/include/sys/ucontext.h:39-45:
```
  39 struct __ia64_fpreg_mcontext
  40   {
  41     union
  42       {
  43         unsigned long __ctx(bits)[2];
  44       } __ctx(u);
  45   } __attribute__ ((__aligned__ (16)));
```

...from glibc 2.39 (w/ia64 support re-added). See the discussion
starting on [1].

[1]: https://gcc.gnu.org/pipermail/gcc-patches/2024-June/654487.html



Signed-off-by: Frank Scheiner <frank.scheiner@web.de>

libstdc++-v3/ChangeLog:

	* testsuite/17_intro/names.cc [__linux__ && __ia64__]: Undefine
	'u' as used in glibc headers.

c0bc9a15

aarch64: Fix SVE ACLE gimple folds for C++ LTO [PR116629] · fee3adba

Richard Sandiford authored 5 months ago

The SVE ACLE code has two ways of handling overloaded functions.
One, used by C, is to define a single dummy function for each unique
overloaded name, with resolve_overloaded_builtin then resolving calls
to real non-overloaded functions.  The other, used by C++, is to
define a separate function for each individual overload.

The builtins harness assigns integer function codes programmatically.
However, LTO requires it to use the same assignment for every
translation unit, regardless of language.  This means that C++ TUs
need to create (unused) slots for the C overloads and that C TUs
need to create (unused) slots for the C++ overloads.

In many ways, it doesn't matter whether the LTO frontend itself
uses the C approach or the C++ approach to defining overloaded
functions, since the LTO frontend never has to resolve source-level
overloading.  However, the C++ approach of defining a separate
function for each overload means that C++ calls never need to
be redirected to a different function.  Calls to an overload
can appear in the LTO dump and survive until expand.  In contrast,
calls to C's dummy overload functions are resolved by the front
end and never survive to LTO (or expand).

Some optimisations work by moving between sibling functions, such as _m
to _x.  If the source function is an overload, the expected destination
function is too.  The LTO frontend needs to define C++ overloads if it
wants to do this optimisation properly for C++.

The PR is about a tree checking failure caused by trying to use a
stubbed-out C++ overload in LTO.  Dealing with that by detecting the
stub (rather than changing which overloads are defined) would have
turned this from an ice-on-valid to a missed optimisation.

In future, it would probably make sense to redirect overloads to
non-overloaded functions during gimple folding, in case that exposes
more CSE opportunities.  But it'd probably be of limited benefit, since
it should be rare for code to mix overloaded and non-overloaded uses of
the same operation.  It also wouldn't be suitable for backports.

gcc/
	PR target/116629
	* config/aarch64/aarch64-sve-builtins.cc
	(function_builder::function_builder): Use direct overloads for LTO.

gcc/testsuite/
	PR target/116629
	* gcc.target/aarch64/sve/acle/general/pr106326_2.c: New test.

fee3adba

testsuite: Make check-function-bodies work with LTO · b94331d9

Richard Sandiford authored 5 months ago

This patch tries to make check-function-bodies automatically
choose between reading the regular assembly file and reading the
LTO assembly file.  There should only ever be one right answer,
since check-function-bodies doesn't make sense on slim LTO output.

Maybe this will turn out to be impossible to get right, but I'd like
to try at least.

gcc/testsuite/
	* lib/scanasm.exp (check-function-bodies): Look in ltrans0.ltrans.s
	if the test appears to be using LTO.

b94331d9

libstdc++: Ignore _GLIBCXX_USE_POSIX_SEMAPHORE if not supported [PR116992] · 9a5ac633

Jonathan Wakely authored 5 months ago

If _GLIBCXX_HAVE_POSIX_SEMAPHRE is undefined then users get an error
when defining _GLIBCXX_USE_POSIX_SEMAPHORE. We can just ignore it
instead (and warn them it's being ignored).

This fixes a testsuite failure on hppa64-hp-hpux11.11 (and probably some
other targets):

FAIL: 30_threads/semaphore/platform_try_acquire_for.cc  -std=gnu++20 (test for excess errors)
Excess errors:
semaphore:49: error: '__semaphore_impl' has not been declared

libstdc++-v3/ChangeLog:

	PR libstdc++/116992
	* include/bits/semaphore_base.h (_GLIBCXX_USE_POSIX_SEMAPHORE):
	Undefine and issue a warning if POSIX sem_t is not supported.
	* testsuite/30_threads/semaphore/platform_try_acquire_for.cc:
	Prune new warning.

9a5ac633

libstdc++: Fix -Wnarrowing in <complex> [PR116991] · e998014d

Jonathan Wakely authored 5 months ago

When _GLIBCXX_USE_C99_COMPLEX_ARC is undefined we use the generic
__complex_acos function template for _Float32 etc. and that gives a
-Wnarrowing warning:

complex:2043: warning: ISO C++ does not allow converting to '_Float32' from 'long double' with greater conversion rank [-Wnarrowing]

Use a cast to do the conversion so that it doesn't warn.

libstdc++-v3/ChangeLog:

	PR libstdc++/116991
	* include/std/complex (__complex_acos): Cast literal to
	destination type.

e998014d

libstdc++: Fix -Wsign-compare in std::latch::count_down · f5021ce9

Jonathan Wakely authored 5 months ago

Also add assertions for the precondition on the parameter's value.

libstdc++-v3/ChangeLog:

	* include/std/latch (latch::count_down): Add assertions for
	preconditions. Cast parameter to avoid -Wsign-compare on some
	targets.

f5021ce9

libstdc++: Enable _GLIBCXX_ASSERTIONS by default for -O0 [PR112808] · 361d230f

Jonathan Wakely authored 5 months ago

Too many users don't know about -D_GLIBCXX_ASSERTIONS and so are missing
valuable checks for C++ standard library preconditions. This change
enables libstdc++ assertions by default when compiling with -O0 so that
we diagnose more bugs by default.

When users enable optimization we don't add the assertions by default
(because they have non-zero overhead) so they still need to enable them
manually.

For users who really don't want the assertions even in unoptimized
builds, defining _GLIBCXX_NO_ASSERTIONS will prevent them from being
enabled automatically.

libstdc++-v3/ChangeLog:

	PR libstdc++/112808
	* doc/xml/manual/using.xml (_GLIBCXX_ASSERTIONS): Document
	implicit definition for -O0 compilation.
	(_GLIBCXX_NO_ASSERTIONS): Document.
	* doc/html/manual/using_macros.html: Regenerate.
	* include/bits/c++config [!__OPTIMIZE__] (_GLIBCXX_ASSERTIONS):
	Define for unoptimized builds.

361d230f

libstdc++: Simplify std::aligned_storage and fix for versioned namespace [PR61458] · 6ce1df37

Jonathan Wakely authored 5 months ago

This simplifies the implementation of std::aligned_storage. For the
unstable ABI it also fixes the bug where its size is too large when the
default alignment is used. We can't fix that for the stable ABI though,
so just add a comment about the bug.

libstdc++-v3/ChangeLog:

	PR libstdc++/61458
	* doc/doxygen/user.cfg.in (GENERATE_BUGLIST): Set to NO.
	* include/std/type_traits (__aligned_storage_msa): Remove.
	(__aligned_storage_max_align_t): New struct.
	(__aligned_storage_default_alignment): New function.
	(aligned_storage): Use __aligned_storage_default_alignment for
	default alignment. Replace union with a struct containing an
	aligned buffer. Improve Doxygen comment.
	(aligned_storage_t): Use __aligned_storage_default_alignment for
	default alignment.

6ce1df37

libstdc++: Do not cast away const-ness in std::construct_at (LWG 3870) · 2eaae1bd

Jonathan Wakely authored 8 months ago

This change also requires implementing the proposed resolution of LWG
3216 so that std::make_shared and std::allocate_shared still work, and
the proposed resolution of LWG 3891 so that std::expected still works.

libstdc++-v3/ChangeLog:

	* include/bits/shared_ptr_base.h: Remove cv-qualifiers from
	type managed by _Sp_counted_ptr_inplace, as per LWG 3210.
	* include/bits/stl_construct.h: Do not cast away cv-qualifiers
	when passing pointer to placement new.
	* include/std/expected: Use remove_cv_t for union member, as per
	LWG 3891.
	* testsuite/20_util/allocator/void.cc: Do not test construction
	via const pointer.

2eaae1bd

libstdc++: Make std::construct_at support arrays (LWG 3436) · 993deb3a

Jonathan Wakely authored 1 year ago

The issue was approved at the recent St. Louis meeting, requiring
support for bounded arrays, but only without arguments to initialize the
array elements.

libstdc++-v3/ChangeLog:

	* include/bits/stl_construct.h (construct_at): Support array
	types (LWG 3436).
	* testsuite/20_util/specialized_algorithms/construct_at/array.cc:
	New test.
	* testsuite/20_util/specialized_algorithms/construct_at/array_neg.cc:
	New test.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp0x/initlist-opt1.C: Adjust for different diagnostics
	from std::construct_at by adding -fconcepts-diagnostics-depth=2.

993deb3a

libstdc++: Tweak %c formatting for chrono types · ce89d2f3

Jonathan Wakely authored 5 months ago

libstdc++-v3/ChangeLog:

	* include/bits/chrono_io.h (__formatter_chrono::_M_c): Add
	[[unlikely]] attribute to condition for missing %c format in
	locale. Use %T instead of %H:%M:%S in fallback.

ce89d2f3

libstdc++: Fix formatting of chrono::duration with character rep [PR116755] · b349c651

Jonathan Wakely authored 6 months ago

Implement Peter Dimov's suggestion for resolving LWG 4118, which is to
use +d.count() so that character types are promoted to an integer type
before formatting them. This didn't have unanimous consensus in the
committee as Howard Hinnant proposed that we should format the rep
consistently with std::format("{}", d.count()) instead. That ends up
being more complicated, because it makes std::formattable a precondition
of operator<< which was not previously the case, and it means that
ios_base::fmtflags from the stream would be ignored because std::format
doesn't use them.

libstdc++-v3/ChangeLog:

	PR libstdc++/116755
	* include/bits/chrono_io.h (operator<<): Use +d.count() for
	duration inserter.
	(__formatter_chrono::_M_format): Likewise for %Q format.
	* testsuite/20_util/duration/io.cc: Test durations with
	character types as reps.

b349c651

Clear DR_GROUP_NEXT_ELEMENT upon group dissolving · 55dbb4b5

Richard Biener authored 5 months ago

I've tried to sanitize DR_GROUP_NEXT_ELEMENT accesses but there are too
many so the following instead makes sure DR_GROUP_NEXT_ELEMENT is never
non-NULL for !STMT_VINFO_GROUPED_ACCESS.

	* tree-vect-data-refs.cc (vect_analyze_data_ref_access): When
	cancelling a DR group also clear DR_GROUP_NEXT_ELEMENT.

55dbb4b5

tree-optimization/117041 - fix load classification of former grouped load · 72c83f64

Richard Biener authored 5 months ago

When we first detect a grouped load but later dis-associate it we
only set DR_GROUP_FIRST_ELEMENT to NULL, indicating it is not a
STMT_VINFO_GROUPED_ACCESS but leave DR_GROUP_NEXT_ELEMENT set.  This
causes a stray DR_GROUP_NEXT_ELEMENT access in get_group_load_store_type
to go wrong, indicating a load isn't single_element_p when it actually
is, leading to wrong classification and an ICE.

	PR tree-optimization/117041
	* tree-vect-stmts.cc (get_group_load_store_type): Only
	check DR_GROUP_NEXT_ELEMENT for STMT_VINFO_GROUPED_ACCESS.

	* gcc.dg/torture/pr117041.c: New testcase.

72c83f64

testsuite: arm: use effective-target for vsel*, mod* and pr65647.c tests · cf08dd29

Torbjörn SVENSSON authored 5 months ago


Update test cases to use -mcpu=unset/-march=unset feature introduced in
r15-3606-g7d6c6a0d15c.

gcc/testsuite/ChangeLog

	* gcc.target/arm/pr65647.c: Use effective-target arm_arch_v6m.
	Removed unneeded dg-skip-if.
	* gcc.target/arm/mod_2.c: Use effective-target arm_cpu_cortex_a57.
	* gcc.target/arm/mod_256.c: Likewise.
	* gcc.target/arm/vseleqdf.c: Likewise.
	* gcc.target/arm/vseleqsf.c: Likewise.
	* gcc.target/arm/vselgedf.c: Likewise.
	* gcc.target/arm/vselgesf.c: Likewise.
	* gcc.target/arm/vselgtdf.c: Likewise.
	* gcc.target/arm/vselgtsf.c: Likewise.
	* gcc.target/arm/vselledf.c: Likewise.
	* gcc.target/arm/vsellesf.c: Likewise.
	* gcc.target/arm/vselltdf.c: Likewise.
	* gcc.target/arm/vselltsf.c: Likewise.
	* gcc.target/arm/vselnedf.c: Likewise.
	* gcc.target/arm/vselnesf.c: Likewise.
	* gcc.target/arm/vselvcdf.c: Likewise.
	* gcc.target/arm/vselvcsf.c: Likewise.
	* gcc.target/arm/vselvsdf.c: Likewise.
	* gcc.target/arm/vselvssf.c: Likewise.
	* lib/target-supports.exp: Define effective-target arm_cpu_cortex_a57.
	Update effective-target arm_v8_1_lob_ok to use -mcpu=unset.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>

cf08dd29

libcpp: Use ' instead of %< and %> [PR117039] · f7099903

Ken Matsui authored 5 months ago


	PR bootstrap/117039

libcpp/ChangeLog:

	* directives.cc (do_pragma_once): Use ' instead of %< and %>.

Signed-off-by: Ken Matsui <kmatsui@gcc.gnu.org>

f7099903

Enable LRA for ia64 · 68afc7ac

René Rebe authored 9 months ago

This was tested by bootstrapping GCC natively on ia64-t2-linux-gnu and
running the testsuite (based on
23611606):

https://gcc.gnu.org/pipermail/gcc-testresults/2024-June/817268.html

For comparison, the same with just
23611606:

https://gcc.gnu.org/pipermail/gcc-testresults/2024-June/817267.html



gcc/
	* config/ia64/ia64.cc: Enable LRA for ia64.
	* config/ia64/ia64.md: Likewise.
	* config/ia64/predicates.md: Likewise.

Signed-off-by: René Rebe <rene@exactcode.de>

68afc7ac

Remove ia64*-*-linux from the list of obsolete targets · 452b12ce

René Rebe authored 9 months ago


The following un-deprecates ia64*-*-linux for GCC 15. Since we plan to
support this for some years to come.

gcc/
	* config.gcc: Only list ia64*-*-(hpux|vms|elf) in the list of
	obsoleted targets.

contrib/
	* config-list.mk (LIST): no --enable-obsolete for ia64-linux.

Signed-off-by: René Rebe <rene@exactcode.de>

452b12ce

tree-optimization/116974 - Handle single-lane SLP for OMP scan store · 9df0772d

Richard Biener authored 1 year ago

The following massages the GIMPLE matching way of handling scan
stores to work with single-lane SLP.  I do not fully understand all
the cases that can happen and the stmt matching at vectorizable_store
time is less than ideal - but the following gets me all the testcases
to pass with and without forced SLP.

Long term we want to perform the matching at SLP discovery time,
properly chaining the various SLP instances the current state ends
up with.

	PR tree-optimization/116974
	* tree-vect-stmts.cc (check_scan_store): Pass in the SLP node
	instead of just a flag.  Allow single-lane scan stores.
	(vectorizable_store): Adjust.
	* tree-vect-loop.cc (vect_analyze_loop_2): Empty scan_map
	before re-trying.

9df0772d

tree-optimization/116575 - handle SLP of permuted masked loads · dc90578f

Richard Biener authored 5 months ago

The following handles SLP discovery of permuted masked loads which
was prohibited (because wrongly handled) for PR114375.  In particular
with single-lane SLP at the moment all masked group loads appear
permuted and we fail to use masked load lanes as well.  The following
addresses parts of the issues, starting with doing correct basic
discovery - namely discover an unpermuted mask load followed by
a permute node.  In particular groups with gaps do not support masking
yet (and didn't before w/o SLP IIRC).  There's still issues with
how we represent masked load/store-lanes I think, but I first have to
get my hands on a good testcase.

	PR tree-optimization/116575
	PR tree-optimization/114375
	* tree-vect-slp.cc (vect_build_slp_tree_2): Do not reject
	permuted mask loads without gaps but instead discover a
	node for the full unpermuted load and permute that with
	a VEC_PERM node.

	* gcc.dg/vect/vect-pr114375.c: Expect vectorization now with avx2.

dc90578f

tree-optimization/117000 - elide .REDUC_IOR with compare against zero · 5977b746

Richard Biener authored 5 months ago

The following adds a pattern to elide a .REDUC_IOR operation when
the result is compared against zero with a cbranch.  I've resorted
to using can_compare_p since that's what RTL expansion eventually
checks - while GIMPLE allowed whole vector equality compares for long
I'll notice vector lowering won't lower unsupported ones and RTL
expansion doesn't seem to try using [u]cmp<vector-mode> optabs
(and neither x86 nor aarch64 implements those).  There's cstore
but no target implements that for vector modes either.

	PR tree-optimization/117000
	* match.pd (.REDUC_IOR !=/== 0): New pattern.
	* gimple-match-head.cc: Include memmodel.h and optabs.h.
	* generic-match-head.cc: Likewise.

	* gcc.target/i386/pr117000.c: New testcase.

5977b746

Fix memory leak in vect_cse_slp_nodes · fd883919

Richard Biener authored 5 months ago

The following avoids copying scalar stmts again for the re-lookup
of the slot to replace the NULL guard with node.

	* tree-vect-slp.cc (vect_cse_slp_nodes): Fix memory leak.

fd883919

gcc/doc: adjust __builtin_choose_expr() description · 4b152f62

Jan Beulich authored 5 months ago

Present wording has misled people to believe the ?: operator would be
evaluating all three of the involved expressions.

gcc/

	* doc/extend.texi: Clarify __builtin_choose_expr()
	(dis)similarity to the ?: operator.

4b152f62

gcc, libcpp: Add warning switch for "#pragma once in main file" [PR89808] · 821d5610

Ken Matsui authored 1 year ago


This patch adds a warning switch for "#pragma once in main file".  The
warning option name is Wpragma-once-outside-header, which is the same
as Clang provides.

	PR preprocessor/89808

gcc/c-family/ChangeLog:

	* c.opt (Wpragma_once_outside_header): Define new option.
	* c.opt.urls: Regenerate.

gcc/ChangeLog:

	* doc/invoke.texi (Warning Options): Document
	-Wno-pragma-once-outside-header.

libcpp/ChangeLog:

	* include/cpplib.h (cpp_warning_reason): Define
	CPP_W_PRAGMA_ONCE_OUTSIDE_HEADER.
	* directives.cc (do_pragma_once): Use
	CPP_W_PRAGMA_ONCE_OUTSIDE_HEADER.

gcc/testsuite/ChangeLog:

	* g++.dg/warn/Wno-pragma-once-outside-header.C: New test.
	* g++.dg/warn/Wpragma-once-outside-header.C: New test.

Signed-off-by: Ken Matsui <kmatsui@gcc.gnu.org>
Reviewed-by: Marek Polacek <polacek@redhat.com>

821d5610

Daily bump. · 41179a32
GCC Administrator authored 5 months ago

41179a32

tree-optimization/116024 - simplify some cases of X +- C1 cmp C2 · 52fdf1e7

Artemiy Volkov authored 5 months ago

Whenever C1 and C2 are integer constants, X is of a wrapping type, and
cmp is a relational operator, the expression X +- C1 cmp C2 can be
simplified in the following cases:

(a) If cmp is <= and C2 -+ C1 == +INF(1), we can transform the initial
comparison in the following way:
   X +- C1 <= C2
   -INF <= X +- C1 <= C2 (add left hand side which holds for any X, C1)
   -INF -+ C1 <= X <= C2 -+ C1 (add -+C1 to all 3 expressions)
   -INF -+ C1 <= X <= +INF (due to (1))
   -INF -+ C1 <= X (eliminate the right hand side since it holds for any X)

(b) By analogy, if cmp if >= and C2 -+ C1 == -INF(1), use the following
sequence of transformations:

   X +- C1 >= C2
   +INF >= X +- C1 >= C2 (add left hand side which holds for any X, C1)
   +INF -+ C1 >= X >= C2 -+ C1 (add -+C1 to all 3 expressions)
   +INF -+ C1 >= X >= -INF (due to (1))
   +INF -+ C1 >= X (eliminate the right hand side since it holds for any X)

(c) The > and < cases are negations of (a) and (b), respectively.

This transformation allows to occasionally save add / sub instructions,
for instance the expression

3 + (uint32_t)f() < 2

compiles to

cmn     w0, #4
cset    w0, ls

instead of

add     w0, w0, 3
cmp     w0, 2
cset    w0, ls

on aarch64.

Testcases that go together with this patch have been split into two
separate files, one containing testcases for unsigned variables and the
other for wrapping signed ones (and thus compiled with -fwrapv).
Additionally, one aarch64 test has been adjusted since the patch has
caused the generated code to change from

cmn     w0, #2
csinc   w0, w1, wzr, cc   (x < -2)

to

cmn     w0, #3
csinc   w0, w1, wzr, cs   (x <= -3)

This patch has been bootstrapped and regtested on aarch64, x86_64, and
i386, and additionally regtested on riscv32.

gcc/ChangeLog:

	PR tree-optimization/116024
	* match.pd: New transformation around integer comparison.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/pr116024-2.c: New test.
	* gcc.dg/tree-ssa/pr116024-2-fwrapv.c: Ditto.
	* gcc.target/aarch64/gtu_to_ltu_cmp_1.c: Adjust.

52fdf1e7

tree-optimization/116024 - simplify C1-X cmp C2 for wrapping signed types · e5f5cffb

Artemiy Volkov authored 5 months ago

Implement a match.pd transformation inverting the sign of X in
C1 - X cmp C2, where C1 and C2 are integer constants and X is
of a wrapping signed type, by observing that:

(a) If cmp is == or !=, simply move X and C2 to opposite sides of
the comparison to arrive at X cmp C1 - C2.

(b) If cmp is <:
	- C1 - X < C2 means that C1 - X spans the values of -INF,
	  -INF + 1, ..., C2 - 1;
        - Therefore, X is one of C1 - -INF, C1 - (-INF + 1), ...,
	  C1 - C2 + 1;
	- Subtracting (C1 + 1), X - (C1 + 1) is one of - (-INF) - 1,
          - (-INF) - 2, ..., -C2;
        - Using the fact that - (-INF) - 1 is +INF, derive that
          X - (C1 + 1) spans the values +INF, +INF - 1, ..., -C2;
        - Thus, the original expression can be simplified to
          X - (C1 + 1) > -C2 - 1.

(c) Similarly, C1 - X <= C2 is equivalent to X - (C1 + 1) >= -C2 - 1.

(d) The >= and > cases are negations of (b) and (c), respectively.

(e) In all cases, the expression -C2 - 1 can be shortened to
bit_not (C2).

This transformation allows to occasionally save load-immediate /
subtraction instructions, e.g. the following statement:

10 - (int)f() >= 20;

now compiles to

addi    a0,a0,-11
slti    a0,a0,-20

instead of

li      a5,10
sub     a0,a5,a0
slti    t0,a0,20
xori    a0,t0,1

on 32-bit RISC-V when compiled with -fwrapv.

Additional examples can be found in the newly added test file.  This
patch has been bootstrapped and regtested on aarch64, x86_64, and i386,
and additionally regtested on riscv32.

gcc/ChangeLog:

	PR tree-optimization/116024
	* match.pd: New transformation around integer comparison.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/pr116024-1-fwrapv.c: New test.

e5f5cffb

Oct 08, 2024

tree-optimization/116024 - simplify C1-X cmp C2 for unsigned types · 65b33d43

Artemiy Volkov authored 5 months ago

Implement a match.pd transformation inverting the sign of X in
C1 - X cmp C2, where C1 and C2 are integer constants and X is
of an unsigned type, by observing that:

(a) If cmp is == or !=, simply move X and C2 to opposite sides of the
comparison to arrive at X cmp C1 - C2.

(b) If cmp is <:
	- C1 - X < C2 means that C1 - X spans the range of 0, 1, ..., C2 - 1;
        - This means that X spans the range of C1 - (C2 - 1),
	  C1 - (C2 - 2), ..., C1;
	- Subtracting C1 - (C2 - 1), X - (C1 - (C2 - 1)) is one of 0, 1,
	  ..., C1 - (C1 - (C2 - 1));
        - Simplifying the above, X - (C1 - C2 + 1) is one of 0, 1, ...,
         C2 - 1;
        - Summarizing, the expression C1 - X < C2 can be transformed
	  into X - (C1 - C2 + 1) < C2.

(c) Similarly, if cmp is <=:
	- C1 - X <= C2 means that C1 - X is one of 0, 1, ..., C2;
	- It follows that X is one of C1 - C2, C1 - (C2 - 1), ..., C1;
        - Subtracting C1 - C2, X - (C1 - C2) has range 0, 1, ..., C2;
        - Thus, the expression C1 - X <= C2 can be transformed into
	  X - (C1 - C2) <= C2.

(d) The >= and > cases are negations of (b) and (c), respectively.

This transformation allows to occasionally save load-immediate /
subtraction instructions, e.g. the following statement:

300 - (unsigned int)f() < 100;

now compiles to

addi    a0,a0,-201
sltiu   a0,a0,100

instead of

li      a5,300
sub     a0,a5,a0
sltiu   a0,a0,100

on 32-bit RISC-V.

Additional examples can be found in the newly added test file.  This
patch has been bootstrapped and regtested on aarch64, x86_64, and i386,
and additionally regtested on riscv32.

gcc/ChangeLog:

	PR tree-optimization/116024
	* match.pd: New transformation around integer comparison.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/pr116024-1.c: New test.

65b33d43

tree-optimization/116024 - simplify C1-X cmp C2 for UB-on-overflow types · 0883c886

Artemiy Volkov authored 5 months ago

Implement a match.pd pattern for C1 - X cmp C2, where C1 and C2 are
integer constants and X is of a UB-on-overflow type.  The pattern is
simplified to X rcmp C1 - C2 by moving X and C2 to the other side of the
comparison (with opposite signs).  If C1 - C2 happens to overflow,
replace the whole expression with either a constant 0 or a constant 1
node, depending on the comparison operator and the sign of the overflow.

This transformation allows to occasionally save load-immediate /
subtraction instructions, e.g. the following statement:

10 - (int) x <= 9;

now compiles to

sgt     a0,a0,zero

instead of

li      a5,10
sub     a0,a5,a0
slti    a0,a0,10

on 32-bit RISC-V.

Additional examples can be found in the newly added test file. This
patch has been bootstrapped and regtested on aarch64, x86_64, and
i386, and additionally regtested on riscv32.  Existing tests were
adjusted where necessary.

gcc/ChangeLog:

	PR tree-optimization/116024
	* match.pd: New transformation around integer comparison.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/pr116024.c: New test.
	* gcc.dg/pr67089-6.c: Adjust.

0883c886

RISC-V: Enable builtin __riscv_mul with Zmmul extension. · 2990f580

Tsung Chun Lin authored 5 months ago

From d5b254e19d1f37fe27c7e98a0160e5c22446cfea Mon Sep 17 00:00:00 2001
From: Jim Lin <jim@andestech.com>
Date: Tue, 8 Oct 2024 13:14:32 +0800
Subject: [PATCH] RISC-V: Enable builtin __riscv_mul with Zmmul extension.

gcc/ChangeLog:

	* config/riscv/riscv-c.cc: (riscv_cpu_cpp_builtins):
	Enable builtin __riscv_mul with Zmmul extension.

2990f580

RISC-V: Add implication for M extension. · 0a193466

Tsung Chun Lin authored 5 months ago

That M implies Zmmul.

gcc/ChangeLog:

	* common/config/riscv/riscv-common.cc: M implies Zmmul.

0a193466

RISC-V: Implement TARGET_CAN_INLINE_P · 517d344e

Yangyu Chen authored 5 months ago

Currently, we lack support for TARGET_CAN_INLINE_P on the RISC-V
ISA. As a result, certain functions cannot be optimized with inlining
when specific options, such as __attribute__((target("arch=+v"))) .
This can lead to potential performance issues when building
retargetable binaries for RISC-V.

To address this, I have implemented the riscv_can_inline_p function.
This addition enables inlining when the callee either has no special
options or when the some options match, and also ensuring that the
callee's ISA is a subset of the caller's. I also check some other
options when there is no always_inline set.

gcc/ChangeLog:

	* common/config/riscv/riscv-common.cc (cl_opt_var_ref_t): Add
	cl_opt_var_ref_t pointer to member of cl_target_option.
	(struct riscv_ext_flag_table_t): Add new cl_opt_var_ref_t field.
	(RISCV_EXT_FLAG_ENTRY): New macro to simplify the definition of
	riscv_ext_flag_table.
	(riscv_ext_is_subset): New function to check if the callee's ISA
	is a subset of the caller's.
	(riscv_x_target_flags_isa_mask): New function to get the mask of
	ISA extension in x_target_flags of gcc_options.
	* config/riscv/riscv-subset.h (riscv_ext_is_subset): Declare
	riscv_ext_is_subset function.
	(riscv_x_target_flags_isa_mask): Declare
	riscv_x_target_flags_isa_mask function.
	* config/riscv/riscv.cc (riscv_can_inline_p): New function.
	(TARGET_CAN_INLINE_P): Implement TARGET_CAN_INLINE_P.

517d344e

Add regression test · 5f0a3818
Eric Botcazou authored 5 months ago
```
gcc/testsuite/
	PR ada/116190
	* gnat.dg/aggr31.adb: New test.
```
5f0a3818
Add regression test · 8da27c7b
Eric Botcazou authored 5 months ago
```
gcc/testsuite/
	PR ada/115535
	* gnat.dg/put_image1.adb: New test
```
8da27c7b

Add regression test · 0c002cce

Eric Botcazou authored 5 months ago

gcc/testsuite/
	PR ada/114636
	* gnat.dg/specs/generic_inst1.ads: New test.

0c002cce