Commits · ac2949574da9a668daad421d7edb79f172f73c6f · COBOLworx / gcc-cobol

Feb 09, 2023

OpenMP/Fortran: Partially fix non-rect loop nests [PR107424] · ac294957

Tobias Burnus authored 2 years ago

This patch ensures that loop bounds depending on outer loop vars use the
proper TREE_VEC format. It additionally gives a sorry if such an outer
var has a non-one/non-minus-one increment as currently a count variable
is used in this case (see PR).

Finally, it avoids 'count' and just uses a local loop variable if the
step increment is +/-1.

	PR fortran/107424

gcc/fortran/ChangeLog:

	* trans-openmp.cc (struct dovar_init_d): Add 'sym' and
	'non_unit_incr' members.
	(gfc_nonrect_loop_expr): New.
	(gfc_trans_omp_do): Call it; use normal loop bounds
	for unit stride - and only create local loop var.

libgomp/ChangeLog:

	* testsuite/libgomp.fortran/non-rectangular-loop-1.f90: New test.
	* testsuite/libgomp.fortran/non-rectangular-loop-1a.f90: New test.
	* testsuite/libgomp.fortran/non-rectangular-loop-2.f90: New test.
	* testsuite/libgomp.fortran/non-rectangular-loop-3.f90: New test.
	* testsuite/libgomp.fortran/non-rectangular-loop-4.f90: New test.
	* testsuite/libgomp.fortran/non-rectangular-loop-5.f90: New test.

gcc/testsuite/ChangeLog:

	* gfortran.dg/goacc/privatization-1-compute-loop.f90: Update dg-note.
	* gfortran.dg/goacc/privatization-1-routine_gang-loop.f90: Likewise.

ac294957

docs: add caveat for __builtin_cpu_supports · 1189d1b3

Martin Liska authored 2 years ago

Document that the function does not work correctly for old
VIA processors.

	PR target/100758

gcc/ChangeLog:

	* doc/extend.texi: Document that the function
	does not work correctly for old VIA processors.

1189d1b3

OpenMP: Parse align clause in allocate directive in C/C++ · 1eb78a93

Tobias Burnus authored 2 years ago

gcc/c/ChangeLog:

	* c-parser.cc (c_parser_omp_allocate): Parse align
	clause and check for restrictions.

gcc/cp/ChangeLog:

	* parser.cc (cp_parser_omp_allocate): Parse align
	clause and check for restrictions.

gcc/testsuite/ChangeLog:

	* c-c++-common/gomp/allocate-5.c: Extend for align clause.

1eb78a93

Fortran/OpenMP: Fix -fopenmp-simd for 'omp assume(s)' · ae091a44

Tobias Burnus authored 2 years ago

While 'omp assume' is enabled by -fopenmp-simd, 'omp assumes' is not;
however, due to the way parsing works in Fortran (esp. for fixed-form
source code), 'assumes' was parsed by 'assume' which then stumbled over
the tailing 's'.

gcc/fortran/

	* parse.cc (decode_omp_directive): Really ignore 'assumes' with
	-fopenmp-simd.

gcc/testsuite/

	* gfortran.dg/gomp/openmp-simd-8.f90: New test.

ae091a44

lto-wrapper: Pass through -funwind-tables and -fasynchronous-unwind-tables · 9453e3cd

Andreas Schwab authored 2 years ago

The -funwind-tables and -fasynchronous-unwind-tables options are relevant
for the output pass, so they need to be passed through by the LTO wrapper.
Otherwise, dwarf2out_assembly_start may output a ".cfi_sections
.debug_frame" directive when debug info is enabled even if every
translation unit was compiled with -funwind-tables.

gcc/
	* lto-wrapper.cc (merge_and_complain): Handle
	-funwind-tables and -fasynchronous-unwind-tables.
	(append_compiler_options): Likewise.

9453e3cd

c++: Mangle EXCESS_PRECISION_EXPR <REAL_CST> as fold_convert REAL_CST [PR108698] · b1ed0c96

Jakub Jelinek authored 2 years ago

For standard excess precision, like the C FE we parse floating
point constants as EXCESS_PRECISION_EXPR of promoted REAL_CST
rather than the nominal REAL_CST, and as the following testcase
shows the constants might need mangling.

The following patch mangles those as fold_convert of the REAL_CST
to EXCESS_PRECISION_EXPR type, i.e. how they were mangled before.

I'm not really sure EXCESS_PRECISION_EXPR can appear elsewhere
in expressions that would need mangling, tried various testcases
but haven't managed to come up with one.  If that is possible,
we'd keep ICEing on it without/with this patch, and the big question
is how to mangle those; they could be mangled as casts from the
promoted type back to nominal, but then in the mangled expressions
one could see the effects of excess precision.  Until we have
a reproducer, that is just theoretical though.

2023-02-09  Jakub Jelinek  <jakub@redhat.com>

	PR c++/108698
	* mangle.cc (write_expression, write_template_arg): Handle
	EXCESS_PRECISION_EXPR with REAL_CST operand as
	write_template_arg_literal on fold_convert of the REAL_CST
	to EXCESS_PRECISION_EXPR type.

	* g++.dg/cpp0x/pr108698.C: New test.

b1ed0c96

tree-optimization/26854 - slow bitmap operations · 4b19ff1b

Richard Biener authored 2 years ago

With the compiler.i testcase from the PR one can see bitmap_set_bit
very high in the profile, originating from SSA update and alias
stmt walking.  For SSA update mark_block_for_update essentially
performs redundant bitmap_set_bits and is called via
insert_updated_phi_nodes_for as

      EXECUTE_IF_SET_IN_BITMAP (pruned_idf, 0, i, bi)
...
          mark_block_for_update (bb);
          FOR_EACH_EDGE (e, ei, bb->preds)
            if (e->src->index >= 0)
              mark_block_for_update (e->src);

which is quite random in the access pattern and runs into the
O(n) case of the linked list bitmap representation.  Switching
blocks_to_update to tree view around insert_updated_phi_nodes_for
improves SSA update time from

 tree SSA incremental               :   4.26 (  3%)

to

 tree SSA incremental               :   2.98 (  2%)

Likewise the visited bitmap allocated by the alias walker benefits
from using the tree view in case of large CFGs and we see an
improvement from

 alias stmt walking                 :  10.53 (  9%)

to

 alias stmt walking                 :   4.05 (  4%)

	PR tree-optimization/26854
	* tree-into-ssa.cc (update_ssa): Turn blocks_to_update to tree
	view around insert_updated_phi_nodes_for.
	* tree-ssa-alias.cc (maybe_skip_until): Allocate visited bitmap
	in tree view.
	(walk_aliased_vdefs_1): Likewise.

4b19ff1b

Daily bump. · f6fc79d0
GCC Administrator authored 2 years ago

f6fc79d0

Feb 08, 2023

c: Update checks on constexpr pointer initializers · 53678f7f

Joseph Myers authored 2 years ago

WG14 has agreed a change of the rules on constexpr pointer
initializers, so that a (constant) null value that is not a null
pointer constant is accepted in that context, rather than only
accepting null pointer constants.  (In particular, this means that a
constexpr variable of pointer type can be used to initializer another
such variable.)  Remove the null pointer constant restriction in GCC,
instead checking just whether the value is null.

Bootstrapped with no regressions for x86_64-pc-linux-gnu.

gcc/c/
	* c-typeck.cc (check_constexpr_init): Remove argument
	null_pointer_constant.  Only check pointer initializers for being
	null.
	(digest_init): Update calls to check_constexpr_init.

gcc/testsuite/
	* gcc.dg/c2x-constexpr-1.c: Test initialization of constexpr
	pointers with null values that are not null pointer constants.
	* gcc.dg/c2x-constexpr-3.c: Test initialization of constexpr
	pointers with non-null values, not with null values that are not
	null pointer constants.

53678f7f

doc: Change fsf.org to www.fsf.org · 1a49390f

Gerald Pfeifer authored 2 years ago

fsf.org has been serving a 301 (permanent redirect) http response for
a long while.

gcc/ChangeLog:

	* doc/include/gpl_v3.texi: Change fsf.org to www.fsf.org.

1a49390f

testsuite: Fix asm-goto-with-outputs tests; limit to lra targets · 70888d09

Hans-Peter Nilsson authored 2 years ago

These tests spuriously lacked a "lra" limiter.  Code using
"asm goto" with outputs gets a:
 error: the target does not support 'asm goto' with outputs in 'asm'
compilation error when compiled for a non-LRA target.  Limit
to LRA targets as other asm-goto-with-outputs tests.

	* gcc.dg/torture/pr100398.c: Limit to lra targets.
	* gcc.dg/pr100590.c: Ditto.

70888d09

analyzer: fix overzealous state purging with on-stack structs [PR108704] · 77bb54b1

David Malcolm authored 2 years ago


PR analyzer/108704 reports many false positives seen from
-Wanalyzer-use-of-uninitialized-value on qemu's softfloat.c on code like
the following:

   struct st s;
   s = foo ();
   s = bar (s); // bogusly reports that s is uninitialized here

where e.g. "struct st" is "floatx80" in the qemu examples.

The root cause is overzealous purging of on-stack structs in the code I
added in r12-7718-gfaacafd2306ad7, where at:

	s = bar (s);

state_purge_per_decl::process_point_backwards "sees" the assignment to 's'
and stops processing, effectively treating 's' as unneeded before this
stmt, not noticing the use of 's' in the argument.

Fixed thusly.

The patch greatly reduces the number of
-Wanalyzer-use-of-uninitialized-value warnings from my integration tests:
  ImageMagick-7.1.0-57:  10 ->  6   (-4)
              qemu-7.2: 858 -> 87 (-771)
         haproxy-2.7.1:   1 ->  0   (-1)
All of the above that I've examined appear to be false positives.

gcc/analyzer/ChangeLog:
	PR analyzer/108704
	* state-purge.cc (state_purge_per_decl::process_point_backwards):
	Don't stop processing the decl if it's fully overwritten by
	this stmt if it's also used by this stmt.

gcc/testsuite/ChangeLog:
	PR analyzer/108704
	* gcc.dg/analyzer/uninit-7.c: New test.
	* gcc.dg/analyzer/uninit-pr108704.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

77bb54b1

arm: Optimize arm-mlib.h header inclusion [pr108505]. · 2eeda82d

Srinath Parvathaneni authored 2 years ago

I have committed a fix [1] into gcc trunk for a build
issue mentioned in pr108505 and latter received few upstream
comments proposing more robust fix for this issue.

In this patch I'm addressing those comments and sending this
as a followup patch.

gcc/ChangeLog:

2023-01-27  Srinath Parvathaneni  <srinath.parvathaneni@arm.com>

	PR target/108505
	* config.gcc (tm_mlib_file): Define new variable.

2eeda82d

Fortran: error handling of global entity appearing in COMMON block [PR103259] · 7e9f20f5

Steve Kargl authored 2 years ago

gcc/fortran/ChangeLog:

	PR fortran/103259
	* resolve.cc (resolve_common_vars): Avoid NULL pointer dereference
	when a symbol's location is not set.

gcc/testsuite/ChangeLog:

	PR fortran/103259
	* gfortran.dg/pr103259.f90: New test.

7e9f20f5

vect-patterns: Fix up vect_widened_op_tree [PR108692] · 6ad1c102

Jakub Jelinek authored 2 years ago

The following testcase is miscompiled on aarch64-linux since r11-5160.
Given
  <bb 3> [local count: 955630225]:
  # i_22 = PHI <i_20(6), 0(5)>
  # r_23 = PHI <r_19(6), 0(5)>
...
  a.0_5 = (unsigned char) a_15;
  _6 = (int) a.0_5;
  b.1_7 = (unsigned char) b_17;
  _8 = (int) b.1_7;
  c_18 = _6 - _8;
  _9 = ABS_EXPR <c_18>;
  r_19 = _9 + r_23;
...
where SSA_NAMEs 15/17 have signed char, 5/7 unsigned char and rest is int
we first pattern recognize c_18 as
patt_34 = (a.0_5) w- (b.1_7);
which is still correct, 5/7 are unsigned char subtracted in wider type,
but then vect_recog_sad_pattern turns it into
SAD_EXPR <a_15, b_17, r_23>
which is incorrect, because 15/17 are signed char and so it is
sum of absolute signed differences rather than unsigned sum of
absolute unsigned differences.
The reason why this happens is that vect_recog_sad_pattern calls
vect_widened_op_tree with MINUS_EXPR, WIDEN_MINUS_EXPR on the
patt_34 = (a.0_5) w- (b.1_7); statement's vinfo and vect_widened_op_tree
calls vect_look_through_possible_promotion on the operands of the
WIDEN_MINUS_EXPR, which looks through the further casts.
vect_look_through_possible_promotion has careful code to stop when there
would be nested casts that need to be preserved, but the problem here
is that the WIDEN_*_EXPR operation itself has an implicit cast on the
operands already - in this case of WIDEN_MINUS_EXPR the unsigned char
5/7 SSA_NAMEs are widened to unsigned short before the subtraction,
and vect_look_through_possible_promotion obviously isn't told about that.

Now, I think when we see those WIDEN_{MULT,MINUS,PLUS}_EXPR codes, we had
to look through possible promotions already when creating those and so
vect_look_through_possible_promotion again isn't really needed, all we need
to do is arrange what that function will do if the operand isn't result
of any cast.  Other option would be let vect_look_through_possible_promotion
know about the implicit promotion from the WIDEN_*_EXPR, but I'm afraid
that would be much harder.

2023-02-08  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/108692
	* tree-vect-patterns.cc (vect_widened_op_tree): If rhs_code is
	widened_code which is different from code, don't call
	vect_look_through_possible_promotion but instead just check op is
	SSA_NAME with integral type for which vect_is_simple_use is true
	and call set_op on this_unprom.

	* gcc.dg/pr108692.c: New test.

6ad1c102

aarch64: Fix return_address_sign_ab_exception.C regression · b1d26458

Andrea Corallo authored 2 years ago

Hi all,

this is to fix the regression of
g++.target/aarch64/return_address_sign_ab_exception.C that I
introduced with d8dadbc9.

'aarch_ra_sign_key' for aarch64 ended up being non defined in the opt
file and the function attribute "branch-protection=pac-ret+leaf+b-key"
stopped working as expected.

This patch moves the definition of 'aarch_ra_sign_key' to the opt
files for both Arm back-ends.

Regards

  Andera Corallo

gcc/ChangeLog:

	* config/aarch64/aarch64-protos.h (aarch_ra_sign_key): Remove
	declaration.
	* config/aarch64/aarch64.cc (aarch_ra_sign_key): Remove
	definition.
	* config/aarch64/aarch64.opt (aarch64_ra_sign_key): Rename
	to 'aarch_ra_sign_key'.
	* config/arm/aarch-common.cc (aarch_ra_sign_key): Remove
	declaration.
	* config/arm/arm-protos.h (aarch_ra_sign_key): Likewise.
	* config/arm/arm.cc (enum aarch_key_type): Remove definition.
	* config/arm/arm.opt: Define.

b1d26458

testsuite: Import objc-dg-prune in execute.exp · 3d451c42

Richard Sandiford authored 2 years ago

The GCC-local definition of gcc-dg-prune removes extra error messages,
such as one from the linker warning about executable stacks.  This is
then used by tool-specific pruners like objc-dg-prune, defined in
objc-dg.exp.  However, objc/execute/execute.exp didn't include
objc-dg.exp, meaning that the linker warning could trigger a
failure in objc/execute/nested-func-1.m.

gcc/testsuite/
	* objc/execute/execute.exp: Load objc-dg.exp.

3d451c42

vect: Check gather/scatter offset types [PR108316] · 740a3be7

Richard Sandiford authored 2 years ago

The gather/scatter support can over-widen an offset if the target
requires it, but this relies on using a pattern sequence to add
the widening conversion.  That failed in the testcase because an
earlier pattern (bool) took priority.

I think we should allow patterns to be applied to other patterns,
but that's quite an invasive change and isn't suitable for stage 4.
This patch instead punts if the offset type doesn't match the
expected one.

If we switched to using the SLP representation for everything,
we would probably handle both patterns by rewriting the graph,
which should be much easier.

gcc/
	PR tree-optimization/108316
	* tree-vect-stmts.cc (get_load_store_type): When using
	internal functions for gather/scatter, make sure that the type
	of the offset argument is consistent with the offset vector type.

gcc/testsuite/
	PR tree-optimization/108316
	* gcc.dg/vect/pr108316.c: New test.

740a3be7

Revert "RA: Implement reuse of equivalent memory for caller saves optimization" · ad2bd0ad
Vladimir N. Makarov authored 2 years ago
```
This reverts commit f661c0bb.
```
ad2bd0ad

testsuite: Fix up PR108525 test [PR108525] · a58a4a57

Jakub Jelinek authored 2 years ago

Seems when committing the PR108525 fix I've missed that a test with
the same name had been added a few hours before for PR108526.

This patch separates the PR108525 test into a new file.

2023-02-08  Jakub Jelinek  <jakub@redhat.com>

	PR c++/108525
	* g++.dg/cpp23/static-operator-call5.C: Move PR108525 testcase
	incorrectly applied into PR108526 testcase ...
	* g++.dg/cpp23/static-operator-call6.C: ... here.  New test.

a58a4a57

tree.def: Remove outdated comment on SAD_EXPR · aa12d1b1

Jakub Jelinek authored 2 years ago

While looking at PR108692, I've noticed SAD_EXPR comment mentions that
WIDEN_MINUS_EXPR is missing, which is not true anymore since r11-5160.

The following patch just removes that part of the comment.

2023-02-08  Jakub Jelinek  <jakub@redhat.com>

	* tree.def (SAD_EXPR): Remove outdated comment about missing
	WIDEN_MINUS_EXPR.

aa12d1b1

Daily bump. · 8f3b85ef
GCC Administrator authored 2 years ago

8f3b85ef

Feb 07, 2023

Fix 'libgomp.fortran/reverse-offload-6.f90' nvptx offloading compilation · 7ab75a6e

Thomas Schwinge authored 2 years ago

Fix-up for recent commit 0b1ce70a
"libgomp: Fix reverse offload issues".

	libgomp/
	* testsuite/libgomp.fortran/reverse-offload-6.f90: Fix nvptx
	offloading compilation.

7ab75a6e

analyzer: fix -Wanalyzer-use-of-uninitialized-value false +ve on "read" [PR108661] · c300e251

David Malcolm authored 2 years ago


My integration testing shows many false positives from
-Wanalyzer-use-of-uninitialized-value.

One cause turns out to be that as of r13-1404-g97baacba963c06
fd_state_machine::on_stmt recognizes calls to "read", and returns true,
so that region_model::on_call_post doesn't call handle_unrecognized_call
on them, and so the analyzer erroneously "thinks" that the buffer
pointed to by "read" is never touched by the "read" call.

This works for "fread" because sm-file.cc implements kf_fread, which
handles calls to "fread" by clobbering the buffer pointed to.  In the
long term we should probably be smarter about this and bifurcate the
analysis to consider e.g. errors vs full reads vs partial reads, etc
(which I'm tracking in PR analyzer/108689).

In the meantime, this patch adds a kf_read for "read" analogous to the
one for "fread", fixing 6 false positives seen in git-2.39.0 and
2 in haproxy-2.7.1.

gcc/analyzer/ChangeLog:
	PR analyzer/108661
	* sm-fd.cc (class kf_read): New.
	(register_known_fd_functions): Register "read".
	* sm-file.cc (class kf_fread): Update comment.

gcc/testsuite/ChangeLog:
	PR analyzer/108661
	* gcc.dg/analyzer/fread-pr108661.c: New test.
	* gcc.dg/analyzer/read-pr108661.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

c300e251

Fortran: ASSOCIATE variables should not be TREE_STATIC [PR95107] · c36f3da5

Harald Anlauf authored 2 years ago

gcc/fortran/ChangeLog:

	PR fortran/95107
	* trans-decl.cc (gfc_finish_var_decl): With -fno-automatic, do not
	make ASSOCIATE variables TREE_STATIC.

gcc/testsuite/ChangeLog:

	PR fortran/95107
	* gfortran.dg/save_7.f90: New test.

c36f3da5

doc: Update -fchar8_t documentation · 8bc87173

Marek Polacek authored 2 years ago

Since C++20 P2513R4, char8_t Compatibility and Portability Fix it is
no longer true that

  char ca[] = u8"xx";

causes an error so adjust the example for -fchar8_t.

gcc/ChangeLog:

	* doc/invoke.texi: Update -fchar8_t documentation.

8bc87173

RA: Implement reuse of equivalent memory for caller saves optimization · f661c0bb

Vladimir N. Makarov authored 2 years ago

The test case shows opportunity to reuse memory with constant address for
caller saves optimization for constant or pure function call.  The patch
implements the memory reuse.

        PR rtl-optimization/103541

gcc/ChangeLog:

	* ira.h (struct ira_reg_equiv_s): Add new field caller_save_p.
	* ira.cc (validate_equiv_mem): Check memref address variance.
	(update_equiv_regs): Define caller save equivalence for
	valid_combine.
	(setup_reg_equiv): Clear defined_p flag for caller save equivalence.
	* lra-constraints.cc (lra_copy_reg_equiv): Add new arg
	call_save_p.  Use caller save equivalence depending on the arg.
	(split_reg): Adjust the call.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/pr103541.c: New.

f661c0bb

tree-optimization/26854 - compile-time hog in SSA forwprop · 295adfc9

Richard Biener authored 2 years ago

The following addresses

 tree forward propagate             :  12.41 (  9%)

seen with the compile.i testcase of this PR which points at
the has_use_on_stmt function which, for SSA names with many
uses is slow.  The solution is to instead of immediate uses,
look at stmt operands to identify whether a name has a use
on a stmt.  That improves SSA forwprop to

 tree forward propagate             :   1.30 (  0%)

for this testcase.

	PR tree-optimization/26854
	* gimple-fold.cc (has_use_on_stmt): Look at stmt operands
	instead of immediate uses.

295adfc9

ipa-split: Don't split returns_twice functions [PR106923] · 5321d532

Jakub Jelinek authored 2 years ago

As discussed in the PR, returns_twice functions are rare/special beasts
that need special treatment in the cfg, and inside of their bodies
we don't know which part actually works the weird returns twice way
(either in the fork/vfork sense, or in the setjmp) and aren't updating
ab edges to reflect that.

I think easiest is just to never split these, like we already never
split noreturn or malloc functions.

2023-02-07  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/106923
	* ipa-split.cc (execute_split_functions): Don't split returns_twice
	functions.

	* gcc.dg/pr106923.c: New test.

5321d532

cgraph: Handle simd clones in cgraph_node::set_{const,pure}_flag [PR106433] · cad2412c

Jakub Jelinek authored 2 years ago

The following testcase ICEs, because we determine only in late pure const
pass that bar is const (the content of the function loses a store to a
global var during dse3 and read from it during cddce2) and local-pure-const2
makes it const.  The cgraph ordering is that post IPA (in late IPA simd
clones are created) bar is processed first, then foo as its caller, then
foo.simdclone* and finally bar.simdclone*.  Conceptually I think that is the
right ordering which allows for static simd clones to be removed.

The reason for the ICE is that because bar was marked const, the call to
it lost vops before vectorization, and when we in foo.simdclone* try to
vectorize the call to bar, we replace it with bar.simdclone* which hasn't
been marked const and so needs vops, which we don't add.

Now, because the simd clones are created from the same IL, just in a loop
with different argument/return value passing, I think generally if the base
function is determined to be const or pure, the simd clones should be too,
unless e.g. the vectorization causes different optimization decisions, but
then still the global memory reads if any shouldn't affect what the function
does and global memory stores shouldn't be reachable at runtime.

So, the following patch changes set_{const,pure}_flag to mark also simd
clones.

2023-02-07  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/106433
	* cgraph.cc (set_const_flag_1): Recurse on simd clones too.
	(cgraph_node::set_pure_flag): Call set_pure_flag_1 on simd clones too.

	* gcc.c-torture/compile/pr106433.c: New test.

cad2412c

testsuite: Expect -Wdeprecated warning in warn/Wstrict-aliasing-bogus-union-2.C for C++23 · 64b5ca43

Jakub Jelinek authored 2 years ago

On Mon, Feb 06, 2023 at 02:26:01PM +0000, Jonathan Wakely via Gcc-patches wrote:
> With the recent change to deprecate std::aligned_storage and
> std::aligned_union we need to adjust some tests that now fail with
> -std=c++23.

The g++.dg/warn/Wstrict-aliasing-bogus-union-2.C test is also affected:
PASS: g++.dg/warn/Wstrict-aliasing-bogus-union-2.C  -std=gnu++2b  (test for bogus messages, line 12)
FAIL: g++.dg/warn/Wstrict-aliasing-bogus-union-2.C  -std=gnu++2b (test for excess errors)
Excess errors:
.../gcc/testsuite/g++.dg/warn/Wstrict-aliasing-bogus-union-2.C:8:8: warning: 'template<long unsigned int _Len, long unsigned int _Align> struct std::aligned_storage' is deprecated [-

The following patch adds dg-warning for it.

2023-02-07  Jakub Jelinek  <jakub@redhat.com>

	* g++.dg/warn/Wstrict-aliasing-bogus-union-2.C: Expect
	-Wdeprecated warning for C++23.

64b5ca43

Enable 512 bit vector for zen4 · a7502c4a

Jan Hubicka authored 2 years ago

While internally 512 registers are splits into two 256 halves, 512 bit vectors
reduces number of instructions to retire and has chance to improve paralelism.
There are few tsvc benchmarks that improves significantly:

           runtime
benchmark  256bit  512bit
s2275      48.57   20.67    -58%
s311       32.29   16.06    -50%
s312       32.30   16.07    -50%
vsumr      32.30   16.07    -50%
s314       10.77   5.42     -50%
s313       21.52   10.85    -50%
vdotr      43.05   21.69    -50%
s316       10.80   5.64     -48%
s235       61.72   33.91    -45%
s161       15.91   9.95     -38%
s3251      32.13   20.31    -36%

And there are no benchmarks with off-noise regression.  The basic matrix
multiplication loop improves by 32%.  It is also expected that 512 bit
vectors are more power effecient (I can't masure that).

The down side is that loops with low trip counts may get slower when the
unvectorized prologue and epilogue is hit more often.  With SPECfp this
problem happens with x264 (12% regression) and bwaves (6% regression)
and this is tracked in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108410
and will need more work on vectorizer to support masked epilogues.

After some additional testing it seems that using 512 bit vectors by
default is now overall better choice.

Bootstrapped/regtested x86_64-linux. Plan to commit it tomorrow.

	* config/i386/x86-tune.def (X86_TUNE_AVX256_OPTIMAL): Turn off
	for znver4.

a7502c4a

Daily bump. · f0e73dd0
GCC Administrator authored 2 years ago

f0e73dd0

Feb 06, 2023

Modula2 meets clang [PR108135] · d5f933d2

Gaius Mulley authored 2 years ago


Remove unused function (and build warnings).

gcc/m2/ChangeLog:

	* gm2-compiler/M2Search.mod (DSdbEnter): Comment out.
	(DSdbExit): Comment out.

	PR modula2/108135

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

d5f933d2

libstdc++: Document P1642 and extensions · 9f4baed6

Arsen Arsenović authored 2 years ago

libstdc++-v3/ChangeLog:

	* doc/xml/manual/using.xml: Document newly-freestanding
	headers and the effect of the -ffreestanding flag.
	* doc/xml/manual/status_cxx2023.xml: Document P1642R11 as
	completed.
	* doc/xml/manual/configure.xml: Document that hosted installs
	respect __STDC_HOSTED__.
	* doc/xml/manual/test.xml: Document how to run tests in
	freestanding mode.
	* doc/html/*: Regenerate.

9f4baed6

Format error in m2pp.cc (m2pp_integer_cst) [PR107234] · 17d0892d

Gaius Mulley authored 2 years ago


Use HOST_WIDE_INT_PRINT_UNSIGNED instead of hardcoding a
specific format.

gcc/m2/ChangeLog:

	* m2pp.cc (m2pp_integer_cst): Use
	HOST_WIDE_INT_PRINT_UNSIGNED as the format specifier.

	PR modula2/107234
	    Co-Authored by: Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

17d0892d

amdgcn: Pass -mstack-size through to runtime · 45e01229

Andrew Stubbs authored 2 years ago

But only for the offload case.

gcc/ChangeLog:

	* config/gcn/mkoffload.cc (gcn_stack_size): New global variable.
	(process_asm): Create a constructor for GCN_STACK_SIZE.
	(main): Parse the -mstack-size option.

45e01229

Remove unused variables and procedures. · 74337475

Gaius Mulley authored 2 years ago


Remove unused variables and procedures (and remove build
warning clutter).

gcc/m2/ChangeLog:

	* gm2-compiler/M2Preprocess.mod (BaseName): Comment out.
	* gm2-lang.cc (opt): Remove.
	* gm2spec.cc (add_include): Remove.
	(full_libraries): Remove.
	(concat_option): Remove.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

74337475

aarch64: Fix up bfmlal lane pattern [PR104921] · 277e1f30

Alex Coplan authored 2 years ago

As the testcase shows, this pattern had an incorrect constraint leading
to GCC's output getting rejected by the assembler.

This patch fixes the constraint accordingly.

The test is split into two: one that can run without bf16 support from
the assembler and another that checks that the output actually assembles
when such support is available.

Bootstrapped/regtested on aarch64-linux-gnu.

OK for GCC 13? Or better to wait for next stage 1? What about backports?

Thanks,
Alex

gcc/ChangeLog:

	PR target/104921
	* config/aarch64/aarch64-simd.md (aarch64_bfmlal<bt>_lane<q>v4sf):
	Use correct constraint for operand 3.

gcc/testsuite/ChangeLog:

	PR target/104921
	* gcc.target/aarch64/pr104921-1.c: New test.
	* gcc.target/aarch64/pr104921-2.c: New test.
	* gcc.target/aarch64/pr104921.x: Include file for new tests.

277e1f30

libstdc++: Fix non-reserved name for template parameter · 0afcb713

Jonathan Wakely authored 2 years ago

libstdc++-v3/ChangeLog:

	* include/bits/ranges_algo.h (__find_last_fn): Rename T to _Tp.
	(__find_last_if_fn): Likewise.

0afcb713