Commits · cf5327b89ab610649c5faab78ea7907bb74b103c · COBOLworx / gcc-cobol

Dec 12, 2022

Fortran: improve checking of assumed-size array spec [PR102180] · cf5327b8

Harald Anlauf authored 2 years ago

gcc/fortran/ChangeLog:

	PR fortran/102180
	* array.cc (match_array_element_spec): Add check for bad
	assumed-implied-spec.
	(gfc_match_array_spec): Reorder logic so that the first bad array
	element spec may trigger an error.

gcc/testsuite/ChangeLog:

	PR fortran/102180
	* gfortran.dg/pr102180.f90: New test.

cf5327b8

d: Fix undefined reference to nested lambda in template (PR108055) · 9fe7d3de

Iain Buclaw authored 2 years ago

Sometimes, nested lambdas of templated functions get no code generation
due to them being marked as instantianted outside of all modules being
compiled in the current compilation unit.  This despite enclosing
template instances being marked as instantiated inside the current
compilation unit.  To fix, all enclosing templates are now checked in
`function_defined_in_root_p'.

Because of this change, `function_needs_inline_definition_p' has also
been fixed up to only check whether the regular function definition
itself is to be emitted in the current compilation unit.

	PR d/108055

gcc/d/ChangeLog:

	* decl.cc (function_defined_in_root_p): Check all enclosing template
	instances for definition in a root module.
	(function_needs_inline_definition_p): Replace call to
	function_defined_in_root_p with test for outer module `isRoot'.

gcc/testsuite/ChangeLog:

	* gdc.dg/torture/imports/pr108055conv.d: New.
	* gdc.dg/torture/imports/pr108055spec.d: New.
	* gdc.dg/torture/imports/pr108055write.d: New.
	* gdc.dg/torture/pr108055.d: New test.

9fe7d3de

AArch64: Enable TARGET_CONST_ANCHOR · 2d7c73ee

Wilco Dijkstra authored 2 years ago

Enable TARGET_CONST_ANCHOR to allow complex constants to be created via
immediate add/sub.  Use a 24-bit range as that enables a 3 or 4-instruction
immediate to be replaced by 2 add/sub instructions.  Fix the costing of
add/sub to support 24-bit and 12-bit shifted immediates.
The generated code for the testcase is now the same or better than LLVM.
It also results in a small codesize reduction on SPEC.

gcc/
	* config/aarch64/aarch64.cc (aarch64_rtx_costs): Add correct costs
	for 24-bit and 12-bit shifted immediate add/sub.
	(TARGET_CONST_ANCHOR): Define.
	* config/aarch64/predicates.md (aarch64_pluslong_immediate):
	Fix range check.

gcc/testsuite/
	* gcc.target/aarch64/movk_3.c: New test.

2d7c73ee

middle-end: simplify complex if expressions where comparisons are inverse of one another. · 4d9db4bd

Tamar Christina authored 2 years ago

This optimizes the following sequence

  ((a < b) & c) | ((a >= b) & d)

into

  (a < b ? c : d) & 1

for scalar and on vector we can omit the & 1.

Also recognizes

  (-(a < b) & c) | (-(a >= b) & d)

into

  a < b ? c : d

This changes the code generation from

zoo2:
	cmp     w0, w1
	cset    w0, lt
	cset    w1, ge
	and     w0, w0, w2
	and     w1, w1, w3
	orr     w0, w0, w1
	ret

into

	cmp	w0, w1
	csel	w0, w2, w3, lt
	and	w0, w0, 1
	ret

and significantly reduces the number of selects we have to do in the vector
code.

gcc/ChangeLog:

	* match.pd: Add new rule.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/if-compare_1.c: New test.
	* gcc.target/aarch64/if-compare_2.c: New test.

4d9db4bd

AArch64: Fix vector re-interpretation between partial SIMD modes · 594264e9

Tamar Christina authored 2 years ago

While writing a patch series I started getting incorrect codegen out from
VEC_PERM on partial struct types.

It turns out that this was happening because the TARGET_CAN_CHANGE_MODE_CLASS
implementation has a slight bug in it.  The hook only checked for SIMD to
Partial but never Partial to SIMD.   This resulted in incorrect subregs to be
generated from the fallback code in VEC_PERM_EXPR expansions.

I have unfortunately not been able to trigger it using a standalone testcase as
the mid-end optimizes away the permute every time I try to describe a permute
that would result in the bug.

The patch now rejects any conversion of partial SIMD struct types, unless they
are both partial structures of the same number of registers or one is a SIMD
type who's size is less than 8 bytes.

gcc/ChangeLog:

	* config/aarch64/aarch64.cc (aarch64_can_change_mode_class): Restrict
	conversions between partial struct types properly.

594264e9

AArch64: Support new tbranch optab. · 17ae956c

Tamar Christina authored 2 years ago

This implements the new tbranch optab for AArch64.

we cannot emit one big RTL for the final instruction immediately.
The reason that all comparisons in the AArch64 backend expand to separate CC
compares, and separate testing of the operands is for ifcvt.

The separate CC compare is needed so ifcvt can produce csel, cset etc from the
compares.  Unlike say combine, ifcvt can not do recog on a parallel with a
clobber.  Should we emit the instruction directly then ifcvt will not be able
to say, make a csel, because we have no patterns which handle zero_extract and
compare. (unlike combine ifcvt cannot transform the extract into an AND).

While you could provide various patterns for this (and I did try) you end up
with broken patterns because you can't add the clobber to the CC register.  If
you do, ifcvt recog fails.

i.e.

int
f1 (int x)
{
  if (x & 1)
    return 1;
  return x;
}

We lose csel here.

Secondly the reason the compare with an explicit CC mode is needed is so that
ifcvt can transform the operation into a version that doesn't require the flags
to be set.  But it only does so if it know the explicit usage of the CC reg.

For instance

int
foo (int a, int b)
{
  return ((a & (1 << 25)) ? 5 : 4);
}

Doesn't require a comparison, the optimal form is:

foo(int, int):
        ubfx    x0, x0, 25, 1
        add     w0, w0, 4
        ret

and no compare is actually needed.  If you represent the instruction using an
ANDS instead of a zero_extract then you get close, but you end up with an ands
followed by an add, which is a slower operation.

gcc/ChangeLog:

	* config/aarch64/aarch64.md (*tb<optab><mode>1): Rename to...
	(*tb<optab><ALLI:mode><GPI:mode>1): ... this.
	(tbranch_<code><mode>4): New.
	* config/aarch64/iterators.md(ZEROM, zerom): New.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/tbz_1.c: New test.

17ae956c

middle-end: Add new tbranch optab to add support for bit-test-and-branch operations · dc582d2e

Tamar Christina authored 2 years ago

This adds a new test-and-branch optab that can be used to do a conditional test
of a bit and branch.   This is similar to the cbranch optab but instead can
test any arbitrary bit inside the register.

This patch recognizes boolean comparisons and single bit mask tests.

gcc/ChangeLog:

	* dojump.cc (do_jump): Pass along value.
	(do_jump_by_parts_greater_rtx): Likewise.
	(do_jump_by_parts_zero_rtx): Likewise.
	(do_jump_by_parts_equality_rtx): Likewise.
	(do_compare_rtx_and_jump): Likewise.
	(do_compare_and_jump): Likewise.
	* dojump.h (do_compare_rtx_and_jump): New.
	* optabs.cc (emit_cmp_and_jump_insn_1): Refactor to take optab to check.
	(validate_test_and_branch): New.
	(emit_cmp_and_jump_insns): Optiobally take a value, and when value is
	supplied then check if it's suitable for tbranch.
	* optabs.def (tbranch_eq$a4, tbranch_ne$a4): New.
	* doc/md.texi (tbranch_@var{op}@var{mode}4): Document it.
	* optabs.h (emit_cmp_and_jump_insns): New.
	* tree.h (tree_zero_one_valued_p): New.

dc582d2e

aarch64: Make existing V2HF be usable. · 2cba118e

Tamar Christina authored 2 years ago

The backend has an existing V2HFmode that is used by pairwise operations.
This mode was however never made fully functional.  Amongst other things it was
never declared as a vector type which made it unusable from the mid-end.

It's also lacking an implementation for load/stores so reload ICEs if this mode
is every used.  This finishes the implementation by providing the above.

Note that I have created a new iterator VHSDF_P instead of extending VHSDF
because the previous iterator is used in far more things than just load/stores.

It's also used for instance in intrinsics and extending this would force me to
provide support for mangling the type while we never expose it through
intrinsics.

gcc/ChangeLog:

	* config/aarch64/aarch64-simd.md (*aarch64_simd_movv2hf): New.
	(mov<mode>, movmisalign<mode>, aarch64_dup_lane<mode>,
	aarch64_store_lane0<mode>, aarch64_simd_vec_set<mode>,
	@aarch64_simd_vec_copy_lane<mode>, vec_set<mode>,
	reduc_<optab>_scal_<mode>, reduc_<fmaxmin>_scal_<mode>,
	aarch64_reduc_<optab>_internal<mode>, aarch64_get_lane<mode>,
	vec_init<mode><Vel>, vec_extract<mode><Vel>): Support V2HF.
	(aarch64_simd_dupv2hf): New.
	* config/aarch64/aarch64.cc (aarch64_classify_vector_mode):
	Add E_V2HFmode.
	* config/aarch64/iterators.md (VHSDF_P): New.
	(V2F, VMOVE, nunits, Vtype, Vmtype, Vetype, stype, VEL,
	Vel, q, vp): Add V2HF.
	* config/arm/types.md (neon_fp_reduc_add_h): New.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/sve/slp_1.c: Update testcase.

2cba118e

libstdc++: Add a test checking for chrono::duration overflows · dc94eaab

Jonathan Wakely authored 2 years ago

This test fails if chrono::days::rep or chrono::years::rep is a 32-bit
type, because a large days or years value silently overflows a 32-bit
integer when converted to seconds. It would be conforming to implement
chrono::days as chrono::duration<int32_t, ratio<86400>>, but would make
this overflow case more likely. Similarly for chrono::years,
chrono::months and chrono::weeks. This test is here to remind us not to
make that change lightly.

libstdc++-v3/ChangeLog:

	* testsuite/20_util/duration/arithmetic/overflow_c++20.cc: New
	test.

dc94eaab

libstdc++: Fix constraint on std::basic_format_string [PR108024] · 6c0f9584

Jonathan Wakely authored 2 years ago

Also remove some redundant std::move calls for return statements.

libstdc++-v3/ChangeLog:

	PR libstdc++/108024
	* include/std/format (basic_format_string): Fix constraint.
	* testsuite/std/format/format_string.cc: New test.

6c0f9584

libstdc++: Change names that clash with Win32 or Clang · cb363fd9

Jonathan Wakely authored 2 years ago

Clang now defines an __is_unsigned built-in, and Windows defines an
_Out_ macro. Replace uses of those as identifiers.

There might also be a problem with __is_signed, which we use in several
places.

libstdc++-v3/ChangeLog:

	* include/std/chrono (hh_mm_ss): Rename __is_unsigned member to
	_S_is_unsigned.
	* include/std/format (basic_format_context): Rename _Out_
	template parameter to _Out2.
	* testsuite/17_intro/names.cc: Add Windows SAL annotation
	macros.

cb363fd9

libstdc++: Define atomic lock-free type aliases for C++20 [PR98034] · 320ac807

Jonathan Wakely authored 2 years ago

libstdc++-v3/ChangeLog:

	PR libstdc++/98034
	* include/std/atomic (__cpp_lib_atomic_lock_free_type_aliases):
	Define macro.
	(atomic_signed_lock_free, atomic_unsigned_lock_free): Define
	aliases.
	* include/std/version (__cpp_lib_atomic_lock_free_type_aliases):
	Define macro.
	* testsuite/29_atomics/atomic/lock_free_aliases.cc: New test.

320ac807

libstdc++: Make operator<< for stacktraces less templated (LWG 3515) · 2327d933

Jonathan Wakely authored 2 years ago

This change was approved for C++23 last month.

libstdc++-v3/ChangeLog:

	* include/std/stacktrace (operator<<): Only output to narrow
	ostreams (LWG 3515).
	* testsuite/19_diagnostics/stacktrace/synopsis.cc:

2327d933

mklog: do not parse binary file for PR entry · 14d0f82c
Martin Liska authored 2 years ago
```
contrib/ChangeLog:

	* mklog.py: Do not search PR entry in a file that is binary.
```
14d0f82c

aarch64: Add __ARM_FEATURE_PAUTH and __ARM_FEATURE_BTI ACLE defines · 688f4eb2

Kyrylo Tkachov authored 2 years ago

Recent ACLE additions specified the __ARM_FEATURE_PAUTH and __ARM_FEATURE_BTI macros [1] that the compiler
should define when the pointer authentication and BTI instructions are available (and don't act as NOPs).
We've received requests to enable them in GCC for aarch64, similar to clang [2].
It's a fairly simple patch and should be non-intrusive at this stage.
Pointer authentication has its own "pauth" feature flag, whereas BTI depends on an architecture level
of Armv8.5-a or later.

Bootstrapped and tested on aarch64-none-linux-gnu.

[1] https://github.com/ARM-software/acle/blob/main/main/acle.md#pointer-authentication
[2] https://reviews.llvm.org/rG7d40baa82b1f272f68de63f3c4f68d970bdcd6ed

gcc/ChangeLog:

	* config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins): Define
	__ARM_FEATURE_PAUTH and __ARM_FEATURE_BTI when appropriate.
	* config/aarch64/aarch64.h (TARGET_BTI): Define.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/acle/bti_def.c: New test.
	* gcc.target/aarch64/acle/pauth_def.c: New test.

688f4eb2

Revert parts of ADDR_EXPR/CONSTRUCTOR treatment change in match.pd · 49bf49bb

Richard Biener authored 2 years ago

This reverts the part that substitutes from the definition of an
SSA name to the capture, thus ADDR_EXPR@0 eventually yielding
&y_1->a[i_2] instead of _3.  That's because I didn't think of
how to deal with substituting @0 in the result pattern.  So
the following re-instantiates the SSA def CONSTRUCTOR handling
and in the ADDR_EXPR helpers used by match.pd handles SSA names
defined to ADDR_EXPRs transparently.

	* genmatch.cc (dt_simplify::gen): Revert last change.
	* match.pd: Revert simplification of CONSTUCTOR leaf handling.
	(&x cmp SSA_NAME): Handle ADDR_EXPR in SSA defs.
	* fold-const.cc (split_address_to_core_and_offset): Handle
	ADDR_EXPRs in SSA defs.
	(address_compare): Likewise.

49bf49bb

tree-optimization/89317 - another pattern for &p->x != p + 4 · 2dc5d6b1

Richard Biener authored 2 years ago

As seen in the original testcase for PR89317 we are missing
comparison simplification patterns for &p->x != p + 4.  Fixed
by making an existing one apply.  To make the pattern apply
during CCP we need to simplify ccp_fold to not use GENERIC
folding of conditions but also use GIMPLE folding.

	PR tree-optimization/89317
	* tree-ssa-ccp.cc (ccp_fold): Handle GIMPLE_COND via
	gimple_fold_stmt_to_constant_1.
	* match.pd (&a != &a + c): Apply to pointer_plus with non-ADDR_EXPR
	base as well.

	* gcc.dg/tree-ssa/pr89317.c: Amend.

2dc5d6b1

Daily bump. · 324e9953
GCC Administrator authored 2 years ago

324e9953

Dec 11, 2022

Fortran: fix ICE on bad use of statement function [PR107995] · 8f72249f

Steve Kargl authored 2 years ago

gcc/fortran/ChangeLog:

	PR fortran/107995
	* interface.cc (gfc_check_dummy_characteristics): Reject statement
	function dummy arguments.

gcc/testsuite/ChangeLog:

	PR fortran/107995
	* gfortran.dg/pr107995.f90: New test.

8f72249f

d: Fix internal compiler error: in visit, at d/imports.cc:72 (PR108050) · d9d8c967

Iain Buclaw authored 2 years ago

The visitor for lowering IMPORTED_DECLs did not have an override for
dealing with importing OverloadSet symbols.  This has now been
implemented in the code generator.

	PR d/108050

gcc/d/ChangeLog:

	* decl.cc (DeclVisitor::visit (Import *)): Handle build_import_decl
	returning a TREE_LIST.
	* imports.cc (ImportVisitor::visit (OverloadSet *)): New override.

gcc/testsuite/ChangeLog:

	* gdc.dg/imports/pr108050/mod1.d: New.
	* gdc.dg/imports/pr108050/mod2.d: New.
	* gdc.dg/imports/pr108050/package.d: New.
	* gdc.dg/pr108050.d: New test.

d9d8c967

unidiff: use newline='\n' argument · b0451799

Martin Liska authored 2 years ago

In order to support CR on a line, we need to open files
with newline='\n' as our line endings supposed to be of UNIX style.

contrib/ChangeLog:

	* check_GNU_style.py: Use newline=\n.
	* check_GNU_style_lib.py: Simplify.
	* gcc-changelog/git_commit.py: Fix issues seen
	Rust patchset.
	* gcc-changelog/git_email.py: Use newline argument.
	* gcc-changelog/test_email.py: New test.
	* gcc-changelog/test_patches.txt: New test.
	* mklog.py: Use newline argument.

b0451799

d: Merge upstream dmd, druntime c8ae4adb2e, phobos 792c8b7c1. · 6d799f0a

Iain Buclaw authored 2 years ago

D front-end changes:

	- Import dmd v2.101.0.
	- Deprecate the ability to call `__traits(getAttributes)' on
	  overload sets.
	- Deprecate non-empty `for' statement increment clause with no
	  effect.
	- Array literals assigned to `scope' array variables can now be
	  allocated on the stack.

D runtime changes:

	- Import druntime v2.101.0.

Phobos changes:

	- Import phobos v2.101.0.

gcc/d/ChangeLog:

	* dmd/MERGE: Merge upstream dmd c8ae4adb2e.
	* typeinfo.cc (check_typeinfo_type): Update for new front-end
	interface.
	(TypeInfoVisitor::visit (TypeInfoStructDeclaration *)): Remove warning
	that toHash() must be declared 'nothrow @safe`.

libphobos/ChangeLog:

	* libdruntime/MERGE: Merge upstream druntime c8ae4adb2e.
	* src/MERGE: Merge upstream phobos 792c8b7c1.

6d799f0a

d: Expand bsr intrinsic as `clz(arg) ^ (argsize - 1)' · cc7f509d

Iain Buclaw authored 2 years ago

As well as removing unnecessary casts, this results in less temporaries
being generated during the initial gimple lowering pass.  Otherwise the
code generated is identical to the former intrinsic expansion.

gcc/d/ChangeLog:

	* intrinsics.cc (expand_intrinsic_bsf): Fix comment.
	(expand_intrinsic_bsr): Use BIT_XOR_EXPR instead of MINUS_EXPR.

cc7f509d

tree-optimization/89317 - missed folding of (p + 4) - &p->d · d13b86f9

Richard Biener authored 2 years ago

The PR notices we fail to simplify

  a_4 = &x_3(D)->data;
  b_5 = x_3(D) + 16;
  _1 = b_5 - a_4;

together with the enabler handling ADDR_EXPR leafs in separate
stmts in match.pd the suggested patterns work.

	PR tree-optimization/89317
	* match.pd ((p + b) - &p->c -> b - offsetof(c)): New patterns.

	* gcc.dg/tree-ssa/pr89317.c: New testcase.

d13b86f9

Treat ADDR_EXPR and CONSTRUCTOR as GIMPLE/GENERIC magically · 26295a06

Richard Biener authored 2 years ago

The following allows to match ADDR_EXPR for both the invariant
&a.b case as well as the &p->d case in a separate definition
transparently.  This also allows to remove the hack we employ
for CONSTRUCTOR which we handle for example with

 (match vec_same_elem_p
  CONSTRUCTOR@0
  (if (TREE_CODE (@0) == SSA_NAME
       && uniform_vector_p (gimple_assign_rhs1 (SSA_NAME_DEF_STMT (@0))))))

Note CONSTUCTORs always appear as separate definition in GIMPLE,
but I continue to play safe and ADDR_EXPRs are now matched in
both places where previously ADDR_EXPR@0 would have missed
the &p->x case.

This is a prerequesite for the PR89317 fix.

	* genmatch.cc (dt_node::gen_kids): Handle ADDR_EXPR in both
	the GENERIC and GIMPLE op position.
	(dt_simplify::gen): Capture both GENERIC and GIMPLE op
	position for ADDR_EXPR and CONSTRUCTOR.
	* match.pd: Simplify CONSTRUCTOR leaf handling.

	* gcc.dg/tree-ssa/forwprop-3.c: Adjust.
	* g++.dg/tree-ssa/pr31146-2.C: Likewise.

26295a06

tree-optimization/106904 - bogus -Wstringopt-overflow with vectors · f8d136e5

Richard Biener authored 2 years ago

The following avoids CSE of &ps->wp to &ps->wp.hwnd confusing
-Wstringopt-overflow by making sure to produce addresses to the
biggest container from vectorization.  For this I introduce
strip_zero_offset_components which turns &ps->wp.hwnd into
&(*ps) and use that to base the vector data references on.
That will also work for addresses with variable components,
alternatively emitting pointer arithmetic via calling
get_inner_reference and gimplifying that would be possible
but likely more intrusive.

This is by no means a complete fix for all of those issues
(avoiding ADDR_EXPRs in favor of pointer arithmetic might be).
Other passes will have similar issues.

In theory that might now cause false negatives.

	PR tree-optimization/106904
	* tree.h (strip_zero_offset_components): Declare.
	* tree.cc (strip_zero_offset_components): Define.
	* tree-vect-data-refs.cc (vect_create_addr_base_for_vector_ref):
	Strip zero offset components before building the address.

	* gcc.dg/Wstringop-overflow-pr106904.c: New testcase.

f8d136e5

fortran/openmp.cc: Remove 's' that slipped in during %<..%> replacement · 045592f6

Tobias Burnus authored 2 years ago

Seemingly, 's' (in VI that's the 's'ubstitute command) appeared verbatim in
a gfc_error message when to doing the '...' to %<...%> replacements in commit
r13-4590-g84f6f8a2a97f88be01e223c9c9dbab801a4f501f

gcc/fortran/
	* openmp.cc (gfc_match_omp_context_selector_specification):
	Remove spurious 's' in an error message.

045592f6

Daily bump. · c6b12b80
GCC Administrator authored 2 years ago

c6b12b80

Dec 10, 2022

Fortran: reject bad SIZE argument while simplifying ISHFTC [PR106911] · ae443853

Harald Anlauf authored 2 years ago

gcc/fortran/ChangeLog:

	PR fortran/106911
	* simplify.cc (gfc_simplify_ishftc): If the SIZE argument is known
	to be outside the allowed range, terminate simplification.

gcc/testsuite/ChangeLog:

	PR fortran/106911
	* gfortran.dg/pr106911.f90: New test.

ae443853

ivopts: Fix IP_END handling for asm goto [PR107997] · 7676235f

Jakub Jelinek authored 2 years ago

The following testcase ICEs, because the latch bb ends with
asm goto which has both fallthrough to the header and one or more labels
in the header too.  In that case there is just a single edge out of the
latch block, but still the asm goto is stmt_ends_bb_p statement, yet
ivopts decides to emit an IV bump at the IP_END position and inserts
it into the same bb as the asm goto after it, which then fails verification
(control flow in the middle of bb).

The following patch fixes it by splitting the latch -> header edge in that
case and inserting into the newly created bb, where split_edge ->
redirect_edge_and_branch is able to deal with this case correctly.

2022-12-10  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/107997
	* tree-ssa-loop-ivopts.cc: Include cfganal.h.
	(create_new_iv) <case IP_END>: If ip_end_pos bb is non-empty and ends
	with a stmt which ends bb, instead of adding iv update after it split
	the latch edge and insert iterator into the new latch bb.

	* gcc.c-torture/compile/pr107997.c: New test.

7676235f

libgomp: Handle OpenMP's reverse offloads · ea4b23d9

Tobias Burnus authored 2 years ago

This commit enabled reverse offload for nvptx such that gomp_target_rev
actually gets called.  And it fills the latter function to do all of
the following: finding the host function to the device func ptr and
copying the arguments to the host, processing the mapping/firstprivate,
calling the host function, copying back the data and freeing as needed.

The data handling is made easier by assuming that all host variables
either existed before (and are in the mapping) or that those are
devices variables not yet available on the host. Thus, the reverse
mapping can do without refcounts etc. Note that the spec disallows
inside a target region device-affecting constructs other than target
plus ancestor device-modifier and it also limits the clauses permitted
on this construct.

For the function addresses, an additional splay tree is used; for
the lookup of mapped variables, the existing splay-tree is used.
Unfortunately, its data structure requires a full walk of the tree;
Additionally, the just mapped variables are recorded in a separate
data structure an extra lookup. While the lookup is slow, assuming
that only few variables get mapped in each reverse offload construct
and that reverse offload is the exception and not performance critical,
this seems to be acceptable.

libgomp/ChangeLog:

	* libgomp.h (struct target_mem_desc): Predeclare; move
	below after 'reverse_splay_tree_node' and add rev_array
	member.
	(struct reverse_splay_tree_key_s, reverse_splay_compare): New.
	(reverse_splay_tree_node, reverse_splay_tree,
	reverse_splay_tree_key): New typedef.
	(struct gomp_device_descr): Add mem_map_rev member.
	* oacc-host.c (host_dispatch): NULL init .mem_map_rev.
	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_get_num_devices): Claim
	support for GOMP_REQUIRES_REVERSE_OFFLOAD.
	* splay-tree.h (splay_tree_callback_stop): New typedef; like
	splay_tree_callback but returning int not void.
	(splay_tree_foreach_lazy): Define; like splay_tree_foreach but
	taking splay_tree_callback_stop as argument.
	* splay-tree.c (splay_tree_foreach_internal_lazy,
	splay_tree_foreach_lazy): New; but early exit if callback returns
	nonzero.
	* target.c: Instatiate splay_tree_c with splay_tree_prefix 'reverse'.
	(gomp_map_lookup_rev): New.
	(gomp_load_image_to_device): Handle reverse-offload function
	lookup table.
	(gomp_unload_image_from_device): Free devicep->mem_map_rev.
	(struct gomp_splay_tree_rev_lookup_data, gomp_splay_tree_rev_lookup,
	gomp_map_rev_lookup, struct cpy_data, gomp_map_cdata_lookup_int,
	gomp_map_cdata_lookup): New auxiliary structs and functions for
	gomp_target_rev.
	(gomp_target_rev): Implement reverse offloading and its mapping.
	(gomp_target_init): Init current_device.mem_map_rev.root.
	* testsuite/libgomp.fortran/reverse-offload-2.f90: New test.
	* testsuite/libgomp.fortran/reverse-offload-3.f90: New test.
	* testsuite/libgomp.fortran/reverse-offload-4.f90: New test.
	* testsuite/libgomp.fortran/reverse-offload-5.f90: New test.
	* testsuite/libgomp.fortran/reverse-offload-5a.f90: New test without
	mapping of on-device allocated variables.

ea4b23d9

Add initial ChangeLogs for modula2. · 68ee8a64

Gaius Mulley authored 2 years ago


Add initial ChangeLog file in libgm2 and gcc/m2.

ChangeLog:

	* libgm2: (New directory).
	* libgm2/ChangeLog: (New file).

gcc/ChangeLog:

	* m2: (New directory).
	* m2/ChangeLog: (New file).

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

68ee8a64

Add stub 'gcc/rust/ChangeLog' · 24ff0b3e
Thomas Schwinge authored 2 years ago

24ff0b3e

Fortran: Replace simple '.' quotes by %<.%> · 84f6f8a2

Tobias Burnus authored 2 years ago

When using %qs instead of '%s' or %<=%> instead of '=' looks nicer
by having nicer quotes and bold text, if the terminal supports it;
otherwise, plain quotes are used.

gcc/fortran/ChangeLog:

	* match.cc (gfc_match_member_sep): Use %<...%> in gfc_error.
	* openmp.cc (gfc_match_oacc_routine, gfc_match_omp_context_selector,
	gfc_match_omp_context_selector_specification,
	gfc_match_omp_declare_variant, resolve_omp_clauses): Likewise;
	use %qs instead of '%s'.
	* primary.cc (match_real_constant, gfc_match_varspec): Likewise.
	* resolve.cc (gfc_resolve_formal_arglist, resolve_operator,
	resolve_ordinary_assign): Likewise.

84f6f8a2

Prepare 'contrib/gcc-changelog/git_commit.py' for GCC/Rust · 325529e2

Thomas Schwinge authored 2 years ago

	contrib/
	* gcc-changelog/git_commit.py (default_changelog_locations): Add
	'gcc/rust'.
	(bug_components): Add 'rust'.

325529e2

Add ChangeLog directories for modula2 into git_commit.py. · 7e4aa710

Gaius Mulley authored 2 years ago


Prepare to add changelogs for the Modula2 front end by changing
the contrib git_commit.py script.

contrib/ChangeLog:

	* gcc-changelog/git_commit.py (default_changelog_locations):
	New entry for gcc/m2.  New entry for libgm2.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

7e4aa710

libbacktrace: rewrite and simplify main zstd loop · 1bdba731

Ian Lance Taylor authored 2 years ago

	* elf.c (ZSTD_TABLE_*): Use elf_zstd_fse_baseline_entry.
	(ZSTD_ENCODE_BASELINE_BITS): Define.
	(ZSTD_DECODE_BASELINE, ZSTD_DECODE_BASEBITS): Define.
	(elf_zstd_literal_length_base): New static const array.
	(elf_zstd_match_length_base): Likewise.
	(struct elf_zstd_fse_baseline_entry): Define.
	(elf_zstd_make_literal_baseline_fse): New static function.
	(elf_zstd_make_offset_baseline_fse): Likewise.
	(elf_zstd_make_match_baseline_fse): Likewise.
	(print_table, main): Use elf_zstd_fse_baseline_entry.
	(elf_zstd_lit_table, elf_zstd_match_table): Likewise.
	(elf_zstd_offset_table): Likewise.
	(struct elf_zstd_seq_decode): Likewise.  Remove use_rle and rle
	fields.
	(elf_zstd_unpack_seq_decode): Use elf_zstd_fse_baseline_entry,
	taking a conversion function.  Convert RLE to FSE.
	(elf_zstd_literal_length_baseline): Remove.
	(elf_zstd_literal_length_bits): Remove.
	(elf_zstd_match_length_baseline): Remove.
	(elf_zstd_match_length_bits): Remove.
	(elf_zstd_decompress): Use elf_zstd_fse_baseline_entry.  Rewrite
	and simplify main loop.

1bdba731

Daily bump. · 40ce6485
GCC Administrator authored 2 years ago

40ce6485

Dec 09, 2022

Fortran: ICE on recursive derived types with allocatable components [PR107872] · 01254aa2

Paul Thomas authored 2 years ago

gcc/fortran/ChangeLog:

	PR fortran/107872
	* resolve.cc (derived_inaccessible): Skip over allocatable components
	to prevent an infinite loop.

gcc/testsuite/ChangeLog:

	PR fortran/107872
	* gfortran.dg/pr107872.f90: New test.

01254aa2

Fortran/OpenMP: align/allocator modifiers to the allocate clause · b2e1c49b

Tobias Burnus authored 2 years ago

gcc/fortran/ChangeLog:

	* dump-parse-tree.cc (show_omp_namelist): Improve OMP_LIST_ALLOCATE
	output.
	* gfortran.h (struct gfc_omp_namelist): Add 'align' to 'u'.
	(gfc_free_omp_namelist): Add bool arg.
	* match.cc (gfc_free_omp_namelist): Likewise; free 'u.align'.
	* openmp.cc (gfc_free_omp_clauses, gfc_match_omp_clause_reduction,
	gfc_match_omp_flush): Update call.
	(gfc_match_omp_clauses): Match 'align/allocate modifers in
	'allocate' clause.
	(resolve_omp_clauses): Resolve align.
	* st.cc (gfc_free_statement): Update call
	* trans-openmp.cc (gfc_trans_omp_clauses): Handle 'align'.

libgomp/ChangeLog:

	* libgomp.texi (5.1 Impl. Status): Split allocate clause/directive
	item about 'align'; mark clause as 'Y' and directive as 'N'.
	* testsuite/libgomp.fortran/allocate-2.f90: New test.
	* testsuite/libgomp.fortran/allocate-3.f90: New test.

b2e1c49b