Commits · 63bd36be990f3b08fcee5b69718ef97c055fbb31 · COBOLworx / gcc-cobol

Aug 11, 2023

c++: dependently scoped template-id in type-req [PR110927] · 63bd36be

Patrick Palka authored 1 year ago

Here we're incorrectly rejecting the first type-requirement at parse
time with

  concepts-requires35.C:14:56: error: ‘typename A<T>::B’ is not a template [-fpermissive]

We also incorrectly reject the second type-requirement at satisfaction time
with

  concepts-requires35.C:17:34: error: ‘typename A<int>::B’ names ‘template<class U> struct A<int>::B’, which is not a type

and similarly for the third type-requirement.  This seems to happen only
within a type-requirement; if we instead use e.g. an alias template then
it works as expected.

The difference ultimately seems to be that during parsing of a using-decl,
we pass check_dependency_p=true to cp_parser_nested_name_specifier_opt
whereas for a type-requirement we pass check_dependency_p=false.
Passing =false causes cp_parser_template_id for the dependently-scoped
template-id B<bool> to create a TYPE_DECL of TYPENAME_TYPE (with
TYPENAME_IS_CLASS_P unexpectedly set in the last two cases) whereas
passing =true causes it to return a TEMPLATE_ID_EXPR.  We then call
make_typename_type on this TYPE_DECL which does the wrong thing.

Since there seems to be no justification for using check_dependency_p=false
here, the simplest fix seems to be to pass check_dependency_p=true instead,
matching the behavior of cp_parser_elaborated_type_specifier.

	PR c++/110927

gcc/cp/ChangeLog:

	* parser.cc (cp_parser_type_requirement): Pass
	check_dependency_p=true instead of =false.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp2a/concepts-requires35.C: New test.

63bd36be

c++: recognize in-class var tmpl partial spec [PR71954] · ca267665

Patrick Palka authored 1 year ago

This makes us recognize member variable template partial specializations
defined directly inside the class body.  It seems we mainly just need to
call check_explicit_specialization when we see a static TEMPLATE_ID_EXPR
data member, which sets SET_DECL_TEMPLATE_SPECIALIZATION for us and which
we otherwise don't call (for the out-of-class case we call it from
grokvardecl).

We also need to make finish_member_template_decl return NULL_TREE for
such partial specializations, matching its behavior for class template
partial specializations, so that later we don't try to register it as a
separate member declaration.

	PR c++/71954

gcc/cp/ChangeLog:

	* decl.cc (grokdeclarator): Pass 'dname' instead of
	'unqualified_id' as the name when building the VAR_DECL for a
	static data member.  Call check_explicit_specialization for a
	TEMPLATE_ID_EXPR such member.
	* pt.cc (finish_member_template_decl): Return NULL_TREE
	instead of 'decl' when DECL_TEMPLATE_SPECIALIZATION is not
	set.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp1y/var-templ84.C: New test.
	* g++.dg/cpp1y/var-templ84a.C: New test.

ca267665

libstdc++: Do not call log10(0.0) in std::format [PR110860] · 9e33d718

Jonathan Wakely authored 1 year ago

Calling log10(0.0) returns -inf which has undefined behaviour when
converted to an integer. We only need to use log10 for large values
anyway. If the value is zero then the larger buffer is only needed due
to a large precision, so we don't need to use log10 to estimate the
number of digits for the significand.

libstdc++-v3/ChangeLog:

	PR libstdc++/110860
	* include/std/format (__formatter_fp::format): Do not call log10
	with zero values.

9e33d718

MAINTAINERS: Add myself to write after approval · 20db5cab
Eric Feng authored 1 year ago
```
ChangeLog:

	* MAINTAINERS: Add myself.

Signed-off-by: Eric Feng <ef2648@columbia.edu>
```
20db5cab

c++: improve debug_tree for templated types/decls · 1531de63

Patrick Palka authored 1 year ago

gcc/cp/ChangeLog:

	* ptree.cc (cxx_print_decl): Check for DECL_LANG_SPECIFIC and
	TS_DECL_COMMON only when necessary.  Print DECL_TEMPLATE_INFO
	for all decls that have it, not just VAR_DECL or FUNCTION_DECL.
	Also print DECL_USE_TEMPLATE.
	(cxx_print_type): Print TYPE_TEMPLATE_INFO.
	<case BOUND_TEMPLATE_TEMPLATE_PARM>: Don't print TYPE_TI_ARGS
	anymore.
	<case TEMPLATE_TYPE/TEMPLATE_PARM>: Print TEMPLATE_TYPE_PARM_INDEX
	instead of printing the index, level and original level
	individually.

1531de63

tree-pretty-print: handle COMPONENT_REF with non-decl RHS · a4238f6d

Patrick Palka authored 1 year ago

In the C++ front end, a COMPONENT_REF's second operand isn't always a
decl (at least at template parse time).  This patch makes the generic
pretty printer not ICE when printing such a COMPONENT_REF.

gcc/ChangeLog:

	* tree-pretty-print.cc (dump_generic_node) <case COMPONENT_REF>:
	Don't call component_ref_field_offset if the RHS isn't a decl.

a4238f6d

Use strtol instead of std::stoi [PR110646] · 834d1422

John David Anglin authored 1 year ago

Implementation of std::stoi was overlooked on hppa-hpux, so use
strtol instead.

2023-08-11  John David Anglin  <danglin@gcc.gnu.org>

gcc/ChangeLog:

	PR bootstrap/110646
	* gensupport.cc(class conlist): Use strtol instead of std::stoi.

834d1422

preserve base pointer for __deregister_frame [PR110956] · c46bded7

Thomas Neumann authored 1 year ago

Original bug report: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110956
Rainer Orth successfully tested the patch on Solaris with a full bootstrap.

Some uncommon unwinding table encodings need to access the base pointer
for address computations. We do not have that information in calls to
__deregister_frame_info_bases, and previously simply used nullptr as
base pointer. That is usually fine, but for some Solaris i386 shared
libraries that results in wrong address computations.

To fix this problem we now associate the unwinding object with
the table pointer itself, which is always known, in addition to
the PC range. When deregistering a frame, we first locate the object
using the table pointer, and then use the base pointer stored within
the object to compute the PC range.

libgcc/ChangeLog:
	PR libgcc/110956
	* unwind-dw2-fde.c: Associate object with address of unwinding
	table.

c46bded7

[LRA]: Implement output stack pointer reloads · ef96754d

Vladimir N. Makarov authored 1 year ago

LRA prohibited output stack pointer reloads but it resulted in LRA
failure for AVR target which has no arithmetic insns working with the
stack pointer register.  Given patch implements the output stack
pointer reloads.

gcc/ChangeLog:

	* lra-constraints.cc (goal_alt_out_sp_reload_p): New flag.
	(process_alt_operands): Set the flag.
	(curr_insn_transform): Modify stack pointer offsets if output
	stack pointer reload is generated.

ef96754d

libstdc++: Handle invalid values in std::chrono pretty printers · c19b542a

Jonathan Wakely authored 1 year ago

This avoids an IndexError exception when printing invalid chrono::month
or chrono::weekday values.

libstdc++-v3/ChangeLog:

	* python/libstdcxx/v6/printers.py (StdChronoCalendarPrinter):
	Check for out-of-range month an weekday indices.
	* testsuite/libstdc++-prettyprinters/chrono.cc: Check invalid
	month and weekday values.

c19b542a

libstdc++: Revert accidentally committed change to bits/stl_iterator.h · 7723684f

Jonathan Wakely authored 1 year ago

In commit r14-3134-g9cb2a7c8d54b1f I only meant to change some uses of
__clamp_iter_cat to use __iter_category_t, I didn't mean to commit the
additional change introducing __clamped_iter_cat_t. This reverts that
part.

libstdc++-v3/ChangeLog:

	* include/bits/stl_iterator.h (__clamped_iter_cat_t): Remove.

7723684f

config: Fix host -rdynamic detection for build != host != target · 4d9bc81a

Joseph Myers authored 1 year ago

The GCC_ENABLE_PLUGINS configure logic for detecting whether -rdynamic
is necessary and supported uses an appropriate objdump for $host
binaries (running on $build) in cases where $host is $build or
$target.

However, it is missing such logic in the case where $host is neither
$build nor $target, resulting in the compilers not being linked with
-rdynamic and plugins not being usable with such a compiler.  In fact
$ac_cv_prog_OBJDUMP, as used when $build = $host, is always an objdump
for $host binaries that runs on $build; that is, it's appropriate to
use in this case as well.

Tested in such a configuration that it does result in cc1 being linked
with -rdynamic as expected.  Also bootstrapped with no regressions for
x86_64-pc-linux-gnu.

config/
	* gcc-plugin.m4 (GCC_ENABLE_PLUGINS): Use
	export_sym_check="$ac_cv_prog_OBJDUMP -T" also when host is not
	build or target.

gcc/
	* configure: Regenerate.

libcc1/
	* configure: Regenerate.

4d9bc81a

tree-optimization/110979 - fold-left reduction and partial vectors · 798a880a

Richard Biener authored 1 year ago

When we vectorize fold-left reductions with partial vectors but
no target operation available we use a vector conditional to force
excess elements to zero.  But that doesn't correctly preserve
the sign of zero.  The following patch disables partial vector
support when we have to do that and also need to honor rounding
modes other than round-to-nearest.  When round-to-nearest is in
effect and we have to preserve the sign of zero instead use
negative zero for the excess elements.

	PR tree-optimization/110979
	* tree-vect-loop.cc (vectorizable_reduction): For
	FOLD_LEFT_REDUCTION without target support make sure
	we don't need to honor signed zeros and sign dependent rounding.

	* gcc.dg/torture/pr110979.c: New testcase.

798a880a

Improve BB vectorization opt-info · 3a13884b

Richard Biener authored 1 year ago

The following makes us more correctly print the used vector size
when doing BB vectorization and also print all involved SLP graph
roots, not just the random one we ended up picking as leader.
In particular the last bit improves diffing opt-info between
different GCC revs but it also requires some testsuite adjustments.

	* tree-vect-slp.cc (vect_slp_region): Provide opt-info for all SLP
	subgraph entries.  Dump the used vector size based on the
	SLP subgraph entry root vector type.

	* g++.dg/vect/slp-pr87105.cc: Adjust.
	* gcc.dg/vect/bb-slp-17.c: Likewise.
	* gcc.dg/vect/bb-slp-20.c: Likewise.
	* gcc.dg/vect/bb-slp-21.c: Likewise.
	* gcc.dg/vect/bb-slp-22.c: Likewise.
	* gcc.dg/vect/bb-slp-subgroups-2.c: Likewise.

3a13884b

RISC-V: Support RVV VFMSUB rounding mode intrinsic API · 6a8203b7

Pan Li authored 1 year ago


This patch would like to support the rounding mode API for the
VFMSUB as the below samples.

* __riscv_vfmsub_vv_f32m1_rm
* __riscv_vfmsub_vv_f32m1_rm_m
* __riscv_vfmsub_vf_f32m1_rm
* __riscv_vfmsub_vf_f32m1_rm_m

Signed-off-by: Pan Li <pan2.li@intel.com>

gcc/ChangeLog:

	* config/riscv/riscv-vector-builtins-bases.cc
	(class vfmsub_frm): New class for vfmsub frm.
	(vfmsub_frm): New declaration.
	(BASE): Ditto.
	* config/riscv/riscv-vector-builtins-bases.h: Ditto.
	* config/riscv/riscv-vector-builtins-functions.def
	(vfmsub_frm): New function declaration.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/float-point-msub.c: New test.

6a8203b7

VECT: Add vec_mask_len_{load_lanes,store_lanes} patterns · 59d789b3

Juzhe-Zhong authored 1 year ago

This patch is add vec_mask_len_{load_lanes,store_stores} autovectorization patterns.

Here we want to support this following autovectorization:

void
foo (int8_t *__restrict a,
int8_t *__restrict b,
int8_t *__restrict cond,
int n)
{
  for (intptr_t i = 0; i < n; ++i)
    {
      if (cond[i])
        a[i] = b[i * 2] + b[i * 2 + 1];
    }
}

ARM SVE IR:

https://godbolt.org/z/cro1Eqc6a

  # loop_mask_60 = PHI <next_mask_82(4), max_mask_81(3)>
  ...
  mask__39.12_63 = vect__3.11_61 != { 0, ... };
  vec_mask_and_66 = loop_mask_60 & mask__39.12_63;
  ...
  vect_array.15 = .MASK_LOAD_LANES (_57, 8B, vec_mask_and_66);
  ...

For RVV, we would like to see IR:

  loop_len = SELECT_VL;
  ...
  mask__39.12_63 = vect__3.11_61 != { 0, ... };
  ...
  vect_array.15 = .MASK_LEN_LOAD_LANES (_57, 8B, mask__39.12_63, loop_len, bias);
  ...

Bootstrap and Regression on X86 passed.

Ok for trunk ?

gcc/ChangeLog:

	* doc/md.texi: Add vec_mask_len_{load_lanes,store_lanes} patterns.
	* internal-fn.cc (expand_partial_load_optab_fn): Ditto.
	(expand_partial_store_optab_fn): Ditto.
	* internal-fn.def (MASK_LEN_LOAD_LANES): Ditto.
	(MASK_LEN_STORE_LANES): Ditto.
	* optabs.def (OPTAB_CD): Ditto.

59d789b3

RISC-V: Support RVV VFNMADD rounding mode intrinsic API · bcda361d

Pan Li authored 1 year ago


This patch would like to support the rounding mode API for the
VFNMADD as the below samples.

* __riscv_vfnmadd_vv_f32m1_rm
* __riscv_vfnmadd_vv_f32m1_rm_m
* __riscv_vfnmadd_vf_f32m1_rm
* __riscv_vfnmadd_vf_f32m1_rm_m

Signed-off-by: Pan Li <pan2.li@intel.com>

gcc/ChangeLog:

	* config/riscv/riscv-vector-builtins-bases.cc
	(class vfnmadd_frm): New class for vfnmadd frm.
	(vfnmadd_frm): New declaration.
	(BASE): Ditto.
	* config/riscv/riscv-vector-builtins-bases.h: Ditto.
	* config/riscv/riscv-vector-builtins-functions.def
	(vfnmadd_frm): New function declaration.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/float-point-nmadd.c: New test.

bcda361d

match.pd: Implement missed optimization ((x ^ y) & z) | x -> (z & y) | x [PR109938] · 9f933492

Drew Ross authored 1 year ago

Adds a simplification for ((x ^ y) & z) | x to be folded into
(z & y) | x. Merges this simplification with ((x | y) & z) | x -> (z & y) | x
to prevent duplicate pattern.

2023-08-11  Drew Ross  <drross@redhat.com>
	    Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/109938
	* match.pd (((x ^ y) & z) | x -> (z & y) | x): New simplification.

	* gcc.c-torture/execute/pr109938.c: New test.
	* gcc.dg/tree-ssa/pr109938.c: New test.

9f933492

RISC-V: Support RVV VFMADD rounding mode intrinsic API · 797334e9

Pan Li authored 1 year ago


This patch would like to support the rounding mode API for the
VFMADD as the below samples.

* __riscv_vfmadd_vv_f32m1_rm
* __riscv_vfmadd_vv_f32m1_rm_m
* __riscv_vfmadd_vf_f32m1_rm
* __riscv_vfmadd_vf_f32m1_rm_m

Signed-off-by: Pan Li <pan2.li@intel.com>

gcc/ChangeLog:

	* config/riscv/riscv-vector-builtins-bases.cc
	(class vfmadd_frm): New class for vfmadd frm.
	(vfmadd_frm_obj): New declaration.
	(BASE): Ditto.
	* config/riscv/riscv-vector-builtins-bases.h: Ditto.
	* config/riscv/riscv-vector-builtins-functions.def
	(vfmadd_frm): New function definition.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/float-point-madd.c: New test.

797334e9

RISC-V: Support RVV VFNMSAC rounding mode intrinsic API · cd9150e2

Pan Li authored 1 year ago


This patch would like to support the rounding mode API for the
VFNMSAC for the below samples.

* __riscv_vfnmsac_vv_f32m1_rm
* __riscv_vfnmsac_vv_f32m1_rm_m
* __riscv_vfnmsac_vf_f32m1_rm
* __riscv_vfnmsac_vf_f32m1_rm_m

Signed-off-by: Pan Li <pan2.li@intel.com>

gcc/ChangeLog:

	* config/riscv/riscv-vector-builtins-bases.cc
	(class vfnmsac_frm): New class for vfnmsac frm.
	(vfnmsac_frm_obj): New declaration.
	(BASE): Ditto.
	* config/riscv/riscv-vector-builtins-bases.h: Ditto.
	* config/riscv/riscv-vector-builtins-functions.def
	(vfnmsac_frm): New function definition.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/float-point-nmsac.c: New test.

cd9150e2

c: Add __typeof_unqual__ and __typeof_unqual support · 607d9d50

Jakub Jelinek authored 1 year ago

As I mentioned in my stdckdint.h mail, I think having __ prefixed
keywords for the typeof_unqual keyword which can be used in earlier
language modes can be useful, not all code can be switched to C23
right away.

The following patch implements that.  It keeps the non-C23 behavior
for it for the _Noreturn functions to stay compatible with how
__typeof__ behaves.

I think we don't need it for C++, in C++ we have standard
traits to remove qualifiers etc.

2023-08-11  Jakub Jelinek  <jakub@redhat.com>

gcc/
	* doc/extend.texi (Typeof): Document typeof_unqual
	and __typeof_unqual__.
gcc/c-family/
	* c-common.cc (c_common_reswords): Add __typeof_unqual
	and __typeof_unqual__ spellings of typeof_unqual.
gcc/c/
	* c-parser.cc (c_parser_typeof_specifier): Handle
	__typeof_unqual and __typeof_unqual__ as !is_std.
gcc/testsuite/
	* gcc.dg/c11-typeof-2.c: New test.
	* gcc.dg/c11-typeof-3.c: New test.
	* gcc.dg/gnu11-typeof-3.c: New test.
	* gcc.dg/gnu11-typeof-4.c: New test.

607d9d50

Fix PR 110954: wrong code with cmp | !cmp · f956c232

Andrew Pinski authored 1 year ago

This was an oversight on my part forgetting that
cmp will might have a different true value than all ones
but will have a value of 1 in most cases.
This means if we have `(f < 0) | !(f < 0)` we would
optimize this to -1 rather than just 1.

This is version 2 of the patch.
Decided to go down a different route than just checking if
the precission was 1 inside bitwise_inverted_equal_p.
So instead bitwise_inverted_equal_p gets passed an argument
that will be set if there was a comparison that was being compared
and the user of bitwise_inverted_equal_p decides what needs to be done.
In most uses of bitwise_inverted_equal_p, the check will be
`!wascmp || element_precision (type) == 1` .
But in the case of `a & ~a` and `a ^| ~a` we can handle the case
of wascmp by using constant_boolean_node isntead.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

	PR tree-optimization/110954

gcc/ChangeLog:

	* generic-match-head.cc (bitwise_inverted_equal_p): Add
	wascmp argument and set it accordingly.
	* gimple-match-head.cc (bitwise_inverted_equal_p): Add
	wascmp argument to the macro.
	(gimple_bitwise_inverted_equal_p): Add
	wascmp argument and set it accordingly.
	* match.pd (`a & ~a`, `a ^| ~a`): Update call
	to bitwise_inverted_equal_p and handle wascmp case.
	(`(~x | y) & x`, `(~x | y) & x`, `a?~t:t`): Update
	call to bitwise_inverted_equal_p and check to see
	if was !wascmp or if precision was 1.

gcc/testsuite/ChangeLog:

	* gcc.c-torture/execute/pr110954-1.c: New test.

f956c232

c: Support for -Wuseless-cast [PR84510] · 68783211

Martin Uecker authored 1 year ago

Add support for Wuseless-cast C (and ObjC).

	PR c/84510

gcc/c/:
	* c-typeck.cc (build_c_cast): Add warning.

gcc/c-family/:
	* c.opt: Enable warning for C and ObjC.

gcc/:
	* doc/invoke.texi: Update.

gcc/testsuite/:
	* gcc.dg/Wuseless-cast.c: New test.

68783211

RISC-V: Support RVV VFMSAC rounding mode intrinsic API · ee8a844d

Pan Li authored 1 year ago


This patch would like to support the rounding mode API for the
VFMSAC for the below samples.

* __riscv_vfmsac_vv_f32m1_rm
* __riscv_vfmsac_vv_f32m1_rm_m
* __riscv_vfmsac_vf_f32m1_rm
* __riscv_vfmsac_vf_f32m1_rm_m

Signed-off-by: Pan Li <pan2.li@intel.com>

gcc/ChangeLog:

	* config/riscv/riscv-vector-builtins-bases.cc
	(class vfmsac_frm): New class for vfmsac frm.
	(vfmsac_frm_obj): New declaration.
	(BASE): Ditto.
	* config/riscv/riscv-vector-builtins-bases.h: Ditto.
	* config/riscv/riscv-vector-builtins-functions.def
	(vfmsac_frm): New function definition

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/float-point-msac.c: New test.

ee8a844d

Daily bump. · 4271b742
GCC Administrator authored 1 year ago

4271b742

Aug 10, 2023

libstdc++: Fix out-of-bounds read in format string "{:{}." [PR110974] · ecfd8c7f

Jonathan Wakely authored 1 year ago

libstdc++-v3/ChangeLog:

	PR libstdc++/110974
	* include/std/format (_Spec::_S_parse_width_or_precision): Check
	for empty range before dereferencing iterator.
	* testsuite/std/format/string.cc: Check for expected exception.
	Fix expected exception message in test_pr110862() and actually
	call it.

ecfd8c7f

libstdc++: Fix std::format for localized floats [PR110968] · f48a5423

Jonathan Wakely authored 1 year ago

The __formatter_fp::_M_localize function just returns an empty string if
the formatting locale is the C locale, as there is nothing to do. But
the caller was assuming that the returned string contains the localized
string. The caller should use the original string if _M_localize returns
an empty string.

libstdc++-v3/ChangeLog:

	PR libstdc++/110968
	* include/std/format (__formatter_fp::format): Check return
	value of _M_localize.
	* testsuite/std/format/functions/format.cc: Check classic
	locale.

f48a5423

libstdc++: Use alias template for iterator_category [PR110970] · 9cb2a7c8

Jonathan Wakely authored 1 year ago

This renames __iterator_category_t to __iter_category_t, for consistency
with std::iter_value_t, std::iter_difference_t and std::iter_reference_t
in C++20. Then use __iter_category_t in <bits/stl_iterator.h>, which
fixes the problem of the missing 'typename' that Clang 15 incorrectly
still requires.

libstdc++-v3/ChangeLog:

	PR libstdc++/110970
	* include/bits/stl_iterator.h (__detail::__move_iter_cat): Use
	__iter_category_t.
	(iterator_traits<common_iterator<I, S>>::_S_iter_cat): Likewise.
	(__detail::__basic_const_iterator_iter_cat): Likewise.
	* include/bits/stl_iterator_base_types.h (__iterator_category_t):
	Rename to __iter_category_t.

9cb2a7c8

Fix division by zero in loop splitting · 39204ae9

Jan Hubicka authored 1 year ago

Profile update I added to tree-ssa-loop-split can divide by zero in
situation that the conditional is predicted with 0 probability which
is triggered by jump threading update in the testcase.

gcc/ChangeLog:

	PR middle-end/110923
	* tree-ssa-loop-split.cc (split_loop): Watch for division by zero.

gcc/testsuite/ChangeLog:

	PR middle-end/110923
	* gcc.dg/tree-ssa/pr110923.c: New test.

39204ae9

RISC-V: Add Ztso atomic mappings · 0ac32323

Patrick O'Neill authored 1 year ago

The RISC-V Ztso extension currently has no effect on generated code.
With the additional ordering constraints guarenteed by Ztso, we can emit
more optimized atomic mappings than the RVWMO mappings.

This PR implements the Ztso psABI mappings[1].

[1] https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/391



2023-08-08 Patrick O'Neill <patrick@rivosinc.com>

gcc/ChangeLog:

	* common/config/riscv/riscv-common.cc: Add Ztso and mark Ztso as
	dependent on 'a' extension.
	* config/riscv/riscv-opts.h (MASK_ZTSO): New mask.
	(TARGET_ZTSO): New target.
	* config/riscv/riscv.cc (riscv_memmodel_needs_amo_acquire): Add
	Ztso case.
	(riscv_memmodel_needs_amo_release): Add Ztso case.
	(riscv_print_operand): Add Ztso case for LR/SC annotations.
	* config/riscv/riscv.md: Import sync-rvwmo.md and sync-ztso.md.
	* config/riscv/riscv.opt: Add Ztso target variable.
	* config/riscv/sync.md (mem_thread_fence_1): Expand to RVWMO or
	Ztso specific insn.
	(atomic_load<mode>): Expand to RVWMO or Ztso specific insn.
	(atomic_store<mode>): Expand to RVWMO or Ztso specific insn.
	* config/riscv/sync-rvwmo.md: New file. Seperate out RVWMO
	specific load/store/fence mappings.
	* config/riscv/sync-ztso.md: New file. Seperate out Ztso
	specific load/store/fence mappings.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/amo-table-ztso-amo-add-1.c: New test.
	* gcc.target/riscv/amo-table-ztso-amo-add-2.c: New test.
	* gcc.target/riscv/amo-table-ztso-amo-add-3.c: New test.
	* gcc.target/riscv/amo-table-ztso-amo-add-4.c: New test.
	* gcc.target/riscv/amo-table-ztso-amo-add-5.c: New test.
	* gcc.target/riscv/amo-table-ztso-compare-exchange-1.c: New test.
	* gcc.target/riscv/amo-table-ztso-compare-exchange-2.c: New test.
	* gcc.target/riscv/amo-table-ztso-compare-exchange-3.c: New test.
	* gcc.target/riscv/amo-table-ztso-compare-exchange-4.c: New test.
	* gcc.target/riscv/amo-table-ztso-compare-exchange-5.c: New test.
	* gcc.target/riscv/amo-table-ztso-compare-exchange-6.c: New test.
	* gcc.target/riscv/amo-table-ztso-compare-exchange-7.c: New test.
	* gcc.target/riscv/amo-table-ztso-fence-1.c: New test.
	* gcc.target/riscv/amo-table-ztso-fence-2.c: New test.
	* gcc.target/riscv/amo-table-ztso-fence-3.c: New test.
	* gcc.target/riscv/amo-table-ztso-fence-4.c: New test.
	* gcc.target/riscv/amo-table-ztso-fence-5.c: New test.
	* gcc.target/riscv/amo-table-ztso-load-1.c: New test.
	* gcc.target/riscv/amo-table-ztso-load-2.c: New test.
	* gcc.target/riscv/amo-table-ztso-load-3.c: New test.
	* gcc.target/riscv/amo-table-ztso-store-1.c: New test.
	* gcc.target/riscv/amo-table-ztso-store-2.c: New test.
	* gcc.target/riscv/amo-table-ztso-store-3.c: New test.
	* gcc.target/riscv/amo-table-ztso-subword-amo-add-1.c: New test.
	* gcc.target/riscv/amo-table-ztso-subword-amo-add-2.c: New test.
	* gcc.target/riscv/amo-table-ztso-subword-amo-add-3.c: New test.
	* gcc.target/riscv/amo-table-ztso-subword-amo-add-4.c: New test.
	* gcc.target/riscv/amo-table-ztso-subword-amo-add-5.c: New test.

Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>

0ac32323

Fix profile update in duplicat_loop_body_to_header_edge for loops with 0 count_in · 937591d2

Jan Hubicka authored 1 year ago

this patch makes duplicate_loop_body_to_header_edge to not drop profile counts to
uninitialized when count_in is 0.  This happens because profile_probability in 0 count
is undefined.

gcc/ChangeLog:

	* cfgloopmanip.cc (duplicate_loop_body_to_header_edge): Special case loops with
	0 iteration count.

937591d2

Fix profile updating bug in tree-ssa-threadupdate · 546bf79b

Jan Hubicka authored 1 year ago

ssa_fix_duplicate_block_edges later calls update_profile to correct profile after threading.
In the testcase this does not work since we lose track of the duplicated edge.  This
happens because redirect_edge_and_branch returns NULL if the edge already has correct
destination which is the case.

gcc/ChangeLog:

	* tree-ssa-threadupdate.cc (ssa_fix_duplicate_block_edges): Fix profile update.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/phi_on_compare-1.c: Check profile consistency.

546bf79b

Fix undefined behaviour in profile_count::differs_from_p · e4110308

Jan Hubicka authored 1 year ago

This patch avoid overflow in profile_count::differs_from_p and also makes it to
return false from one of the values is undefined while other is defined.

gcc/ChangeLog:

	* profile-count.cc (profile_count::differs_from_p): Fix overflow and
	handling of undefined values.

e4110308

phiopt: Fix phiopt ICE on vops [PR102989] · 8afe9d5d

Jakub Jelinek authored 1 year ago

I've ran into ICE on gcc.dg/torture/bitint-42.c with -O1 or -Os
when enabling expensive tests, and unfortunately I can't reproduce without
_BitInt.  The IL before phiopt3 has:
  <bb 87> [local count: 203190070]:
  # .MEM_428 = VDEF <.MEM_367>
  bitint.159 = VIEW_CONVERT_EXPR<unsigned long[8]>(*.LC3);
  goto <bb 89>; [100.00%]

  <bb 88> [local count: 203190070]:
  # .MEM_427 = VDEF <.MEM_367>
  bitint.159 = VIEW_CONVERT_EXPR<unsigned long[8]>(*.LC4);

  <bb 89> [local count: 406380139]:
  # .MEM_368 = PHI <.MEM_428(87), .MEM_427(88)>
  # VUSE <.MEM_368>
  _123 = VIEW_CONVERT_EXPR<unsigned long[8]>(r495[i_107].D.2780)[0];
and factor_out_conditional_operation is called on the vop PHI, it
sees it has exactly two operands and defining statements of both
PHI arguments are converts (VCEs in this case), so it thinks it is
a good idea to try to optimize that and while doing that it constructs
void type SSA_NAMEs and the like.

2023-08-10  Jakub Jelinek  <jakub@redhat.com>

	PR c/102989
	* tree-ssa-phiopt.cc (single_non_singleton_phi_for_edges): Never
	return virtual phis and return NULL if there is a virtual phi
	where the arguments from E0 and E1 edges aren't equal.

8afe9d5d

Make ISEL used internal functions const/nothrow where appropriate · b0894a12

Richard Biener authored 1 year ago

Both .VEC_SET and .VEC_EXTACT and the various .VCOND internal functions
are operating on registers only and they are not supposed to raise
any exceptions.  The following makes them const/nothrow.  I've
verified this avoids useless SSA updates in ISEL.

	* internal-fn.def (VCOND, VCONDU, VCONDEQ, VCOND_MASK,
	VEC_SET, VEC_EXTRACT): Make ECF_CONST | ECF_NOTHROW.

b0894a12

RISC-V: Add MASK vec_duplicate pattern[PR110962] · da7b43fb

Juzhe-Zhong authored 1 year ago

This patch fix bug:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110962

SUBROUTINE a(b,c,d)
  LOGICAL,DIMENSION(INOUT)  :: b
  LOGICAL e
  REAL, DIMENSION(IN)     ::  c
  REAL, DIMENSION(INOUT)  ::  d
  REAL, DIMENSION(SIZE(c))   :: f
  WHERE (b.AND.e)
     WHERE (f>=0.)
        d = g
     ENDWHERE
  ENDWHERE
END SUBROUTINE a

   PR target/110962

gcc/ChangeLog:
	PR target/110962
	* config/riscv/autovec.md (vec_duplicate<mode>): New pattern.

da7b43fb

RISC-V: Support RVV VFNMACC rounding mode intrinsic API · 6176527a

Pan Li authored 1 year ago


This patch would like to support the rounding mode API for the
VFNMACC for the below samples.

* __riscv_vfnmacc_vv_f32m1_rm
* __riscv_vfnmacc_vv_f32m1_rm_m
* __riscv_vfnmacc_vf_f32m1_rm
* __riscv_vfnmacc_vf_f32m1_rm_m

Signed-off-by: Pan Li <pan2.li@intel.com>

gcc/ChangeLog:

	* config/riscv/riscv-vector-builtins-bases.cc
	(class vfnmacc_frm): New class for vfnmacc.
	(vfnmacc_frm_obj): New declaration.
	(BASE): Ditto.
	* config/riscv/riscv-vector-builtins-bases.h: Ditto.
	* config/riscv/riscv-vector-builtins-functions.def
	(vfnmacc_frm): New function definition.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/float-point-nmacc.c: New test.

6176527a

RISC-V: Support RVV VFMACC rounding mode intrinsic API · 07e93224

Pan Li authored 1 year ago


This patch would like to support the rounding mode API for the
VFMACC for the below samples.

* __riscv_vfmacc_vv_f32m1_rm
* __riscv_vfmacc_vv_f32m1_rm_m
* __riscv_vfmacc_vf_f32m1_rm
* __riscv_vfmacc_vf_f32m1_rm_m

Signed-off-by: Pan Li <pan2.li@intel.com>

gcc/ChangeLog:

	* config/riscv/riscv-vector-builtins-bases.cc
	(class vfmacc_frm): New class for vfmacc frm.
	(vfmacc_frm_obj): New declaration.
	(BASE): Ditto.
	* config/riscv/riscv-vector-builtins-bases.h: Ditto.
	* config/riscv/riscv-vector-builtins-functions.def
	(vfmacc_frm): New function definition.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/float-point-macc.c: New test.

07e93224

RISC-V: Support TU for integer ternary OP[PR110964] · 887f1391

Juzhe-Zhong authored 1 year ago

PR target/110964

gcc/ChangeLog:
	PR target/110964
	* config/riscv/riscv-v.cc (expand_cond_len_ternop): Add integer ternary.

gcc/testsuite/ChangeLog:
	PR target/110964
	* gcc.target/riscv/rvv/autovec/pr110964.c: New test.

887f1391

Remove insert location argument from vectorizable_live_operation · 9b8ebdb6

Richard Biener authored 1 year ago

The insert location argument isn't actually used but we compute
that ourselves.  There's a single spot, namely when asking
for the loop mask via vect_get_loop_mask that the passed argument
is used but that looks like an oversight.  The following fixes that
and adjusts vectorizable_live_operation and can_vectorize_live_stmts
to no longer take a stmt iterator argument.

	* tree-vectorizer.h (vectorizable_live_operation): Remove
	gimple_stmt_iterator * argument.
	* tree-vect-loop.cc (vectorizable_live_operation): Likewise.
	Adjust plumbing around vect_get_loop_mask.
	(vect_analyze_loop_operations): Adjust.
	* tree-vect-slp.cc (vect_slp_analyze_node_operations_1): Likewise.
	(vect_bb_slp_mark_live_stmts): Likewise.
	(vect_schedule_slp_node): Likewise.
	* tree-vect-stmts.cc (can_vectorize_live_stmts): Likewise.
	Remove gimple_stmt_iterator * argument.
	(vect_transform_stmt): Adjust.

9b8ebdb6