Commits · 3709ca091bec43ee3203b96146585652c5d84728 · COBOLworx / gcc-cobol

Aug 18, 2023

RISC-V: Add the missed half floating-point mode patterns of... · 3709ca09

Lehua Ding authored 1 year ago

RISC-V: Add the missed half floating-point mode patterns of local_pic_load/store when only use zfhmin or zhinxmin

Hi,

There is a new failed RISC-V testcase(testsuite/gcc.target/riscv/rvv/autovec/vls/const-4.c)
on the current trunk branch when use medany as default cmodel.
The reason is the load of half floating-point imm is convert from RTL 1 to RTL
2 as the cmodel be changed from medlow to medany. This change let insn 7 be
combineed with @pred_broadcast patterns (insn 8) at combine pass. However,
insn 6 and insn 7 are combined for SF and DF mode, but not for HF mode, and
the fail combined leads to insn 7 and insn 8 be combined. The reason of the
fail combined is the local_pic_loadhf pattern doesn't exist when only enable
zfhmin(implied by zvfh).

Therefore, when only zfhmin but not zfh is enabled, the define_insn of
*local_pic_load<ANYF:mode> must also be able to produce the pattern for
*load_pic_loadhf pattern, since the zfhmin extension also includes a
half floating-point load/store instructions. So, I added an ANFLSF Iterator
and applied it to local_pic_load/store define_insns. I have checked other ANYF
usage scenarios and feel that this is the only place that needs to be corrected.
I may have missed something, please correct. Thanks.

RTL 1:

(insn 6 3 7 2 (set (reg:DI 137)
        (high:DI (symbol_ref/u:DI ("*.LC0") [flags 0x82]))) "/work/home/lding/open-source/riscv-gnu-toolchain-push/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/const-4.c":7:1 discrim 3 179 {*movdi_64bit}
     (nil))
(insn 7 6 8 2 (set (reg:HF 136)
        (mem/u/c:HF (lo_sum:DI (reg:DI 137)
                (symbol_ref/u:DI ("*.LC0") [flags 0x82])) [0  S2 A16])) "/work/home/lding/open-source/riscv-gnu-toolchain-push/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/const-4.c":7:1 discrim 3 126 {*movhf_hardfloat}
     (expr_list:REG_EQUAL (const_double:HF 8.8828125e+0 [0x0.8e2p+4])
        (nil)))

RTL 2:

(insn 6 3 7 2 (set (reg/f:DI 137)
        (symbol_ref/u:DI ("*.LC0") [flags 0x82])) "/work/home/lding/open-source/riscv-gnu-toolchain-push/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/const-4.c":7:1 discrim 3 179 {*movdi_64bit}
     (nil))
(insn 7 6 8 2 (set (reg:HF 136)
        (mem/u/c:HF (reg/f:DI 137) [0  S2 A16])) "/work/home/lding/open-source/riscv-gnu-toolchain-push/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/const-4.c":7:1 discrim 3 126 {*movhf_hardfloat}
     (expr_list:REG_EQUAL (const_double:HF 8.8828125e+0 [0x0.8e2p+4])
        (nil)))
(insn 8 7 9 2 (set (reg:V2HF 135)
        (if_then_else:V2HF (unspec:V2BI [
                    (const_vector:V2BI [
                            (const_int 1 [0x1]) repeated x2
                        ])
                    (const_int 2 [0x2]) repeated x3
                    (const_int 0 [0])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                ] UNSPEC_VPREDICATE)
            (vec_duplicate:V2HF (reg:HF 136))
            (unspec:V2HF [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))) "/work/home/lding/open-source/riscv-gnu-toolchain-push/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/const-4.c":6:1 discrim 3 1389 {*pred_broadcastv2hf}
     (nil))

Best,
Lehua

gcc/ChangeLog:

	* config/riscv/iterators.md (TARGET_HARD_FLOAT || TARGET_ZFINX): New.
	* config/riscv/pic.md (*local_pic_load<ANYF:mode>): Change ANYF.
	(*local_pic_load<ANYLSF:mode>): To ANYLSF.
	(*local_pic_load_32d<ANYF:mode>): Ditto.
	(*local_pic_load_32d<ANYLSF:mode>): Ditto.
	(*local_pic_store<ANYF:mode>): Ditto.
	(*local_pic_store<ANYLSF:mode>): Ditto.
	(*local_pic_store_32d<ANYF:mode>): Ditto.
	(*local_pic_store_32d<ANYLSF:mode>): Ditto.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/_Float16-zfhmin-4.c: New test.
	* gcc.target/riscv/_Float16-zhinxmin-4.c: New test.

3709ca09

RISC-V: Revert the convert from vmv.s.x to vmv.v.i · 86d80395

Lehua Ding authored 1 year ago


Hi,

This patch revert the convert from vmv.s.x to vmv.v.i and add new pattern
optimize the special case when the scalar operand is zero.

Currently, the broadcast pattern where the scalar operand is a imm
will be converted to vmv.v.i from vmv.s.x and the mask operand will be
converted from 00..01 to 11..11. There are some advantages and
disadvantages before and after the conversion after discussing
with Juzhe offline and we chose not to do this transform.

Before:

  Advantages: The vsetvli info required by vmv.s.x has better compatibility since
  vmv.s.x only required SEW and VLEN be zero or one. That mean there
  is more opportunities to combine with other vsetlv infos in vsetvl pass.

  Disadvantages: For non-zero scalar imm, one more `li rd, imm` instruction
  will be needed.

After:

  Advantages: No need `li rd, imm` instruction since vmv.v.i support imm operand.

  Disadvantages: Like before's advantages. Worse compatibility leads to more
  vsetvl instrunctions need.

Consider the bellow C code and asm after autovec.
there is an extra insn (vsetivli zero, 1, e32, m1, ta, ma)
after converted vmv.s.x to vmv.v.i.

```
int foo1(int* restrict a, int* restrict b, int *restrict c, int n) {
    int sum = 0;
    for (int i = 0; i < n; i++)
      sum += a[i] * b[i];

    return sum;
}
```

asm (Before):

```
foo1:
        ble     a3,zero,.L7
        vsetvli a2,zero,e32,m1,ta,ma
        vmv.v.i v1,0
.L6:
        vsetvli a5,a3,e32,m1,tu,ma
        slli    a4,a5,2
        sub     a3,a3,a5
        vle32.v v2,0(a0)
        vle32.v v3,0(a1)
        add     a0,a0,a4
        add     a1,a1,a4
        vmacc.vv        v1,v3,v2
        bne     a3,zero,.L6
        vsetvli a2,zero,e32,m1,ta,ma
        vmv.s.x v2,zero
        vredsum.vs      v1,v1,v2
        vmv.x.s a0,v1
        ret
.L7:
        li      a0,0
        ret
```

asm (After):

```
foo1:
        ble     a3,zero,.L4
        vsetvli a2,zero,e32,m1,ta,ma
        vmv.v.i v1,0
.L3:
        vsetvli a5,a3,e32,m1,tu,ma
        slli    a4,a5,2
        sub     a3,a3,a5
        vle32.v v2,0(a0)
        vle32.v v3,0(a1)
        add     a0,a0,a4
        add     a1,a1,a4
        vmacc.vv        v1,v3,v2
        bne     a3,zero,.L3
        vsetivli        zero,1,e32,m1,ta,ma
        vmv.v.i v2,0
        vsetvli a2,zero,e32,m1,ta,ma
        vredsum.vs      v1,v1,v2
        vmv.x.s a0,v1
        ret
.L4:
        li      a0,0
        ret
```

Best,
Lehua

Co-Authored-By: Ju-Zhe Zhong <juzhe.zhong@rivai.ai>

gcc/ChangeLog:

	* config/riscv/predicates.md (vector_const_0_operand): New.
	* config/riscv/vector.md (*pred_broadcast<mode>_zero): Ditto.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/scalar_move-5.c: Update.
	* gcc.target/riscv/rvv/base/scalar_move-6.c: Ditto.

86d80395

RISC-V: Forbidden fuse vlmax vsetvl to DEMAND_NONZERO_AVL vsetvl · c4391685

Lehua Ding authored 1 year ago

Hi,

This little patch fix the fail testcase
(gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c)
after apply this patch
(https://gcc.gnu.org/pipermail/gcc-patches/2023-August/627121.html).
The specific reason is that the vsetvl pass has bug and this patch
forbidden the fuse of this case. This patch needs to be committed
before that patch to work.

Best,
Lehua

gcc/ChangeLog:

	* config/riscv/riscv-vsetvl.cc (pass_vsetvl::backward_demand_fusion):
	Forbidden.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c:
	Address failure due to uninitialized vtype register.

c4391685

Daily bump. · 1eb2433f
GCC Administrator authored 1 year ago

1eb2433f

Aug 17, 2023

Revert "libstdc++: Reuse double overload of __convert_to_v if possible" · b860e657
Jonathan Wakely authored 1 year ago
```
This reverts commit aad83d61.

libstdc++-v3/ChangeLog:

	* config/locale/generic/c_locale.cc:
```
b860e657

libstdc++: Replace global std::string objects in tzdb.cc · d82a85b6

Jonathan Wakely authored 1 year ago

When the library is built with --disable-libstdcxx-dual-abi the only
type of std::string supported is the COW string, and the two global
std::string objects in tzdb.cc have to allocate memory. I added them
thinking they would fit in the SSO string buffer, but that's not the
case when the library only uses COW strings.

Replace them with string_view objects to avoid any allocations.

libstdc++-v3/ChangeLog:

	* src/c++20/tzdb.cc (tzdata_file, leaps_file): Change type to
	std::string_view.

d82a85b6

libstdc++: Reuse double overload of __convert_to_v if possible · aad83d61

Jonathan Wakely authored 1 year ago

For targets where double and long double have the same representation we
can reuse the same __convert_to_v code for both types. This will
slightly reduce the size of the compiled code in the library.

libstdc++-v3/ChangeLog:

	* config/locale/generic/c_locale.cc (__convert_to_v): Reuse
	double overload for long double if possible.

aad83d61

libstdc++: Micro-optimize construction of named std::locale · 74c019b5

Jonathan Wakely authored 1 year ago

This shaves about 100ns off the std::locale constructor for named
locales (which is only about 1% of the total time).

Using !*s instead of !strcmp(s, "") doesn't make any difference as GCC
optimizes that already even at -O1. !strcmp(s, "C") is optimized at -O2
so replacing that with s[0] == 'C' && s[1] == '\0' only matters for the
--enable-libstdcxx-debug builds. But !strcmp(s, "POSIX") always makes a
call to strcmp at any optimization level. We make that strcmp call,
maybe several times, for any locale name except for "C" (which will be
matched before we get to the check for "POSIX").

For most targets, locale names begin with a lowercase letter and the
only one that begins with 'P' is "POSIX". Replacing !strcmp(s, "POSIX")
with s[0] == 'P' && !strcmp(s+1, "OSIX") means that we avoid calling
strcmp unless the string really does match "POSIX".

Maybe more importantly, I find is_C_locale(s) easier to read than
strcmp(s, "C") == 0 || strcmp(s, "POSIX") == 0, and !is_C_locale(s)
easier to read than strcmp(s, "C") != 0 && strcmp(s, "POSIX") != 0.

libstdc++-v3/ChangeLog:

	* src/c++98/localename.cc (is_C_locale): New function.
	(locale::locale(const char*)): Use is_C_locale.

74c019b5

libstdc++: Optimize std::string::assign(Iter, Iter) [PR110945] · cc3d7baf

Jonathan Wakely authored 1 year ago

Calling string::assign(Iter, Iter) with "foreign" iterators (not the
string's own iterator or pointer types) currently constructs a temporary
string and then calls replace to copy the characters from it. That means
we copy from the iterators twice, and if the replace operation has to
grow the string then we also allocate twice.

By using *this = basic_string(first, last, get_allocator()) we only
perform a single allocation+copy and then do a cheap move assignment
instead of a second copy (and possible allocation). But that alternative
has to be done conditionally, so that we don't pessimize the native
iterator case (the string's own iterator and pointer types) which
currently select efficient overloads of replace which will not allocate
at all if the string already has sufficient capacity. For C++20 we can
extend that efficient case to work for any contiguous iterator with the
right value type, not just for the string's native iterators.

So the change is to inline the code that decides whether to work in
place or to allocate+copy (instead of deciding that via overload
resolution for replace), and for the allocate+copy case do a move
assignment instead of another call to replace.

For C++98 there is no change, as we can't do an efficient move
assignment anyway, so keep the current code.

We can also simplify assign(initializer_list<CharT>) because the backing
array for an initializer_list is always disjunct with *this, so most of
the code in _M_replace is not needed.

libstdc++-v3/ChangeLog:

	PR libstdc++/110945
	* include/bits/basic_string.h (basic_string::assign(Iter, Iter)):
	Dispatch to _M_replace or move assignment from a temporary,
	based on the iterator type.

cc3d7baf

libstdc++: Add std::formatter specializations for extended float types · 6cf214b4

Jonathan Wakely authored 1 year ago

This makes it possible to format _Float32, _Float64 etc. in C++20 mode.
Previously it was only possible to format them in C++23 when the
<stdfloat> typedefs and the std::to_chars overloads were defined.

Instead of relying on std::to_chars for those types, we can just reuse
the formatters for float, double and long double. This also avoids
template bloat by reusing the same specializations instead of
instantiating __formatter_fp for every different type.

libstdc++-v3/ChangeLog:

	* include/std/format (formatter): Add partial specializations
	for extended floating-point types.
	* testsuite/std/format/functions/format.cc: Move test_float128()
	to ...
	* testsuite/std/format/formatter/ext_float.cc: New test.

6cf214b4

libstdc++: Define std::numeric_limits<_FloatNN> before C++23 · 1a566fdd

Jonathan Wakely authored 1 year ago

The extended floating-point types such as _Float32 are supported by GCC
prior to C++23, you just can't use the standard-conforming names from
<stdfloat> to refer to them. This change defines the specializations of
std::numeric_limits for those types for older dialects, not only for
C++23.

libstdc++-v3/ChangeLog:

	* include/bits/c++config (__gnu_cxx::__bfloat16_t): Define
	whenever __BFLT16_DIG__ is defined, not only for C++23.
	* include/std/limits (numeric_limits<bfloat16_t>): Likewise.
	(numeric_limits<_Float16>, numeric_limits<_Float32>)
	(numeric_limits<_Float64>): Likewise for other extended
	floating-point types.

1a566fdd

libstdc++: Fix -Wunused-parameter in <experimental/internet> · 8ee74c5a

Jonathan Wakely authored 1 year ago

libstdc++-v3/ChangeLog:

	* include/experimental/internet (address_v4::to_string): Remove
	unused parameter name.

8ee74c5a

libstdc++: Make __cmp_cat::__unseq constructor consteval · 84cff28f

Jonathan Wakely authored 1 year ago

This constructor should only ever be used with a literal 0 as the
argument, so we can make it consteval. This has the nice advantage that
it is expanded immediately in the front end, and so GDB will never step
into the __cmp_cat::__unseq::__unseq(__unseq*) constructor that is
uninteresting and probably confusing to users.

libstdc++-v3/ChangeLog:

	* libsupc++/compare (__cmp_cat::__unseq): Make ctor consteval.
	* testsuite/18_support/comparisons/categories/zero_neg.cc: Prune
	excess errors caused by invalid consteval calls.

84cff28f

libstdc++: Simplify chrono::__units_suffix using std::format · c992acdc

Jonathan Wakely authored 1 year ago

For std::chrono formatting we can simplify __units_suffix by using
std::format_to to generate the "[n/m]s" suffix with the correct
character type and write directly to the output iterator, so it doesn't
need to be widened using ctype. We can't remove the use of ctype::widen
for formatting a time zone abbreviation as a wide string, because that
can contain arbitrary characters that can't be widened by
__to_wstring_numeric.

This also fixes a bug in the chrono formatter for %Z which created a
dangling wstring_view.

libstdc++-v3/ChangeLog:

	* include/bits/chrono_io.h (__units_suffix_misc): Remove.
	(__units_suffix): Return a known suffix as string view, do not
	write unknown suffixes to a buffer.
	(__fmt_units_suffix): New function that formats the suffix using
	std::format_to.
	(operator<<, __chrono_formatter::_M_q): Use __fmt_units_suffix.
	(__chrono_formatter::_M_Z): Correct lifetime of wstring.

c992acdc

libstdc++: Rework std::format support for wchar_t · 023a62b7

Jonathan Wakely authored 1 year ago

This changes how std::format creates wide strings, by replacing uses of
std::ctype<wchar_t>::widen with the recently-added __to_wstring_numeric
helper function. This removes the dependency on the locale, which should
only be used for locale-specific formats such as {:Ld}.

Also disable all the wide string formatting support if the
_GLIBCXX_USE_WCHAR_T macro is not defined. This is consistent with other
wchar_t support being disabled if the library is built without that
macro defined.

libstdc++-v3/ChangeLog:

	* include/std/format [_GLIBCXX_USE_WCHAR_T]: Guard all wide
	string formatters with this macro.
	(__formatter_int::_M_format_int, __formatter_fp::format)
	(formatter<const void*, C>::format): Use __to_wstring_numeric
	instead of std::ctype::widen.
	(__formatter_fp::_M_localize): Use hardcoded wchar_t values
	instead of std::ctype::widen.
	* testsuite/std/format/functions/format.cc: Add more checks for
	wstring formatting of arithmetic types.

023a62b7

libstdc++: Implement std::to_string in terms of std::format (P2587R3) · aeed687f

Jonathan Wakely authored 1 year ago

This change for C++26 affects std::to_string for floating-point
arguments, so that they should be formatted using std::format("{}", v)
instead of using sprintf. The modified specification in the standard
also affects integral arguments, but there's no observable difference
for them, and we already use std::to_chars for them anyway.

To avoid <string> depending on all of <format>, this change actually
just uses std::to_chars directly instead of using std::format. This is
equivalent, because the format spec "{}" doesn't use any of the other
features of std::format.

libstdc++-v3/ChangeLog:

	* include/bits/basic_string.h (to_string(floating-point-type)):
	Implement using std::to_chars for C++26.
	* include/bits/version.def (__cpp_lib_to_string): Define.
	* include/bits/version.h: Regenerate.
	* testsuite/21_strings/basic_string/numeric_conversions/char/dr1261.cc:
	Adjust expected result in C++26 mode.
	* testsuite/21_strings/basic_string/numeric_conversions/char/to_string.cc:
	Likewise.
	* testsuite/21_strings/basic_string/numeric_conversions/wchar_t/dr1261.cc:
	Likewise.
	* testsuite/21_strings/basic_string/numeric_conversions/wchar_t/to_wstring.cc:
	Likewise.
	* testsuite/21_strings/basic_string/numeric_conversions/char/to_string_float.cc:
	New test.
	* testsuite/21_strings/basic_string/numeric_conversions/wchar_t/to_wstring_float.cc:
	New test.
	* testsuite/21_strings/basic_string/numeric_conversions/version.cc:
	New test.

aeed687f

libstdc++: Optimize std::to_string using std::string::resize_and_overwrite · 51ec07b1

Jonathan Wakely authored 1 year ago

This uses std::string::__resize_and_overwrite to avoid initializing the
string buffer with characters that are immediately overwritten. This
results in about 6% better performance for the std_to_string case in
int-benchmark.cc from https://github.com/fmtlib/format-benchmark

This requires a change to a testcase. The previous implementation
guaranteed that the string returned from std::to_string(integral-type)
would have no excess capacity, because it was constructed with the
correct length. The new implementation constructs an empty string and
then resizes it with resize_and_overwrite, which over-allocates. This
means that the "no-excess capacity" guarantee no longer holds.

We can also greatly improve the performance of std::to_wstring by using
std::to_string and then widening it with a new helper function, instead
of using std::swprintf to do the formatting.

libstdc++-v3/ChangeLog:

	* include/bits/basic_string.h (to_string(integral-type)): Use
	resize_and_overwrite when available.
	(__to_wstring_numeric): New helper functions.
	(to_wstring): Use std::to_string then __to_wstring_numeric.
	* testsuite/21_strings/basic_string/numeric_conversions/char/to_string_int.cc:
	Remove check for no excess capacity.

51ec07b1

libstdc++: Define std::string::resize_and_overwrite for C++11 and COW string · 95c2b0cc

Jonathan Wakely authored 1 year ago

There are several places in the library where we can improve performance
using resize_and_overwrite so it's inconvenient only being able to use
it in C++23 mode, and only for cxx11 strings. This adds it for COW
strings, and also adds __resize_and_overwrite as an extension for C++11
mode.

The new __resize_and_overwrite is available for C++11 and later, so
within the library we can use that consistently even in C++23.  In order
to avoid making a copy (which might not be possible for non-copyable,
non-movable types) the callable is passed to resize_and_overwrite as an
lvalue reference.  Unlike wrapping it in std::ref(op) this ensures that
invoking it as std::move(op)(n, p) will use the correct value category.
It also avoids any overhead that would be added by wrapping it in a
lambda like [&op](auto p, auto n) { return std::move(op)(p, n); }.

Adjust std::format to use the new __resize_and_overwrite, which we can
assume exists because we only use std::basic_string<char> and
std::basic_string<wchar_t>, so no program-defined specializations.

The uses in <experimental/internet> cannot be replaced, because those
are type-dependent on an Allocator template parameter, which could mean
they use program-defined specializations of std::basic_string that don't
have the __resize_and_overwrite extension.

libstdc++-v3/ChangeLog:

	* include/bits/basic_string.h (__resize_and_overwrite): New
	function.
	* include/bits/basic_string.tcc (__resize_and_overwrite): New
	function.
	(resize_and_overwrite): Simplify by using reserve instead of
	growing the string manually. Adjust for C++11 compatibility.
	* include/bits/cow_string.h (resize_and_overwrite): New
	function.
	(__resize_and_overwrite): New function.
	* include/bits/version.def (__cpp_lib_string_resize_and_overwrite):
	Do not depend on cxx11abi.
	* include/bits/version.h: Regenerate.
	* include/std/format (__formatter_fp::_S_resize_and_overwrite):
	Remove.
	(__formatter_fp::format, __formatter_fp::_M_localize): Use
	__resize_and_overwrite instead of _S_resize_and_overwrite.
	* testsuite/21_strings/basic_string/capacity/char/resize_and_overwrite.cc:
	Adjust for C++11 compatibility when included by ...
	* testsuite/21_strings/basic_string/capacity/char/resize_and_overwrite_ext.cc:
	New test.

95c2b0cc

Fix range-ops operator_addr. · dc48d1d1

Andrew MacLeod authored 1 year ago

Lack of symbolic information prevents op1_range from beig able to draw
the same conclusions as fold_range can.

	PR tree-optimization/111009
	gcc/
	* range-op.cc (operator_addr_expr::op1_range): Be more restrictive.

	gcc/testsuite/
	* gcc.dg/pr111009.c: New.

dc48d1d1

RISCV: Add rotate immediate regression test · d7b6cad9

Patrick O'Neill authored 1 year ago


This adds new regression tests to ensure half-register rotations are
correctly optimized into rori instructions.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/zbb-rol-ror-08.c: New test.
	* gcc.target/riscv/zbb-rol-ror-09.c: New test.

Co-authored-by: Charlie Jenkins <charlie@rivosinc.com>
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>

d7b6cad9

libstdc++: Implement P2770R0 changes to join_view / join_with_view · bad357dd

Patrick Palka authored 1 year ago


This C++23 paper fixes an issue in these views when adapting a certain
kind of non-forward range, and we treat it as a DR against C++20.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

libstdc++-v3/ChangeLog:

	* include/bits/regex.h (regex_iterator::iterator_concept):
	Define for C++20 as per P2770R0.
	(regex_token_iterator::iterator_concept): Likewise.
	* include/std/ranges (__detail::__as_lvalue): Define.
	(join_view::_Iterator): Befriend join_view.
	(join_view::_Iterator::_M_satisfy): Use _M_get_outer
	instead of _M_outer.
	(join_view::_Iterator::_M_get_outer): Define.
	(join_view::_Iterator::_Iterator): Split constructor taking
	_Parent argument into two as per P2770R0.  Remove constraint on
	default constructor.
	(join_view::_Iterator::_M_outer): Make this data member present
	only when the underlying range is forward.
	(join_view::_Iterator::operator++): Use _M_get_outer instead of
	_M_outer.
	(join_view::_Iterator::operator--): Use __as_lvalue helper.
	(join_view::_Iterator::operator==): Adjust constraints as per
	P2770R0.
	(join_view::_Sentinel::__equal): Use _M_get_outer instead of
	_M_outer.
	(join_view::_M_outer): New data member when the underlying range
	is non-forward.
	(join_view::begin): Adjust definition as per P2770R0.
	(join_view::end): Likewise.
	(join_with_view::_M_outer_it): New data member when the
	underlying range is non-forward.
	(join_with_view::begin): Adjust definition as per P2770R0.
	(join_with_view::end): Likewise.
	(join_with_view::_Iterator::_M_outer_it): Make this data member
	present only when the underlying range is forward.
	(join_with_view::_Iterator::_M_get_outer): Define.
	(join_with_view::_Iterator::_Iterator): Split constructor
	taking _Parent argument into two as per P2770R0.  Remove
	constraint on default constructor.
	(join_with_view::_Iterator::_M_update_inner): Adjust definition
	as per P2770R0.
	(join_with_view::_Iterator::_M_get_inner): Likewise.
	(join_with_view::_Iterator::_M_satisfy): Adjust calls to
	_M_get_inner.  Use _M_get_outer instead of _M_outer_it.
	(join_with_view::_Iterator::operator==): Adjust constraints
	as per P2770R0.
	(join_with_view::_Sentinel::operator==): Use _M_get_outer
	instead of _M_outer_it.
	* testsuite/std/ranges/adaptors/p2770r0.cc: New test.

bad357dd

libstdc++: Convert _RangeAdaptorClosure into a CRTP base [PR108827] · 4a6f3676

Patrick Palka authored 1 year ago


Using the CRTP idiom for this base class avoids bloating the size of a
pipeline when adding distinct empty range adaptor closure objects to it,
as detailed in section 4.1 of P2387R3.

But it means we can no longer define its operator| overloads as hidden
friends, since it'd mean each instantiation of _RangeAdaptorClosure
introduces its own distinct set of hidden friends.  So e.g. for the
outer | in

  x | (views::reverse | views::join)

ADL would find 6 distinct hidden operator| friends:

  two from _RangeAdaptorClosure<_Reverse>
  two from _RangeAdaptorClosure<_Join>
  two from _RangeAdaptorClosure<_Pipe<_Reverse, _Join>>

but we really only want to consider the last two.

We avoid this issue by instead defining the operator| overloads at
namespace scope alongside _RangeAdaptorClosure.  This should be fine
because the only types defined in this namespace are _RangeAdaptorClosure,
_RangeAdaptor, _Pipe and _Partial, so we don't have to worry about
unintentional ADL.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

	PR libstdc++/108827

libstdc++-v3/ChangeLog:

	* include/std/ranges (__adaptor::_RangeAdaptorClosure):
	Convert into a CRTP class template.  Move hidden operator|
	friends into namespace scope and adjust their constraints.
	(__closure::__is_range_adaptor_closure_fn): Define.
	(__closure::__is_range_adaptor_closure): Define.
	(__adaptor::_Partial): Adjust use of _RangeAdaptorClosure.
	(__adaptor::_Pipe): Likewise.
	(views::_All): Likewise.
	(views::_Join): Likewise.
	(views::_Common): Likewise.
	(views::_Reverse): Likewise.
	(views::_Elements): Likewise.
	(views::_Adjacent): Likewise.
	(views::_AsRvalue): Likewise.
	(views::_Enumerate): Likewise.
	(views::_AsConst): Likewise.
	* testsuite/std/ranges/adaptors/all.cc: Reinstate assertion
	expecting that adding empty range adaptor closure objects to a
	pipeline doesn't increase the size of a pipeline.

4a6f3676

[LRA]: When assigning stack slots to pseudos previously assigned to fp... · bd7257f0

Vladimir N. Makarov authored 1 year ago

[LRA]: When assigning stack slots to pseudos previously assigned to fp consider other spilled pseudos

The previous LRA patch can assign slot of conflicting pseudos to
pseudos spilled after prohibiting fp->sp elimination.  This patch
fixes this problem.

gcc/ChangeLog:

	* lra-spills.cc (assign_stack_slot_num_and_sort_pseudos): Moving
	slots_num initialization from here ...
	(lra_spill): ... to here before the 1st call of
	assign_stack_slot_num_and_sort_pseudos.  Add the 2nd call after
	fp->sp elimination.

bd7257f0

Add warning options -W[no-]compare-distinct-pointer-types · e1f45bea

Jose E. Marchesi authored 1 year ago

GCC emits pedwarns unconditionally when comparing pointers of
different types, for example:

  int xdp_context (struct xdp_md *xdp)
    {
        void *data = (void *)(long)xdp->data;
        __u32 *metadata = (void *)(long)xdp->data_meta;
        __u32 ret;

        if (metadata + 1 > data)
          return 0;
        return 1;
   }

  /home/jemarch/foo.c: In function ‘xdp_context’:
  /home/jemarch/foo.c:15:20: warning: comparison of distinct pointer types lacks a cast
         15 |   if (metadata + 1 > data)
                 |                    ^

LLVM supports an option -W[no-]compare-distinct-pointer-types that can
be used in order to enable or disable the emission of such warnings.
It is enabled by default.

This patch adds the same options to GCC.

Documentation and testsuite updated included.
Regtested in x86_64-linu-gnu.
No regressions observed.

gcc/ChangeLog:

	PR c/106537
	* doc/invoke.texi (Option Summary): Mention
	-Wcompare-distinct-pointer-types under `Warning Options'.
	(Warning Options): Document -Wcompare-distinct-pointer-types.

gcc/c-family/ChangeLog:

	PR c/106537
	* c.opt (Wcompare-distinct-pointer-types): New option.

gcc/c/ChangeLog:

	PR c/106537
	* c-typeck.cc (build_binary_op): Warning on comparing distinct
	pointer types only when -Wcompare-distinct-pointer-types.

gcc/testsuite/ChangeLog:

	PR c/106537
	* gcc.c-torture/compile/pr106537-1.c: New test.
	* gcc.c-torture/compile/pr106537-2.c: Likewise.
	* gcc.c-torture/compile/pr106537-3.c: Likewise.

e1f45bea

Fix code_helper unused argument warning for fr30 · ee40bdbf

Jan-Benedict Glaw authored 1 year ago

fr30 is the only target defining GO_IF_LEGITIMATE_ADDRESS right now, in
which case the `code_helper ch` argument to memory_address_addr_space_p()
is unused and emits a new warning.

gcc/ChangeLog:
	* recog.cc (memory_address_addr_space_p): Mark possibly unused
	argument as unused.

ee40bdbf

[PATCH] RISC-V: Deduplicate #error messages in testsuite · 1aaf3a64

Tsukasa OI authored 1 year ago

"#error Feature macro not defined" is required to test the existence of an
extension through the preprocessor.  However, multiple occurrence of the
exact same error message will confuse the developer once an error is
encountered.

This commit replaces such error messages to
"#error Feature macro for `EXT' not defined" to make which
macro is missing.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/zvkn.c: Deduplicate #error messages.
	* gcc.target/riscv/zvkn-1.c: Ditto.
	* gcc.target/riscv/zvknc.c: Ditto.
	* gcc.target/riscv/zvknc-1.c: Ditto.
	* gcc.target/riscv/zvknc-2.c: Ditto.
	* gcc.target/riscv/zvkng.c: Ditto.
	* gcc.target/riscv/zvkng-1.c: Ditto.
	* gcc.target/riscv/zvkng-2.c: Ditto.
	* gcc.target/riscv/zvks.c: Ditto.
	* gcc.target/riscv/zvks-1.c: Ditto.
	* gcc.target/riscv/zvksc.c: Ditto.
	* gcc.target/riscv/zvksc-1.c: Ditto.
	* gcc.target/riscv/zvksc-2.c: Ditto.
	* gcc.target/riscv/zvksg.c: Ditto.
	* gcc.target/riscv/zvksg-1.c: Ditto.
	* gcc.target/riscv/zvksg-2.c: Ditto.

1aaf3a64

tree-optimization/111039 - abnormals and bit test merging · 482551a7

Richard Biener authored 1 year ago

The following guards the bit test merging code in if-combine against
the appearance of SSA names used in abnormal PHIs.

	PR tree-optimization/111039
	* tree-ssa-ifcombine.cc (ifcombine_ifandif): Check for
	SSA_NAME_OCCURS_IN_ABNORMAL_PHI.

	* gcc.dg/pr111039.c: New testcase.

482551a7

libgomp: call numa_available first when using libnuma · 8f3c4517

Tobias Burnus authored 1 year ago

The documentation requires that numa_available() is called and only
when successful, other libnuma function may be called. Internally,
it does a syscall to get_mempolicy with flag=0 (which would return
the default policy if mode were not NULL). If this returns -1 (and
not 0) and errno == ENOSYS, the Linux kernel does not have the
get_mempolicy syscall function; if so, numa_available() returns -1
(otherwise: 0).

libgomp/

	PR libgomp/111024
	* allocator.c (gomp_init_libnuma): Call numa_available; if
	not available or not returning 0, disable libnuma usage.

8f3c4517

doc: Fixes to RTL-SSA sample code · 84a5be47

Alex Coplan authored 1 year ago

This patch fixes up the code examples in the RTL-SSA documentation (the
sections on making insn changes) to reflect the current API.

The main issues are as follows:
 - rtl_ssa::recog takes an obstack_watermark & as the first parameter.
   Presumably this is intended to be the change attempt, so I've updated
   the examples to pass this through.
 - The variants of recog and restrict_movement that take an ignore
   predicate have been renamed with an _ignoring suffix, so I've
   updated callers to use those names.
 - A couple of minor "obvious" fixes to add a missing address-of
   operator and correct a variable name.

gcc/ChangeLog:

	* doc/rtl.texi: Fix up sample code for RTL-SSA insn changes.

84a5be47

RISC-V: Fix XPASS slp testcases · 903d9375

Lehua Ding authored 1 year ago

This patch fixs XPASS slp testcases on trunk by
making the conditions for xfail stricter.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/partial/slp-1.c: Fix.
	* gcc.target/riscv/rvv/autovec/partial/slp-16.c: Ditto.
	* gcc.target/riscv/rvv/autovec/partial/slp-17.c: Ditto.
	* gcc.target/riscv/rvv/autovec/partial/slp-18.c: Ditto.
	* gcc.target/riscv/rvv/autovec/partial/slp-19.c: Ditto.
	* gcc.target/riscv/rvv/autovec/partial/slp-2.c: Ditto.
	* gcc.target/riscv/rvv/autovec/partial/slp-3.c: Ditto.
	* gcc.target/riscv/rvv/autovec/partial/slp-4.c: Ditto.
	* gcc.target/riscv/rvv/autovec/partial/slp-5.c: Ditto.
	* gcc.target/riscv/rvv/autovec/partial/slp-6.c: Ditto.

903d9375

bpf: support `naked' function attributes in BPF targets · b7c50f68

Jose E. Marchesi authored 1 year ago

The kernel selftests and other BPF programs make extensive use of the
`naked' function attribute with bodies written using basic inline
assembly.  This patch adds support for the attribute to
bpf-unkonwn-none, makes it to inhibit warnings due to lack of explicit
`return' statement, and updates documentation and testsuite
accordingly.

Tested in x86_64-linux-gnu host and bpf-unknown-none target.

gcc/ChangeLog

	PR target/111046
	* config/bpf/bpf.cc (bpf_attribute_table): Add entry for the
	`naked' function attribute.
	(bpf_warn_func_return): New function.
	(TARGET_WARN_FUNC_RETURN): Define.
	(bpf_expand_prologue): Add preventive comment.
	(bpf_expand_epilogue): Likewise.
	* doc/extend.texi (BPF Function Attributes): Document the `naked'
	function attribute.

gcc/testsuite/ChangeLog

	* gcc.target/bpf/naked-1.c: New test.

b7c50f68

libstdc++: Fix std::format("{:F}", inf) to use uppercase · d07bce47

Jonathan Wakely authored 1 year ago

std::format was treating {:f} and {:F} identically on the basis that for
the fixed 1.234567 format there are no alphabetical characters that need
to be in uppercase. But that's wrong for infinities and NaNs, which
should be formatted as "INF" and "NAN" for {:F}.

libstdc++-v3/ChangeLog:

	* include/std/format (__format::_Pres_type): Add _Pres_F.
	(__formatter_fp::parse): Use _Pres_F for 'F'.
	(__formatter_fp::format): Set __upper for _Pres_F.
	* testsuite/std/format/functions/format.cc: Check formatting of
	infinity and NaN for each presentation type.

d07bce47

libstdc++: Regenerate Makefile.in · b10dfbb5
Jonathan Wakely authored 1 year ago
```
libstdc++-v3/ChangeLog:

	* include/Makefile.in: Regenerate.
```
b10dfbb5

Handle TYPE_OVERFLOW_UNDEFINED vectorized BB reductions · 99b5921b

Richard Biener authored 1 year ago

The following changes the gate to perform vectorization of BB reductions
to use needs_fold_left_reduction_p which in turn requires handling
TYPE_OVERFLOW_UNDEFINED types in the epilogue code generation by
promoting any operations generated there to use unsigned arithmetic.

The following does this, there's currently only v16qi where x86
supports a .REDUC_PLUS reduction for integral modes so I had to
add a x86 specific testcase using GIMPLE IL.

	* tree-vect-slp.cc (vect_slp_check_for_roots): Use
	!needs_fold_left_reduction_p to decide whether we can
	handle the reduction with association.
	(vectorize_slp_instance_root_stmt): For TYPE_OVERFLOW_UNDEFINED
	reductions perform all arithmetic in an unsigned type.

	* gcc.target/i386/vect-reduc-2.c: New testcase.

99b5921b

testsuite: Remove unused dg-line in · 17d670db

benjamin priour authored 1 year ago


Test case g++.dg/analyzer/fanalyzer-show-events-in-system-headers.C
introduced by patch ce8cdf5b
emitted a warning for an unused dg-line variable.
This fixes up the blunder.

Signed-off-by: benjamin priour <vultkayn@gcc.gnu.org>

gcc/testsuite/ChangeLog:

	* g++.dg/analyzer/fanalyzer-show-events-in-system-headers.C:
	Remove dg-line var declare_a.

17d670db

fixincludes: Update darwin_flt_eval_method for macOS 14 · 93f803d5

Rainer Orth authored 1 year ago

On macOS 14, a guard in <math.h> changed:

-- MacOSX13.3.sdk/usr/include/math.h	2023-04-19 01:54:44
+++ MacOSX14.0.sdk/usr/include/math.h	2023-08-01 08:42:43
@@ -22,0 +23 @@
+
@@ -43 +44 @@
-#if __FLT_EVAL_METHOD__ == 0
+#if __FLT_EVAL_METHOD__ == 0 || __FLT_EVAL_METHOD__ == -1
@@ -49 +50 @@
-#elif __FLT_EVAL_METHOD__ == 2 || __FLT_EVAL_METHOD__ == -1
+#elif __FLT_EVAL_METHOD__ == 2

Therefore the darwin_flt_eval_method fixincludes fix doesn't match any
longer, leading to a large number of testsuite failures like

/private/var/gcc/regression/master/14-gcc/build/gcc/include-fixed/math.h:69:5:
error: #error "Unsupported value of __FLT_EVAL_METHOD__."

where __FLT_EVAL_METHOD__ = 16.

This patch adjusts the fix to allow for both forms.

Tested with make check in fixincludes on x86_64-apple-darwin23.0.0 and
verifying that <math.h> has indeed been fixed as expected.

2023-08-16  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

	fixincludes:
	* inclhack.def (darwin_flt_eval_method): Handle macOS 14 guard
	variant.
	* fixincl.x: Regenerate.
	* tests/base/math.h [DARWIN_FLT_EVAL_METHOD_CHECK]: Update test.

93f803d5

build: Allow for Xcode 15 ld -v output · 0beac920

Rainer Orth authored 1 year ago

Since Xcode 15 beta 6, ld -v output differs from previous versions:

* macOS 13/Xcode 14:

  @(#)PROGRAM:ld  PROJECT:ld64-857.1

* macOS 14/Xcode 15:

  @(#)PROGRAM:ld  PROJECT:dyld-1015.1

configure cannot handle the new form, so LD64_VERSION isn't set.

This patch fixes this.  The autoconf manual states that sed doesn't
portably support alternation, so I'm using two separate expressions to
extract the version number.

Tested on x86_64-apple-darwin23.0.0.

2023-08-16  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

	gcc:
	* configure.ac (gcc_cv_ld64_version): Allow for dyld in ld -v
	output.
	* configure: Regenerate.

0beac920

libstdc++: Disable PCH for tests that rely on include order · 51d702f3

Jonathan Wakely authored 1 year ago

These tests expect to be able to #undef a feature test macro and then
include <version> to get it redefined. But if <version> has already been
included by the <bits/stdc++.h> PCH then including it again does nothing
and the macro remains undefined.

libstdc++-v3/ChangeLog:

	* testsuite/24_iterators/move_iterator/p2520r0.cc: Add no_pch.
	* testsuite/std/format/functions/format.cc: Likewise.
	* testsuite/std/format/functions/format_c++23.cc: Likewise.

51d702f3

libstdc++: Fix testsuite no_pch directive · 91315f23

Jonathan Wakely authored 1 year ago

The { dg-add-options no_pch } directive is supposed to add a macro
definition that invalidates the PCH file, and ensures that the #include
directives in the test file are processed as written. But the proc that
adds the options actually removes all existing options, cancelling out
any previous dg-options directive.

This means that using no_pch will cause FAILs in a file that relies on
other options set by an earlier dg-options.

The no_pch directive was added for PR libstdc++/21769 where Janis
suggested adding it as return "$flags -D__GLIBCXX__=99999999" but what
was actually committed didn't include the $flags so replaced them.

Additionally, using no_pch  only prevents the precompiled version of
<bits/stdc++.h> from being included, it doesn't prevent the
non-precompiled version being included by -include bits/stdc++.h in the
test flags. Use regsub to filter that out of the options as well.

libstdc++-v3/ChangeLog:

	* testsuite/lib/dg-options.exp (add_options_for_no_pch): Remove
	any "-include bits/stdc++.h" from options and add the macro to
	the existing options instead of replacing them.

91315f23

RISC-V: Support RVV VFWREDOSUM.VS rounding mode intrinsic API · c6259c49

Pan Li authored 1 year ago


This patch would like to support the rounding mode API for the
VFWREDOSUM.VS as the below samples

* __riscv_vfwredosum_vs_f32m1_f64m1_rm
* __riscv_vfwredosum_vs_f32m1_f64m1_rm_m

Signed-off-by: Pan Li <pan2.li@intel.com>

gcc/ChangeLog:

	* config/riscv/riscv-vector-builtins-bases.cc
	(widen_freducop): Add frm_opt_type template arg.
	(vfwredosum_frm_obj): New declaration.
	(BASE): Ditto.
	* config/riscv/riscv-vector-builtins-bases.h: Ditto.
	* config/riscv/riscv-vector-builtins-functions.def
	(vfwredosum_frm): New intrinsic function def.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/float-point-wredosum.c: New test.

c6259c49