Commits · 5bba2215c23d71ea08fb81656d93041229f7ea9c · COBOLworx / gcc-cobol

Nov 19, 2020

Nathan Sidwell authored 4 years ago

This exposes the template specialization table, so the modules
machinery may access it.  The hashed entity (tmpl, args & spec) is
available, along with a hash table walker.  We also need a way of
finding or inserting entries, along with some bookkeeping fns to deal
with the instantiation and (partial) specialization lists.

	gcc/cp/
	* cp-tree.h (struct spec_entry): Moved from pt.c.
	(walk_specializations, match_mergeable_specialization)
	(get_mergeable_specialization_flags)
	(add_mergeable_specialization): Declare.
	* pt.c (struct spec_entry): Moved to cp-tree.h.
	(walk_specializations, match_mergeable_specialization)
	(get_mergeable_specialization_flags)
	(add_mergeable_specialization): New.

5bba2215

libstdc++: Avoid calling undefined __gthread_self weak symbol [PR 95989] · 08b4d325

Jonathan Wakely authored 4 years ago

Since glibc 2.27 the pthread_self symbol has been defined in libc rather
than libpthread. Because we only call pthread_self through a weak alias
it's possible for statically linked executables to end up without a
definition of pthread_self. This crashes when trying to call an
undefined weak symbol.

We can use the __GLIBC_PREREQ version check to detect the version of
glibc where pthread_self is no longer in libpthread, and call it
directly rather than through the weak reference.

It would be better to check for pthread_self in libc during configure
instead of hardcoding the __GLIBC_PREREQ check. That would be
complicated by the fact that prior to glibc 2.27 libc.a didn't have the
pthread_self symbol, but libc.so.6 did.  The configure checks would need
to try to link both statically and dynamically, and the result would
depend on whether the static libc.a happens to be installed during
configure (which could vary between different systems using the same
version of glibc). Doing it properly is left for a future date, as that
will be needed anyway after glibc moves all pthread symbols from
libpthread to libc. When that happens we should revisit the whole
approach of using weak symbols for pthread symbols.

For the purposes of std::this_thread::get_id() we call
pthread_self() directly when using glibc 2.27 or later. Otherwise, if
__gthread_active_p() is true then we know the libpthread symbol is
available so we call that. Otherwise, we are single-threaded and just
use ((__gthread_t)1) as the thread ID.

An undesirable consequence of this change is that code compiled prior to
the change might inline the old definition of this_thread::get_id()
which always returns (__gthread_t)1 in a program that isn't linked to
libpthread. Code compiled after the change will use pthread_self() and
so get a real TID. That could result in the main thread having different
thread::id values in different translation units. This seems acceptable,
as there are not expected to be many uses of thread::id in programs
that aren't linked to libpthread.

An earlier version of this patch also changed __gthread_self() to use
__GLIBC_PREREQ(2, 27) and only use the weak symbol for older glibc. Tha
might still make sense to do, but isn't needed by libstdc++ now.

libstdc++-v3/ChangeLog:

	PR libstdc++/95989
	* config/os/gnu-linux/os_defines.h (_GLIBCXX_NATIVE_THREAD_ID):
	Define new macro to get reliable thread ID.
	* include/bits/std_thread.h: (this_thread::get_id): Use new
	macro if it's defined.
	* testsuite/30_threads/jthread/95989.cc: New test.
	* testsuite/30_threads/this_thread/95989.cc: New test.

08b4d325

c++: Expose constexpr hash table · bfc139e2

Nathan Sidwell authored 4 years ago

This patch exposes the constexpr hash table so that the modules
machinery can save and load constexpr bodies.  While there I noticed
that we could do a little constification of the hasher and comparator
functions.  Also combine the saving machinery to a single function
returning void -- nothing ever looked at its return value.

	gcc/cp/
	* cp-tree.h (struct constexpr_fundef): Moved from constexpr.c.
	(maybe_save_constexpr_fundef): Declare.
	(register_constexpr_fundef): Take constexpr_fundef object, return
	void.
	* decl.c (mabe_save_function_definition): Delete, functionality
	moved to maybe_save_constexpr_fundef.
	(emit_coro_helper, finish_function): Adjust.
	* constexpr.c (struct constexpr_fundef): Moved to cp-tree.h.
	(constexpr_fundef_hasher::equal): Constify.
	(constexpr_fundef_hasher::hash): Constify.
	(retrieve_constexpr_fundef): Make non-static.
	(maybe_save_constexpr_fundef): Break out checking and duplication
	from ...
	(register_constexpr_fundef): ... here.  Just register the constexpr.

bfc139e2

Fix two bugs in operand_equal_p · 0862d007

Jan Hubicka authored 4 years ago

	* fold-const.c (operand_compare::operand_equal_p): Fix thinko in
	COMPONENT_REF handling and guard types_same_for_odr by
	virtual_method_call_p.
	(operand_compare::hash_operand): Likewise.

0862d007

c, tree: Fix ICE from get_parm_array_spec [PR97860] · 8156cfaa

Jakub Jelinek authored 4 years ago

The C and C++ FEs handle zero sized arrays differently, C uses
NULL TYPE_MAX_VALUE on non-NULL TYPE_DOMAIN on complete ARRAY_TYPEs
with bitsize_zero_node TYPE_SIZE, while C++ FE likes to set
TYPE_MAX_VALUE to the largest value (and min to the lowest).

Martin has used array_type_nelts in get_parm_array_spec where the
function on the C form of [0] arrays returns error_mark_node and the code
crashes soon afterwards.  The following patch teaches array_type_nelts about
this (e.g. dwarf2out already handles that as [0]).  While it will change
what is_empty_type returns for certain types (e.g. struct S { int a[0]; };),
as those types occupy zero bits in C, it should make an ABI difference.

So, the tree.c change makes the c-decl.c code handle the [0] arrays
like any other constant extents, and the c-decl.c change just makes sure
that if we'd run into error_mark_node e.g. from the VLA expressions, we
don't crash on those.

2020-11-19  Jakub Jelinek  <jakub@redhat.com>

	PR c/97860
	* tree.c (array_type_nelts): For complete arrays with zero min
	and NULL max and zero size return -1.

	* c-decl.c (get_parm_array_spec): Bail out of nelts is
	error_operand_p.

	* gcc.dg/pr97860.c: New test.

8156cfaa

c++: Fix array new with value-initialization [PR97523] · ae48b74c

Marek Polacek authored 4 years ago

Since my r11-3092 the following is rejected with -std=c++20:

  struct T { explicit T(); };
  void fn(int n) {
    new T[1]();
  }

with "would use explicit constructor 'T::T()'".  It is because since
that change we go into the P1009 block in build_new (array_p is false,
but nelts is non-null and we're in C++20).  Since we only have (), we
build a {} and continue to build_new_1, which then calls build_vec_init
and then we error because the {} isn't CONSTRUCTOR_IS_DIRECT_INIT.

For (), which is value-initializing, we want to do what we were doing
before: pass empty init and let build_value_init take care of it.

For various reasons I wanted to dig a little bit deeper into this,
and as a result, I'm adding a test for [expr.new]/24 (and checked that
out current behavior matches clang++).

gcc/cp/ChangeLog:

	PR c++/97523
	* init.c (build_new): When value-initializing an array new,
	leave the INIT as an empty vector.

gcc/testsuite/ChangeLog:

	PR c++/97523
	* g++.dg/expr/anew5.C: New test.
	* g++.dg/expr/anew6.C: New test.

ae48b74c

c++: Fix crash with broken deduction from {} [PR97895] · 25056bdf

Marek Polacek authored 4 years ago

Unfortunately, the otherwise beautiful

  for (constructor_elt &elt : *CONSTRUCTOR_ELTS (init))

is not immune to an empty constructor, so we have to check
CONSTRUCTOR_ELTS first.

gcc/cp/ChangeLog:

	PR c++/97895
	* pt.c (do_auto_deduction): Don't crash when the constructor has
	zero elements.

gcc/testsuite/ChangeLog:

	PR c++/97895
	* g++.dg/cpp0x/auto54.C: New test.

25056bdf

config: Add tests for modules-desired features · e1f07131

Nathan Sidwell authored 4 years ago

this adds configure tests for features that modules can take advantage
of -- and if they are not present has reduced or fallback functionality.

	gcc/
	* configure.ac: Add tests for fstatat, sighandler_t, O_CLOEXEC,
	unix-domain and ipv6 sockets.
	* config.in: Rebuilt.
	* configure: Rebuilt.

e1f07131

c++: Relax new assert [PR 97905] · 255483e5

Nathan Sidwell authored 4 years ago

It turns out there are legitimate cases for the new decl to not have
lang-specific.

	PR c++/97905
	gcc/cp/
	* decl.c (duplicate_decls): Relax new assert.
	gcc/testsuite/
	* g++.dg/lookup/pr97905.C: New.

255483e5

pru: Add builtins for HALT and LMBD · 5ace1776

Dimitar Dimitrov authored 4 years ago

Add builtins for HALT and LMBD, per Texas Instruments document
SPRUHV7C.  Use the new LMBD pattern to define an expand for clz.

Binutils [1] and sim [2] support for LMBD instruction are merged now.

[1] https://sourceware.org/pipermail/binutils/2020-October/113901.html
[2] https://sourceware.org/pipermail/gdb-patches/2020-November/173141.html

gcc/ChangeLog:

	* config/pru/alu-zext.md: Add lmbd patterns for zero_extend
	variants.
	* config/pru/pru.c (enum pru_builtin): Add HALT and LMBD.
	(pru_init_builtins): Ditto.
	(pru_builtin_decl): Ditto.
	(pru_expand_builtin): Ditto.
	* config/pru/pru.h (CLZ_DEFINED_VALUE_AT_ZERO): Define PRU
	value for CLZ with zero value parameter.
	* config/pru/pru.md: Add halt, lmbd and clz patterns.
	* doc/extend.texi: Document PRU builtins.

gcc/testsuite/ChangeLog:

	* gcc.target/pru/halt.c: New test.
	* gcc.target/pru/lmbd.c: New test.

5ace1776

vect: Add a “very cheap” cost model · 0b0061f4

Richard Sandiford authored 4 years ago

Currently we have three vector cost models: cheap, dynamic and
unlimited.  -O2 -ftree-vectorize uses “cheap” by default, but that's
still relatively aggressive about peeling and aliasing checks,
and can lead to significant code size growth.

This patch adds an even more conservative choice, which for lack of
imagination I've called “very cheap”.  It only allows vectorisation
if the vector code entirely replaces the scalar code.  It also
requires one iteration of the vector loop to pay for itself,
regardless of how often the loop iterates.  (If the vector loop
needs multiple iterations to be beneficial then things are
probably too close to call, and the conservative thing would
be to stick with the scalar code.)

The idea is that this should be suitable for -O2, although the patch
doesn't change any defaults itself.

I tested this by building and running a bunch of workloads for SVE,
with three options:

  (1) -O2
  (2) -O2 -ftree-vectorize -fvect-cost-model=very-cheap
  (3) -O2 -ftree-vectorize [-fvect-cost-model=cheap]

All three builds used the default -msve-vector-bits=scalable and
ran with the minimum vector length of 128 bits, which should give
a worst-case bound for the performance impact.

The workloads included a mixture of microbenchmarks and full
applications.  Because it's quite an eclectic mix, there's not
much point giving exact figures.  The aim was more to get a general
impression.

Code size growth with (2) was much lower than with (3).  Only a
handful of tests increased by more than 5%, and all of them were
microbenchmarks.

In terms of performance, (2) was significantly faster than (1)
on microbenchmarks (as expected) but also on some full apps.
Again, performance only regressed on a handful of tests.

As expected, the performance of (3) vs. (1) and (3) vs. (2) is more
of a mixed bag.  There are several significant improvements with (3)
over (2), but also some (smaller) regressions.  That seems to be in
line with -O2 -ftree-vectorize being a kind of -O2.5.

The patch reorders vect_cost_model so that values are in order
of increasing aggressiveness, which makes it possible to use
range checks.  The value 0 still represents “unlimited”,
so “if (flag_vect_cost_model)” is still a meaningful check.

gcc/
	* doc/invoke.texi (-fvect-cost-model): Add a very-cheap model.
	* common.opt (fvect-cost-model=): Add very-cheap as a possible option.
	(fsimd-cost-model=): Likewise.
	(vect_cost_model): Add very-cheap.
	* flag-types.h (vect_cost_model): Add VECT_COST_MODEL_VERY_CHEAP.
	Put the values in order of increasing aggressiveness.
	* tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Use
	range checks when comparing against VECT_COST_MODEL_CHEAP.
	(vect_prune_runtime_alias_test_list): Do not allow any alias
	checks for the very-cheap cost model.
	* tree-vect-loop.c (vect_analyze_loop_costing): Do not allow
	any peeling for the very-cheap cost model.  Also require one
	iteration of the vector loop to pay for itself.

gcc/testsuite/
	* gcc.dg/vect/vect-cost-model-1.c: New test.
	* gcc.dg/vect/vect-cost-model-2.c: Likewise.
	* gcc.dg/vect/vect-cost-model-3.c: Likewise.
	* gcc.dg/vect/vect-cost-model-4.c: Likewise.
	* gcc.dg/vect/vect-cost-model-5.c: Likewise.
	* gcc.dg/vect/vect-cost-model-6.c: Likewise.

0b0061f4

libstdc++: Add missing header to some tests · 5e6a4315

Jonathan Wakely authored 4 years ago

These tests use std::this_thread::sleep_for without including <thread>.

libstdc++-v3/ChangeLog:

	* testsuite/30_threads/async/async.cc: Include <thread>.
	* testsuite/30_threads/future/members/93456.cc: Likewise.

5e6a4315

AArch64: Add cost table for Cortex-A76 · 5c5a67e6

Wilco Dijkstra authored 4 years ago

Add an initial cost table for Cortex-A76 - this is copied from
cotexa57_extra_costs but updated based on the Optimization Guide.
Use the new cost table on all Neoverse tunings and ensure the tunings
are consistent for all.  As a result more compact code is generated
with more combined shift+alu operations. Eg. -mcpu=cortex-a76 will now
merge the shifts in:

int f(int x, int y) { return (x & y << 3) * (x | y << 3); }

and  w2, w0, w1, lsl 3
orr  w0, w0, w1, lsl 3
mul  w0, w2, w0
ret

SPEC2017 codesize improves by 0.02% and SPECINT2017 shows 0.24% gain.

2020-11-18  Wilco Dijkstra  <wdijkstr@arm.com>

gcc/
	* config/aarch64/aarch64.c (neoversen1_tunings): Use new
	cortexa76_extra_costs.
	(neoversev1_tunings): Likewise.
	(neoversen2_tunines): Likewise.
	* config/arm/aarch-cost-tables.h (cortexa76_extra_costs):
	add new costs.

5c5a67e6

AArch64: Improve inline memcpy expansion · 1d77928f

Wilco Dijkstra authored 4 years ago

Improve the inline memcpy expansion.  Use integer load/store for copies <= 24
bytes instead of SIMD.  Set the maximum copy to expand to 256 by default,
except that -Os or no Neon expands up to 128 bytes.  When using LDP/STP of
Q-registers, also use Q-register accesses for the unaligned tail, saving 2
instructions (eg. all sizes up to 48 bytes emit exactly 4 instructions).
Cleanup code and comments.

The codesize gain vs the GCC10 expansion is 0.05% on SPECINT2017.

2020-11-03  Wilco Dijkstra  <wdijkstr@arm.com>

gcc/
	* config/aarch64/aarch64.c (aarch64_expand_cpymem): Cleanup code and
	comments, tweak expansion decisions and improve tail expansion.

1d77928f

Fix PR ada/97805 · 2729378d

Eric Botcazou authored 4 years ago

We need to include limits.h (or <climits>) in adaint.c because of LLONG_MIN.

gcc/ada/ChangeLog:
	PR ada/97805
	* adaint.c: Include climits in C++ and limits.h otherwise.

2729378d

preprocessor: main file searching · 9844497a

Nathan Sidwell authored 4 years ago

This adds the capability to locate the main file on the user or system
include paths.  That's extremely useful to users building header
units.  Searching has to be requiested (plain header-unit compilation
will not search).  Also, to make include_next work as expected when
building a header unit, we add a mechanism to retrofit a non-searched
source file as one on the include path.

	libcpp/
	* include/cpplib.h (enum cpp_main_search): New.
	(struct cpp_options): Add main_search field.
	(cpp_main_loc): Declare.
	(cpp_retrofit_as_include): Declare.
	* internal.h (struct cpp_reader): Add main_loc field.
	(_cpp_in_main_source_file): Not main if main is a header.
	* init.c (cpp_read_main_file): Use main_search option to locate
	main file.  Set main_loc
	* files.c (cpp_retrofit_as_include): New.

9844497a

libstdc++: Move std::thread to a new header · b204d772

Jonathan Wakely authored 4 years ago

This makes it possible to use std::thread without including the whole of
<thread>. It also makes this_thread::get_id() and this_thread::yield()
available even when there is no gthreads support (e.g. when GCC is built
with --disable-threads or --enable-threads=single).

In order for the std::thread::id return type of this_thread::get_id() to
be defined, std:thread itself is defined unconditionally. However the
constructor that creates new threads is not defined for single-threaded
builds. The thread::join() and thread::detach() member functions are
defined inline for single-threaded builds and just throw an exception
(because we know the thread cannot be joinable if the constructor that
creates joinable threads doesn't exit).

The thread::hardware_concurrency() member function is also defined
inline and returns 0 (as suggested by the standard when the value "is
not computable or well-defined").

The main benefit for most targets is that other headers such as <future>
do not need to include the whole of <thread> just to be able to create a
std::thread. That avoids including <stop_token> and std::jthread where
not required. This is another partial fix for PR 92546.

This also means we can use this_thread::get_id() and this_thread::yield()
in <stop_token> instead of using the gthread functions directly. This
removes some preprocessor conditionals, simplifying the code.

libstdc++-v3/ChangeLog:

	PR libstdc++/92546
	* include/Makefile.am: Add new <bits/std_thread.h> header.
	* include/Makefile.in: Regenerate.
	* include/std/future: Include new header instead of <thread>.
	* include/std/stop_token: Include new header instead of
	<bits/gthr.h>.
	(stop_token::_S_yield()): Use this_thread::yield().
	(_Stop_state_t::_M_requester): Change type to std::thread::id.
	(_Stop_state_t::_M_request_stop()): Use this_thread::get_id().
	(_Stop_state_t::_M_remove_callback(_Stop_cb*)): Likewise.
	Use __is_single_threaded() to decide whether to synchronize.
	* include/std/thread (thread, operator==, this_thread::get_id)
	(this_thread::yield): Move to new header.
	(operator<=>, operator!=, operator<, operator<=, operator>)
	(operator>=, hash<thread::id>, operator<<): Define even when
	gthreads not available.
	* src/c++11/thread.cc: Include <memory>.
	* include/bits/std_thread.h: New file.
	(thread, operator==, this_thread::get_id, this_thread::yield):
	Define even when gthreads not available.
	[!_GLIBCXX_HAS_GTHREADS] (thread::join, thread::detach)
	(thread::hardware_concurrency): Define inline.

b204d772

libstdc++: Fix overflow checks to use the correct "time_t" [PR 93456] · b108faa9

Jonathan Wakely authored 4 years ago

I recently added overflow checks to src/c++11/futex.cc for PR 93456, but
then changed the type of the timespec for PR 93421. This meant the
overflow checks were no longer using the right range, because the
variable being written to might be smaller than time_t.

This introduces new typedef that corresponds to the tv_sec member of the
struct being passed to the syscall, and uses that typedef in the range
checks.

libstdc++-v3/ChangeLog:

	PR libstdc++/93421
	PR libstdc++/93456
	* src/c++11/futex.cc (syscall_time_t): New typedef for
	the type of the syscall_timespec::tv_sec member.
	(relative_timespec, _M_futex_wait_until)
	(_M_futex_wait_until_steady): Use syscall_time_t in overflow
	checks, not time_t.

b108faa9

preprocessor: main-file cleanup · bf425849

Nathan Sidwell authored 4 years ago

In preparing module patch 7 I realized there was a cleanup I could
make to simplify it.  This is that cleanup.  Also, when doing the
cleanup I noticed some macros had been turned into inline functions,
but not renamed to the preprocessors internal namespace
(_cpp_$INTERNAL rather than cpp_$USER).  Thus, this renames those
functions, deletes an internal field of the file structure, and
determines whether we're in the main file by comparing to
pfile->main_file, the _cpp_file of the main file.

	libcpp/
	* internal.h (cpp_in_system_header): Rename to ...
	(_cpp_in_system_header): ... here.
	(cpp_in_primary_file): Rename to ...
	(_cpp_in_main_source_file): ... here.  Compare main_file equality
	and check main_search value.
	* lex.c (maybe_va_opt_error, _cpp_lex_direct): Adjust for rename.
	* macro.c (_cpp_builtin_macro_text): Likewise.
	(replace_args): Likewise.
	* directives.c (do_include_next): Likewise.
	(do_pragma_once, do_pragma_system_header): Likewise.
	* files.c (struct _cpp_file): Delete main_file field.
	(pch_open): Check pfile->main_file equality.
	(make_cpp_file): Drop cpp_reader parm, don't set main_file.
	(_cpp_find_file): Adjust.
	(_cpp_stack_file): Check pfile->main_file equality.
	(struct report_missing_guard_data): Add cpp_reader field.
	(report_missing_guard): Check pfile->main_file equality.
	(_cpp_report_missing_guards): Adjust.

bf425849

Fix bootstrap · d84ba819

Richard Biener authored 4 years ago

This fixes a typo in the TREE_CODE compare which should
compare against TYPE_DECL, not TYPE_NAME.

2020-11-19  Richard Biener  <rguenther@suse.de>

	* fold-const.c (operand_compare::hash_operand): Fix typo.

d84ba819

Fix gcc.dg/pr97897.c · 717e22dc

Richard Biener authored 4 years ago

This adds dg-options "" to avoid the pedantic error on _Complex int.

2020-11-19  Richard Biener  <rguenther@suse.de>

	* gcc.dg/pr97897.c: Add dg-options.

717e22dc

refactor reassocs get_rank · b08e0ee3

Richard Biener authored 4 years ago

This refactors things so assigned ranks are dumped and the cache
is consistently used also for PHIs.

2020-11-19  Richard Biener  <rguenther@suse.de>

	* tree-ssa-reassoc.c (get_rank): Refactor to consistently
	use the cache and dump ranks assigned.

b08e0ee3

Fix operand_equal_p hash and copare of ODR_TYPE_REF · d8cf8976

Jan Hubicka authored 4 years ago

	* fold-const.c (operand_compare::operand_equal_p): More OBJ_TYPE_REF
	matching to correct place; drop OEP_ADDRESS_OF for TOKEN, OBJECT and
	class.
	(operand_compare::hash_operand): Hash ODR type for OBJ_TYPE_REF.

d8cf8976

[3/3] [AArch64][vect] vec_widen_lshift pattern · 27842e2a

Joel Hutton authored 4 years ago

Add aarch64 vec_widen_lshift_lo/hi patterns and fix bug it triggers in
mid-end. This pattern takes one vector with N elements of size S, shifts
each element left by the element width and stores the results as N
elements of size 2*s (in 2 result vectors). The aarch64 backend
implements this with the shll,shll2 instruction pair.

gcc/ChangeLog:

	* config/aarch64/aarch64-simd.md: Add vec_widen_lshift_hi/lo<mode>
	patterns.
	* tree-vect-stmts.c (vectorizable_conversion): Fix for widen_lshift
	case.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/vect-widen-lshift.c: New test.

27842e2a

[2/3] [vect] Add widening add, subtract patterns · 9fc9573f

Joel Hutton authored 4 years ago

Add widening add, subtract patterns to tree-vect-patterns. Update the
widened code of patterns that detect PLUS_EXPR to also detect
WIDEN_PLUS_EXPR. These patterns take 2 vectors with N elements of size
S and perform an add/subtract on the elements, storing the results as N
elements of size 2*S (in 2 result vectors). This is implemented in the
aarch64 backend as addl,addl2 and subl,subl2 respectively. Add aarch64
tests for patterns.

gcc/ChangeLog:
	* doc/generic.texi: Document new widen_plus/minus_lo/hi tree codes.
	* doc/md.texi: Document new widenening add/subtract hi/lo optabs.
	* expr.c (expand_expr_real_2): Add widen_add, widen_subtract cases.
	* optabs-tree.c (optab_for_tree_code): Add case for widening optabs.
	* optabs.def (OPTAB_D): Define vectorized widen add, subtracts.
	* tree-cfg.c (verify_gimple_assign_binary): Add case for widening adds,
	subtracts.
	* tree-inline.c (estimate_operator_cost): Add case for widening adds,
	subtracts.
	* tree-vect-generic.c (expand_vector_operations_1): Add case for
	widening adds, subtracts
	* tree-vect-patterns.c (vect_recog_widen_add_pattern): New recog
	pattern.
	(vect_recog_widen_sub_pattern): New recog pattern.
	(vect_recog_average_pattern): Update widened add code.
	(vect_recog_average_pattern): Update widened add code.
	* tree-vect-stmts.c (vectorizable_conversion): Add case for widened add,
	subtract.
	(supportable_widening_operation): Add case for widened add, subtract.
	* tree.def
	(WIDEN_PLUS_EXPR): New tree code.
	(WIDEN_MINUS_EXPR): New tree code.
	(VEC_WIDEN_ADD_HI_EXPR): New tree code.
	(VEC_WIDEN_PLUS_LO_EXPR): New tree code.
	(VEC_WIDEN_MINUS_HI_EXPR): New tree code.
	(VEC_WIDEN_MINUS_LO_EXPR): New tree code.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/vect-widen-add.c: New test.
	* gcc.target/aarch64/vect-widen-sub.c: New test.

9fc9573f

[1/3][aarch64] Add vec_widen patterns to aarch64 · ec46904e

Joel Hutton authored 4 years ago

Add widening add and subtract patterns to the aarch64
backend. These allow taking vectors of N elements of size S
and performing and add/subtract on the high or low half
widening the resulting elements and storing N/2 elements of size 2*S.
These correspond to the addl,addl2,subl,subl2 instructions.

gcc/ChangeLog:

	* config/aarch64/aarch64-simd.md: New patterns
	vec_widen_saddl_lo/hi_<mode>.

ec46904e

tree-optimization/97901 - ICE propagating out LC PHIs · ec383f0b

Richard Biener authored 4 years ago

We need to fold the stmt to canonicalize MEM_REFs which means
we're back to using replace_uses_by.  Which means we need dominators
to not require a CFG cleanup upthread.

2020-11-19  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/97901
	* tree-ssa-propagate.c (clean_up_loop_closed_phi): Compute
	dominators and use replace_uses_by.

	* gcc.dg/torture/pr97901.c: New testcase.

ec383f0b

Enhance debug info for fixed-point types · 43a0debd

Eric Botcazou authored 4 years ago

The Ada language supports fixed-point types as first-class citizens so
they need to be described as-is in the debug info.  We devised the
langhook get_fixed_point_type_info for this purpose a few years ago,
but it comes with a limitation for the representation of the scale
factor that we would need to lift in order to be able to represent
more fixed-point types.

gcc/ChangeLog:
	* dwarf2out.h (struct fixed_point_type_info) <scale_factor>: Turn
	numerator and denominator into a tree.
	* dwarf2out.c (base_type_die): In the case of a fixed-point type
	with arbitrary scale factor, call add_scalar_info on numerator and
	denominator to emit the appropriate attributes.

gcc/ada/ChangeLog:
	* exp_dbug.adb (Is_Handled_Scale_Factor): Delete.
	(Get_Encoded_Name): Do not call it.
	* gcc-interface/decl.c (gnat_to_gnu_entity) <Fixed_Point_Type>:
	Tidy up and always use a meaningful description for arbitrary
	scale factors.
	* gcc-interface/misc.c (gnat_get_fixed_point_type_info): Remove
	obsolete block and adjust the description of the scale factor.

43a0debd

tree-optimization/97897 - complex lowering on abnormal edges · 0d829095

Richard Biener authored 4 years ago

This fixes complex lowering to not put constants into abnormal
edge PHI values by making sure abnormally used SSA names are
VARYING in its propagation lattice.

2020-11-19  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/97897
	* tree-complex.c (complex_propagate::visit_stmt): Make sure
	abnormally used SSA names are VARYING.
	(complex_propagate::visit_phi): Likewise.
	* tree-ssa.c (verify_phi_args): Verify PHI arguments on abnormal
	edges are SSA names.

	* gcc.dg/pr97897.c: New testcase.

0d829095

i386: Disable *<absneg:code><mode>2_i387_1 for TARGET_SSE_MATH modes · 50134189

Uros Bizjak authored 4 years ago

This pattern interferes with *<absneg:code><mode>2_1 when TARGET_SSE_MATH
modes are active. Combine pass is able to remove (use) RTXes and transforms
*<absneg:code><mode>2_1 to *<absneg:code><mode>2_i387_1 where SSE
alternatives are not available.

2020-11-19  Uroš Bizjak  <ubizjak@gmail.com>

gcc/
	* config/i386/i386.md (*<absneg:code><mode>2_i387_1):
	Disable for TARGET_SSE_MATH modes.

gcc/testsuite/
	* gcc.target/i386/pr97887.c: New test.

50134189

Minor H8 shift code generation change in preparation for cc0 removal · 70033749

Jeff Law authored 4 years ago

So I didn't stay up late to work from pago pago this year and beat the stage1
close, but I do want to flush out the removal of cc0 from the H8 port this
cycle.  Given these patches only affect the H8 and the H8 would be killed this
cycle without the conversion, I think this is suitable even though we're past
stage1 close.

This patch addresses an initial codegen issue that would have resulted in
regressions after removal of cc0.  The compare/test eliminate pass is unable to
handle multiple clobbers.  So patterns that clobber a scratch and also clobber
a condition code are never used to eliminate a compare/test.

The H8 can shift 1 or 2 bits at a time depending on the precise model.  Not
surprisingly we have multiple strategies to implement shifts, some of which
clobber scratch registers -- but we have a clobber on every shift insn and as
a result they can not participate in compare/test removal once cc0 is removed
from the port.

This patch removes the clobber in the initial code generation in cases where
it's obviously not needed allowing those shifts to participate in compare/test
removal in a future patch.  It has the advantage that is also generates
slightly better code.  By installing this now the removal of cc0 is a smaller
patch, but more importantly, it allows for a more direct comparison of the
generated code before/after cc0 removal.

I've had my tester test before/after this patch with no regressions on the
major H8 multilibs.  I've also spot checked the generated code and as expected
it's ever-so-slightly better after this patch.

I'll be installing this on the trunk momentarily.  More patches will follow,
though probably not in rapid succession as my time to push this stuff is very
limited.

gcc/

	* config/h8300/constraints.md (R constraint): Add argument to call
	to h8300_shift_needs_scratch_p.
	(S and T constraints): Similary.
	* config/h8300/h8300-protos.h: Update h8300_shift_needs_scratch_p
	prototype.
	* config/h8300/h8300.c (expand_a_shift): Emit a different pattern
	if the shift does not require a scratch register.
	(h8300_shift_needs_scratch_p): Refine to be more accurate.
	* config/h8300/shiftrotate.md (shiftqi_noscratch): New pattern.
	(shifthi_noscratch, shiftsi_noscratch): Similarly.

70033749

Daily bump. · 25bb75f8
GCC Administrator authored 4 years ago

25bb75f8

Nov 18, 2020

Fix middle-end/85811: Introduce tree_expr_maybe_non_p et al. · 1be48781

Roger Sayle authored 4 years ago

The motivation for this patch is PR middle-end/85811, a wrong-code
regression entitled "Invalid optimization with fmax, fabs and nan".
The optimization involves assuming max(x,y) is non-negative if (say)
y is non-negative, i.e. max(x,2.0).  Unfortunately, this is an invalid
assumption in the presence of NaNs.  Hence max(x,+qNaN), with IEEE fmax
semantics will always return x even though the qNaN is non-negative.
Worse, max(x,2.0) may return a negative value if x is -sNaN.

I'll quote Joseph Myers (many thanks) who describes things clearly as:
> (a) When both arguments are NaNs, the return value should be a qNaN,
> but sometimes it is an sNaN if at least one argument is an sNaN.
> (b) Under TS 18661-1 semantics, if either argument is an sNaN then the
> result should be a qNaN (whereas if one argument is a qNaN and the
> other is not a NaN, the result should be the non-NaN argument).
> Various implementations treat sNaNs like qNaNs here.

Under this logic, the tree_expr_nonnegative_p for IEEE fmax should be:

    CASE_CFN_FMAX:
    CASE_CFN_FMAX_FN:
      /* Usually RECURSE (arg0) || RECURSE (arg1) but NaNs complicate
         things.  In the presence of sNaNs, we're only guaranteed to be
         non-negative if both operands are non-negative.  In the presence
         of qNaNs, we're non-negative if either operand is non-negative
         and can't be a qNaN, or if both operands are non-negative.  */
      if (tree_expr_maybe_signaling_nan_p (arg0) ||
          tree_expr_maybe_signaling_nan_p (arg1))
        return RECURSE (arg0) && RECURSE (arg1);
      return RECURSE (arg0) ? (!tree_expr_maybe_nan_p (arg0)
                              || RECURSE (arg1))
                            : (RECURSE (arg1)
                              && !tree_expr_maybe_nan_p (arg1));

Which indeed resolves the wrong code in the PR.  The infrastructure that
makes this possible are the two new functions tree_expr_maybe_nan_p and
tree_expr_maybe_signaling_nan_p which test whether a value may potentially
be a NaN or a signaling NaN respectively.  In fact, this patch adds seven
new predicates to the middle-end:

bool tree_expr_finite_p (const_tree);
bool tree_expr_infinite_p (const_tree);
bool tree_expr_maybe_infinite_p (const_tree);
bool tree_expr_signaling_nan_p (const_tree);
bool tree_expr_maybe_signaling_nan_p (const_tree);
bool tree_expr_nan_p (const_tree);
bool tree_expr_maybe_nan_p (const_tree);

These functions correspond to the "must" and "may" operators in modal logic,
and allow us to triage expressions in the middle-end; definitely a NaN,
definitely not a NaN, and unknown at compile-time, etc.  A prime example of
the utility of these functions is that a IEEE floating point value promoted
from an integer type can't be a NaN or infinite.  Hence (double)i+0.0 where
i is an integer can be simplified to (double)i even with -fsignaling-nans.
Currently in GCC optimizations are enabled/disabled based on whether the
expression's type supports NaNs or sNaNs; with these new predicates they
can be controlled by whether the actual operands may or may not be NaNs.

Having added these extremely useful helper functions to the middle-end,
I couldn't help by use then in a few places in fold-const.c, builtins.c
and match.pd.  In the near term, these can/should be used in places
where the tree optimizers test for HONOR_NANS, HONOR_INFINITIES or
HONOR_SNANS, or explicitly test whether a REAL_CST is a NaN or Inf.
In the longer term (I'm not volunteering) these predicates could perhaps
be hooked into the middle-end's SSA chaining and/or VRP machinery,
allowing finiteness to propagated around the CFG, much like we
currently propagate value ranges.

This patch has been tested on x86_64-pc-linux-gnu with a "make bootstrap"
and "make -k check".
Ok for mainline?

2020-08-15  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	PR middle-end/85811
	* fold-const.c (tree_expr_finite_p): New function to test whether
	a tree expression must be finite, i.e. not a FP NaN or infinity.
	(tree_expr_infinite_p):  New function to test whether a tree
	expression must be infinite, i.e. a FP infinity.
	(tree_expr_maybe_infinite_p): New function to test whether a tree
	expression may be infinite, i.e. a FP infinity.
	(tree_expr_signaling_nan_p): New function to test whether a tree
	expression must evaluate to a signaling NaN (sNaN).
	(tree_expr_maybe_signaling_nan_p): New function to test whether a
	tree expression may be a signaling NaN (sNaN).
	(tree_expr_nan_p): New function to test whether a tree expression
	must evaluate to a (quiet or signaling) NaN.
	(tree_expr_maybe_nan_p): New function to test whether a tree
	expression me be a (quiet or signaling) NaN.

	(tree_binary_nonnegative_warnv_p) [MAX_EXPR]: In the presence
	of NaNs, MAX_EXPR is only guaranteed to be non-negative, if both
	operands are non-negative.
	(tree_call_nonnegative_warnv_p) [CASE_CFN_FMAX,CASE_CFN_FMAX_FN]:
	In the presence of signaling NaNs, fmax is only guaranteed to be
	non-negative if both operands are negative.  In the presence of
	quiet NaNs, fmax is non-negative if either operand is non-negative
	and not a qNaN, or both operands are non-negative.

	* fold-const.h (tree_expr_finite_p, tree_expr_infinite_p,
	tree_expr_maybe_infinite_p, tree_expr_signaling_nan_p,
	tree_expr_maybe_signaling_nan_p, tree_expr_nan_p,
	tree_expr_maybe_nan_p): Prototype new functions here.

	* builtins.c (fold_builtin_classify) [BUILT_IN_ISINF]: Fold to
	a constant if argument is known to be (or not to be) an Infinity.
	[BUILT_IN_ISFINITE]: Fold to a constant if argument is known to
	be (or not to be) finite.
	[BUILT_IN_ISNAN]: Fold to a constant if argument is known to be
	(or not to be) a NaN.
	(fold_builtin_fpclassify): Check tree_expr_maybe_infinite_p and
	tree_expr_maybe_nan_p instead of HONOR_INFINITIES and HONOR_NANS
	respectively.
	(fold_builtin_unordered_cmp): Fold UNORDERED_EXPR to a constant
	when its arguments are known to be (or not be) NaNs.  Check
	tree_expr_maybe_nan_p instead of HONOR_NANS when choosing between
	unordered and regular forms of comparison operators.

	* match.pd (ordered(x,y)->true/false): Constant fold ORDERED_EXPR
	if its operands are known to be (or not to be) NaNs.
	(unordered(x,y)->true/false): Constant fold UNORDERED_EXPR if its
	operands are known to be (or not to be) NaNs.
	(sqrt(x)*sqrt(x)->x): Check tree_expr_maybe_signaling_nan_p instead
	of HONOR_SNANS.

gcc/testsuite/ChangeLog
	PR middle-end/85811
	* gcc.dg/pr85811.c: New test.
	* gcc.dg/fold-isfinite-1.c: New test.
	* gcc.dg/fold-isfinite-2.c: New test.
	* gcc.dg/fold-isinf-1.c: New test.
	* gcc.dg/fold-isinf-2.c: New test.
	* gcc.dg/fold-isnan-1.c: New test.
	* gcc.dg/fold-isnan-2.c: New test.

1be48781

lto: Fix typo in comment of gcc/lto/lto-symtab.c · 579d235d
Jerry Clcanny authored 4 years ago
```
	* lto-symtab.c (lto_symtab_merge_symbols): Fix typos in comment.
```
579d235d

vrp: Fix operator_trunc_mod::op1_range [PR97888] · 71c9d2b0

Jakub Jelinek authored 4 years ago

As mentioned in the PR, in (x % y) >= 0 && y >= 0, we can't deduce
x's range to be x >= 0, as e.g. -7 % 7 is 0.  But we can deduce it
from (x % y) > 0.  The patch also fixes up the comments.

2020-11-18  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/91029
	PR tree-optimization/97888
	* range-op.cc (operator_trunc_mod::op1_range): Only set op1
	range to >= 0 if lhs is > 0, rather than >= 0.  Fix up comments.

	* gcc.dg/pr91029.c: Add comment with PR number.
	(f2): Use > 0 rather than >= 0.
	* gcc.c-torture/execute/pr97888-1.c: New test.
	* gcc.c-torture/execute/pr97888-2.c: New test.

71c9d2b0

plugins: Allow plugins to handle global_options changes · 84e0549c

Jakub Jelinek authored 4 years ago

Any time somebody adds or removes an option in some *.opt file (which e.g.
on the 10 branch after branching off 11 happened 7 times already), many
offsets in global_options variable change and so plugins that ever access
GCC options or other global_options values are ABI dependent on it.  It is
true we don't guarantee ABI stability for plugins, but we change the most
often used data structures on the release branches only very rarely and so
the options changes are the most problematic for ABI stability of plugins.

Annobin uses a way to remap accesses to some of the global_options.x_* by
looking them up in the cl_options array where we have
offsetof (struct gcc_options, x_flag_lto)
etc. remembered, but sadly doesn't do it for all options (e.g. some flag_*
etc. option accesses may be hidden in various macros like POINTER_SIZE),
and more importantly some struct gcc_options offsets are not covered at all.
E.g. there is no offsetof (struct gcc_options, x_optimize),
offsetof (struct gcc_options, x_flag_sanitize) etc.  Those are usually:
Variable
int optimize
in the *.opt files.

The following patch allows the plugins to deal with reshuffling of even
the global_options fields that aren't tracked in cl_options by adding
another array that describes those, which adds an 816 bytes long array
and 1039 bytes in string literals, so 1855 .rodata bytes in total ATM.
And adds it only if --enable-plugin (the default), with --disable-plugin
it will not be compiled in.

2020-11-18  Jakub Jelinek  <jakub@redhat.com>

	* opts.h (struct cl_var): New type.
	(cl_vars): Declare.
	* optc-gen.awk: Generate cl_vars array.

84e0549c

analyzer: only use CWE-690 for unchecked return value [PR97893] · f3f312b5

David Malcolm authored 4 years ago

CWE-690 is only for dereferencing an unchecked return value; for
other kinds of NULL dereference, use the parent classification, CWE-476.

gcc/analyzer/ChangeLog:
	PR analyzer/97893
	* sm-malloc.cc (null_deref::emit): Use CWE-476 rather than
	CWE-690, as this isn't due to an unchecked return value.
	(null_arg::emit): Likewise.

gcc/testsuite/ChangeLog:
	PR analyzer/97893
	* gcc.dg/analyzer/malloc-1.c: Add CWE-690 and CWE-476 codes to
	expected output.

f3f312b5

Objective-C++ : Avoid ICE on invalid with empty attributes. · 08028093

Iain Sandoe authored 4 years ago

Empty prefix attributes like:

__attribute__ (())
@interface MyClass
@end

cause an ICE at present, check for that case and skip them.

gcc/cp/ChangeLog:

	* parser.c (cp_parser_objc_valid_prefix_attributes): Check
	for empty attributes.

08028093

Optimize two patterns with three xors · f44e6091
Eugene Rozenfeld authored 4 years ago
```
gcc/
	PR tree-optimization/96671
	* match.pd (three xor patterns): New patterns.
```
f44e6091

openmp: Retire nest-var ICV for OpenMP 5.1 · 6fae7eda

Kwok Cheung Yeung authored 4 years ago

This removes the nest-var ICV, expressing nesting in terms of the
max-active-levels-var ICV instead.  The max-active-levels-var ICV
is now per data environment rather than per device.

2020-11-18  Kwok Cheung Yeung  <kcy@codesourcery.com>

	libgomp/
	* env.c (gomp_global_icv): Remove nest_var field.  Add
	max_active_levels_var field.
	(gomp_max_active_levels_var): Remove.
	(parse_boolean): Return true on success.
	(handle_omp_display_env): Express OMP_NESTED in terms of
	max_active_levels_var.  Change format specifier for
	max_active_levels_var.
	(initialize_env): Set max_active_levels_var from
	OMP_MAX_ACTIVE_LEVELS, OMP_NESTED, OMP_NUM_THREADS and
	OMP_PROC_BIND.
	* icv.c (omp_set_nested): Express in terms of
	max_active_levels_var.
	(omp_get_nested): Likewise.
	(omp_set_max_active_levels): Use max_active_levels_var field instead
	of gomp_max_active_levels_var.
	(omp_get_max_active_levels): Likewise.
	* libgomp.h (struct gomp_task_icv): Remove nest_var field.  Add
	max_active_levels_var field.
	(gomp_supported_active_levels): Set to UCHAR_MAX.
	(gomp_max_active_levels_var): Delete.
	* libgomp.texi (omp_get_nested): Update documentation.
	(omp_set_nested): Likewise.
	(OMP_MAX_ACTIVE_LEVELS): Likewise.
	(OMP_NESTED): Likewise.
	(OMP_NUM_THREADS): Likewise.
	(OMP_PROC_BIND): Likewise.
	* parallel.c (gomp_resolve_num_threads): Replace reference
	to nest_var with max_active_levels_var.  Use max_active_levels_var
	field instead of gomp_max_active_levels_var.

6fae7eda