Commits · 9522fc8bb7812f2ad50eb038e0938bfd958e730f · COBOLworx / gcc-cobol

Aug 27, 2024

MIPS: Include missing mips16.S in libgcc/lib1funcs.S · 9522fc8b

YunQiang Su authored 7 months ago

mips16.S was missing since
commit 29b74545
Date:   Thu Jun 1 10:14:24 2023 +0800

    MIPS: Add speculation_barrier support

Without mips16.S included, some symbols will miss for mips16, and
so some software will fail to build.

libgcc/ChangeLog:

	* config/mips/lib1funcs.S: Includes mips16.S.

9522fc8b

Aug 26, 2024

combine.cc (make_more_copies): Copy attributes from the original pseudo, PR115883 · 5031df5d

Hans-Peter Nilsson authored 8 months ago

The first of the late-combine passes, propagates some of the copies
made during the (in-time-)combine pass in make_more_copies into the
users of the "original" pseudo registers and removes the "old"
pseudos.  That effectively removes attributes such as REG_POINTER,
which matter to LRA.  The quoted PR is for an ICE-manifesting bug that
was exposed by the late-combine pass and went back to hiding with this
patch until commit r15-2937-g3673b7054ec2, the fix for PR116236, when
it was actually fixed.  To wit, this patch is only incidentally
related to that bug.

In other words, the REG_POINTER attribute should not be required for
LRA to work correctly.  This patch merely corrects state for those
propagated register-uses to ante late-combine.

For reasons not investigated, this fixes a failing test
"FAIL: gcc.dg/guality/pr54200.c -Og -DPREVENT_OPTIMIZATION line 20 z == 3"
for x86_64-linux-gnu.

	PR middle-end/115883
	* combine.cc (make_more_copies): Copy attributes from the original
	pseudo to the new copy.

5031df5d

c++/coros: do not assume coros don't nest [PR113457] · 5cca7517

Arsen Arsenović authored 7 months ago

In the testcase presented in the PR, during template expansion, an
tsubst of an operand causes a lambda coroutine to be processed, causing
it to get an initial suspend and final suspend.  The code for assigning
awaitable var names (get_awaitable_var) assumed that the sequence Is ->
Is -> Fs -> Fs is impossible (i.e. that one could only 'open' one
coroutine before closing it at a time), and reset the counter used for
unique numbering each time a final suspend occured.  This assumption is
false in a few cases, usually when lambdas are involved.

Instead of storing this counter in a static-storage variable, we can
store it in coroutine_info.  This struct is local to each function, so
we don't need to worry about "cross-contamination" nor resetting.

	PR c++/113457

gcc/cp/ChangeLog:

	* coroutines.cc (struct coroutine_info): Add integer field
	awaitable_number.  This is a counter used for assigning unique
	names to awaitable temporaries.
	(get_awaitable_var): Use awaitable_number from coroutine_info
	instead of the static int awn.

gcc/testsuite/ChangeLog:

	* g++.dg/coroutines/pr113457-1.C: New test.
	* g++.dg/coroutines/pr113457.C: New test.

5cca7517

coroutines: diagnose usage of alloca in coroutines · c73d7f3c

Arsen Arsenović authored 7 months ago

We do not support it currently, and the resulting memory can only be
used inside a single resumption, so best not confuse the user with it.

	PR c++/115858 - Incompatibility of coroutines and alloca()

gcc/ChangeLog:

	* coroutine-passes.cc (execute_early_expand_coro_ifns): Emit a
	sorry if a statement is an alloca call.

gcc/testsuite/ChangeLog:

	* g++.dg/coroutines/pr115858.C: New test.

c73d7f3c

diagnostics: move output formats from diagnostic.{c,h} to their own files · 92c5265d

David Malcolm authored 7 months ago


In particular, move the classic text output code to a
diagnostic-text.cc (analogous to -json.cc and -sarif.cc).

No functional change intended.

gcc/ChangeLog:
	* Makefile.in (OBJS-libcommon): Add diagnostic-format-text.o.
	* diagnostic-format-json.cc: Include "diagnostic-format.h".
	* diagnostic-format-sarif.cc: Likewise.
	* diagnostic-format-text.cc: New file, using material from
	diagnostics.cc.
	* diagnostic-global-context.cc: Include
	"diagnostic-format.h".
	* diagnostic-format-text.h: New file, using material from
	diagnostics.h.
	* diagnostic-format.h: New file, using material from
	diagnostics.h.
	* diagnostic.cc: Include "diagnostic-format.h" and
	"diagnostic-format-text.h".
	(diagnostic_text_output_format::~diagnostic_text_output_format):
	Move to diagnostic-format-text.cc.
	(diagnostic_text_output_format::on_report_diagnostic): Likewise.
	(diagnostic_text_output_format::on_diagram): Likewise.
	(diagnostic_text_output_format::print_any_cwe): Likewise.
	(diagnostic_text_output_format::print_any_rules): Likewise.
	(diagnostic_text_output_format::print_option_information):
	Likewise.
	* diagnostic.h (class diagnostic_output_format): Move to
	diagnostic-format.h.
	(class diagnostic_text_output_format): Move to
	diagnostic-format-text.h.
	(diagnostic_output_format_init): Move to
	diagnostic-format.h.
	(diagnostic_output_format_init_json_stderr): Likewise.
	(diagnostic_output_format_init_json_file): Likewise.
	(diagnostic_output_format_init_sarif_stderr): Likewise.
	(diagnostic_output_format_init_sarif_file): Likewise.
	(diagnostic_output_format_init_sarif_stream): Likewise.
	* gcc.cc: Include "diagnostic-format.h".
	* opts.cc: Include "diagnostic-format.h".

gcc/testsuite/ChangeLog:
	* gcc.dg/plugin/diagnostic_group_plugin.c: Include
	"diagnostic-format-text.h".

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

92c5265d

diagnostics: consolidate on_{begin,end}_diagnostic into on_report_diagnostic · ac707d30

David Malcolm authored 7 months ago


Previously diagnostic_context::report_diagnostic had, after the call to
pp_format (phases 1 and 2 of formatting the message):

  m_output_format->on_begin_diagnostic (*diagnostic);
  pp_output_formatted_text (this->printer, m_urlifier);
  if (m_show_cwe)
    print_any_cwe (*diagnostic);
  if (m_show_rules)
    print_any_rules (*diagnostic);
  if (m_show_option_requested)
  print_option_information (*diagnostic, orig_diag_kind);
  m_output_format->on_end_diagnostic (*diagnostic, orig_diag_kind);

This patch replaces all of the above with a single call to

  m_output_format->on_report_diagnostic (*diagnostic, orig_diag_kind);

moving responsibility for phase 3 of formatting and printing the result
from diagnostic_context to the output format.

This simplifies diagnostic_context::report_diagnostic and allows us to
move the code that prints CWEs, rules, and option information in textual
form from diagnostic_context to diagnostic_text_output_format, where it
belongs.

No functional change intended.

gcc/ChangeLog:
	* diagnostic-format-json.cc
	(json_output_format::on_begin_diagnostic): Delete.
	(json_output_format::on_end_diagnostic): Rename to...
	(json_output_format::on_report_diagnostic): ...this and add call
	to pp_output_formatted_text.
	(diagnostic_output_format_init_json): Drop unnecessary calls
	to disable textual printing of CWEs, rules, and options.
	* diagnostic-format-sarif.cc (sarif_builder::end_diagnostic):
	Rename to...
	(sarif_builder::on_report_diagnostic): ...this and add call to
	pp_output_formatted_text.
	(sarif_output_format::on_begin_diagnostic): Delete.
	(sarif_output_format::on_end_diagnostic): Rename to...
	(sarif_output_format::on_report_diagnostic): ...this and update
	call to m_builder accordingly.
	(diagnostic_output_format_init_sarif): Drop unnecessary calls
	to disable textual printing of CWEs, rules, and options.
	* diagnostic.cc (diagnostic_context::print_any_cwe): Convert to...
	(diagnostic_text_output_format::print_any_cwe): ...this.
	(diagnostic_context::print_any_rules): Convert to...
	(diagnostic_text_output_format::print_any_rules): ...this.
	(diagnostic_context::print_option_information): Convert to...
	(diagnostic_text_output_format::print_option_information):
	...this.
	(diagnostic_context::report_diagnostic): Replace calls to the
	output format's on_begin_diagnostic, to pp_output_formatted_text,
	printing CWE, rules, option info, and the call to the format's
	on_end_diagnostic with a call to the format's
	on_report_diagnostic.
	(diagnostic_text_output_format::on_begin_diagnostic): Delete.
	(diagnostic_text_output_format::on_end_diagnostic): Delete.
	(diagnostic_text_output_format::on_report_diagnostic): New vfunc,
	which effectively does the on_begin_diagnostic, the call to
	pp_output_formatted_text, the calls for printing CWE, rules,
	option info, and the call to the diagnostic_finalizer.
	* diagnostic.h (diagnostic_output_format::on_begin_diagnostic):
	Delete.
	(diagnostic_output_format::on_end_diagnostic): Delete.
	(diagnostic_output_format::on_report_diagnostic): New.
	(diagnostic_text_output_format::on_begin_diagnostic): Delete.
	(diagnostic_text_output_format::on_end_diagnostic): Delete.
	(diagnostic_text_output_format::on_report_diagnostic): New.
	(class diagnostic_context): Add friend class
	diagnostic_text_output_format.
	(diagnostic_context::get_urlifier): New accessor.
	(diagnostic_context::print_any_cwe): Move decl...
	(diagnostic_text_output_format::print_any_cwe): ...to here.
	(diagnostic_context::print_any_rules): Move decl...
	(diagnostic_text_output_format::print_any_rules): ...to here.
	(diagnostic_context::print_option_information): Move decl...
	(diagnostic_text_output_format::print_option_information): ...to
	here.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

ac707d30

testsuite: add event IDs to multithreaded event plugin test · 6a1c359e

David Malcolm authored 7 months ago


Add test coverage of "%@" in event messages in a multithreaded
execution path.

gcc/testsuite/ChangeLog:
	* gcc.dg/plugin/diagnostic-test-paths-multithreaded-inline-events.c:
	Update expected output.
	* gcc.dg/plugin/diagnostic-test-paths-multithreaded-sarif.py:
	Likewise.
	* gcc.dg/plugin/diagnostic-test-paths-multithreaded-separate-events.c:
	Likewise.
	* gcc.dg/plugin/diagnostic_plugin_test_paths.c
	(test_diagnostic_path::add_event_2): Return the id of the added
	event.
	(test_diagnostic_path::add_event_2_with_event_id): New.
	(example_4): Add event IDs to the deadlock messages indicating
	where the locks where acquired.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

6a1c359e

testsuite: generalize support for Python tests for SARIF output · aa3b9502

David Malcolm authored 7 months ago


In r15-2354-g4d1f71d49e396c I added the ability to use Python to write
tests of SARIF output via a new "run-sarif-pytest" based
on "run-gcov-pytest", with a sarif.py support script in
testsuite/gcc.dg/sarif-output.

This followup patch:
(a) removes the limitation of such tests needing to be in
testsuite/gcc.dg/sarif-output by moving sarif.py to testsuite/lib
and adding logic to add that directory to PYTHONPATH when invoking
pytest.

(b) uses this to replace fragile regexp-based tests in
gcc.dg/plugin/diagnostic-test-paths-multithreaded-sarif.c with
Python logic that verifies the structure within the generated JSON,
and to add test coverage for SARIF output relating to GCC plugins.

gcc/ChangeLog:
	* diagnostic-format-sarif.cc: Add comments noting that we don't
	yet capture any diagnostic_metadata::rules associated with a
	diagnostic.

gcc/testsuite/ChangeLog:
	* gcc.dg/plugin/diagnostic-test-metadata-sarif.c: New test,
	based on diagnostic-test-metadata.c.
	* gcc.dg/plugin/diagnostic-test-metadata-sarif.py: New script.
	* gcc.dg/plugin/diagnostic-test-paths-multithreaded-sarif.c:
	Replace scan-sarif-file directives with run-sarif-pytest, to
	run...
	* gcc.dg/plugin/diagnostic-test-paths-multithreaded-sarif.py:
	...this new test.
	* gcc.dg/plugin/plugin.exp (plugin_test_list): Add
	diagnostic-test-metadata-sarif.c.
	* gcc.dg/sarif-output/sarif.py: Move to...
	* lib/sarif.py: ...here.
	* lib/scansarif.exp (run-sarif-pytest): Prepend "lib" to
	PYTHONPATH before running python scripts.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

aa3b9502

pretty-print: fixes to selftests · 276cc432

David Malcolm authored 7 months ago


Add selftest coverage for %{ and %} in pretty-print.cc

No functional change intended.

gcc/ChangeLog:
	* pretty-print.cc (selftest::test_urls): Make static.
	(selftest::test_urls_from_braces): New.
	(selftest::test_null_urls): Make static.
	(selftest::test_urlification): Likewise.
	(selftest::pretty_print_cc_tests): Call test_urls_from_braces.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

276cc432

json.h: fix typo in comment · b8357103

David Malcolm authored 7 months ago


gcc/ChangeLog:
	* json.h: Fix typo in comment about missing INCLUDE_MEMORY.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

b8357103

c++: Check template parameters in member class template specialization [PR115716] · 26ee9544

Simon Martin authored 7 months ago

We currently ICE upon the following invalid code, because we don't check
that the template parameters in a member class template specialization
are correct.

=== cut here ===
template <typename T> struct x {
  template <typename U> struct y {
    typedef T result2;
  };
};
template<> template<typename U, typename> struct x<int>::y {
  typedef double result2;
};
int main() {
  x<int>::y<int>::result2 xxx2;
}
=== cut here ===

This patch fixes the PR by calling redeclare_class_template.

	PR c++/115716

gcc/cp/ChangeLog:

	* pt.cc (maybe_process_partial_specialization): Call
	redeclare_class_template.

gcc/testsuite/ChangeLog:

	* g++.dg/template/spec42.C: New test.
	* g++.dg/template/spec43.C: New test.

26ee9544

Remove an unneeded include that was added by mistake. · cc372be5
Andi Kleen authored 7 months ago
```
gcc/ChangeLog:

	* tree-if-conv.cc: Remove unneeded include from last change.
```
cc372be5

Fix bootstap-errors due to enabling -gvariable-location-views · eb63f958

Bernd Edlinger authored 7 months ago

This recent change triggered various bootstap-errors, mostly on
x86 targets because line info advance address entries were output
in the wrong section table.
The switch to the wrong line table happened in dwarfout_set_ignored_loc.
It must use the same section as the earlier called
dwarf2out_switch_text_section.

But also ft32-elf was affected, because the assembler choked on
something simple as ".2byte .LM2-.LM1", but fortunately it is
able to use native location views, the configure test was just
not executed because the ft32 "nop" instruction was missing.

gcc/ChangeLog:

	PR debug/116470
	* configure.ac: Add the "nop" instruction for cpu type ft32.
	* configure: Regenerate.
	* dwarf2out.cc (dwarf2out_set_ignored_loc): Use the correct
	line info section.

eb63f958

libcpp: deduplicate definition of padding size · a8260ebe

Alexander Monakov authored 7 months ago

Tie together the two functions that ensure tail padding with
search_line_ssse3 via CPP_BUFFER_PADDING macro.

libcpp/ChangeLog:

	* internal.h (CPP_BUFFER_PADDING): New macro; use it ...
	* charset.cc (_cpp_convert_input): ...here, and ...
	* files.cc (read_file_guts): ...here, and ...
	* lex.cc (search_line_ssse3): here.

a8260ebe

tree-optimization/116460 - improve forwprop compile-time · 0ceeb992

Richard Biener authored 7 months ago

The following improves forwprop block reachability which I noticed
when debugging PR116460 and what is also noted in the comment.  It
avoids processing blocks in natural loops determined unreachable,
thereby making the issue in PR116460 latent.

	PR tree-optimization/116460
	* tree-ssa-forwprop.cc (pass_forwprop::execute): Do not
	process blocks in unreachable natural loops.

0ceeb992

Delay edge removal in forwprop · 03b802e1

Richard Biener authored 7 months ago

SSA forwprop has switch simplification code that calls remove edge
and as side-effect releases dominator info.  For a followup we want
to retain that so the following delays removing edges until the end
of the pass.  As usual we have to deal with parts of the edge
vanishing due to EH/abnormal pruning so record edges as basic-block
index pairs and remove them only when they are still there.

	* tree-ssa-forwprop.cc (simplify_gimple_switch_label_vec):
	Delay removing edges and releasing dominator info, instead
	record into edges_to_remove vector.
	(simplify_gimple_switch): Pass through vector of to remove
	edges.
	(pass_forwprop::execute): Likewise.  Remove queued edges.

03b802e1

vect: Fix STMT_VINFO_DEF_TYPE check for odd/even widen mult [PR116348] · d3e71b99

Xi Ruoyao authored 7 months ago


After fixing PR116142 some code started to trigger an ICE with -O3
-march=znver4.  Per Richard Biener who actually made this fix:

"supportable_widening_operation fails at transform time - that's likely
because vectorizable_reduction "puns" defs to internal_def"

so the check should use STMT_VINFO_REDUC_DEF instead of checking if
STMT_VINFO_DEF_TYPE is vect_reduction_def.

gcc/ChangeLog:

	PR tree-optimization/116348
	* tree-vect-stmts.cc (supportable_widening_operation): Use
	STMT_VINFO_REDUC_DEF (x) instead of
	STMT_VINFO_DEF_TYPE (x) == vect_reduction_def.

gcc/testsuite/ChangeLog:

	PR tree-optimization/116348
	* gcc.c-torture/compile/pr116438.c: New test.

Co-authored-by: Richard Biener <rguenther@suse.de>

d3e71b99

Match: Add int type fits check for .SAT_ADD imm operand · 3b78aa3e

Pan Li authored 7 months ago


This patch would like to add strict check for imm operand of .SAT_ADD
matching.  We have no type checking for imm operand in previous, which
may result in unexpected IL to be catched by .SAT_ADD pattern.

We leverage the int_fits_type_p here to make sure the imm operand is
a int type fits the result type of the .SAT_ADD.  For example:

Fits uint8_t:
uint8_t a;
uint8_t sum = .SAT_ADD (a, 12);
uint8_t sum = .SAT_ADD (a, 12u);
uint8_t sum = .SAT_ADD (a, 126u);
uint8_t sum = .SAT_ADD (a, 128u);
uint8_t sum = .SAT_ADD (a, 228);
uint8_t sum = .SAT_ADD (a, 223u);

Not fits uint8_t:
uint8_t a;
uint8_t sum = .SAT_ADD (a, -1);
uint8_t sum = .SAT_ADD (a, 256u);
uint8_t sum = .SAT_ADD (a, 257);

The below test suite are passed for this patch:
* The rv64gcv fully regression test.
* The x86 bootstrap test.
* The x86 fully regression test.

gcc/ChangeLog:

	* match.pd: Add int_fits_type_p check for .SAT_ADD imm operand.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/sat_arith.h: Add test helper macros.
	* gcc.target/riscv/sat_u_add_imm-11.c: Adjust test case for imm.
	* gcc.target/riscv/sat_u_add_imm-12.c: Ditto.
	* gcc.target/riscv/sat_u_add_imm-15.c: Ditto.
	* gcc.target/riscv/sat_u_add_imm-16.c: Ditto.
	* gcc.target/riscv/sat_u_add_imm_type_check-1.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-10.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-11.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-12.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-13.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-14.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-15.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-16.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-17.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-18.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-19.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-2.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-20.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-21.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-22.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-23.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-24.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-25.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-26.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-27.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-28.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-29.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-3.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-30.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-31.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-32.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-33.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-34.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-35.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-36.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-37.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-38.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-39.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-4.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-40.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-41.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-42.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-43.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-44.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-45.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-46.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-47.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-48.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-49.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-5.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-50.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-51.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-52.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-6.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-7.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-8.c: New test.
	* gcc.target/riscv/sat_u_add_imm_type_check-9.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

3b78aa3e

expand: Use the correct mode for store flags for popcount [PR116480] · 53b86cac

Andrew Pinski authored 7 months ago


When expanding popcount used for equal to 1 (or rather __builtin_stdc_has_single_bit),
the wrong mode was bsing used for the mode of the store flags. We were using the mode
of the argument to popcount but since popcount's return value is always int, the mode
of the expansion here should have been the mode of the return type rater than the argument.

Built and tested on aarch64-linux-gnu with no regressions.
Also bootstrapped and tested on x86_64-linux-gnu.

	PR middle-end/116480

gcc/ChangeLog:

	* internal-fn.cc (expand_POPCOUNT): Use the correct mode
	for store flags.

gcc/testsuite/ChangeLog:

	* gcc.dg/torture/pr116480-1.c: New test.
	* gcc.dg/torture/pr116480-2.c: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

53b86cac

i386: Add bf8 -> fp16 intrin · b4ac2c23

Haochen Jiang authored 7 months ago

Since BF8 and FP16 have same bits for exponent, the type conversion
between them is just a cast for fraction part. We will use a sequence
of instrctions instead of new instructions to do that. For convenience,
intrins are also provided.

gcc/ChangeLog:

	* config/i386/avx10_2-512convertintrin.h
	(_mm512_cvtpbf8_ph): New.
	(_mm512_mask_cvtpbf8_ph): Ditto.
	(_mm512_maskz_cvtpbf8_ph): Ditto.
	* config/i386/avx10_2convertintrin.h
	(_mm_cvtpbf8_ph): Ditto.
	(_mm_mask_cvtpbf8_ph): Ditto.
	(_mm_maskz_cvtpbf8_ph): Ditto.
	(_mm256_cvtpbf8_ph): Ditto.
	(_mm256_mask_cvtpbf8_ph): Ditto.
	(_mm256_maskz_cvtpbf8_ph): Ditto.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx10_2-512-convert-1.c: Add tests for new
	intrin.
	* gcc.target/i386/avx10_2-convert-1.c: Ditto.

b4ac2c23

AVX10.2: Support compare instructions · 576bd309

Zhang, Jun authored 7 months ago


gcc/ChangeLog:

	* config/i386/i386-expand.cc
	(ix86_ssecom_setcc): Mention behavior change on flags.
	(ix86_expand_sse_comi): Handle AVX10.2 behavior.
	(ix86_expand_sse_comi_round): Ditto.
	(ix86_expand_round_builtin): Ditto.
	(ix86_expand_builtin): Change function call.
	* config/i386/i386.md (UNSPEC_COMX): New unspec.
	* config/i386/sse.md
	(avx10_2_v<unord>comx<ssemodesuffix><round_saeonly_name>): New.
	(<sse>_<unord>comi<round_saeonly_name>): Add HFmode.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx10_2-compare-1.c: New test.

Co-authored-by: Haochen Jiang <haochen.jiang@intel.com>
Co-authored-by: Hongtao Liu <hongtao.liu@intel.com>

576bd309

AVX10.2: Support vector copy instructions · f6fe2962

Zhang, Jun authored 7 months ago

gcc/ChangeLog:

	* config.gcc: Add avx10_2copyintrin.h.
	* config/i386/i386.md (avx10_2): New isa attribute.
	* config/i386/immintrin.h: Include avx10_2copyintrin.h.
	* config/i386/sse.md
	(sse_movss_<mode>): Add new constraints to handle AVX10.2.
	(vec_set<mode>_0): Ditto.
	(@vec_set<mode>_0): Ditto.
	(vec_set<mode>_0): Ditto.
	(avx512fp16_mov<mode>): Ditto.
	(*vec_set<mode>_0_1): New split.
	* config/i386/avx10_2copyintrin.h: New file.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx10_2-vmovd-1.c: New test.
	* gcc.target/i386/avx10_2-vmovd-2.c: Ditto.
	* gcc.target/i386/avx10_2-vmovw-1.c: Ditto.
	* gcc.target/i386/avx10_2-vmovw-2.c: Ditto.

f6fe2962

AVX10.2: Support minmax instructions · 889f6dd0

Mo, Zewei authored 7 months ago


gcc/ChangeLog:

	* config.gcc: Add avx10_2-512minmaxintrin.h and
	avx10_2minmaxintrin.h.
	* config/i386/i386-builtin-types.def:
	Add DEF_FUNCTION_TYPE (V8BF, V8BF, V8BF, INT, V8BF, UQI),
	(V16BF, V16BF, V16BF, INT, V16BF, UHI),
	(V32BF, V32BF, V32BF, INT, V32BF, USI),
	(V8HF, V8HF, V8HF, INT, V8HF, UQI),
	(V8DF, V8DF, V8DF, INT, V8DF, UQI, INT),
	(V32HF, V32HF, V32HF, INT, V32HF, USI, INT),
	(V16HF, V16HF, V16HF, INT, V16HF, UHI, INT),
	(V16SF, V16SF, V16SF, INT, V16SF, UHI, INT).
	* config/i386/i386-builtin.def (BDESC): Add new builtins.
	* config/i386/i386-expand.cc
	(ix86_expand_args_builtin): Handle V8BF_FTYPE_V8BF_V8BF_INT_V8BF_UQI,
	V16BF_FTYPE_V16BF_V16BF_INT_V16BF_UHI,
	V32BF_FTYPE_V32BF_V32BF_INT_V32BF_USI,
	V8HF_FTYPE_V8HF_V8HF_INT_V8HF_UQI,
	(ix86_expand_round_builtin): Handle V8DF_FTYPE_V8DF_V8DF_INT_V8DF_UQI_INT,
	V32HF_FTYPE_V32HF_V32HF_INT_V32HF_USI_INT,
	V16HF_FTYPE_V16HF_V16HF_INT_V16HF_UHI_INT.
	V16SF_FTYPE_V16SF_V16SF_INT_V16SF_UHI_INT.
	* config/i386/immintrin.h: Include avx10_2-512mixmaxintrin.h and
	avx10_2minmaxintrin.h.
	* config/i386/sse.md (VFH_AVX10_2): New.
	(avx10_2_vminmaxnepbf16_<mode><mask_name>): New define_insn.
	(avx10_2_minmaxp<mode><mask_name><round_saeonly_name>): Ditto.
	(avx10_2_minmaxs<mode><mask_scalar_name><round_saeonly_scalar_name>): Ditto.
	* config/i386/avx10_2-512minmaxintrin.h: New file.
	* config/i386/avx10_2minmaxintrin.h: Ditto.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx-1.c: Add macros.
	* gcc.target/i386/sse-13.c: Ditto.
	* gcc.target/i386/sse-14.c: Ditto.
	* gcc.target/i386/sse-22.c: Ditto.
	* gcc.target/i386/sse-23.c: Ditto.
	* gcc.target/i386/avx512f-helper.h: Add helper function.
	* gcc.target/i386/avx10-minmax-helper.h: New helper file.
	* gcc.target/i386/avx10_2-512-minmax-1.c: New test.
	* gcc.target/i386/avx10_2-512-vminmaxnepbf16-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vminmaxpd-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vminmaxph-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vminmaxps-2.c: Ditto.
	* gcc.target/i386/avx10_2-minmax-1.c: Ditto.
	* gcc.target/i386/avx10_2-vminmaxnepbf16-2.c: Ditto.
	* gcc.target/i386/avx10_2-vminmaxsd-2.c: Ditto.
	* gcc.target/i386/avx10_2-vminmaxsh-2.c: Ditto.
	* gcc.target/i386/avx10_2-vminmaxss-2.c: Ditto.
	* gcc.target/i386/avx10_2-vminmaxpd-2.c: Ditto.
	* gcc.target/i386/avx10_2-vminmaxph-2.c: Ditto.
	* gcc.target/i386/avx10_2-vminmaxps-2.c: Ditto.

Co-authored-by: Hu, Lin1 <lin1.hu@intel.com>
Co-authored-by: Haochen Jiang <haochen.jiang@intel.com>

889f6dd0

[PATCH 2/2] AVX10.2: Support saturating convert instructions · 3a97ce17

Hu, Lin1 authored 7 months ago

gcc/ChangeLog:

	* config/i386/avx10_2-512satcvtintrin.h: Add new intrin.
	* config/i386/avx10_2satcvtintrin.h: Ditto.
	* config/i386/i386-builtin.def (BDESC): Add new builtins.
	* config/i386/sse.md (VF1_VF2_AVX10_2): New iterator.
	(VF2_AVX10_2): Ditto.
	(VI8_AVX10_2): Ditto.
	(sat_cvt_sign_prefix): Add new UNSPEC.
	(UNSPEC_SAT_CVT_DS_SIGN_ITER): New iterator.
	(pd2dqssuff): Ditto.
	(avx10_2_vcvtt<castmode>2<sat_cvt_sign_prefix>dqs<mode><mask_name><round_saeonly_name>):
	New.
	(avx10_2_vcvttpd2<sat_cvt_sign_prefix>qqs<mode><mask_name><round_saeonly_name>):
	Ditto.
	(avx10_2_vcvttps2<sat_cvt_sign_prefix>qqs<mode><mask_name><round_saeonly_name>):
	Ditto.
	(avx10_2_vcvttsd2<sat_cvt_sign_prefix>sis<mode><round_saeonly_name>):
	Ditto.
	(avx10_2_vcvttss2<sat_cvt_sign_prefix>sis<mode><round_saeonly_name>):
	Ditto.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx-1.c: Add macros.
	* gcc.target/i386/sse-13.c: Ditto.
	* gcc.target/i386/sse-14.c: Ditto.
	* gcc.target/i386/sse-22.c: Ditto.
	* gcc.target/i386/sse-23.c: Ditto.
	* gcc.target/i386/avx10_2-satcvt-1.c: Add test.
	* gcc.target/i386/avx10_2-512-satcvt-1.c: Ditto.
	* gcc.target/i386/avx10_2-512-vcvttpd2dqs-2.c: New test.
	* gcc.target/i386/avx10_2-512-vcvttpd2qqs-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vcvttpd2udqs-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vcvttpd2uqqs-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vcvttps2dqs-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vcvttps2qqs-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vcvttps2udqs-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vcvttps2uqqs-2.c: Ditto.
	* gcc.target/i386/avx10_2-vcvttpd2dqs-2.c: Ditto.
	* gcc.target/i386/avx10_2-vcvttpd2qqs-2.c: Ditto.
	* gcc.target/i386/avx10_2-vcvttpd2udqs-2.c: Ditto.
	* gcc.target/i386/avx10_2-vcvttpd2uqqs-2.c: Ditto.
	* gcc.target/i386/avx10_2-vcvttps2dqs-2.c: Ditto.
	* gcc.target/i386/avx10_2-vcvttps2qqs-2.c: Ditto.
	* gcc.target/i386/avx10_2-vcvttps2udqs-2.c: Ditto.
	* gcc.target/i386/avx10_2-vcvttps2uqqs-2.c: Ditto.
	* gcc.target/i386/avx10_2-vcvttsd2sis-2.c: Ditto.
	* gcc.target/i386/avx10_2-vcvttsd2usis-2.c: Ditto.
	* gcc.target/i386/avx10_2-vcvttss2sis-2.c: Ditto.
	* gcc.target/i386/avx10_2-vcvttss2usis-2.c: Ditto.

3a97ce17

[PATCH 1/2] AVX10.2: Support saturating convert instructions · e2c80d23

Hu, Lin1 authored 7 months ago

gcc/ChangeLog:

	* config.gcc: Add avx10_2satcvtintrin.h and
	avx10_2-512satcvtintrin.h.
	* config/i386/i386-builtin-types.def:
	Add DEF_FUNCTION_TYPE (V8HI, V8BF, V8HI, UQI),
	(V16HI, V16BF, V16HI, UHI), (V32HI, V32BF, V32HI, USI),
	(V16SI, V16SF, V16SI, UHI, INT), (V16HI, V16BF, V16HI, UHI, INT),
	(V32HI, V32BF, V32HI, USI, INT).
	* config/i386/i386-builtin.def (BDESC): Add new builtins.
	* config/i386/i386-expand.cc (ix86_expand_args_builtin): Handle
	V32HI_FTYPE_V32BF_V32HI_USI, V16HI_FTYPE_V16BF_V16HI_UHI,
	V8HI_FTYPE_V8BF_V8HI_UQI.
	(ix86_expand_round_builtin): Handle V32HI_FTYPE_V32BF_V32HI_USI_INT,
	V16SI_FTYPE_V16SF_V16SI_UHI_INT, V16HI_FTYPE_V16BF_V16HI_UHI_INT.
	* config/i386/immintrin.h: Include avx10_2satcvtintrin.h and
	avx10_2-512savcvtintrin.h.
	* config/i386/sse.md:
	(UNSPEC_CVTNE_BF16_IBS_ITER): New iterator.
	(sat_cvt_sign_prefix): Ditto.
	(sat_cvt_trunc_prefix): Ditto.
	(UNSPEC_CVT_PH_IBS_ITER): Ditto.
	(UNSPEC_CVTT_PH_IBS_ITER): Ditto.
	(UNSPEC_CVT_PS_IBS_ITER): Ditto.
	(UNSPEC_CVTT_PS_IBS_ITER): Ditto.
	(avx10_2_cvt<sat_cvt_trunc_prefix>nebf162i<sat_cvt_sign_prefix>bs<mode><mask_name>):
	New define_insn.
	(avx10_2_cvtph2i<sat_cvt_sign_prefix>bs<mode><mask_name><round_name>):
	Ditto.
	(avx10_2_cvttph2i<sat_cvt_sign_prefix>bs<mode><mask_name><round_saeonly_name>):
	Ditto.
	(avx10_2_cvtps2i<sat_cvt_sign_prefix>bs<mode><mask_name><round_name>):
	Ditto.
	(avx10_2_cvttps2i<sat_cvt_sign_prefix>bs<mode><mask_name><round_saeonly_name>):
	Ditto.
	* config/i386/avx10_2-512satcvtintrin.h: New file.
	* config/i386/avx10_2satcvtintrin.h: Ditto.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx-1.c: Add macros.
	* gcc.target/i386/sse-13.c: Ditto.
	* gcc.target/i386/sse-14.c: Ditto.
	* gcc.target/i386/sse-22.c: Ditto.
	* gcc.target/i386/sse-23.c: Ditto.
	* gcc.target/i386/avx512f-helper.h: Add new test macro.
	* gcc.target/i386/m512-check.h: Add new type.
	* gcc.target/i386/avx10_2-512-satcvt-1.c: New test.
	* gcc.target/i386/avx10_2-512-vcvtnebf162ibs-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vcvtnebf162iubs-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vcvtph2ibs-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vcvtph2iubs-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vcvtps2ibs-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vcvtps2iubs-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vcvttnebf162ibs-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vcvttnebf162iubs-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vcvttph2ibs-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vcvttph2iubs-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vcvttps2ibs-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vcvttps2iubs-2.c: Ditto.
	* gcc.target/i386/avx10_2-satcvt-1.c: Ditto.
	* gcc.target/i386/avx10_2-vcvtnebf162ibs-2.c: Ditto.
	* gcc.target/i386/avx10_2-vcvtnebf162iubs-2.c: Ditto.
	* gcc.target/i386/avx10_2-vcvtph2ibs-2.c: Ditto.
	* gcc.target/i386/avx10_2-vcvtph2iubs-2.c: Ditto.
	* gcc.target/i386/avx10_2-vcvtps2ibs-2.c: Ditto.
	* gcc.target/i386/avx10_2-vcvttnebf162ibs-2.c: Ditto.
	* gcc.target/i386/avx10_2-vcvttnebf162iubs-2.c: Ditto.
	* gcc.target/i386/avx10_2-vcvttph2ibs-2.c: Ditto.
	* gcc.target/i386/avx10_2-vcvttph2iubs-2.c: Ditto.
	* gcc.target/i386/avx10_2-vcvttps2ibs-2.c: Ditto.
	* gcc.target/i386/avx10_2-vcvttps2iubs-2.c: Ditto.

e2c80d23

[PATCH 2/2] AVX10.2: Support BF16 instructions · 5cb67ddd

konglin1 authored 7 months ago


gcc/ChangeLog:

	* config/i386/avx10_2-512bf16intrin.h: Add new intrinsics.
	* config/i386/avx10_2bf16intrin.h: Diito.
	* config/i386/i386-builtin-types.def : Add new DEF_FUNCTION_TYPE
	for new type.
	* config/i386/i386-builtin.def (BDESC): Add new buildin.
	* config/i386/i386-expand.cc (ix86_expand_args_builtin):
	Handle new type.
	* config/i386/sse.md (vecmemsuffix): Add vector BF mode.
	(avx10_2_rsqrtpbf16_<mode><mask_name>): New define_insn.
	(avx10_2_sqrtnepbf16_<mode><mask_name>): Ditto.
	(avx10_2_rcppbf16_<mode><mask_name>): Ditto.
	(avx10_2_getexppbf16_<mode><mask_name>): Ditto.
	(BF16IMMOP): New iterator.
	(bf16immop): Ditto.
	(avx10_2_<bf16immop>pbf16_<mode><mask_name>): New define_insn.
	(avx10_2_fpclasspbf16_<mode><mask_scalar_merge_name>): Ditto.
	(avx10_2_cmppbf16_<mode><mask_scalar_merge_name>): Ditto.
	(avx10_2_comsbf16_v8bf): Ditto.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx10-check.h: Add AVX10_SCALAR.
	* gcc.target/i386/avx10-helper.h: Add helper functions.
	* gcc.target/i386/avx10_2-512-bf16-1.c: Add new tests.
	* gcc.target/i386/avx10_2-bf16-1.c: Ditto.
	* gcc.target/i386/avx-1.c: Add macros.
	* gcc.target/i386/sse-13.c: Ditto.
	* gcc.target/i386/sse-14.c: Ditto.
	* gcc.target/i386/sse-22.c: Ditto.
	* gcc.target/i386/sse-23.c: Ditto.
	* gcc.target/i386/avx10_2-512-vcmppbf16-2.c: New test.
	* gcc.target/i386/avx10_2-512-vfpclasspbf16-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vgetexppbf16-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vgetmantpbf16-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vrcppbf16-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vreducenepbf16-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vrndscalenepbf16-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vrsqrtpbf16-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vsqrtnepbf16-2.c: Ditto.
	* gcc.target/i386/avx10_2-vcmppbf16-2.c: Ditto.
	* gcc.target/i386/avx10_2-vcomsbf16-1.c: Ditto.
	* gcc.target/i386/avx10_2-vcomsbf16-2.c: Ditto.
	* gcc.target/i386/avx10_2-vfpclasspbf16-2.c: Ditto.
	* gcc.target/i386/avx10_2-vgetexppbf16-2.c: Ditto.
	* gcc.target/i386/avx10_2-vgetmantpbf16-2.c: Ditto.
	* gcc.target/i386/avx10_2-vrcppbf16-2.c: Ditto.
	* gcc.target/i386/avx10_2-vreducenepbf16-2.c: Ditto.
	* gcc.target/i386/avx10_2-vrndscalenepbf16-2.c: Ditto.
	* gcc.target/i386/avx10_2-vrsqrtpbf16-2.c: Ditto.
	* gcc.target/i386/avx10_2-vsqrtnepbf16-2.c: Ditto.

Co-authored-by: Levy Hsu <admin@levyhsu.com>

5cb67ddd

[PATCH 1/2] AVX10.2: Support BF16 instructions · 90236624

konglin1 authored 7 months ago


gcc/ChangeLog:

	* config.gcc: Add avx10_2-512bf16intrin.h and avx10_2bf16intrin.h.
	* config/i386/i386-builtin-types.def : Add new
	DEF_FUNCTION_TYPE for V32BF_FTYPE_V32BF_V32BF,
	V16BF_FTYPE_V16BF_V16BF, V8BF_FTYPE_V8BF_V8BF,
	V8BF_FTYPE_V8BF_V8BF_UQI, V16BF_FTYPE_V16BF_V16BF_UHI,
	V32BF_FTYPE_V32BF_V32BF_USI, V32BF_FTYPE_V32BF_V32BF_V32BF_USI,
	V8BF_FTYPE_V8BF_V8BF_V8BF_UQI and V16BF_FTYPE_V16BF_V16BF_V16BF_UHI.
	* config/i386/i386-builtin.def (BDESC): Add new builtins.
	* config/i386/i386-expand.cc (ix86_expand_args_builtin):
	Handle new DEF_FUNCTION_TYPE.
	* config/i386/immintrin.h: Include avx10_2-512bf16intrin.h and
	avx10_2bf16intrin.h.
	* config/i386/sse.md
	(VBF_AVX10_2): New iterator.
	(avx10_2_scalefpbf16_<mode><mask_name>): New define_insn.
	(avx10_2_<code>nepbf16_<mode><mask_name>): Ditto.
	(avx10_2_<insn>nepbf16_<mode><mask_name>): Ditto.
	(avx10_2_fmaddnepbf16_<mode>_maskz): New expander.
	(avx10_2_fnmaddnepbf16_<mode>_maskz): Ditto.
	(avx10_2_fmsubnepbf16_<mode>_maskz): Ditto.
	(avx10_2_fnmsubnepbf16_<mode>_maskz): Ditto.
	(avx10_2_fmaddnepbf16_<mode><sd_maskz_name>): New define_insn.
	(avx10_2_fmaddnepbf16_<mode>_mask): Ditto.
	(avx10_2_fmaddnepbf16_<mode>_mask3): Ditto.
	(avx10_2_fnmaddnepbf16_<mode><sd_maskz_name>): Ditto.
	(avx10_2_fnmaddnepbf16_<mode>_mask): Ditto.
	(avx10_2_fnmaddnepbf16_<mode>_mask3): Ditto.
	(avx10_2_fmsubnepbf16_<mode><sd_maskz_name>): Ditto.
	(avx10_2_fmsubnepbf16_<mode>_mask): Ditto.
	(avx10_2_fmsubnepbf16_<mode>_mask3): Ditto.
	(avx10_2_fnmsubnepbf16_<mode><sd_maskz_name>): Ditto.
	(avx10_2_fnmsubnepbf16_<mode>_mask): Ditto.
	(avx10_2_fnmsubnepbf16_<mode>_mask3): Ditto.
	* config/i386/avx10_2-512bf16intrin.h: New file.
	* config/i386/avx10_2bf16intrin.h: Ditto.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx512f-helper.h: Add MAKE_MASK_MERGE and MAKE_MASK_ZERO
	for bf16_uw.
	* gcc.target/i386/m512-check.h: Add union512bf16_uw, union256bf16_uw,
	union128bf16_uw and CHECK_EXP for them.
	* gcc.target/i386/avx10-helper.h: New file.
	* gcc.target/i386/avx10_2-512-bf16-1.c: New test.
	* gcc.target/i386/avx10_2-512-vaddnepbf16-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vdivnepbf16-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vfmaddXXXnepbf16-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vfmsubXXXnepbf16-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vfnmaddXXXnepbf16-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vfnmsubXXXnepbf16-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vmaxpbf16-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vminpbf16-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vscalefpbf16-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vsubnepbf16-2.c: Ditto.
	* gcc.target/i386/avx10_2-bf16-1.c: Ditto.
	* gcc.target/i386/avx10_2-vaddnepbf16-2.c: Ditto.
	* gcc.target/i386/avx10_2-vdivnepbf16-2.c: Ditto.
	* gcc.target/i386/avx10_2-vfmaddXXXnepbf16-2.c: Ditto.
	* gcc.target/i386/avx10_2-vfmsubXXXnepbf16-2.c: Ditto.
	* gcc.target/i386/avx10_2-vfnmaddXXXnepbf16-2.c: Ditto.
	* gcc.target/i386/avx10_2-vfnmsubXXXnepbf16-2.c: Ditto.
	* gcc.target/i386/avx10_2-vmaxpbf16-2.c: Ditto.
	* gcc.target/i386/avx10_2-vminpbf16-2.c: Ditto.
	* gcc.target/i386/avx10_2-vmulnepbf16-2.c: Ditto.
	* gcc.target/i386/avx10_2-vscalefpbf16-2.c: Ditto.
	* gcc.target/i386/avx10_2-vsubnepbf16-2.c: Ditto.

Co-authored-by: Levy Hsu <admin@levyhsu.com>

90236624

AVX10.2: Support convert instructions · 2a046117

Levy Hsu authored 7 months ago


gcc/ChangeLog:

	* config.gcc: Add avx10_2-512convertintrin.h and
	avx10_2convertintrin.h.
	* config/i386/i386-builtin-types.def: Add new DEF_POINTER_TYPE
	and DEF_FUNCTION_TYPE.
	* config/i386/i386-builtin.def (BDESC): Add new builtins.
	* config/i386/i386-expand.cc (ix86_expand_args_builtin):
	Handle AVX10.2.
	(ix86_expand_round_builtin): Ditto.
	* config/i386/immintrin.h: Include avx10_2-512convertintrin.h,
	avx10_2convertintrin.h.
	* config/i386/sse.md (VHF_AVX10_2): New iterator.
	(bf16_ph): Add 512 bit mode.
	(avx10_2_cvt2ps2phx_<mode><mask_name<round_name>): New define_insn.
	(ssebvecmode): New iterator.
	(UNSPEC_NECONVERTFP8_PACK): Ditto.
	(neconvertfp8_pack): Ditto.
	(vcvt<neconvertfp8_pack><mode><mask_name>): New define_insn.
	(ssebvecmode_2): New iterator.
	(UNSPEC_VCVTBIASPH2FP8_PACK): Ditto.
	(biasph2fp8_pack): Ditto.
	(vcvt<biasph2fp8_pack>v8hf): New expander.
	(vcvt<biasph2fp8_pack>v8hf_mask): Ditto.
	(*vcvt<biasph2bf8_pack>v8hf): New define_insn.
	(*vcvt<biasph2fp8_pack>v8hf_mask): Ditto.
	(VHF_AVX10_2_2): New iterator.
	(vcvt<biasph2fp8_pack><mode><mask_name>): New define_insn.
	(VHF_256_512): New iterator.
	(ph2fp8suff): Ditto.
	(UNSPEC_NECONVERTPH2FP8_PACK): Ditto.
	(neconvertph2fp8): Ditto.
	(vcvt<neconvertph2fp8>v8hf_mask): New expander.
	(*vcvt<neconvertph2fp8>v8hf): New define_insn.
	(*vcvt<neconvertph2fp8>v8hf_mask): Ditto.
	(vcvt<neconvertph2fp8><mode><mask_name>): Ditto.
	(vcvthf82ph<mode><mask_name>): Ditto.
	* config/i386/avx10_2-512convertintrin.h: New file.
	* config/i386/avx10_2convertintrin.h: Ditto.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx-1.c: Add macros for const.
	* gcc.target/i386/avx-2.c: Ditto.
	* gcc.target/i386/sse-13.c: Ditto.
	* gcc.target/i386/sse-14.c: Ditto.
	* gcc.target/i386/sse-22.c: Ditto.
	* gcc.target/i386/sse-23.c: Ditto.
	* gcc.target/i386/avx10_2-512-convert-1.c: New test.
	* gcc.target/i386/avx10_2-512-vcvt2ps2phx-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vcvtbiasph2bf8-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vcvtbiasph2bf8s-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vcvtbiasph2hf8-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vcvtbiasph2hf8s-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vcvthf82ph-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vcvtne2ph2bf8-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vcvtne2ph2bf8s-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vcvtne2ph2hf8-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vcvtne2ph2hf8s-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vcvtneph2bf8-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vcvtneph2bf8s-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vcvtneph2hf8-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vcvtneph2hf8s-2.c: Ditto.
	* gcc.target/i386/avx10_2-convert-1.c: Ditto.
	* gcc.target/i386/avx10_2-vcvt2ps2phx-2.c: Ditto.
	* gcc.target/i386/avx10_2-vcvtbiasph2bf8-2.c: Ditto.
	* gcc.target/i386/avx10_2-vcvtbiasph2bf8s-2.c: Ditto.
	* gcc.target/i386/avx10_2-vcvtbiasph2hf8-2.c: Ditto.
	* gcc.target/i386/avx10_2-vcvtbiasph2hf8s-2.c: Ditto.
	* gcc.target/i386/avx10_2-vcvthf82ph-2.c: Ditto.
	* gcc.target/i386/avx10_2-vcvtne2ph2bf8-2.c: Ditto.
	* gcc.target/i386/avx10_2-vcvtne2ph2bf8s-2.c: Ditto.
	* gcc.target/i386/avx10_2-vcvtne2ph2hf8-2.c: Ditto.
	* gcc.target/i386/avx10_2-vcvtne2ph2hf8s-2.c: Ditto.
	* gcc.target/i386/avx10_2-vcvtneph2bf8-2.c: Ditto.
	* gcc.target/i386/avx10_2-vcvtneph2bf8s-2.c: Ditto.
	* gcc.target/i386/avx10_2-vcvtneph2hf8-2.c: Ditto.
	* gcc.target/i386/avx10_2-vcvtneph2hf8s-2.c: Ditto.
	* gcc.target/i386/fp8-helper.h: New helper file.

Co-authored-by: Levy Hsu <admin@levyhsu.com>
Co-authored-by: Kong Lingling <lingling.kong@intel.com>

2a046117

[PATCH 2/2] AVX10.2: Support media instructions · af0a0627

Haochen Jiang authored 7 months ago


gcc/ChangeLog:

	* config/i386/avx10_2-512mediaintrin.h: Add new intrins.
	* config/i386/avx10_2mediaintrin.h: Ditto.
	* config/i386/i386-builtin.def: Add new builtins.
	* config/i386/i386-builtins.cc (def_builtin): Handle shared
	builtins between AVXVNNIINT16 and AVX10.2.
	* config/i386/i386-expand.cc (ix86_check_builtin_isa_match):
	Ditto.
	* config/i386/sse.md (unspec): Add UNSPEC_VDPPHPS.
	(avx10_2_mpsadbw<mask_name>): New define_insn.
	(<mask_codefor><sse4_1_avx2>_mpsadbw<mask_name>): Ditto.
	(vpdp<vpdpwprodtype>_<mode>): Add AVX10_2_256.
	(vpdp<vpdpwprodtype>_v16si): New defin_insn.
	(vpdp<vpdpwprodtype>_<mode>_mask): Ditto.
	(*vpdp<vpdpwprodtype>_<mode>_maskz): Ditto.
	(vpdp<vpdpwprodtype>_<mode>_maskz): New expander.
	(vdpphps_<mode>): New define_insn.
	(vdpphps_<mode>_mask): Ditto.
	(*vdpphps_<mode>_maskz): Ditto.
	(vdpphps_<mode>_maskz): New expander.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/avxvnniint16-1.c: Add new macro test.
	* gcc.target/i386/avx-1.c: Ditto.
	* gcc.target/i386/sse-13.c: Ditto.
	* gcc.target/i386/sse-14.c: Ditto.
	* gcc.target/i386/sse-22.c: Ditto.
	* gcc.target/i386/sse-23.c: Ditto.
	* gcc.target/i386/avx10_2-512-media-1.c: Add test.
	* gcc.target/i386/avx10_2-media-1.c: Ditto.
	* gcc.target/i386/avxvnniint16-builtin.c: New test.
	* gcc.target/i386/avx10_2-512-vdpphps-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vmpsadbw-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vpdpwsud-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vpdpwsuds-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vpdpwusd-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vpdpwusds-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vpdpwuud-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vpdpwuuds-2.c: Ditto.
	* gcc.target/i386/avx10_2-builtin-2.c: Ditto.
	* gcc.target/i386/avx10_2-vdpphps-2.c: Ditto.
	* gcc.target/i386/avx10_2-vmpsadbw-2.c: Ditto.
	* gcc.target/i386/avx10_2-vpdpwsud-2.c: Ditto.
	* gcc.target/i386/avx10_2-vpdpwsuds-2.c: Ditto.
	* gcc.target/i386/avx10_2-vpdpwusd-2.c: Ditto.
	* gcc.target/i386/avx10_2-vpdpwusds-2.c: Ditto.
	* gcc.target/i386/avx10_2-vpdpwuud-2.c: Ditto.
	* gcc.target/i386/avx10_2-vpdpwuuds-2.c: Ditto.

Co-authored-by: Hongyu Wang <hongyu.wang@intel.com>

af0a0627

[PATCH 1/2] AVX10.2: Support media instructions · 8db80b27

Hongyu Wang authored 7 months ago


gcc/ChangeLog

	* config.gcc: Add avx10_2mediaintrin.h and
	avx10_2-512mediaintrin.h.
	* config/i386/i386-builtin.def: Add new builtins.
	* config/i386/i386-builtins.cc (def_builtin): Handle shared
	builtins between AVXVNNIINT8 and AVX10.2.
	* config/i386/i386-expand.cc (ix86_check_builtin_isa_match):
	Ditto.
	* config/i386/immintrin.h: Include avx10_2mediaintrin.h and
	avx10_2-512mediaintrin.h
	* config/i386/sse.md: (VI4_AVX10_2): New.
	(vpdp<vpdotprodtype>_<mode>): Add AVX10_2_256.
	(vpdp<vpdotprodtype>_v16si): New define_insn.
	(vpdp<vpdotprodtype>_<mode>_mask): Ditto.
	(*vpdp<vpdotprodtype>_<mode>_maskz): Ditto.
	(vpdp<vpdotprodtype>_<mode>_maskz): New expander.
	* config/i386/avx10_2-512mediaintrin.h: New file.
	* config/i386/avx10_2mediaintrin.h: Ditto.

gcc/testsuite/ChangeLog

	* gcc.target/i386/avx512f-helper.h: Reuse AVX512F macros
	for AVX10.
	* gcc.target/i386/funcspec-56.inc: Add new target attribute.
	* lib/target-supports.exp
	(check_effective_target_avx10_2): New.
	(check_effective_target_avx10_2_512): Ditto.
	* gcc.target/i386/avx10-check.h: New test file.
	* gcc.target/i386/avx10-helper.h: Ditto.
	* gcc.target/i386/avx10_2-builtin-1.c: Ditto.
	* gcc.target/i386/avx10_2-512-media-1.c: Ditto.
	* gcc.target/i386/avx10_2-media-1.c: Ditto..
	* gcc.target/i386/avxvnniint8-builtin.c: Ditto.
	* gcc.target/i386/avx10_2-512-vpdpbssd-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vpdpbssds-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vpdpbsud-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vpdpbsuds-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vpdpbuud-2.c: Ditto.
	* gcc.target/i386/avx10_2-512-vpdpbuuds-2.c: Ditto.
	* gcc.target/i386/avx10_2-vpdpbssd-2.c: Ditto.
	* gcc.target/i386/avx10_2-vpdpbssds-2.c: Ditto.
	* gcc.target/i386/avx10_2-vpdpbsud-2.c: Ditto.
	* gcc.target/i386/avx10_2-vpdpbsuds-2.c: Ditto.
	* gcc.target/i386/avx10_2-vpdpbuud-2.c: Ditto.
	* gcc.target/i386/avx10_2-vpdpbuuds-2.c: Ditto.

Co-authored-by: Haochen Jiang <haochen.jiang@intel.com>

8db80b27

i386: Refactor m512-check.h · cba45668

Haochen Jiang authored 7 months ago

After AVX10 introduction, we still want to use AVX512 helper functions
to avoid duplicate code. In order to reuse them, we need to do some refactor
to make sure each function define happen under correct ISA to avoid ABI
warnings.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/m512-check.h: Wrap the function define with
	correct vector size.

cba45668

RISC-V: Support IMM for operand 0 of ussub pattern · 17be0091

Pan Li authored 7 months ago


This patch would like to allow IMM for the operand 0 of ussub pattern.
Aka .SAT_SUB(1023, y) as the below example.

Form 1:
  #define DEF_SAT_U_SUB_IMM_FMT_1(T, IMM) \
  T __attribute__((noinline))             \
  sat_u_sub_imm##IMM##_##T##_fmt_1 (T y)  \
  {                                       \
    return (T)IMM >= y ? (T)IMM - y : 0;  \
  }

DEF_SAT_U_SUB_IMM_FMT_1(uint64_t, 1023)

Before this patch:
  10   │ sat_u_sub_imm82_uint64_t_fmt_1:
  11   │     li  a5,82
  12   │     bgtu    a0,a5,.L3
  13   │     sub a0,a5,a0
  14   │     ret
  15   │ .L3:
  16   │     li  a0,0
  17   │     ret

After this patch:
  10   │ sat_u_sub_imm82_uint64_t_fmt_1:
  11   │     li  a5,82
  12   │     sltu    a4,a5,a0
  13   │     addi    a4,a4,-1
  14   │     sub a0,a5,a0
  15   │     and a0,a4,a0
  16   │     ret

The below test suites are passed for this patch:
1. The rv64gcv fully regression test.

gcc/ChangeLog:

	* config/riscv/riscv.cc (riscv_gen_unsigned_xmode_reg): Add new
	func impl to gen xmode rtx reg from operand rtx.
	(riscv_expand_ussub): Gen xmode reg for operand 1.
	* config/riscv/riscv.md: Allow const_int for operand 1.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/sat_arith.h: Add test helper macro.
	* gcc.target/riscv/sat_u_sub_imm-1.c: New test.
	* gcc.target/riscv/sat_u_sub_imm-1_1.c: New test.
	* gcc.target/riscv/sat_u_sub_imm-1_2.c: New test.
	* gcc.target/riscv/sat_u_sub_imm-2.c: New test.
	* gcc.target/riscv/sat_u_sub_imm-2_1.c: New test.
	* gcc.target/riscv/sat_u_sub_imm-2_2.c: New test.
	* gcc.target/riscv/sat_u_sub_imm-3.c: New test.
	* gcc.target/riscv/sat_u_sub_imm-3_1.c: New test.
	* gcc.target/riscv/sat_u_sub_imm-3_2.c: New test.
	* gcc.target/riscv/sat_u_sub_imm-4.c: New test.
	* gcc.target/riscv/sat_u_sub_imm-run-1.c: New test.
	* gcc.target/riscv/sat_u_sub_imm-run-2.c: New test.
	* gcc.target/riscv/sat_u_sub_imm-run-3.c: New test.
	* gcc.target/riscv/sat_u_sub_imm-run-4.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

17be0091

RISC-V: Add testcases for unsigned vector .SAT_TRUNC form 4 · 8f2f7aab

Pan Li authored 7 months ago


This patch would like to add test cases for the unsigned vector
.SAT_TRUNC form 4.  Aka:

Form 4:
  #define DEF_VEC_SAT_U_TRUNC_FMT_4(NT, WT)                             \
  void __attribute__((noinline))                                        \
  vec_sat_u_trunc_##NT##_##WT##_fmt_4 (NT *out, WT *in, unsigned limit) \
  {                                                                     \
    unsigned i;                                                         \
    for (i = 0; i < limit; i++)                                         \
      {                                                                 \
        bool not_overflow = in[i] <= (WT)(NT)(-1);                      \
        out[i] = ((NT)in[i]) | (NT)((NT)not_overflow - 1);              \
      }                                                                 \
  }

DEF_VEC_SAT_U_TRUNC_FMT_4 (uint32_t, uint64_t)

The below test is passed for this patch.
* The rv64gcv regression test.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros.
	* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-19.c: New test.
	* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-20.c: New test.
	* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-21.c: New test.
	* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-22.c: New test.
	* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-23.c: New test.
	* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-24.c: New test.
	* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-19.c: New test.
	* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-20.c: New test.
	* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-21.c: New test.
	* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-22.c: New test.
	* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-23.c: New test.
	* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-24.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

8f2f7aab

RISC-V: Add testcases for unsigned scalar .SAT_TRUNC form 4 · 5ab1e238

Pan Li authored 7 months ago


This patch would like to add test cases for the unsigned scalar quad and
oct .SAT_TRUNC form 4.  Aka:

Form 4:
  #define DEF_SAT_U_TRUNC_FMT_4(NT, WT)          \
  NT __attribute__((noinline))                   \
  sat_u_trunc_##WT##_to_##NT##_fmt_4 (WT x)      \
  {                                              \
    bool not_overflow = x <= (WT)(NT)(-1);       \
    return ((NT)x) | (NT)((NT)not_overflow - 1); \
  }

The below test is passed for this patch.
* The rv64gcv regression test.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/sat_arith.h: Add test helper macros.
	* gcc.target/riscv/sat_u_trunc-19.c: New test.
	* gcc.target/riscv/sat_u_trunc-20.c: New test.
	* gcc.target/riscv/sat_u_trunc-21.c: New test.
	* gcc.target/riscv/sat_u_trunc-22.c: New test.
	* gcc.target/riscv/sat_u_trunc-23.c: New test.
	* gcc.target/riscv/sat_u_trunc-24.c: New test.
	* gcc.target/riscv/sat_u_trunc-run-19.c: New test.
	* gcc.target/riscv/sat_u_trunc-run-20.c: New test.
	* gcc.target/riscv/sat_u_trunc-run-21.c: New test.
	* gcc.target/riscv/sat_u_trunc-run-22.c: New test.
	* gcc.target/riscv/sat_u_trunc-run-23.c: New test.
	* gcc.target/riscv/sat_u_trunc-run-24.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

5ab1e238

Daily bump. · 07b7098b
GCC Administrator authored 7 months ago

07b7098b

Aug 25, 2024

RISC-V: Fix double mode under RV32 not utilize vf · 7f65c38a

demin.han authored 7 months ago

Currently, some binops of vector vs double scalar under RV32 can't
translated to vf but vfmv+vxx.vv.

The cause is that vec_duplicate is also expanded to broadcast for double mode
under RV32. last-combine can't process expanded broadcast.

gcc/ChangeLog:

	* config/riscv/vector.md: Add !FLOAT_MODE_P constraint.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/binop/vadd-rv32gcv-nofm.c: Fix test.
	* gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv-nofm.c: Ditto.
	* gcc.target/riscv/rvv/autovec/binop/vmul-rv32gcv-nofm.c: Ditto.
	* gcc.target/riscv/rvv/autovec/binop/vsub-rv32gcv-nofm.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_copysign-rv32gcv.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fadd-1.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fadd-2.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fadd-3.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fadd-4.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-1.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-3.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-4.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-5.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-6.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmax-1.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmax-2.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmax-3.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmax-4.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmin-1.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmin-2.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmin-3.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmin-4.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-1.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-3.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-4.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-5.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-6.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmul-1.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmul-2.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmul-3.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmul-4.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmul-5.c: Ditto.

7f65c38a

[PATCH] Re-add calling emit_clobber in lower-subreg.cc's resolve_simple_move. · dba20679

Xianmiao Qu authored 7 months ago

The previous patch:
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=d8a6945c6ea22efa4d5e42fe1922d2b27953c8cd
aimed to eliminate redundant MOV instructions by removing calling
emit_clobber in lower-subreg.cc's resolve_simple_move.

First, I found that another patch address this issue:
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=bdf2737cda53a83332db1a1a021653447b05a7e7
and even without removing calling emit_clobber,
the instruction generation is still as expected.

Second, removing the CLOBBER expression will have side effects.
When there is no CLOBBER expression and only SUBREG assignments exist,
according to the logic of the 'df_lr_bb_local_compute' function,
the register will be added to the basic block LR IN set.
This will cause the register's lifetime to span the entire function,
resulting in increased register pressure. Taking the newly added test case
'gcc/testsuite/gcc.target/riscv/pr43644.c' as an example,
removing the CLOBBER expression will lead to spill in some registers.

gcc/:
	* lower-subreg.cc (resolve_simple_move): Re-add calling emit_clobber
	immediately before moving a multi-word register by parts.

gcc/testsuite/:
	* gcc.target/riscv/pr43644.c: New test case.

dba20679

testsuite: Run array54.C only for sync_int_long targets · b21d6474

Dimitar Dimitrov authored 7 months ago


The test case uses "atomic<int>", which fails to link on
pru-unknown-elf target due to missing __atomic_load_4 symbol.

Fix by filtering for sync_int_long effective target.  Ensured that the
test still passes for x86_64-pc-linux-gnu.

gcc/testsuite/ChangeLog:

	* g++.dg/init/array54.C: Require sync_int_long effective target.

Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>

b21d6474

Support if conversion for switches · c9ccc396

Andi Kleen authored 7 months ago

The gimple-if-to-switch pass converts if statements with
multiple equal checks on the same value to a switch. This breaks
vectorization which cannot handle switches.

Teach the tree-if-conv pass used by the vectorizer to handle
simple switch statements, like those created by if-to-switch earlier.
These are switches that only have a single non default block,
They are handled similar to COND in if conversion.

This makes the vect-bitfield-read-1-not test fail. The test
checks for a bitfield analysis failing, but it actually
relied on the ifcvt erroring out early because the test
is using a switch. The if conversion still does not
work because the switch is not in a form that this
patch can handle, but it fails much later and the bitfield
analysis succeeds, which makes the test fail. I marked
it xfail because it doesn't seem to be testing what it wants
to test.

	PR tree-optimization/115866

gcc/ChangeLog:

	* tree-if-conv.cc (if_convertible_switch_p): New function.
	(if_convertible_stmt_p): Check for switch.
	(get_loop_body_in_if_conv_order): Handle switch.
	(predicate_bbs): Likewise.
	(predicate_statements): Likewise.
	(remove_conditions_and_labels): Likewise.
	(ifcvt_split_critical_edges): Likewise.
	(ifcvt_local_dce): Likewise.

gcc/testsuite/ChangeLog:

	* gcc.dg/vect/vect-switch-ifcvt-1.c: New test.
	* gcc.dg/vect/vect-switch-ifcvt-2.c: New test.
	* gcc.dg/vect/vect-switch-search-line-fast.c: New test.
	* gcc.dg/vect/vect-bitfield-read-1-not.c: Change to xfail.

c9ccc396

Write CodeView information about static locals in optimized code · 382fcf03

Mark Harmstone authored 7 months ago

Write CodeView S_LDATA32 symbols for static locals in optimized code. We have
to handle these separately, as they come after the S_FRAMEPROC, plus you can't
have S_BLOCK32 symbols like you can in unoptimized code.

gcc/
	* dwarf2codeview.cc (write_optimized_static_local_vars): New function.
	(write_function): Call write_optimized_static_local_vars.

382fcf03