- Aug 27, 2024
-
-
YunQiang Su authored
mips16.S was missing since commit 29b74545 Date: Thu Jun 1 10:14:24 2023 +0800 MIPS: Add speculation_barrier support Without mips16.S included, some symbols will miss for mips16, and so some software will fail to build. libgcc/ChangeLog: * config/mips/lib1funcs.S: Includes mips16.S.
-
- Aug 26, 2024
-
-
Hans-Peter Nilsson authored
The first of the late-combine passes, propagates some of the copies made during the (in-time-)combine pass in make_more_copies into the users of the "original" pseudo registers and removes the "old" pseudos. That effectively removes attributes such as REG_POINTER, which matter to LRA. The quoted PR is for an ICE-manifesting bug that was exposed by the late-combine pass and went back to hiding with this patch until commit r15-2937-g3673b7054ec2, the fix for PR116236, when it was actually fixed. To wit, this patch is only incidentally related to that bug. In other words, the REG_POINTER attribute should not be required for LRA to work correctly. This patch merely corrects state for those propagated register-uses to ante late-combine. For reasons not investigated, this fixes a failing test "FAIL: gcc.dg/guality/pr54200.c -Og -DPREVENT_OPTIMIZATION line 20 z == 3" for x86_64-linux-gnu. PR middle-end/115883 * combine.cc (make_more_copies): Copy attributes from the original pseudo to the new copy.
-
Arsen Arsenović authored
In the testcase presented in the PR, during template expansion, an tsubst of an operand causes a lambda coroutine to be processed, causing it to get an initial suspend and final suspend. The code for assigning awaitable var names (get_awaitable_var) assumed that the sequence Is -> Is -> Fs -> Fs is impossible (i.e. that one could only 'open' one coroutine before closing it at a time), and reset the counter used for unique numbering each time a final suspend occured. This assumption is false in a few cases, usually when lambdas are involved. Instead of storing this counter in a static-storage variable, we can store it in coroutine_info. This struct is local to each function, so we don't need to worry about "cross-contamination" nor resetting. PR c++/113457 gcc/cp/ChangeLog: * coroutines.cc (struct coroutine_info): Add integer field awaitable_number. This is a counter used for assigning unique names to awaitable temporaries. (get_awaitable_var): Use awaitable_number from coroutine_info instead of the static int awn. gcc/testsuite/ChangeLog: * g++.dg/coroutines/pr113457-1.C: New test. * g++.dg/coroutines/pr113457.C: New test.
-
Arsen Arsenović authored
We do not support it currently, and the resulting memory can only be used inside a single resumption, so best not confuse the user with it. PR c++/115858 - Incompatibility of coroutines and alloca() gcc/ChangeLog: * coroutine-passes.cc (execute_early_expand_coro_ifns): Emit a sorry if a statement is an alloca call. gcc/testsuite/ChangeLog: * g++.dg/coroutines/pr115858.C: New test.
-
David Malcolm authored
In particular, move the classic text output code to a diagnostic-text.cc (analogous to -json.cc and -sarif.cc). No functional change intended. gcc/ChangeLog: * Makefile.in (OBJS-libcommon): Add diagnostic-format-text.o. * diagnostic-format-json.cc: Include "diagnostic-format.h". * diagnostic-format-sarif.cc: Likewise. * diagnostic-format-text.cc: New file, using material from diagnostics.cc. * diagnostic-global-context.cc: Include "diagnostic-format.h". * diagnostic-format-text.h: New file, using material from diagnostics.h. * diagnostic-format.h: New file, using material from diagnostics.h. * diagnostic.cc: Include "diagnostic-format.h" and "diagnostic-format-text.h". (diagnostic_text_output_format::~diagnostic_text_output_format): Move to diagnostic-format-text.cc. (diagnostic_text_output_format::on_report_diagnostic): Likewise. (diagnostic_text_output_format::on_diagram): Likewise. (diagnostic_text_output_format::print_any_cwe): Likewise. (diagnostic_text_output_format::print_any_rules): Likewise. (diagnostic_text_output_format::print_option_information): Likewise. * diagnostic.h (class diagnostic_output_format): Move to diagnostic-format.h. (class diagnostic_text_output_format): Move to diagnostic-format-text.h. (diagnostic_output_format_init): Move to diagnostic-format.h. (diagnostic_output_format_init_json_stderr): Likewise. (diagnostic_output_format_init_json_file): Likewise. (diagnostic_output_format_init_sarif_stderr): Likewise. (diagnostic_output_format_init_sarif_file): Likewise. (diagnostic_output_format_init_sarif_stream): Likewise. * gcc.cc: Include "diagnostic-format.h". * opts.cc: Include "diagnostic-format.h". gcc/testsuite/ChangeLog: * gcc.dg/plugin/diagnostic_group_plugin.c: Include "diagnostic-format-text.h". Signed-off-by:
David Malcolm <dmalcolm@redhat.com>
-
David Malcolm authored
Previously diagnostic_context::report_diagnostic had, after the call to pp_format (phases 1 and 2 of formatting the message): m_output_format->on_begin_diagnostic (*diagnostic); pp_output_formatted_text (this->printer, m_urlifier); if (m_show_cwe) print_any_cwe (*diagnostic); if (m_show_rules) print_any_rules (*diagnostic); if (m_show_option_requested) print_option_information (*diagnostic, orig_diag_kind); m_output_format->on_end_diagnostic (*diagnostic, orig_diag_kind); This patch replaces all of the above with a single call to m_output_format->on_report_diagnostic (*diagnostic, orig_diag_kind); moving responsibility for phase 3 of formatting and printing the result from diagnostic_context to the output format. This simplifies diagnostic_context::report_diagnostic and allows us to move the code that prints CWEs, rules, and option information in textual form from diagnostic_context to diagnostic_text_output_format, where it belongs. No functional change intended. gcc/ChangeLog: * diagnostic-format-json.cc (json_output_format::on_begin_diagnostic): Delete. (json_output_format::on_end_diagnostic): Rename to... (json_output_format::on_report_diagnostic): ...this and add call to pp_output_formatted_text. (diagnostic_output_format_init_json): Drop unnecessary calls to disable textual printing of CWEs, rules, and options. * diagnostic-format-sarif.cc (sarif_builder::end_diagnostic): Rename to... (sarif_builder::on_report_diagnostic): ...this and add call to pp_output_formatted_text. (sarif_output_format::on_begin_diagnostic): Delete. (sarif_output_format::on_end_diagnostic): Rename to... (sarif_output_format::on_report_diagnostic): ...this and update call to m_builder accordingly. (diagnostic_output_format_init_sarif): Drop unnecessary calls to disable textual printing of CWEs, rules, and options. * diagnostic.cc (diagnostic_context::print_any_cwe): Convert to... (diagnostic_text_output_format::print_any_cwe): ...this. (diagnostic_context::print_any_rules): Convert to... (diagnostic_text_output_format::print_any_rules): ...this. (diagnostic_context::print_option_information): Convert to... (diagnostic_text_output_format::print_option_information): ...this. (diagnostic_context::report_diagnostic): Replace calls to the output format's on_begin_diagnostic, to pp_output_formatted_text, printing CWE, rules, option info, and the call to the format's on_end_diagnostic with a call to the format's on_report_diagnostic. (diagnostic_text_output_format::on_begin_diagnostic): Delete. (diagnostic_text_output_format::on_end_diagnostic): Delete. (diagnostic_text_output_format::on_report_diagnostic): New vfunc, which effectively does the on_begin_diagnostic, the call to pp_output_formatted_text, the calls for printing CWE, rules, option info, and the call to the diagnostic_finalizer. * diagnostic.h (diagnostic_output_format::on_begin_diagnostic): Delete. (diagnostic_output_format::on_end_diagnostic): Delete. (diagnostic_output_format::on_report_diagnostic): New. (diagnostic_text_output_format::on_begin_diagnostic): Delete. (diagnostic_text_output_format::on_end_diagnostic): Delete. (diagnostic_text_output_format::on_report_diagnostic): New. (class diagnostic_context): Add friend class diagnostic_text_output_format. (diagnostic_context::get_urlifier): New accessor. (diagnostic_context::print_any_cwe): Move decl... (diagnostic_text_output_format::print_any_cwe): ...to here. (diagnostic_context::print_any_rules): Move decl... (diagnostic_text_output_format::print_any_rules): ...to here. (diagnostic_context::print_option_information): Move decl... (diagnostic_text_output_format::print_option_information): ...to here. Signed-off-by:
David Malcolm <dmalcolm@redhat.com>
-
David Malcolm authored
Add test coverage of "%@" in event messages in a multithreaded execution path. gcc/testsuite/ChangeLog: * gcc.dg/plugin/diagnostic-test-paths-multithreaded-inline-events.c: Update expected output. * gcc.dg/plugin/diagnostic-test-paths-multithreaded-sarif.py: Likewise. * gcc.dg/plugin/diagnostic-test-paths-multithreaded-separate-events.c: Likewise. * gcc.dg/plugin/diagnostic_plugin_test_paths.c (test_diagnostic_path::add_event_2): Return the id of the added event. (test_diagnostic_path::add_event_2_with_event_id): New. (example_4): Add event IDs to the deadlock messages indicating where the locks where acquired. Signed-off-by:
David Malcolm <dmalcolm@redhat.com>
-
David Malcolm authored
In r15-2354-g4d1f71d49e396c I added the ability to use Python to write tests of SARIF output via a new "run-sarif-pytest" based on "run-gcov-pytest", with a sarif.py support script in testsuite/gcc.dg/sarif-output. This followup patch: (a) removes the limitation of such tests needing to be in testsuite/gcc.dg/sarif-output by moving sarif.py to testsuite/lib and adding logic to add that directory to PYTHONPATH when invoking pytest. (b) uses this to replace fragile regexp-based tests in gcc.dg/plugin/diagnostic-test-paths-multithreaded-sarif.c with Python logic that verifies the structure within the generated JSON, and to add test coverage for SARIF output relating to GCC plugins. gcc/ChangeLog: * diagnostic-format-sarif.cc: Add comments noting that we don't yet capture any diagnostic_metadata::rules associated with a diagnostic. gcc/testsuite/ChangeLog: * gcc.dg/plugin/diagnostic-test-metadata-sarif.c: New test, based on diagnostic-test-metadata.c. * gcc.dg/plugin/diagnostic-test-metadata-sarif.py: New script. * gcc.dg/plugin/diagnostic-test-paths-multithreaded-sarif.c: Replace scan-sarif-file directives with run-sarif-pytest, to run... * gcc.dg/plugin/diagnostic-test-paths-multithreaded-sarif.py: ...this new test. * gcc.dg/plugin/plugin.exp (plugin_test_list): Add diagnostic-test-metadata-sarif.c. * gcc.dg/sarif-output/sarif.py: Move to... * lib/sarif.py: ...here. * lib/scansarif.exp (run-sarif-pytest): Prepend "lib" to PYTHONPATH before running python scripts. Signed-off-by:
David Malcolm <dmalcolm@redhat.com>
-
David Malcolm authored
Add selftest coverage for %{ and %} in pretty-print.cc No functional change intended. gcc/ChangeLog: * pretty-print.cc (selftest::test_urls): Make static. (selftest::test_urls_from_braces): New. (selftest::test_null_urls): Make static. (selftest::test_urlification): Likewise. (selftest::pretty_print_cc_tests): Call test_urls_from_braces. Signed-off-by:
David Malcolm <dmalcolm@redhat.com>
-
David Malcolm authored
gcc/ChangeLog: * json.h: Fix typo in comment about missing INCLUDE_MEMORY. Signed-off-by:
David Malcolm <dmalcolm@redhat.com>
-
Simon Martin authored
We currently ICE upon the following invalid code, because we don't check that the template parameters in a member class template specialization are correct. === cut here === template <typename T> struct x { template <typename U> struct y { typedef T result2; }; }; template<> template<typename U, typename> struct x<int>::y { typedef double result2; }; int main() { x<int>::y<int>::result2 xxx2; } === cut here === This patch fixes the PR by calling redeclare_class_template. PR c++/115716 gcc/cp/ChangeLog: * pt.cc (maybe_process_partial_specialization): Call redeclare_class_template. gcc/testsuite/ChangeLog: * g++.dg/template/spec42.C: New test. * g++.dg/template/spec43.C: New test.
-
Andi Kleen authored
gcc/ChangeLog: * tree-if-conv.cc: Remove unneeded include from last change.
-
Bernd Edlinger authored
This recent change triggered various bootstap-errors, mostly on x86 targets because line info advance address entries were output in the wrong section table. The switch to the wrong line table happened in dwarfout_set_ignored_loc. It must use the same section as the earlier called dwarf2out_switch_text_section. But also ft32-elf was affected, because the assembler choked on something simple as ".2byte .LM2-.LM1", but fortunately it is able to use native location views, the configure test was just not executed because the ft32 "nop" instruction was missing. gcc/ChangeLog: PR debug/116470 * configure.ac: Add the "nop" instruction for cpu type ft32. * configure: Regenerate. * dwarf2out.cc (dwarf2out_set_ignored_loc): Use the correct line info section.
-
Alexander Monakov authored
Tie together the two functions that ensure tail padding with search_line_ssse3 via CPP_BUFFER_PADDING macro. libcpp/ChangeLog: * internal.h (CPP_BUFFER_PADDING): New macro; use it ... * charset.cc (_cpp_convert_input): ...here, and ... * files.cc (read_file_guts): ...here, and ... * lex.cc (search_line_ssse3): here.
-
Richard Biener authored
The following improves forwprop block reachability which I noticed when debugging PR116460 and what is also noted in the comment. It avoids processing blocks in natural loops determined unreachable, thereby making the issue in PR116460 latent. PR tree-optimization/116460 * tree-ssa-forwprop.cc (pass_forwprop::execute): Do not process blocks in unreachable natural loops.
-
Richard Biener authored
SSA forwprop has switch simplification code that calls remove edge and as side-effect releases dominator info. For a followup we want to retain that so the following delays removing edges until the end of the pass. As usual we have to deal with parts of the edge vanishing due to EH/abnormal pruning so record edges as basic-block index pairs and remove them only when they are still there. * tree-ssa-forwprop.cc (simplify_gimple_switch_label_vec): Delay removing edges and releasing dominator info, instead record into edges_to_remove vector. (simplify_gimple_switch): Pass through vector of to remove edges. (pass_forwprop::execute): Likewise. Remove queued edges.
-
Xi Ruoyao authored
After fixing PR116142 some code started to trigger an ICE with -O3 -march=znver4. Per Richard Biener who actually made this fix: "supportable_widening_operation fails at transform time - that's likely because vectorizable_reduction "puns" defs to internal_def" so the check should use STMT_VINFO_REDUC_DEF instead of checking if STMT_VINFO_DEF_TYPE is vect_reduction_def. gcc/ChangeLog: PR tree-optimization/116348 * tree-vect-stmts.cc (supportable_widening_operation): Use STMT_VINFO_REDUC_DEF (x) instead of STMT_VINFO_DEF_TYPE (x) == vect_reduction_def. gcc/testsuite/ChangeLog: PR tree-optimization/116348 * gcc.c-torture/compile/pr116438.c: New test. Co-authored-by:
Richard Biener <rguenther@suse.de>
-
Pan Li authored
This patch would like to add strict check for imm operand of .SAT_ADD matching. We have no type checking for imm operand in previous, which may result in unexpected IL to be catched by .SAT_ADD pattern. We leverage the int_fits_type_p here to make sure the imm operand is a int type fits the result type of the .SAT_ADD. For example: Fits uint8_t: uint8_t a; uint8_t sum = .SAT_ADD (a, 12); uint8_t sum = .SAT_ADD (a, 12u); uint8_t sum = .SAT_ADD (a, 126u); uint8_t sum = .SAT_ADD (a, 128u); uint8_t sum = .SAT_ADD (a, 228); uint8_t sum = .SAT_ADD (a, 223u); Not fits uint8_t: uint8_t a; uint8_t sum = .SAT_ADD (a, -1); uint8_t sum = .SAT_ADD (a, 256u); uint8_t sum = .SAT_ADD (a, 257); The below test suite are passed for this patch: * The rv64gcv fully regression test. * The x86 bootstrap test. * The x86 fully regression test. gcc/ChangeLog: * match.pd: Add int_fits_type_p check for .SAT_ADD imm operand. gcc/testsuite/ChangeLog: * gcc.target/riscv/sat_arith.h: Add test helper macros. * gcc.target/riscv/sat_u_add_imm-11.c: Adjust test case for imm. * gcc.target/riscv/sat_u_add_imm-12.c: Ditto. * gcc.target/riscv/sat_u_add_imm-15.c: Ditto. * gcc.target/riscv/sat_u_add_imm-16.c: Ditto. * gcc.target/riscv/sat_u_add_imm_type_check-1.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-10.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-11.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-12.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-13.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-14.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-15.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-16.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-17.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-18.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-19.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-2.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-20.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-21.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-22.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-23.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-24.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-25.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-26.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-27.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-28.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-29.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-3.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-30.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-31.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-32.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-33.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-34.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-35.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-36.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-37.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-38.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-39.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-4.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-40.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-41.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-42.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-43.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-44.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-45.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-46.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-47.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-48.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-49.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-5.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-50.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-51.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-52.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-6.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-7.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-8.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-9.c: New test. Signed-off-by:
Pan Li <pan2.li@intel.com>
-
Andrew Pinski authored
When expanding popcount used for equal to 1 (or rather __builtin_stdc_has_single_bit), the wrong mode was bsing used for the mode of the store flags. We were using the mode of the argument to popcount but since popcount's return value is always int, the mode of the expansion here should have been the mode of the return type rater than the argument. Built and tested on aarch64-linux-gnu with no regressions. Also bootstrapped and tested on x86_64-linux-gnu. PR middle-end/116480 gcc/ChangeLog: * internal-fn.cc (expand_POPCOUNT): Use the correct mode for store flags. gcc/testsuite/ChangeLog: * gcc.dg/torture/pr116480-1.c: New test. * gcc.dg/torture/pr116480-2.c: New test. Signed-off-by:
Andrew Pinski <quic_apinski@quicinc.com>
-
Haochen Jiang authored
Since BF8 and FP16 have same bits for exponent, the type conversion between them is just a cast for fraction part. We will use a sequence of instrctions instead of new instructions to do that. For convenience, intrins are also provided. gcc/ChangeLog: * config/i386/avx10_2-512convertintrin.h (_mm512_cvtpbf8_ph): New. (_mm512_mask_cvtpbf8_ph): Ditto. (_mm512_maskz_cvtpbf8_ph): Ditto. * config/i386/avx10_2convertintrin.h (_mm_cvtpbf8_ph): Ditto. (_mm_mask_cvtpbf8_ph): Ditto. (_mm_maskz_cvtpbf8_ph): Ditto. (_mm256_cvtpbf8_ph): Ditto. (_mm256_mask_cvtpbf8_ph): Ditto. (_mm256_maskz_cvtpbf8_ph): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx10_2-512-convert-1.c: Add tests for new intrin. * gcc.target/i386/avx10_2-convert-1.c: Ditto.
-
Zhang, Jun authored
gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_ssecom_setcc): Mention behavior change on flags. (ix86_expand_sse_comi): Handle AVX10.2 behavior. (ix86_expand_sse_comi_round): Ditto. (ix86_expand_round_builtin): Ditto. (ix86_expand_builtin): Change function call. * config/i386/i386.md (UNSPEC_COMX): New unspec. * config/i386/sse.md (avx10_2_v<unord>comx<ssemodesuffix><round_saeonly_name>): New. (<sse>_<unord>comi<round_saeonly_name>): Add HFmode. gcc/testsuite/ChangeLog: * gcc.target/i386/avx10_2-compare-1.c: New test. Co-authored-by:
Haochen Jiang <haochen.jiang@intel.com> Co-authored-by:
Hongtao Liu <hongtao.liu@intel.com>
-
Zhang, Jun authored
gcc/ChangeLog: * config.gcc: Add avx10_2copyintrin.h. * config/i386/i386.md (avx10_2): New isa attribute. * config/i386/immintrin.h: Include avx10_2copyintrin.h. * config/i386/sse.md (sse_movss_<mode>): Add new constraints to handle AVX10.2. (vec_set<mode>_0): Ditto. (@vec_set<mode>_0): Ditto. (vec_set<mode>_0): Ditto. (avx512fp16_mov<mode>): Ditto. (*vec_set<mode>_0_1): New split. * config/i386/avx10_2copyintrin.h: New file. gcc/testsuite/ChangeLog: * gcc.target/i386/avx10_2-vmovd-1.c: New test. * gcc.target/i386/avx10_2-vmovd-2.c: Ditto. * gcc.target/i386/avx10_2-vmovw-1.c: Ditto. * gcc.target/i386/avx10_2-vmovw-2.c: Ditto.
-
Mo, Zewei authored
gcc/ChangeLog: * config.gcc: Add avx10_2-512minmaxintrin.h and avx10_2minmaxintrin.h. * config/i386/i386-builtin-types.def: Add DEF_FUNCTION_TYPE (V8BF, V8BF, V8BF, INT, V8BF, UQI), (V16BF, V16BF, V16BF, INT, V16BF, UHI), (V32BF, V32BF, V32BF, INT, V32BF, USI), (V8HF, V8HF, V8HF, INT, V8HF, UQI), (V8DF, V8DF, V8DF, INT, V8DF, UQI, INT), (V32HF, V32HF, V32HF, INT, V32HF, USI, INT), (V16HF, V16HF, V16HF, INT, V16HF, UHI, INT), (V16SF, V16SF, V16SF, INT, V16SF, UHI, INT). * config/i386/i386-builtin.def (BDESC): Add new builtins. * config/i386/i386-expand.cc (ix86_expand_args_builtin): Handle V8BF_FTYPE_V8BF_V8BF_INT_V8BF_UQI, V16BF_FTYPE_V16BF_V16BF_INT_V16BF_UHI, V32BF_FTYPE_V32BF_V32BF_INT_V32BF_USI, V8HF_FTYPE_V8HF_V8HF_INT_V8HF_UQI, (ix86_expand_round_builtin): Handle V8DF_FTYPE_V8DF_V8DF_INT_V8DF_UQI_INT, V32HF_FTYPE_V32HF_V32HF_INT_V32HF_USI_INT, V16HF_FTYPE_V16HF_V16HF_INT_V16HF_UHI_INT. V16SF_FTYPE_V16SF_V16SF_INT_V16SF_UHI_INT. * config/i386/immintrin.h: Include avx10_2-512mixmaxintrin.h and avx10_2minmaxintrin.h. * config/i386/sse.md (VFH_AVX10_2): New. (avx10_2_vminmaxnepbf16_<mode><mask_name>): New define_insn. (avx10_2_minmaxp<mode><mask_name><round_saeonly_name>): Ditto. (avx10_2_minmaxs<mode><mask_scalar_name><round_saeonly_scalar_name>): Ditto. * config/i386/avx10_2-512minmaxintrin.h: New file. * config/i386/avx10_2minmaxintrin.h: Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add macros. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Ditto. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/avx512f-helper.h: Add helper function. * gcc.target/i386/avx10-minmax-helper.h: New helper file. * gcc.target/i386/avx10_2-512-minmax-1.c: New test. * gcc.target/i386/avx10_2-512-vminmaxnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vminmaxpd-2.c: Ditto. * gcc.target/i386/avx10_2-512-vminmaxph-2.c: Ditto. * gcc.target/i386/avx10_2-512-vminmaxps-2.c: Ditto. * gcc.target/i386/avx10_2-minmax-1.c: Ditto. * gcc.target/i386/avx10_2-vminmaxnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vminmaxsd-2.c: Ditto. * gcc.target/i386/avx10_2-vminmaxsh-2.c: Ditto. * gcc.target/i386/avx10_2-vminmaxss-2.c: Ditto. * gcc.target/i386/avx10_2-vminmaxpd-2.c: Ditto. * gcc.target/i386/avx10_2-vminmaxph-2.c: Ditto. * gcc.target/i386/avx10_2-vminmaxps-2.c: Ditto. Co-authored-by:
Hu, Lin1 <lin1.hu@intel.com> Co-authored-by:
Haochen Jiang <haochen.jiang@intel.com>
-
Hu, Lin1 authored
gcc/ChangeLog: * config/i386/avx10_2-512satcvtintrin.h: Add new intrin. * config/i386/avx10_2satcvtintrin.h: Ditto. * config/i386/i386-builtin.def (BDESC): Add new builtins. * config/i386/sse.md (VF1_VF2_AVX10_2): New iterator. (VF2_AVX10_2): Ditto. (VI8_AVX10_2): Ditto. (sat_cvt_sign_prefix): Add new UNSPEC. (UNSPEC_SAT_CVT_DS_SIGN_ITER): New iterator. (pd2dqssuff): Ditto. (avx10_2_vcvtt<castmode>2<sat_cvt_sign_prefix>dqs<mode><mask_name><round_saeonly_name>): New. (avx10_2_vcvttpd2<sat_cvt_sign_prefix>qqs<mode><mask_name><round_saeonly_name>): Ditto. (avx10_2_vcvttps2<sat_cvt_sign_prefix>qqs<mode><mask_name><round_saeonly_name>): Ditto. (avx10_2_vcvttsd2<sat_cvt_sign_prefix>sis<mode><round_saeonly_name>): Ditto. (avx10_2_vcvttss2<sat_cvt_sign_prefix>sis<mode><round_saeonly_name>): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add macros. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Ditto. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/avx10_2-satcvt-1.c: Add test. * gcc.target/i386/avx10_2-512-satcvt-1.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttpd2dqs-2.c: New test. * gcc.target/i386/avx10_2-512-vcvttpd2qqs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttpd2udqs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttpd2uqqs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttps2dqs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttps2qqs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttps2udqs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttps2uqqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttpd2dqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttpd2qqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttpd2udqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttpd2uqqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttps2dqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttps2qqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttps2udqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttps2uqqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttsd2sis-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttsd2usis-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttss2sis-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttss2usis-2.c: Ditto.
-
Hu, Lin1 authored
gcc/ChangeLog: * config.gcc: Add avx10_2satcvtintrin.h and avx10_2-512satcvtintrin.h. * config/i386/i386-builtin-types.def: Add DEF_FUNCTION_TYPE (V8HI, V8BF, V8HI, UQI), (V16HI, V16BF, V16HI, UHI), (V32HI, V32BF, V32HI, USI), (V16SI, V16SF, V16SI, UHI, INT), (V16HI, V16BF, V16HI, UHI, INT), (V32HI, V32BF, V32HI, USI, INT). * config/i386/i386-builtin.def (BDESC): Add new builtins. * config/i386/i386-expand.cc (ix86_expand_args_builtin): Handle V32HI_FTYPE_V32BF_V32HI_USI, V16HI_FTYPE_V16BF_V16HI_UHI, V8HI_FTYPE_V8BF_V8HI_UQI. (ix86_expand_round_builtin): Handle V32HI_FTYPE_V32BF_V32HI_USI_INT, V16SI_FTYPE_V16SF_V16SI_UHI_INT, V16HI_FTYPE_V16BF_V16HI_UHI_INT. * config/i386/immintrin.h: Include avx10_2satcvtintrin.h and avx10_2-512savcvtintrin.h. * config/i386/sse.md: (UNSPEC_CVTNE_BF16_IBS_ITER): New iterator. (sat_cvt_sign_prefix): Ditto. (sat_cvt_trunc_prefix): Ditto. (UNSPEC_CVT_PH_IBS_ITER): Ditto. (UNSPEC_CVTT_PH_IBS_ITER): Ditto. (UNSPEC_CVT_PS_IBS_ITER): Ditto. (UNSPEC_CVTT_PS_IBS_ITER): Ditto. (avx10_2_cvt<sat_cvt_trunc_prefix>nebf162i<sat_cvt_sign_prefix>bs<mode><mask_name>): New define_insn. (avx10_2_cvtph2i<sat_cvt_sign_prefix>bs<mode><mask_name><round_name>): Ditto. (avx10_2_cvttph2i<sat_cvt_sign_prefix>bs<mode><mask_name><round_saeonly_name>): Ditto. (avx10_2_cvtps2i<sat_cvt_sign_prefix>bs<mode><mask_name><round_name>): Ditto. (avx10_2_cvttps2i<sat_cvt_sign_prefix>bs<mode><mask_name><round_saeonly_name>): Ditto. * config/i386/avx10_2-512satcvtintrin.h: New file. * config/i386/avx10_2satcvtintrin.h: Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add macros. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Ditto. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/avx512f-helper.h: Add new test macro. * gcc.target/i386/m512-check.h: Add new type. * gcc.target/i386/avx10_2-512-satcvt-1.c: New test. * gcc.target/i386/avx10_2-512-vcvtnebf162ibs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtnebf162iubs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtph2ibs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtph2iubs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtps2ibs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtps2iubs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttnebf162ibs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttnebf162iubs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttph2ibs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttph2iubs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttps2ibs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttps2iubs-2.c: Ditto. * gcc.target/i386/avx10_2-satcvt-1.c: Ditto. * gcc.target/i386/avx10_2-vcvtnebf162ibs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtnebf162iubs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtph2ibs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtph2iubs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtps2ibs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttnebf162ibs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttnebf162iubs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttph2ibs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttph2iubs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttps2ibs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttps2iubs-2.c: Ditto.
-
konglin1 authored
gcc/ChangeLog: * config/i386/avx10_2-512bf16intrin.h: Add new intrinsics. * config/i386/avx10_2bf16intrin.h: Diito. * config/i386/i386-builtin-types.def : Add new DEF_FUNCTION_TYPE for new type. * config/i386/i386-builtin.def (BDESC): Add new buildin. * config/i386/i386-expand.cc (ix86_expand_args_builtin): Handle new type. * config/i386/sse.md (vecmemsuffix): Add vector BF mode. (avx10_2_rsqrtpbf16_<mode><mask_name>): New define_insn. (avx10_2_sqrtnepbf16_<mode><mask_name>): Ditto. (avx10_2_rcppbf16_<mode><mask_name>): Ditto. (avx10_2_getexppbf16_<mode><mask_name>): Ditto. (BF16IMMOP): New iterator. (bf16immop): Ditto. (avx10_2_<bf16immop>pbf16_<mode><mask_name>): New define_insn. (avx10_2_fpclasspbf16_<mode><mask_scalar_merge_name>): Ditto. (avx10_2_cmppbf16_<mode><mask_scalar_merge_name>): Ditto. (avx10_2_comsbf16_v8bf): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx10-check.h: Add AVX10_SCALAR. * gcc.target/i386/avx10-helper.h: Add helper functions. * gcc.target/i386/avx10_2-512-bf16-1.c: Add new tests. * gcc.target/i386/avx10_2-bf16-1.c: Ditto. * gcc.target/i386/avx-1.c: Add macros. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Ditto. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/avx10_2-512-vcmppbf16-2.c: New test. * gcc.target/i386/avx10_2-512-vfpclasspbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vgetexppbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vgetmantpbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vrcppbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vreducenepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vrndscalenepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vrsqrtpbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vsqrtnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vcmppbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vcomsbf16-1.c: Ditto. * gcc.target/i386/avx10_2-vcomsbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vfpclasspbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vgetexppbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vgetmantpbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vrcppbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vreducenepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vrndscalenepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vrsqrtpbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vsqrtnepbf16-2.c: Ditto. Co-authored-by:
Levy Hsu <admin@levyhsu.com>
-
konglin1 authored
gcc/ChangeLog: * config.gcc: Add avx10_2-512bf16intrin.h and avx10_2bf16intrin.h. * config/i386/i386-builtin-types.def : Add new DEF_FUNCTION_TYPE for V32BF_FTYPE_V32BF_V32BF, V16BF_FTYPE_V16BF_V16BF, V8BF_FTYPE_V8BF_V8BF, V8BF_FTYPE_V8BF_V8BF_UQI, V16BF_FTYPE_V16BF_V16BF_UHI, V32BF_FTYPE_V32BF_V32BF_USI, V32BF_FTYPE_V32BF_V32BF_V32BF_USI, V8BF_FTYPE_V8BF_V8BF_V8BF_UQI and V16BF_FTYPE_V16BF_V16BF_V16BF_UHI. * config/i386/i386-builtin.def (BDESC): Add new builtins. * config/i386/i386-expand.cc (ix86_expand_args_builtin): Handle new DEF_FUNCTION_TYPE. * config/i386/immintrin.h: Include avx10_2-512bf16intrin.h and avx10_2bf16intrin.h. * config/i386/sse.md (VBF_AVX10_2): New iterator. (avx10_2_scalefpbf16_<mode><mask_name>): New define_insn. (avx10_2_<code>nepbf16_<mode><mask_name>): Ditto. (avx10_2_<insn>nepbf16_<mode><mask_name>): Ditto. (avx10_2_fmaddnepbf16_<mode>_maskz): New expander. (avx10_2_fnmaddnepbf16_<mode>_maskz): Ditto. (avx10_2_fmsubnepbf16_<mode>_maskz): Ditto. (avx10_2_fnmsubnepbf16_<mode>_maskz): Ditto. (avx10_2_fmaddnepbf16_<mode><sd_maskz_name>): New define_insn. (avx10_2_fmaddnepbf16_<mode>_mask): Ditto. (avx10_2_fmaddnepbf16_<mode>_mask3): Ditto. (avx10_2_fnmaddnepbf16_<mode><sd_maskz_name>): Ditto. (avx10_2_fnmaddnepbf16_<mode>_mask): Ditto. (avx10_2_fnmaddnepbf16_<mode>_mask3): Ditto. (avx10_2_fmsubnepbf16_<mode><sd_maskz_name>): Ditto. (avx10_2_fmsubnepbf16_<mode>_mask): Ditto. (avx10_2_fmsubnepbf16_<mode>_mask3): Ditto. (avx10_2_fnmsubnepbf16_<mode><sd_maskz_name>): Ditto. (avx10_2_fnmsubnepbf16_<mode>_mask): Ditto. (avx10_2_fnmsubnepbf16_<mode>_mask3): Ditto. * config/i386/avx10_2-512bf16intrin.h: New file. * config/i386/avx10_2bf16intrin.h: Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx512f-helper.h: Add MAKE_MASK_MERGE and MAKE_MASK_ZERO for bf16_uw. * gcc.target/i386/m512-check.h: Add union512bf16_uw, union256bf16_uw, union128bf16_uw and CHECK_EXP for them. * gcc.target/i386/avx10-helper.h: New file. * gcc.target/i386/avx10_2-512-bf16-1.c: New test. * gcc.target/i386/avx10_2-512-vaddnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vdivnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vfmaddXXXnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vfmsubXXXnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vfnmaddXXXnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vfnmsubXXXnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vmaxpbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vminpbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vscalefpbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vsubnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-bf16-1.c: Ditto. * gcc.target/i386/avx10_2-vaddnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vdivnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vfmaddXXXnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vfmsubXXXnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vfnmaddXXXnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vfnmsubXXXnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vmaxpbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vminpbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vmulnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vscalefpbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vsubnepbf16-2.c: Ditto. Co-authored-by:
Levy Hsu <admin@levyhsu.com>
-
Levy Hsu authored
gcc/ChangeLog: * config.gcc: Add avx10_2-512convertintrin.h and avx10_2convertintrin.h. * config/i386/i386-builtin-types.def: Add new DEF_POINTER_TYPE and DEF_FUNCTION_TYPE. * config/i386/i386-builtin.def (BDESC): Add new builtins. * config/i386/i386-expand.cc (ix86_expand_args_builtin): Handle AVX10.2. (ix86_expand_round_builtin): Ditto. * config/i386/immintrin.h: Include avx10_2-512convertintrin.h, avx10_2convertintrin.h. * config/i386/sse.md (VHF_AVX10_2): New iterator. (bf16_ph): Add 512 bit mode. (avx10_2_cvt2ps2phx_<mode><mask_name<round_name>): New define_insn. (ssebvecmode): New iterator. (UNSPEC_NECONVERTFP8_PACK): Ditto. (neconvertfp8_pack): Ditto. (vcvt<neconvertfp8_pack><mode><mask_name>): New define_insn. (ssebvecmode_2): New iterator. (UNSPEC_VCVTBIASPH2FP8_PACK): Ditto. (biasph2fp8_pack): Ditto. (vcvt<biasph2fp8_pack>v8hf): New expander. (vcvt<biasph2fp8_pack>v8hf_mask): Ditto. (*vcvt<biasph2bf8_pack>v8hf): New define_insn. (*vcvt<biasph2fp8_pack>v8hf_mask): Ditto. (VHF_AVX10_2_2): New iterator. (vcvt<biasph2fp8_pack><mode><mask_name>): New define_insn. (VHF_256_512): New iterator. (ph2fp8suff): Ditto. (UNSPEC_NECONVERTPH2FP8_PACK): Ditto. (neconvertph2fp8): Ditto. (vcvt<neconvertph2fp8>v8hf_mask): New expander. (*vcvt<neconvertph2fp8>v8hf): New define_insn. (*vcvt<neconvertph2fp8>v8hf_mask): Ditto. (vcvt<neconvertph2fp8><mode><mask_name>): Ditto. (vcvthf82ph<mode><mask_name>): Ditto. * config/i386/avx10_2-512convertintrin.h: New file. * config/i386/avx10_2convertintrin.h: Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add macros for const. * gcc.target/i386/avx-2.c: Ditto. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Ditto. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/avx10_2-512-convert-1.c: New test. * gcc.target/i386/avx10_2-512-vcvt2ps2phx-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtbiasph2bf8-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtbiasph2bf8s-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtbiasph2hf8-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtbiasph2hf8s-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvthf82ph-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtne2ph2bf8-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtne2ph2bf8s-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtne2ph2hf8-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtne2ph2hf8s-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtneph2bf8-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtneph2bf8s-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtneph2hf8-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtneph2hf8s-2.c: Ditto. * gcc.target/i386/avx10_2-convert-1.c: Ditto. * gcc.target/i386/avx10_2-vcvt2ps2phx-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtbiasph2bf8-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtbiasph2bf8s-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtbiasph2hf8-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtbiasph2hf8s-2.c: Ditto. * gcc.target/i386/avx10_2-vcvthf82ph-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtne2ph2bf8-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtne2ph2bf8s-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtne2ph2hf8-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtne2ph2hf8s-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtneph2bf8-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtneph2bf8s-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtneph2hf8-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtneph2hf8s-2.c: Ditto. * gcc.target/i386/fp8-helper.h: New helper file. Co-authored-by:
Levy Hsu <admin@levyhsu.com> Co-authored-by:
Kong Lingling <lingling.kong@intel.com>
-
Haochen Jiang authored
gcc/ChangeLog: * config/i386/avx10_2-512mediaintrin.h: Add new intrins. * config/i386/avx10_2mediaintrin.h: Ditto. * config/i386/i386-builtin.def: Add new builtins. * config/i386/i386-builtins.cc (def_builtin): Handle shared builtins between AVXVNNIINT16 and AVX10.2. * config/i386/i386-expand.cc (ix86_check_builtin_isa_match): Ditto. * config/i386/sse.md (unspec): Add UNSPEC_VDPPHPS. (avx10_2_mpsadbw<mask_name>): New define_insn. (<mask_codefor><sse4_1_avx2>_mpsadbw<mask_name>): Ditto. (vpdp<vpdpwprodtype>_<mode>): Add AVX10_2_256. (vpdp<vpdpwprodtype>_v16si): New defin_insn. (vpdp<vpdpwprodtype>_<mode>_mask): Ditto. (*vpdp<vpdpwprodtype>_<mode>_maskz): Ditto. (vpdp<vpdpwprodtype>_<mode>_maskz): New expander. (vdpphps_<mode>): New define_insn. (vdpphps_<mode>_mask): Ditto. (*vdpphps_<mode>_maskz): Ditto. (vdpphps_<mode>_maskz): New expander. gcc/testsuite/ChangeLog: * gcc.target/i386/avxvnniint16-1.c: Add new macro test. * gcc.target/i386/avx-1.c: Ditto. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Ditto. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/avx10_2-512-media-1.c: Add test. * gcc.target/i386/avx10_2-media-1.c: Ditto. * gcc.target/i386/avxvnniint16-builtin.c: New test. * gcc.target/i386/avx10_2-512-vdpphps-2.c: Ditto. * gcc.target/i386/avx10_2-512-vmpsadbw-2.c: Ditto. * gcc.target/i386/avx10_2-512-vpdpwsud-2.c: Ditto. * gcc.target/i386/avx10_2-512-vpdpwsuds-2.c: Ditto. * gcc.target/i386/avx10_2-512-vpdpwusd-2.c: Ditto. * gcc.target/i386/avx10_2-512-vpdpwusds-2.c: Ditto. * gcc.target/i386/avx10_2-512-vpdpwuud-2.c: Ditto. * gcc.target/i386/avx10_2-512-vpdpwuuds-2.c: Ditto. * gcc.target/i386/avx10_2-builtin-2.c: Ditto. * gcc.target/i386/avx10_2-vdpphps-2.c: Ditto. * gcc.target/i386/avx10_2-vmpsadbw-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpwsud-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpwsuds-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpwusd-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpwusds-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpwuud-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpwuuds-2.c: Ditto. Co-authored-by:
Hongyu Wang <hongyu.wang@intel.com>
-
Hongyu Wang authored
gcc/ChangeLog * config.gcc: Add avx10_2mediaintrin.h and avx10_2-512mediaintrin.h. * config/i386/i386-builtin.def: Add new builtins. * config/i386/i386-builtins.cc (def_builtin): Handle shared builtins between AVXVNNIINT8 and AVX10.2. * config/i386/i386-expand.cc (ix86_check_builtin_isa_match): Ditto. * config/i386/immintrin.h: Include avx10_2mediaintrin.h and avx10_2-512mediaintrin.h * config/i386/sse.md: (VI4_AVX10_2): New. (vpdp<vpdotprodtype>_<mode>): Add AVX10_2_256. (vpdp<vpdotprodtype>_v16si): New define_insn. (vpdp<vpdotprodtype>_<mode>_mask): Ditto. (*vpdp<vpdotprodtype>_<mode>_maskz): Ditto. (vpdp<vpdotprodtype>_<mode>_maskz): New expander. * config/i386/avx10_2-512mediaintrin.h: New file. * config/i386/avx10_2mediaintrin.h: Ditto. gcc/testsuite/ChangeLog * gcc.target/i386/avx512f-helper.h: Reuse AVX512F macros for AVX10. * gcc.target/i386/funcspec-56.inc: Add new target attribute. * lib/target-supports.exp (check_effective_target_avx10_2): New. (check_effective_target_avx10_2_512): Ditto. * gcc.target/i386/avx10-check.h: New test file. * gcc.target/i386/avx10-helper.h: Ditto. * gcc.target/i386/avx10_2-builtin-1.c: Ditto. * gcc.target/i386/avx10_2-512-media-1.c: Ditto. * gcc.target/i386/avx10_2-media-1.c: Ditto.. * gcc.target/i386/avxvnniint8-builtin.c: Ditto. * gcc.target/i386/avx10_2-512-vpdpbssd-2.c: Ditto. * gcc.target/i386/avx10_2-512-vpdpbssds-2.c: Ditto. * gcc.target/i386/avx10_2-512-vpdpbsud-2.c: Ditto. * gcc.target/i386/avx10_2-512-vpdpbsuds-2.c: Ditto. * gcc.target/i386/avx10_2-512-vpdpbuud-2.c: Ditto. * gcc.target/i386/avx10_2-512-vpdpbuuds-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpbssd-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpbssds-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpbsud-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpbsuds-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpbuud-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpbuuds-2.c: Ditto. Co-authored-by:
Haochen Jiang <haochen.jiang@intel.com>
-
Haochen Jiang authored
After AVX10 introduction, we still want to use AVX512 helper functions to avoid duplicate code. In order to reuse them, we need to do some refactor to make sure each function define happen under correct ISA to avoid ABI warnings. gcc/testsuite/ChangeLog: * gcc.target/i386/m512-check.h: Wrap the function define with correct vector size.
-
Pan Li authored
This patch would like to allow IMM for the operand 0 of ussub pattern. Aka .SAT_SUB(1023, y) as the below example. Form 1: #define DEF_SAT_U_SUB_IMM_FMT_1(T, IMM) \ T __attribute__((noinline)) \ sat_u_sub_imm##IMM##_##T##_fmt_1 (T y) \ { \ return (T)IMM >= y ? (T)IMM - y : 0; \ } DEF_SAT_U_SUB_IMM_FMT_1(uint64_t, 1023) Before this patch: 10 │ sat_u_sub_imm82_uint64_t_fmt_1: 11 │ li a5,82 12 │ bgtu a0,a5,.L3 13 │ sub a0,a5,a0 14 │ ret 15 │ .L3: 16 │ li a0,0 17 │ ret After this patch: 10 │ sat_u_sub_imm82_uint64_t_fmt_1: 11 │ li a5,82 12 │ sltu a4,a5,a0 13 │ addi a4,a4,-1 14 │ sub a0,a5,a0 15 │ and a0,a4,a0 16 │ ret The below test suites are passed for this patch: 1. The rv64gcv fully regression test. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_gen_unsigned_xmode_reg): Add new func impl to gen xmode rtx reg from operand rtx. (riscv_expand_ussub): Gen xmode reg for operand 1. * config/riscv/riscv.md: Allow const_int for operand 1. gcc/testsuite/ChangeLog: * gcc.target/riscv/sat_arith.h: Add test helper macro. * gcc.target/riscv/sat_u_sub_imm-1.c: New test. * gcc.target/riscv/sat_u_sub_imm-1_1.c: New test. * gcc.target/riscv/sat_u_sub_imm-1_2.c: New test. * gcc.target/riscv/sat_u_sub_imm-2.c: New test. * gcc.target/riscv/sat_u_sub_imm-2_1.c: New test. * gcc.target/riscv/sat_u_sub_imm-2_2.c: New test. * gcc.target/riscv/sat_u_sub_imm-3.c: New test. * gcc.target/riscv/sat_u_sub_imm-3_1.c: New test. * gcc.target/riscv/sat_u_sub_imm-3_2.c: New test. * gcc.target/riscv/sat_u_sub_imm-4.c: New test. * gcc.target/riscv/sat_u_sub_imm-run-1.c: New test. * gcc.target/riscv/sat_u_sub_imm-run-2.c: New test. * gcc.target/riscv/sat_u_sub_imm-run-3.c: New test. * gcc.target/riscv/sat_u_sub_imm-run-4.c: New test. Signed-off-by:
Pan Li <pan2.li@intel.com>
-
Pan Li authored
This patch would like to add test cases for the unsigned vector .SAT_TRUNC form 4. Aka: Form 4: #define DEF_VEC_SAT_U_TRUNC_FMT_4(NT, WT) \ void __attribute__((noinline)) \ vec_sat_u_trunc_##NT##_##WT##_fmt_4 (NT *out, WT *in, unsigned limit) \ { \ unsigned i; \ for (i = 0; i < limit; i++) \ { \ bool not_overflow = in[i] <= (WT)(NT)(-1); \ out[i] = ((NT)in[i]) | (NT)((NT)not_overflow - 1); \ } \ } DEF_VEC_SAT_U_TRUNC_FMT_4 (uint32_t, uint64_t) The below test is passed for this patch. * The rv64gcv regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-19.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-20.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-21.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-22.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-23.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-24.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-19.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-20.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-21.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-22.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-23.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-24.c: New test. Signed-off-by:
Pan Li <pan2.li@intel.com>
-
Pan Li authored
This patch would like to add test cases for the unsigned scalar quad and oct .SAT_TRUNC form 4. Aka: Form 4: #define DEF_SAT_U_TRUNC_FMT_4(NT, WT) \ NT __attribute__((noinline)) \ sat_u_trunc_##WT##_to_##NT##_fmt_4 (WT x) \ { \ bool not_overflow = x <= (WT)(NT)(-1); \ return ((NT)x) | (NT)((NT)not_overflow - 1); \ } The below test is passed for this patch. * The rv64gcv regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/sat_arith.h: Add test helper macros. * gcc.target/riscv/sat_u_trunc-19.c: New test. * gcc.target/riscv/sat_u_trunc-20.c: New test. * gcc.target/riscv/sat_u_trunc-21.c: New test. * gcc.target/riscv/sat_u_trunc-22.c: New test. * gcc.target/riscv/sat_u_trunc-23.c: New test. * gcc.target/riscv/sat_u_trunc-24.c: New test. * gcc.target/riscv/sat_u_trunc-run-19.c: New test. * gcc.target/riscv/sat_u_trunc-run-20.c: New test. * gcc.target/riscv/sat_u_trunc-run-21.c: New test. * gcc.target/riscv/sat_u_trunc-run-22.c: New test. * gcc.target/riscv/sat_u_trunc-run-23.c: New test. * gcc.target/riscv/sat_u_trunc-run-24.c: New test. Signed-off-by:
Pan Li <pan2.li@intel.com>
-
GCC Administrator authored
-
- Aug 25, 2024
-
-
demin.han authored
Currently, some binops of vector vs double scalar under RV32 can't translated to vf but vfmv+vxx.vv. The cause is that vec_duplicate is also expanded to broadcast for double mode under RV32. last-combine can't process expanded broadcast. gcc/ChangeLog: * config/riscv/vector.md: Add !FLOAT_MODE_P constraint. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/vadd-rv32gcv-nofm.c: Fix test. * gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv-nofm.c: Ditto. * gcc.target/riscv/rvv/autovec/binop/vmul-rv32gcv-nofm.c: Ditto. * gcc.target/riscv/rvv/autovec/binop/vsub-rv32gcv-nofm.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_copysign-rv32gcv.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fadd-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fadd-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fadd-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fadd-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-5.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-6.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-5.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-6.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmul-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmul-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmul-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmul-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmul-5.c: Ditto.
-
Xianmiao Qu authored
The previous patch: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=d8a6945c6ea22efa4d5e42fe1922d2b27953c8cd aimed to eliminate redundant MOV instructions by removing calling emit_clobber in lower-subreg.cc's resolve_simple_move. First, I found that another patch address this issue: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=bdf2737cda53a83332db1a1a021653447b05a7e7 and even without removing calling emit_clobber, the instruction generation is still as expected. Second, removing the CLOBBER expression will have side effects. When there is no CLOBBER expression and only SUBREG assignments exist, according to the logic of the 'df_lr_bb_local_compute' function, the register will be added to the basic block LR IN set. This will cause the register's lifetime to span the entire function, resulting in increased register pressure. Taking the newly added test case 'gcc/testsuite/gcc.target/riscv/pr43644.c' as an example, removing the CLOBBER expression will lead to spill in some registers. gcc/: * lower-subreg.cc (resolve_simple_move): Re-add calling emit_clobber immediately before moving a multi-word register by parts. gcc/testsuite/: * gcc.target/riscv/pr43644.c: New test case.
-
Dimitar Dimitrov authored
The test case uses "atomic<int>", which fails to link on pru-unknown-elf target due to missing __atomic_load_4 symbol. Fix by filtering for sync_int_long effective target. Ensured that the test still passes for x86_64-pc-linux-gnu. gcc/testsuite/ChangeLog: * g++.dg/init/array54.C: Require sync_int_long effective target. Signed-off-by:
Dimitar Dimitrov <dimitar@dinux.eu>
-
Andi Kleen authored
The gimple-if-to-switch pass converts if statements with multiple equal checks on the same value to a switch. This breaks vectorization which cannot handle switches. Teach the tree-if-conv pass used by the vectorizer to handle simple switch statements, like those created by if-to-switch earlier. These are switches that only have a single non default block, They are handled similar to COND in if conversion. This makes the vect-bitfield-read-1-not test fail. The test checks for a bitfield analysis failing, but it actually relied on the ifcvt erroring out early because the test is using a switch. The if conversion still does not work because the switch is not in a form that this patch can handle, but it fails much later and the bitfield analysis succeeds, which makes the test fail. I marked it xfail because it doesn't seem to be testing what it wants to test. PR tree-optimization/115866 gcc/ChangeLog: * tree-if-conv.cc (if_convertible_switch_p): New function. (if_convertible_stmt_p): Check for switch. (get_loop_body_in_if_conv_order): Handle switch. (predicate_bbs): Likewise. (predicate_statements): Likewise. (remove_conditions_and_labels): Likewise. (ifcvt_split_critical_edges): Likewise. (ifcvt_local_dce): Likewise. gcc/testsuite/ChangeLog: * gcc.dg/vect/vect-switch-ifcvt-1.c: New test. * gcc.dg/vect/vect-switch-ifcvt-2.c: New test. * gcc.dg/vect/vect-switch-search-line-fast.c: New test. * gcc.dg/vect/vect-bitfield-read-1-not.c: Change to xfail.
-
Mark Harmstone authored
Write CodeView S_LDATA32 symbols for static locals in optimized code. We have to handle these separately, as they come after the S_FRAMEPROC, plus you can't have S_BLOCK32 symbols like you can in unoptimized code. gcc/ * dwarf2codeview.cc (write_optimized_static_local_vars): New function. (write_function): Call write_optimized_static_local_vars.
-