- Aug 16, 2023
-
-
liuhongt authored
For more details of GDS (Gather Data Sampling), refer to https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/advisory-guidance/gather-data-sampling.html After microcode update, there's performance regression. To avoid that, the patch disables gather generation in autovectorization but uses gather scalar emulation instead. gcc/ChangeLog: * config/i386/i386-options.cc (m_GDS): New macro. * config/i386/x86-tune.def (X86_TUNE_USE_GATHER_2PARTS): Don't enable for m_GDS. (X86_TUNE_USE_GATHER_4PARTS): Ditto. (X86_TUNE_USE_GATHER): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx2-gather-2.c: Adjust options to keep gather vectorization. * gcc.target/i386/avx2-gather-6.c: Ditto. * gcc.target/i386/avx512f-pr88464-1.c: Ditto. * gcc.target/i386/avx512f-pr88464-5.c: Ditto. * gcc.target/i386/avx512vl-pr88464-1.c: Ditto. * gcc.target/i386/avx512vl-pr88464-11.c: Ditto. * gcc.target/i386/avx512vl-pr88464-3.c: Ditto. * gcc.target/i386/avx512vl-pr88464-9.c: Ditto. * gcc.target/i386/pr88531-1b.c: Ditto. * gcc.target/i386/pr88531-1c.c: Ditto.
-
liuhongt authored
vmovapd can enable register renaming and have same code size as vmovsd. Similar for vmovsh vs vmovaps, vmovaps is 1 byte less than vmovsh. When TARGET_AVX512VL is not available, still generate vmovsd/vmovss/vmovsh to avoid vmovapd/vmovaps zmm16-31. gcc/ChangeLog: * config/i386/i386.md (movdf_internal): Generate vmovapd instead of vmovsd when moving DFmode between SSE_REGS. (movhi_internal): Generate vmovdqa instead of vmovsh when moving HImode between SSE_REGS. (mov<mode>_internal): Use vmovaps instead of vmovsh when moving HF/BFmode between SSE_REGS. gcc/testsuite/ChangeLog: * gcc.target/i386/pr89229-4a.c: Adjust testcase.
-
GCC Administrator authored
-
- Aug 15, 2023
-
-
David Faust authored
This define_insn is never used, since a sign-extend to the same mode is just a move, so delete it. gcc/ * config/bpf/bpf.md (extendsisi2): Delete useless define_insn.
-
David Faust authored
In the BPF pseudo-c assembly dialect, registers treated as 32-bits rather than the full 64 in various instructions ought to be printed as "wN" rather than "rN". But bpf_print_register () was only doing this for specifically SImode registers, meaning smaller modes were printed incorrectly. This caused assembler errors like: Error: unrecognized instruction `w2 =(s8)r1' for a 32-bit sign-extending register move instruction, where the source register is used in QImode. Fix bpf_print_register () to print the "w" version of register when specified by the template for any mode 32-bits or smaller. PR target/111029 gcc/ * config/bpf/bpf.cc (bpf_print_register): Print 'w' registers for any mode 32-bits or smaller, not just SImode. gcc/testsuite/ * gcc.target/bpf/smov-2.c: New test. * gcc.target/bpf/smov-pseudoc-2.c: New test.
-
Martin Jambor authored
PRs 68930 and 92497 show that when IPA-CP figures out constants in aggregate parameters or when passed by reference but the loads happen in an inlined function the information is lost. This happens even when the inlined function itself was known to have - or even cloned to have - such constants in incoming parameters because the transform phase of IPA passes is not run on them. See discussion in the bugs for reasons why. Honza suggested that we can plug the results of IPA-CP analysis into value numbering, so that FRE can figure out that some loads fetch known constants. This is what this patch attempts to do. The patch does not attempt to populate partial_defs with information from IPA-CP, this can be hopefully added as a follow-up. gcc/ChangeLog: 2023-08-11 Martin Jambor <mjambor@suse.cz> PR ipa/68930 PR ipa/92497 * ipa-prop.h (ipcp_get_aggregate_const): Declare. * ipa-prop.cc (ipcp_get_aggregate_const): New function. (ipcp_transform_function): Do not deallocate transformation info. * tree-ssa-sccvn.cc: Include alloc-pool.h, symbol-summary.h and ipa-prop.h. (vn_reference_lookup_2): When hitting default-def vuse, query IPA-CP transformation info for any known constants. gcc/testsuite/ChangeLog: 2023-06-07 Martin Jambor <mjambor@suse.cz> PR ipa/68930 PR ipa/92497 * gcc.dg/ipa/pr92497-1.c: New test. * gcc.dg/ipa/pr92497-2.c: Likewise.
-
Iain Buclaw authored
This ICE is specific to the D front-end language version in GDC 12, however a test has been added to mainline to catch the unlikely event of a regression. PR d/110959 gcc/testsuite/ChangeLog: * gdc.dg/pr110959.d: New test.
-
Martin Jambor authored
This patch addresses an issue uncovered by the undefined behavior sanitizer. In function resolve_structure_cons in resolve.cc there is a test starting with: if (cons->expr->ts.type == BT_CHARACTER && comp->ts.u.cl && comp->ts.u.cl->length && comp->ts.u.cl->length->expr_type == EXPR_CONSTANT and UBSAN complained of loads from comp->ts.u.cl->length->expr_type of integer value 1818451807 which is outside of the value range expr_t enum. If I understand the code correctly it the entire load was unwanted because comp->ts.type in those cases is BT_CLASS and not BT_CHARACTER. This patch simply adds a check to make sure it is only accessed in those cases. During review, Harald Anlauf noticed that length types also need to be checked and so I added also checks that he suggested to the condition. Co-authored-by:
Harald Anlauf <anlauf@gmx.de> gcc/fortran/ChangeLog: 2023-08-14 Martin Jambor <mjambor@suse.cz> PR fortran/110677 * resolve.cc (resolve_structure_cons): Check comp->ts is character type before accessing stuff through comp->ts.u.cl.
-
Chung-Lin Tang authored
This patch implements the OpenACC 2.7 addition of default(none|present) support for data constructs. Now, specifying "default(none|present)" on a data construct turns on same default clause behavior for all lexically enclosed compute constructs (which don't already themselves have a default clause). gcc/c/ChangeLog: * c-parser.cc (OACC_DATA_CLAUSE_MASK): Add PRAGMA_OACC_CLAUSE_DEFAULT. gcc/cp/ChangeLog: * parser.cc (OACC_DATA_CLAUSE_MASK): Add PRAGMA_OACC_CLAUSE_DEFAULT. gcc/fortran/ChangeLog: * openmp.cc (OACC_DATA_CLAUSES): Add OMP_CLAUSE_DEFAULT. gcc/ChangeLog: * gimplify.cc (oacc_region_type_name): New function. (oacc_default_clause): If no 'default' clause appears on this compute construct, see if one appears on a lexically containing 'data' construct. (gimplify_scan_omp_clauses): Upon OMP_CLAUSE_DEFAULT case, set ctx->oacc_default_clause_ctx to current context. gcc/testsuite/ChangeLog: * c-c++-common/goacc/default-3.c: Adjust testcase. * c-c++-common/goacc/default-4.c: Adjust testcase. * c-c++-common/goacc/default-5.c: Adjust testcase. * gfortran.dg/goacc/default-3.f95: Adjust testcase. * gfortran.dg/goacc/default-4.f: Adjust testcase. * gfortran.dg/goacc/default-5.f: Adjust testcase. Co-authored-by:
Thomas Schwinge <thomas@codesourcery.com>
-
Juzhe-Zhong authored
Currently, autovec_length_operand predicate incorrect configuration is discovered in PR110989 since this following situation: vect__6.24_107 = .MASK_LEN_LOAD (vectp.22_105, 32B, mask__49.21_99, POLY_INT_CST [2, 2], 0); ---> dummy length = VF. The current autovec length operand failed to recognize the VF dummy length. -march=rv64gcv -mabi=lp64d --param=riscv-autovec-preference=scalable -Ofast -fno-schedule-insns -fno-schedule-insns2: Before this patch: srli a4,s0,2 addi a4,a4,-3 srli s0,s0,3 vsetvli a5,zero,e64,m1,ta,ma vid.v v1 vmul.vx v1,v1,a4 addi a4,s0,-2 vadd.vx v1,v1,a4 addi a4,s0,-1 vslide1up.vx v2,v1,a4 vmv.v.x v1,a4 vand.vv v1,v2,v1 vl1re64.v v3,0(t2) vrgather.vv v2,v3,v1 vmv.v.i v1,0 vmfeq.vv v0,v2,v1 vsetvli zero,s0,e32,mf2,ta,ma ---> s0 = POLY (2,2) vle32.v v3,0(t3),v0.t vsetvli a5,zero,e64,m1,ta,ma vmfne.vv v0,v2,v1 vsetvli zero,zero,e32,mf2,ta,ma vfwcvt.f.x.v v1,v3 vsetvli zero,zero,e64,m1,ta,ma vmerge.vvm v1,v1,v2,v0 vslidedown.vx v1,v1,a4 vfmv.f.s fa5,v1 j .L6 After this patch: srli a4,s0,2 addi a4,a4,-3 srli s0,s0,3 vsetvli a5,zero,e64,m1,ta,ma vid.v v1 vmul.vx v1,v1,a4 addi a4,s0,-2 vadd.vx v1,v1,a4 addi s0,s0,-1 vslide1up.vx v2,v1,s0 vmv.v.x v1,s0 vand.vv v1,v2,v1 vl1re64.v v3,0(t2) vrgather.vv v2,v3,v1 vmv.v.i v1,0 vmfeq.vv v0,v2,v1 vle32.v v3,0(t3),v0.t vmfne.vv v0,v2,v1 vsetvli zero,zero,e32,mf2,ta,ma vfwcvt.f.x.v v1,v3 vsetvli zero,zero,e64,m1,ta,ma vmerge.vvm v1,v1,v2,v0 vslidedown.vx v1,v1,s0 vfmv.f.s fa5,v1 j .L6 2 vsetvli insns are reduced. gcc/ChangeLog: PR target/110989 * config/riscv/predicates.md: Fix predicate. gcc/testsuite/ChangeLog: PR target/110989 * gcc.target/riscv/rvv/autovec/pr110989.c: Add vsetvli assembly check.
-
Richard Biener authored
The following moves CONSTRUCTOR handling into the generic BB vectorization roots handling, removing a special case and finally renaming the function now consisting of more than just constructor detection. * tree-vect-slp.cc (vect_analyze_slp_instance): Remove slp_inst_kind_ctor handling. (vect_analyze_slp): Simplify. (vect_build_slp_instance): Dump when we analyze a CTOR. (vect_slp_check_for_constructors): Rename to ... (vect_slp_check_for_roots): ... this. Register a slp_root for CONSTRUCTORs instead of shoving them to the set of grouped stores. (vect_slp_analyze_bb_1): Adjust.
-
Richard Biener authored
The following supports vectorizing BB reductions involving a constant or an invariant. * tree-vectorizer.h (_slp_instance::remain_stmts): Change to ... (_slp_instance::remain_defs): ... this. (SLP_INSTANCE_REMAIN_STMTS): Rename to ... (SLP_INSTANCE_REMAIN_DEFS): ... this. (slp_root::remain): New. (slp_root::slp_root): Adjust. * tree-vect-slp.cc (vect_free_slp_instance): Adjust. (vect_build_slp_instance): Get extra remain parameter, adjust former handling of a cut off stmt. (vect_analyze_slp_instance): Adjust. (vect_analyze_slp): Likewise. (_bb_vec_info::~_bb_vec_info): Likewise. (vectorizable_bb_reduc_epilogue): Dump something if we fail. (vect_slp_check_for_constructors): Handle non-internal defs as remain defs of a reduction. (vectorize_slp_instance_root_stmt): Adjust. * gcc.dg/vect/bb-slp-75.c: New testcase.
-
Richard Biener authored
The following uses the common find_loop_location as implemented by the vectorizer to query a loop location also for unrolling. That results in a more consistent reporting of locations. * tree-ssa-loop-ivcanon.cc: Include tree-vectorizer.h (canonicalize_loop_induction_variables): Use find_loop_location.
-
Hans-Peter Nilsson authored
While there's another patch that fixes the immediate error in the PR by other means, the include of tree.h here is something I prefer to avoid. PR bootstrap/111021 * config/cris/cris-protos.h: Revert recent change. * config/cris/cris.cc (cris_legitimate_address_p): Remove code_helper unused parameter. (cris_legitimate_address_p_hook): New wrapper function. (TARGET_LEGITIMATE_ADDRESS_P): Change to cris_legitimate_address_p_hook.
-
Richard Biener authored
The following adjusts the heuristic when we perform PHI insertion during GIMPLE PRE from requiring at least one edge that is supposed to be optimized for speed to also doing insertion when the expression is available on all edges (but possibly with different value) and we'd at most have one copy from a constant. The first ensures we optimize two computations on all paths to one plus a possible copy due to the PHI, the second makes sure we do not need to insert many possibly large copies from constants, disregarding the cummulative size cost of the register copies when they are not coalesced. The case in the testcase is <bb 5> _14 = h; if (_14 == 0B) goto <bb 7>; else goto <bb 6>; <bb 6> h = 0B; <bb 7> h.6_12 = h; and we want to optimize that to <bb 7> # h.6_12 = PHI <_14(5), 0B(6)> If we want to consider the cost of the register copies I think the only simplistic enough way would be to restrict the special-case to two incoming edges - we'd assume one register copy is coalesced leaving one copy from a register or from a constant. As with every optimization the downstream effects are probably bigger than what we can locally estimate. PR tree-optimization/110963 * tree-ssa-pre.cc (do_pre_regular_insertion): Also insert a PHI node when the expression is available on all edges and we insert at most one copy from a constant. * gcc.dg/tree-ssa/ssa-pre-34.c: New testcase.
-
Richard Biener authored
The following testcase shows that we are bad at identifying inductions that will be optimized away after vectorizing them because SCEV doesn't handle vectorized defs. The following rolls a simpler identification of SSA cycles covering a PHI and an assignment with a binary operator with a constant second operand. PR tree-optimization/110991 * tree-ssa-loop-ivcanon.cc (constant_after_peeling): Handle VIEW_CONVERT_EXPR <op>, handle more simple IV-like SSA cycles that will end up constant. * gcc.dg/tree-ssa/cunroll-16.c: New testcase.
-
Kewen Lin authored
Commit r14-3093 introduced a random build failure on build/gencondmd.cc building. Since r14-3093 make recog.h include tree.h, which further includes (depends on) some files that are generated during the building, such as: all-tree.def, tree-check.h etc, when building file build/gencondmd.cc, the build can fail if these dependences are not ready. So this patch is to teach this dependence. Thank Jan-Benedict Glaw for testing this! PR bootstrap/111021 gcc/ChangeLog: * Makefile.in (RECOG_H): Add $(TREE_H) as dependence.
-
Kewen Lin authored
Following Richi's suggestion [1], this patch is to move the handlings on VMAT_LOAD_STORE_LANES in the final loop nest of function vectorizable_load to its own loop. Basically it duplicates the final loop nest, clean up some useless set up code for the case of VMAT_LOAD_STORE_LANES, remove some unreachable code. Also remove the corresponding handlings in the final loop nest. [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-June/623329.html gcc/ChangeLog: * tree-vect-stmts.cc (vectorizable_load): Move the handlings on VMAT_LOAD_STORE_LANES in the final loop nest to its own loop, and update the final nest accordingly.
-
Kewen Lin authored
In function vectorizable_load, there is one hunk which is dedicated for the handlings on VMAT_INVARIANT and return early, it means we shouldn't encounter any cases with memory_access_type VMAT_INVARIANT in the following code after that. This patch is to clean up several useless checks on VMAT_INVARIANT. There should be no functional changes. gcc/ChangeLog: * tree-vect-stmts.cc (vectorizable_load): Remove some useless checks on VMAT_INVARIANT.
-
Pan Li authored
In same cases, like gcc/testsuite/gcc.dg/pr78148.c in RISC-V, there will be only 1 operand when SET_SRC in create_pre_exit. For example as below. (insn 13 9 14 2 (clobber (reg/i:TI 10 a0)) "gcc/testsuite/gcc.dg/pr78148.c":24:1 -1 (expr_list:REG_UNUSED (reg/i:TI 10 a0) (nil))) Unfortunately, SET_SRC requires at least 2 operands and then Segment Fault here. For SH4 part result in Segment Fault, it looks like only valid when the return_copy_pat is load or something like that. Thus, this patch try to fix it by restrict the SET insn for SET_SRC. Signed-off-by:
Pan Li <pan2.li@intel.com> gcc/ChangeLog: * mode-switching.cc (create_pre_exit): Add SET insn check. gcc/testsuite/ChangeLog: * gcc.target/riscv/mode-switch-ice-1.c: New test.
-
Pan Li authored
Update in v2: 1. Remove the template of vfrec7 frm class. 2. Update the vfrec7_frm_obj declaration. Original logs: This patch would like to support the rounding mode API for the VFREC7 as the below samples. * __riscv_vfrec7_v_f32m1_rm * __riscv_vfrec7_v_f32m1_rm_m Signed-off-by:
Pan Li <pan2.li@intel.com> gcc/ChangeLog: * config/riscv/riscv-vector-builtins-bases.cc (class vfrec7_frm): New class for frm. (vfrec7_frm_obj): New declaration. (BASE): Ditto. * config/riscv/riscv-vector-builtins-bases.h: Ditto. * config/riscv/riscv-vector-builtins-functions.def (vfrec7_frm): New intrinsic function definition. * config/riscv/vector-iterators.md (VFMISC): Remove VFREC7. (misc_op): Ditto. (float_insn_type): Ditto. (VFMISC_FRM): New int iterator. (misc_frm_op): New op for frm. (float_frm_insn_type): New type for frm. * config/riscv/vector.md (@pred_<misc_frm_op><mode>): New pattern for misc frm. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/float-point-rec7.c: New test.
-
GCC Administrator authored
-
- Aug 14, 2023
-
-
Vladimir N. Makarov authored
Previous patch setting up asserts for processing stack pointer reloads caught an error in code moving sp offset. This resulted in failure of building aarch64 port. The code wrongly processed insns beyond the output reloads of the current insn. This patch fixes it. gcc/ChangeLog: * lra-constraints.cc (curr_insn_transform): Process output stack pointer reloads before emitting reload insns.
-
Mikael Morin authored
Use distinct error codes, so that we can spot directly from the testsuite log which case is failing. gcc/testsuite/ChangeLog: * gfortran.dg/value_9.f90 (val, val4, sub, sub4): Take the error codes from the arguments. (p): Update calls: pass explicit distinct error codes.
-
Mikael Morin authored
Revision r14-2171-g8736d6b14a4dfdfb58c80ccd398981b0fb5d00aa changed the argument passing convention for length 1 value dummy arguments to pass just the single character by value. However, the procedure declarations weren't updated to reflect the change in the argument types. This change does the missing argument type update. The change of argument types generated an internal error in gfc_conv_string_parameter with value_9.f90. Indeed, that function is not prepared for bare character type, so it is updated as well. The condition guarding the single character argument passing code is loosened to not exclude non-interoperable kind (this fixes a regression with c_char_tests_2.f03). Finally, the constant string argument passing code is updated as well to extract the single char and pass it instead of passing it as a length one string. As the code taking care of non-constant arguments was already doing this, the condition guarding it is just removed. With these changes, value_9.f90 passes on 32 bits big-endian powerpc. PR fortran/110360 PR fortran/110419 gcc/fortran/ChangeLog: * trans-types.cc (gfc_sym_type): Use a bare character type for length one value character dummy arguments. * trans-expr.cc (gfc_conv_string_parameter): Handle single character case. (gfc_conv_procedure_call): Don't exclude interoperable kinds from single character handling. For single character dummy arguments, extend the existing handling of non-constant expressions to constant expressions. gcc/testsuite/ChangeLog: * gfortran.dg/bind_c_usage_13.f03: Update tree dump patterns.
-
Mikael Morin authored
Introduce a new predicate to simplify conditionals checking for a character type whose length is the constant one. gcc/fortran/ChangeLog: * gfortran.h (gfc_length_one_character_type_p): New inline function. * check.cc (is_c_interoperable): Use gfc_length_one_character_type_p. * decl.cc (verify_bind_c_sym): Same. * trans-expr.cc (gfc_conv_procedure_call): Same.
-
benjamin priour authored
This patch introduces -fanalyzer-show-events-in-system-headers, disabled by default. This option reduces the noise of the analyzer emitted diagnostics when dealing with system headers. The new option only affects the display of the diagnostics, but doesn't hinder the actual analysis. Given a diagnostics path diving into a system header in the form [ prefix events..., system header call, system header entry, events within system headers..., system header return, suffix events... ] then disabling the option (either by default or explicitly) will shorten the path into: [ prefix events..., system header call, system header return, suffix events... ] Signed-off-by:
benjamin priour <priour.be@gmail.com> gcc/analyzer/ChangeLog: PR analyzer/110543 * analyzer.opt: Add new option. * diagnostic-manager.cc (diagnostic_manager::prune_path): Call prune_system_headers. (prune_frame): New function that deletes all events in a frame. (diagnostic_manager::prune_system_headers): New function. * diagnostic-manager.h: Add prune_system_headers declaration. gcc/ChangeLog: PR analyzer/110543 * doc/invoke.texi: Add documentation of fanalyzer-show-events-in-system-headers gcc/testsuite/ChangeLog: PR analyzer/110543 * g++.dg/analyzer/fanalyzer-show-events-in-system-headers-default.C: New test. * g++.dg/analyzer/fanalyzer-show-events-in-system-headers-no.C: New test. * g++.dg/analyzer/fanalyzer-show-events-in-system-headers.C: New test.
-
gnaggnoyil authored
DR 2386 updated the tuple_size requirements for structured binding and it now requires tuple_size to be considered only if std::tuple_size<TYPE> names a complete class type with member value. GCC before this patch does not follow the updated requrements, and this patch is intended to implement it. (jason) Accepting pseudonym sign-off because a change this small is not legally significant for copyright. DR 2386 PR c++/110216 gcc/cp/ChangeLog: * decl.cc (get_tuple_size): Update implementation for DR 2386. gcc/testsuite/ChangeLog: * g++.dg/cpp1z/decomp10.C: Update expected error for DR 2386. * g++.dg/cpp1z/pr110216.C: New test. Signed-off-by:
gnaggnoyil <gnaggnoyil@gmail.com> Reviewed-by:
Jason Merrill <jason@redhat.com>
-
Jason Merrill authored
Since -fconcepts no longer implies -fconcepts-ts, we shouldn't advertise TS support with __cpp_concepts=201507L. Also fix one case where -std=c++14 -fconcepts wasn't working (as found by range-v3 calendar). Fixing other cases is not a priority, probably better to reject that flag combination if there are further issues. gcc/c-family/ChangeLog: * c-cppbuiltin.cc (c_cpp_builtins): Adjust __cpp_concepts. gcc/cp/ChangeLog: * parser.cc (cp_parser_simple_type_specifier): Handle -std=c++14 -fconcepts.
-
Paul Dreik authored
If abs(__v) is smaller than one, the result will be of the form 0.xxxxx. It is only if the magnitude is large that more digits are needed before the decimal dot. This uses frexp instead of log10 which should be less expensive and have sufficient precision for the desired purpose. It removes the problematic cases where log10 will be negative or not fit in an int. Signed-off-by:
Paul Dreik <gccpatches@pauldreik.se> libstdc++-v3/ChangeLog: PR libstdc++/110860 * include/std/format (__formatter_fp::format): Use frexp instead of log10.
-
Jan Hubicka authored
My patch to fix profile after folding internal call is missing check for the case profile was already zero before if-conversion. gcc/ChangeLog: PR gcov-profile/110988 * tree-cfg.cc (fold_loop_internal_call): Avoid division by zero.
-
Jiawei authored
Add ZC* extensions march args tests for error input cases. Co-Authored by: Nandni Jamnadas <nandni.jamnadas@embecosm.com> Co-Authored by: Jiawei <jiawei@iscas.ac.cn> Co-Authored by: Mary Bennett <mary.bennett@embecosm.com> Co-Authored by: Simon Cook <simon.cook@embecosm.com> gcc/testsuite/ChangeLog: * gcc.target/riscv/arch-24.c: New test. * gcc.target/riscv/arch-25.c: New test.
-
Jiawei authored
This patch enables the compressible features with ZC* extensions. Since all ZC* extension depends on the Zca extension, it's sufficient to only add the target Zca to extend the target RVC. Co-Authored by: Mary Bennett <mary.bennett@embecosm.com> Co-Authored by: Nandni Jamnadas <nandni.jamnadas@embecosm.com> Co-Authored by: Simon Cook <simon.cook@embecosm.com> gcc/ChangeLog: * config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins): Enable compressed builtins when ZC* extensions enabled. * config/riscv/riscv-shorten-memrefs.cc: Enable shorten_memrefs pass when ZC* extensions enabled. * config/riscv/riscv.cc (riscv_compressed_reg_p): Enable compressible registers when ZC* extensions enabled. (riscv_rtx_costs): Allow adjusting rtx costs when ZC* extensions enabled. (riscv_address_cost): Allow adjusting address cost when ZC* extensions enabled. (riscv_first_stack_step): Allow compression of the register saves without adding extra instructions. * config/riscv/riscv.h (FUNCTION_BOUNDARY): Adjusts function boundary to 16 bits when ZC* extensions enabled.
-
Jiawei authored
This patch is the minimal support for ZC* extensions, include the extension name, mask and target defination. Also define the dependencies with Zca and Zce extension. Notes that all ZC* extensions depend on the Zca extension. Zce includes all relevant ZC* extensions for microcontrollers using. Zce will imply zcf when 'f' extension enabled in rv32. Co-Authored by: Charlie Keaney <charlie.keaney@embecosm.com> Co-Authored by: Mary Bennett <mary.bennett@embecosm.com> Co-Authored by: Nandni Jamnadas <nandni.jamnadas@embecosm.com> Co-Authored by: Simon Cook <simon.cook@embecosm.com> Co-Authored by: Sinan Lin <sinan.lin@linux.alibaba.com> Co-Authored by: Shihua Liao <shihua@iscas.ac.cn> Co-Authored by: Yulong Shi <yulong@iscas.ac.cn> gcc/ChangeLog: * common/config/riscv/riscv-common.cc (riscv_subset_list::parse): New extensions. * config/riscv/riscv-opts.h (MASK_ZCA): New mask. (MASK_ZCB): Ditto. (MASK_ZCE): Ditto. (MASK_ZCF): Ditto. (MASK_ZCD): Ditto. (MASK_ZCMP): Ditto. (MASK_ZCMT): Ditto. (TARGET_ZCA): New target. (TARGET_ZCB): Ditto. (TARGET_ZCE): Ditto. (TARGET_ZCF): Ditto. (TARGET_ZCD): Ditto. (TARGET_ZCMP): Ditto. (TARGET_ZCMT): Ditto. * config/riscv/riscv.opt: New target variable.
-
Juzhe-Zhong authored
This reverts commit 6c6f9604.
-
Richard Biener authored
It ICEs when invoked via debug_loops and dump_file clear. * tree-cfg.cc (print_loop_info): Dump to 'file', not 'dump_file'.
-
Pan Li authored
This patch would like to support the rounding mode API for the VFSQRT as the below samples. * __riscv_vfsqrt_v_f32m1_rm * __riscv_vfsqrt_v_f32m1_rm_m Signed-off-by:
Pan Li <pan2.li@intel.com> gcc/ChangeLog: * config/riscv/riscv-vector-builtins-bases.cc (class unop_frm): New class for frm. (vfsqrt_frm_obj): New declaration. (BASE): Ditto. * config/riscv/riscv-vector-builtins-bases.h: Ditto. * config/riscv/riscv-vector-builtins-functions.def (vfsqrt_frm): New intrinsic function definition. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/float-point-sqrt.c: New test.
-
Pan Li authored
This patch would like to support the rounding mode API for the VFWNMSAC as the below samples. * __riscv_vfwnmsac_vv_f64m2_rm * __riscv_vfwnmsac_vv_f64m2_rm_m * __riscv_vfwnmsac_vf_f64m2_rm * __riscv_vfwnmsac_vf_f64m2_rm_m Signed-off-by:
Pan Li <pan2.li@intel.com> gcc/ChangeLog: * config/riscv/riscv-vector-builtins-bases.cc (class vfwnmsac_frm): New class for frm. (vfwnmsac_frm_obj): New declaration. (BASE): Ditto. * config/riscv/riscv-vector-builtins-bases.h: Ditto. * config/riscv/riscv-vector-builtins-functions.def (vfwnmsac_frm): New intrinsic function definition. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/float-point-wnmsac.c: New test.
-
Pan Li authored
This patch would like to support the rounding mode API for the VFWMSAC as the below samples. * __riscv_vfwmsac_vv_f64m2_rm * __riscv_vfwmsac_vv_f64m2_rm_m * __riscv_vfwmsac_vf_f64m2_rm * __riscv_vfwmsac_vf_f64m2_rm_m Signed-off-by:
Pan Li <pan2.li@intel.com> gcc/ChangeLog: * config/riscv/riscv-vector-builtins-bases.cc (class vfwmsac_frm): New class for frm. (vfwmsac_frm_obj): New declaration. (BASE): Ditto. * config/riscv/riscv-vector-builtins-bases.h: Ditto. * config/riscv/riscv-vector-builtins-functions.def (vfwmsac_frm): New intrinsic function definition. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/float-point-wmsac.c: New test.
-
Pan Li authored
This patch would like to support the rounding mode API for the VFWNMACC as the below samples. * __riscv_vfwnmacc_vv_f64m2_rm * __riscv_vfwnmacc_vv_f64m2_rm_m * __riscv_vfwnmacc_vf_f64m2_rm * __riscv_vfwnmacc_vf_f64m2_rm_m Signed-off-by:
Pan Li <pan2.li@intel.com> gcc/ChangeLog: * config/riscv/riscv-vector-builtins-bases.cc (class vfwnmacc_frm): New class for frm. (vfwnmacc_frm_obj): New declaration. (BASE): Ditto. * config/riscv/riscv-vector-builtins-bases.h: Ditto. * config/riscv/riscv-vector-builtins-functions.def (vfwnmacc_frm): New intrinsic function definition. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/float-point-wnmacc.c: New test.
-