Skip to content
Snippets Groups Projects
  1. Oct 12, 2023
    • Kewen Lin's avatar
      testsuite: Avoid uninit var in pr60510.f [PR111427] · 610b845a
      Kewen Lin authored
      The uninitialized variable a in pr60510.f can cause
      some random failures as exposed in PR111427.  This
      patch is to make it initialized accordingly.
      
      	PR testsuite/111427
      
      gcc/testsuite/ChangeLog:
      
      	* gfortran.dg/vect/pr60510.f (test): Init variable a.
      610b845a
    • Kewen Lin's avatar
      vect: Consider vec_perm costing for VMAT_CONTIGUOUS_REVERSE · f1a05dc1
      Kewen Lin authored
      For VMAT_CONTIGUOUS_REVERSE, the transform code in function
      vectorizable_store generates a VEC_PERM_EXPR stmt before
      storing, but it's never considered in costing.
      
      This patch is to make it consider vec_perm in costing, it
      adjusts the order of transform code a bit to make it easy
      to early return for costing_p.
      
      gcc/ChangeLog:
      
      	* tree-vect-stmts.cc (vectorizable_store): Consider generated
      	VEC_PERM_EXPR stmt for VMAT_CONTIGUOUS_REVERSE in costing as
      	vec_perm.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.dg/vect/costmodel/ppc/costmodel-vect-store-2.c: New test.
      f1a05dc1
    • Kewen Lin's avatar
      vect: Get rid of vect_model_store_cost · 0bdb9bb5
      Kewen Lin authored
      This patch is to eventually get rid of vect_model_store_cost,
      it adjusts the costing for the remaining memory access types
      VMAT_CONTIGUOUS{, _DOWN, _REVERSE} by moving costing close
      to the transform code.  Note that in vect_model_store_cost,
      there is one special handling for vectorizing a store into
      the function result, since it's extra penalty and the
      transform part doesn't have it, this patch keep it alone.
      
      gcc/ChangeLog:
      
      	* tree-vect-stmts.cc (vect_model_store_cost): Remove.
      	(vectorizable_store): Adjust the costing for the remaining memory
      	access types VMAT_CONTIGUOUS{, _DOWN, _REVERSE}.
      0bdb9bb5
    • Kewen Lin's avatar
      vect: Adjust vectorizable_store costing on VMAT_CONTIGUOUS_PERMUTE · 0a96eedb
      Kewen Lin authored
      This patch adjusts the cost handling on VMAT_CONTIGUOUS_PERMUTE
      in function vectorizable_store.  We don't call function
      vect_model_store_cost for it any more.  It's the case of
      interleaving stores, so it skips all stmts excepting for
      first_stmt_info, consider the whole group when costing
      first_stmt_info.  This patch shouldn't have any functional
      changes.
      
      gcc/ChangeLog:
      
      	* tree-vect-stmts.cc (vect_model_store_cost): Assert it will never
      	get VMAT_CONTIGUOUS_PERMUTE and remove VMAT_CONTIGUOUS_PERMUTE related
      	handlings.
      	(vectorizable_store): Adjust the cost handling on
      	VMAT_CONTIGUOUS_PERMUTE without calling vect_model_store_cost.
      0a96eedb
    • Kewen Lin's avatar
      vect: Adjust vectorizable_store costing on VMAT_LOAD_STORE_LANES · 6a88202e
      Kewen Lin authored
      This patch adjusts the cost handling on VMAT_LOAD_STORE_LANES
      in function vectorizable_store.  We don't call function
      vect_model_store_cost for it any more.  It's the case of
      interleaving stores, so it skips all stmts excepting for
      first_stmt_info, consider the whole group when costing
      first_stmt_info.  This patch shouldn't have any functional
      changes.
      
      gcc/ChangeLog:
      
      	* tree-vect-stmts.cc (vect_model_store_cost): Assert it will never
      	get VMAT_LOAD_STORE_LANES.
      	(vectorizable_store): Adjust the cost handling on VMAT_LOAD_STORE_LANES
      	without calling vect_model_store_cost.  Factor out new lambda function
      	update_prologue_cost.
      6a88202e
    • Kewen Lin's avatar
      vect: Adjust vectorizable_store costing on VMAT_ELEMENTWISE and VMAT_STRIDED_SLP · 8b151eb9
      Kewen Lin authored
      This patch adjusts the cost handling on VMAT_ELEMENTWISE
      and VMAT_STRIDED_SLP in function vectorizable_store.  We
      don't call function vect_model_store_cost for them any more.
      
      Like what we improved for PR82255 on load side, this change
      helps us to get rid of unnecessary vec_to_scalar costing
      for some case with VMAT_STRIDED_SLP.  One typical test case
      gcc.dg/vect/costmodel/ppc/costmodel-vect-store-1.c has been
      associated.  And it helps some cases with some inconsistent
      costing too.
      
      Besides, this also special-cases the interleaving stores
      for these two affected memory access types, since for the
      interleaving stores the whole chain is vectorized when the
      last store in the chain is reached, the other stores in the
      group would be skipped.  To keep consistent with this and
      follows the transforming handlings like iterating the whole
      group, it only costs for the first store in the group.
      Ideally we can only cost for the last one but it's not
      trivial and using the first one is actually equivalent.
      
      gcc/ChangeLog:
      
      	* tree-vect-stmts.cc (vect_model_store_cost): Assert it won't get
      	VMAT_ELEMENTWISE and VMAT_STRIDED_SLP any more, and remove their
      	related handlings.
      	(vectorizable_store): Adjust the cost handling on VMAT_ELEMENTWISE
      	and VMAT_STRIDED_SLP without calling vect_model_store_cost.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.dg/vect/costmodel/ppc/costmodel-vect-store-1.c: New test.
      8b151eb9
    • Kewen Lin's avatar
      vect: Simplify costing on vectorizable_scan_store · 7184d225
      Kewen Lin authored
      This patch is to simplify the costing on the case
      vectorizable_scan_store without calling function
      vect_model_store_cost any more.
      
      I considered if moving the costing into function
      vectorizable_scan_store is a good idea, for doing
      that, we have to pass several variables down which
      are only used for costing, and for now we just
      want to keep the costing as the previous, haven't
      tried to make this costing consistent with what the
      transforming does, so I think we can leave it for now.
      
      gcc/ChangeLog:
      
      	* tree-vect-stmts.cc (vectorizable_store): Adjust costing on
      	vectorizable_scan_store without calling vect_model_store_cost
      	any more.
      7184d225
    • Kewen Lin's avatar
      vect: Adjust vectorizable_store costing on VMAT_GATHER_SCATTER · e00820c8
      Kewen Lin authored
      This patch adjusts the cost handling on VMAT_GATHER_SCATTER
      in function vectorizable_store (all three cases), then we
      won't depend on vect_model_load_store for its costing any
      more.  This patch shouldn't have any functional changes.
      
      gcc/ChangeLog:
      
      	* tree-vect-stmts.cc (vect_model_store_cost): Assert it won't get
      	VMAT_GATHER_SCATTER any more, remove VMAT_GATHER_SCATTER related
      	handlings and the related parameter gs_info.
      	(vect_build_scatter_store_calls): Add the handlings on costing with
      	one more argument cost_vec.
      	(vectorizable_store): Adjust the cost handling on VMAT_GATHER_SCATTER
      	without calling vect_model_store_cost any more.
      e00820c8
    • Kewen Lin's avatar
      vect: Move vect_model_store_cost next to the transform in vectorizable_store · 3bf23666
      Kewen Lin authored
      This patch is an initial patch to move costing next to the
      transform, it still adopts vect_model_store_cost for costing
      but moves and duplicates it down according to the handlings
      of different vect_memory_access_types or some special
      handling need, hope it can make the subsequent patches easy
      to review.  This patch should not have any functional
      changes.
      
      gcc/ChangeLog:
      
      	* tree-vect-stmts.cc (vectorizable_store): Move and duplicate the call
      	to vect_model_store_cost down to some different transform paths
      	according to the handlings of different vect_memory_access_types
      	or some special handling need.
      3bf23666
    • Kewen Lin's avatar
      vect: Ensure vect store is supported for some VMAT_ELEMENTWISE case · 32207b15
      Kewen Lin authored
      When making/testing patches to move costing next to the
      transform code for vectorizable_store, some ICEs got
      exposed when I further refined the costing handlings on
      VMAT_ELEMENTWISE.  The apparent cause is triggering the
      assertion in rs6000 specific function for costing
      rs6000_builtin_vectorization_cost:
      
        if (TARGET_ALTIVEC)
           /* Misaligned stores are not supported.  */
           gcc_unreachable ();
      
      I used vect_get_store_cost instead of the original way by
      record_stmt_cost with scalar_store for costing, that is to
      use one unaligned_store instead, it matches what we use in
      transforming, it's a vector store as below:
      
        else if (group_size >= const_nunits
                 && group_size % const_nunits == 0)
          {
             nstores = 1;
             lnel = const_nunits;
             ltype = vectype;
             lvectype = vectype;
          }
      
      So IMHO it's more consistent with vector store instead of
      scalar store, with the given compilation option
      -mno-allow-movmisalign, the misaligned vector store is
      unexpected to be used in vectorizer, but why it's still
      adopted?  In the current implementation of function
      get_group_load_store_type, we always set alignment support
      scheme as dr_unaligned_supported for VMAT_ELEMENTWISE, it
      is true if we always adopt scalar stores, but as the above
      code shows, we could use vector stores for some cases, so
      we should use the correct alignment support scheme for it.
      
      This patch is to ensure the vector store is supported by
      further checking with vect_supportable_dr_alignment.  The
      ICEs got exposed with patches moving costing next to the
      transform but they haven't been landed, the test coverage
      would be there once they get landed.  The affected test
      cases are:
        - gcc.dg/vect/slp-45.c
        - gcc.dg/vect/vect-alias-check-{10,11,12}.c
      
      btw, I tried to make some correctness test case, but I
      realized that -mno-allow-movmisalign is mainly for noting
      movmisalign optab and it doesn't guard for the actual hw
      vector memory access insns, so I failed to make it unless
      I also altered some conditions for them as it.
      
      gcc/ChangeLog:
      
      	* tree-vect-stmts.cc (vectorizable_store): Ensure the generated
      	vector store for some case of VMAT_ELEMENTWISE is supported.
      32207b15
    • Zhang, Jun's avatar
      x86: set spincount 1 for x86 hybrid platform · e1e127de
      Zhang, Jun authored
      By test, we find in hybrid platform spincount 1 is better.
      
      Use '-march=native -Ofast -funroll-loops -flto',
      results as follows:
      
      spec2017 speed   RPL     ADL
      657.xz_s         0.00%   0.50%
      603.bwaves_s     10.90%  26.20%
      607.cactuBSSN_s  5.50%   72.50%
      619.lbm_s        2.40%   2.50%
      621.wrf_s        -7.70%  2.40%
      627.cam4_s       0.50%   0.70%
      628.pop2_s       48.20%  153.00%
      638.imagick_s    -0.10%  0.20%
      644.nab_s        2.30%   1.40%
      649.fotonik3d_s  8.00%   13.80%
      654.roms_s       1.20%   1.10%
      Geomean-int      0.00%   0.50%
      Geomean-fp       6.30%   21.10%
      Geomean-all      5.70%   19.10%
      
      omp2012          RPL     ADL
      350.md           -1.81%  -1.75%
      351.bwaves       7.72%   12.50%
      352.nab          14.63%  19.71%
      357.bt331        -0.20%  1.77%
      358.botsalgn     0.00%   0.00%
      359.botsspar     0.00%   0.65%
      360.ilbdc        0.00%   0.25%
      362.fma3d        2.66%   -0.51%
      363.swim         10.44%  0.00%
      367.imagick      0.00%   0.12%
      370.mgrid331     2.49%   25.56%
      371.applu331     1.06%   4.22%
      372.smithwa      0.74%   3.34%
      376.kdtree       10.67%  16.03%
      GEOMEAN          3.34%   5.53%
      
      include/ChangeLog:
      
      	PR target/109812
      	* spincount.h: New file.
      
      libgomp/ChangeLog:
      
      	* env.c (initialize_env): Use do_adjust_default_spincount.
      	* config/linux/x86/spincount.h: New file.
      e1e127de
    • Pan Li's avatar
      RISC-V: Support FP llrint auto vectorization · 6a3302a4
      Pan Li authored
      
      This patch would like to support the FP llrint auto vectorization.
      
      * long long llrint (double)
      
      This will be the CVT from DF => DI from the standard name's perpsective,
      which has been covered in previous PATCH(es). Thus, this patch only add
      some test cases.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/riscv/rvv/autovec/unop/test-math.h: Add type int64_t.
      	* gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c: New test.
      	* gcc.target/riscv/rvv/autovec/vls/math-llrint-0.c: New test.
      
      Signed-off-by: default avatarPan Li <pan2.li@intel.com>
      6a3302a4
    • Mo, Zewei's avatar
      [APX] Support Intel APX PUSH2POP2 · 180b08f6
      Mo, Zewei authored
      
      This feature requires stack to be aligned at 16byte, therefore in
      prologue/epilogue, a standalone push/pop will be emitted before any
      push2/pop2 if the stack was not aligned to 16byte.
      Also for current implementation we only support push2/pop2 usage in
      function prologue/epilogue for those callee-saved registers.
      
      gcc/ChangeLog:
      
      	* config/i386/i386.cc (gen_push2): New function to emit push2
      	and adjust cfa offset.
      	(ix86_pro_and_epilogue_can_use_push2_pop2): New function to
      	determine whether push2/pop2 can be used.
      	(ix86_compute_frame_layout): Adjust preferred stack boundary
      	and stack alignment needed for push2/pop2.
      	(ix86_emit_save_regs): Emit push2 when available.
      	(ix86_emit_restore_reg_using_pop2): New function to emit pop2
      	and adjust cfa info.
      	(ix86_emit_restore_regs_using_pop2): New function to loop
      	through the saved regs and call above.
      	(ix86_expand_epilogue): Call ix86_emit_restore_regs_using_pop2
      	when push2pop2 available.
      	* config/i386/i386.md (push2_di): New pattern for push2.
      	(pop2_di): Likewise for pop2.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/i386/apx-push2pop2-1.c: New test.
      	* gcc.target/i386/apx-push2pop2_force_drap-1.c: Likewise.
      	* gcc.target/i386/apx-push2pop2_interrupt-1.c: Likewise.
      
      Co-authored-by: default avatarHu Lin1 <lin1.hu@intel.com>
      Co-authored-by: default avatarHongyu Wang <hongyu.wang@intel.com>
      180b08f6
    • Pan Li's avatar
      RISC-V: Support FP irintf auto vectorization · d6b7fe11
      Pan Li authored
      
      This patch would like to support the FP irintf auto vectorization.
      
      * int irintf (float)
      
      Due to the limitation that only the same size of data type are allowed
      in the vectorier, the standard name lrintmn2 only act on SF => SI.
      
      Given we have code like:
      
      void
      test_irintf (int *out, float *in, unsigned count)
      {
        for (unsigned i = 0; i < count; i++)
          out[i] = __builtin_irintf (in[i]);
      }
      
      Before this patch:
      .L3:
        ...
        flw      fa5,0(a1)
        fcvt.w.s a5,fa5,dyn
        sw       a5,-4(a0)
        ...
        bne      a1,a4,.L3
      
      After this patch:
      .L3:
        ...
        vle32.v     v1,0(a1)
        vfcvt.x.f.v v1,v1
        vse32.v     v1,0(a0)
        ...
        bne         a2,zero,.L3
      
      The rest part like DF => SI/HF => SI will be covered by the hook
      TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION.
      
      gcc/ChangeLog:
      
      	* config/riscv/autovec.md (lrint<mode><vlconvert>2): Rename from.
      	(lrint<mode><v_i_l_ll_convert>2): Rename to.
      	* config/riscv/vector-iterators.md: Rename and remove TARGET_64BIT.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/riscv/rvv/autovec/unop/math-irint-0.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/math-irint-run-0.c: New test.
      	* gcc.target/riscv/rvv/autovec/vls/math-irint-0.c: New test.
      
      Signed-off-by: default avatarPan Li <pan2.li@intel.com>
      d6b7fe11
    • GCC Administrator's avatar
      Daily bump. · 6febf76c
      GCC Administrator authored
      6febf76c
  2. Oct 11, 2023
    • Kito Cheng's avatar
      RISC-V: Add TARGET_MIN_VLEN_OPTS to fix the build · 06f36c1d
      Kito Cheng authored
      gcc/ChangeLog:
      
      	* config/riscv/riscv-opts.h (TARGET_MIN_VLEN_OPTS): New.
      06f36c1d
    • Jeff Law's avatar
      RISC-V Adjust long unconditional branch sequence · a3e50ee9
      Jeff Law authored
      Andrew and I independently noted the long unconditional branch sequence was
      using the "call" pseudo op.  Technically it works, but it's a bit odd.  This
      patch flips it to use the "jump" pseudo-op.
      
      This was tested with a hacked-up local compiler which forced all branches/jumps
      to be long jumps.  Naturally it triggered some failures for scan-asm tests but
      no execution regressions (which is mostly what I was testing for).
      
      I've updated the long branch support item in the RISE wiki to indicate that we
      eventually want a register scavenging approach with a fallback to $ra in the
      future so that we don't muck up the return address predictors.  It's not
      super-high priority and shouldn't be terrible to implement given we've got the
      $ra fallback when a suitable register can not be found.
      
      gcc/
      	* config/riscv/riscv.md (jump): Adjust sequence to use a "jump"
      	pseudo op instead of a "call" pseudo op.
      a3e50ee9
    • Kito Cheng's avatar
      RISC-V: Extend riscv_subset_list, preparatory for target attribute support · faae30c4
      Kito Cheng authored
      riscv_subset_list only accept a full arch string before, but we need to
      parse single extension when supporting target attribute, also we may set
      a riscv_subset_list directly rather than re-parsing the ISA string
      again.
      
      gcc/ChangeLog:
      
      	* config/riscv/riscv-subset.h (riscv_subset_list::parse_single_std_ext):
      	New.
      	(riscv_subset_list::parse_single_multiletter_ext): Ditto.
      	(riscv_subset_list::clone): Ditto.
      	(riscv_subset_list::parse_single_ext): Ditto.
      	(riscv_subset_list::set_loc): Ditto.
      	(riscv_set_arch_by_subset_list): Ditto.
      	* common/config/riscv/riscv-common.cc
      	(riscv_subset_list::parse_single_std_ext): New.
      	(riscv_subset_list::parse_single_multiletter_ext): Ditto.
      	(riscv_subset_list::clone): Ditto.
      	(riscv_subset_list::parse_single_ext): Ditto.
      	(riscv_subset_list::set_loc): Ditto.
      	(riscv_set_arch_by_subset_list): Ditto.
      faae30c4
    • Kito Cheng's avatar
      RISC-V: Refactor riscv_option_override and riscv_convert_vector_bits. [NFC] · 9452d13b
      Kito Cheng authored
      Allow those funciton apply from a local gcc_options rather than the
      global options.
      
      Preparatory for target attribute, sperate this change for eaiser reivew
      since it's a NFC.
      
      gcc/ChangeLog:
      
      	* config/riscv/riscv.cc (riscv_convert_vector_bits): Get setting
      	from argument rather than get setting from global setting.
      	(riscv_override_options_internal): New, splited from
      	riscv_override_options, also take a gcc_options argument.
      	(riscv_option_override): Splited most part to
      	riscv_override_options_internal.
      9452d13b
    • Kito Cheng's avatar
      options: Define TARGET_<NAME>_P and TARGET_<NAME>_OPTS_P macro for Mask and InverseMask · 0363bba8
      Kito Cheng authored
      We TARGET_<NAME>_P marcro to test a Mask and InverseMask with user
      specified target_variable, however we may want to test with specific
      gcc_options variable rather than target_variable.
      
      Like RISC-V has defined lots of Mask with TargetVariable, which is not
      easy to use, because that means we need to known which Mask are associate with
      which TargetVariable, so take a gcc_options variable is a better interface
      for such use case.
      
      gcc/ChangeLog:
      
      	* doc/options.texi (Mask): Document TARGET_<NAME>_P and
      	TARGET_<NAME>_OPTS_P.
      	(InverseMask): Ditto.
      	* opth-gen.awk (Mask): Generate TARGET_<NAME>_P and
      	TARGET_<NAME>_OPTS_P macro.
      	(InverseMask): Ditto.
      0363bba8
    • Andrew Pinski's avatar
      MATCH: [PR111282] Simplify `a & (b ^ ~a)` to `a & b` · e8d418df
      Andrew Pinski authored
      While `a & (b ^ ~a)` is optimized to `a & b` on the rtl level,
      it is always good to optimize this at the gimple level and allows
      us to match a few extra things including where a is a comparison.
      
      Note I had to update/change the testcase and-1.c to avoid matching
      this case as we can match -2 and 1 as bitwise inversions.
      
      	PR tree-optimization/111282
      
      gcc/ChangeLog:
      
      	* match.pd (`a & ~(a ^ b)`, `a & (a == b)`,
      	`a & ((~a) ^ b)`): New patterns.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.dg/tree-ssa/and-1.c: Update testcase to avoid
      	matching `~1 & (a ^ 1)` simplification.
      	* gcc.dg/tree-ssa/bitops-6.c: New test.
      e8d418df
    • Gaius Mulley's avatar
      modula2: Narrow subranges to int or unsigned int if ZTYPE is the base type. · acfca27e
      Gaius Mulley authored
      
      This patch narrows the subrange base type to INTEGER or CARDINAL
      providing the range is satisfied.  It only does this when the subrange
      base type is the ZTYPE.
      
      gcc/m2/ChangeLog:
      
      	* gm2-compiler/M2GCCDeclare.mod (DeclareSubrange): Check
      	the base type of the subrange against the ZTYPE and call
      	DeclareSubrangeNarrow if necessary.
      	(DeclareSubrangeNarrow): New procedure function.
      
      Signed-off-by: default avatarGaius Mulley <gaiusmod2@gmail.com>
      acfca27e
    • Mary Bennett's avatar
      [PATCH v4 2/2] RISC-V: Add support for XCValu extension in CV32E40P · 5ef248c1
      Mary Bennett authored
      Spec: github.com/openhwgroup/core-v-sw/blob/master/specifications/corev-builtin-spec.md
      
      Contributors:
        Mary Bennett <mary.bennett@embecosm.com>
        Nandni Jamnadas <nandni.jamnadas@embecosm.com>
        Pietra Ferreira <pietra.ferreira@embecosm.com>
        Charlie Keaney
        Jessica Mills
        Craig Blackmore <craig.blackmore@embecosm.com>
        Simon Cook <simon.cook@embecosm.com>
        Jeremy Bennett <jeremy.bennett@embecosm.com>
        Helene Chelin <helene.chelin@embecosm.com>
      
      gcc/ChangeLog:
      
      	* common/config/riscv/riscv-common.cc: Add the XCValu
      	extension.
      	* config/riscv/constraints.md: Add builtins for the XCValu
      	extension.
      	* config/riscv/predicates.md (immediate_register_operand):
      	Likewise.
      	* config/riscv/corev.def: Likewise.
      	* config/riscv/corev.md: Likewise.
      	* config/riscv/riscv-builtins.cc (AVAIL): Likewise.
      	(RISCV_ATYPE_UHI): Likewise.
      	* config/riscv/riscv-ftypes.def: Likewise.
      	* config/riscv/riscv.opt: Likewise.
      	* config/riscv/riscv.cc (riscv_print_operand): Likewise.
      	* doc/extend.texi: Add XCValu documentation.
      	* doc/sourcebuild.texi: Likewise.
      
      gcc/testsuite/ChangeLog:
      
      	* lib/target-supports.exp: Add proc for the XCValu extension.
      	* gcc.target/riscv/cv-alu-compile.c: New test.
      	* gcc.target/riscv/cv-alu-fail-compile-addn.c: New test.
      	* gcc.target/riscv/cv-alu-fail-compile-addrn.c: New test.
      	* gcc.target/riscv/cv-alu-fail-compile-addun.c: New test.
      	* gcc.target/riscv/cv-alu-fail-compile-addurn.c: New test.
      	* gcc.target/riscv/cv-alu-fail-compile-clip.c: New test.
      	* gcc.target/riscv/cv-alu-fail-compile-clipu.c: New test.
      	* gcc.target/riscv/cv-alu-fail-compile-subn.c: New test.
      	* gcc.target/riscv/cv-alu-fail-compile-subrn.c: New test.
      	* gcc.target/riscv/cv-alu-fail-compile-subun.c: New test.
      	* gcc.target/riscv/cv-alu-fail-compile-suburn.c: New test.
      	* gcc.target/riscv/cv-alu-fail-compile.c: New test.
      5ef248c1
    • Mary Bennett's avatar
      [PATCH v4 1/2] RISC-V: Add support for XCVmac extension in CV32E40P · 400efddd
      Mary Bennett authored
      Spec: github.com/openhwgroup/core-v-sw/blob/master/specifications/corev-builtin-spec.md
      
      Contributors:
        Mary Bennett <mary.bennett@embecosm.com>
        Nandni Jamnadas <nandni.jamnadas@embecosm.com>
        Pietra Ferreira <pietra.ferreira@embecosm.com>
        Charlie Keaney
        Jessica Mills
        Craig Blackmore <craig.blackmore@embecosm.com>
        Simon Cook <simon.cook@embecosm.com>
        Jeremy Bennett <jeremy.bennett@embecosm.com>
        Helene Chelin <helene.chelin@embecosm.com>
      
      gcc/ChangeLog:
      
      	* common/config/riscv/riscv-common.cc: Add XCVmac.
      	* config/riscv/riscv-ftypes.def: Add XCVmac builtins.
      	* config/riscv/riscv-builtins.cc: Likewise.
      	* config/riscv/riscv.md: Likewise.
      	* config/riscv/riscv.opt: Likewise.
      	* doc/extend.texi: Add XCVmac builtin documentation.
      	* doc/sourcebuild.texi: Likewise.
      	* config/riscv/corev.def: New file.
      	* config/riscv/corev.md: New file.
      
      gcc/testsuite/ChangeLog:
      
      	* lib/target-supports.exp: Add new effective target check.
      	* gcc.target/riscv/cv-mac-compile.c: New test.
      	* gcc.target/riscv/cv-mac-fail-compile-mac.c: New test.
      	* gcc.target/riscv/cv-mac-fail-compile-machhsn.c: New test.
      	* gcc.target/riscv/cv-mac-fail-compile-machhsrn.c: New test.
      	* gcc.target/riscv/cv-mac-fail-compile-machhun.c: New test.
      	* gcc.target/riscv/cv-mac-fail-compile-machhurn.c: New test.
      	* gcc.target/riscv/cv-mac-fail-compile-macsn.c: New test.
      	* gcc.target/riscv/cv-mac-fail-compile-macsrn.c: New test.
      	* gcc.target/riscv/cv-mac-fail-compile-macun.c: New test.
      	* gcc.target/riscv/cv-mac-fail-compile-macurn.c: New test.
      	* gcc.target/riscv/cv-mac-fail-compile-msu.c: New test.
      	* gcc.target/riscv/cv-mac-fail-compile-mulhhsn.c: New test.
      	* gcc.target/riscv/cv-mac-fail-compile-mulhhsrn.c: New test.
      	* gcc.target/riscv/cv-mac-fail-compile-mulhhun.c: New test.
      	* gcc.target/riscv/cv-mac-fail-compile-mulhhurn.c: New test.
      	* gcc.target/riscv/cv-mac-fail-compile-mulsn.c: New test.
      	* gcc.target/riscv/cv-mac-fail-compile-mulsrn.c: New test.
      	* gcc.target/riscv/cv-mac-fail-compile-mulun.c: New test.
      	* gcc.target/riscv/cv-mac-fail-compile-mulurn.c: New test.
      	* gcc.target/riscv/cv-mac-test-autogeneration.c: New test.
      400efddd
    • Filip Kastl's avatar
      MAINTAINERS: Fix write after approval name order · 70b02dfd
      Filip Kastl authored
      
      ChangeLog:
      
      	* MAINTAINERS: Fix name order.
      
      Signed-off-by: default avatarFilip Kastl <fkastl@suse.cz>
      70b02dfd
    • Gaius Mulley's avatar
      PR modula2/111675 Incorrect packed record field value passed to a procedure · 2b783fe2
      Gaius Mulley authored
      
      This patch allows a packed field to be extracted and passed to a
      procedure.  It ensures that the subrange type is the same for both the
      procedure and record field.  It also extends the <* bytealignment (0) *>
      to cover packed subrange types.
      
      gcc/m2/ChangeLog:
      
      	PR modula2/111675
      	* gm2-compiler/M2CaseList.mod (appendTree): Replace
      	InitStringCharStar with InitString.
      	* gm2-compiler/M2GCCDeclare.mod: Import AreConstantsEqual.
      	(DeclareSubrange): Add zero alignment test and call
      	BuildSmallestTypeRange if necessary.
      	(WalkSubrangeDependants): Walk the align expression.
      	(IsSubrangeDependants): Test the align expression.
      	* gm2-compiler/M2Quads.mod (BuildStringAdrParam): Correct end name.
      	* gm2-compiler/P2SymBuild.mod (BuildTypeAlignment): Allow subranges
      	to be zero aligned (packed).
      	* gm2-compiler/SymbolTable.mod (Subrange): Add Align field.
      	(MakeSubrange): Set Align to NulSym.
      	(PutAlignment): Assign Subrange.Align to align.
      	(GetAlignment): Return Subrange.Align.
      	* gm2-gcc/m2expr.cc (noBitsRequired): Rewrite.
      	(calcNbits): Rename ...
      	(m2expr_calcNbits): ... to this and test for negative values.
      	(m2expr_BuildTBitSize): Replace calcNBits with m2expr_calcNbits.
      	* gm2-gcc/m2expr.def (calcNbits): Export.
      	* gm2-gcc/m2expr.h (m2expr_calcNbits): New prototype.
      	* gm2-gcc/m2type.cc (noBitsRequired): Remove.
      	(m2type_BuildSmallestTypeRange): Call m2expr_calcNbits.
      	(m2type_BuildSubrangeType): Create range_type from
      	build_range_type (type, lowval, highval).
      
      gcc/testsuite/ChangeLog:
      
      	PR modula2/111675
      	* gm2/extensions/run/pass/packedrecord3.mod: New test.
      
      Signed-off-by: default avatarGaius Mulley <gaiusmod2@gmail.com>
      2b783fe2
    • Juzhe-Zhong's avatar
      RISC-V: Fix incorrect index(offset) of gather/scatter · f6c5e247
      Juzhe-Zhong authored
      I suddenly discovered I made a mistake that was lucky un-exposed.
      
      https://godbolt.org/z/c3jzrh7or
      
      GCC is using 32 bit index offset:
      
              vsll.vi v1,v1,2
              vsetvli zero,a5,e32,m1,ta,ma
              vluxei32.v      v1,(a1),v1
      
      This is wrong since v1 may overflow 32bit after vsll.vi.
      
      After this patch:
      
      vsext.vf2	v8,v4
      vsll.vi	v8,v8,2
      vluxei64.v	v8,(a1),v8
      
      Same as Clang.
      
      Regression passed. Ok for trunk ?
      
      gcc/ChangeLog:
      
      	* config/riscv/autovec.md: Fix index bug.
      	* config/riscv/riscv-protos.h (gather_scatter_valid_offset_mode_p): New function.
      	* config/riscv/riscv-v.cc (expand_gather_scatter): Fix index bug.
      	(gather_scatter_valid_offset_mode_p): New function.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/riscv/rvv/autovec/gather-scatter/offset_extend-1.c: New test.
      f6c5e247
    • Pan Li's avatar
      RISC-V: Support FP lrint/lrintf auto vectorization · d1e55666
      Pan Li authored
      
      This patch would like to support the FP lrint/lrintf auto vectorization.
      
      * long lrint (double) for rv64
      * long lrintf (float) for rv32
      
      Due to the limitation that only the same size of data type are allowed
      in the vectorier, the standard name lrintmn2 only act on DF => DI for
      rv64, and SF => SI for rv32.
      
      Given we have code like:
      
      void
      test_lrint (long *out, double *in, unsigned count)
      {
        for (unsigned i = 0; i < count; i++)
          out[i] = __builtin_lrint (in[i]);
      }
      
      Before this patch:
      .L3:
        ...
        fld      fa5,0(a1)
        fcvt.l.d a5,fa5,dyn
        sd       a5,-8(a0)
        ...
        bne      a1,a4,.L3
      
      After this patch:
      .L3:
        ...
        vsetvli     a3,zero,e64,m1,ta,ma
        vfcvt.x.f.v v1,v1
        vsetvli     zero,a2,e64,m1,ta,ma
        vse32.v     v1,0(a0)
        ...
        bne         a2,zero,.L3
      
      The rest part like SF => DI/HF => DI/DF => SI/HF => SI will be covered
      by TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION.
      
      gcc/ChangeLog:
      
      	* config/riscv/autovec.md (lrint<mode><vlconvert>2): New pattern
      	for lrint/lintf.
      	* config/riscv/riscv-protos.h (expand_vec_lrint): New func decl
      	for expanding lint.
      	* config/riscv/riscv-v.cc (emit_vec_cvt_x_f): New helper func impl
      	for vfcvt.x.f.v.
      	(expand_vec_lrint): New function impl for expanding lint.
      	* config/riscv/vector-iterators.md: New mode attr and iterator.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/riscv/rvv/autovec/unop/test-math.h: New define for
      	CVT like test case.
      	* gcc.target/riscv/rvv/autovec/vls/def.h: Ditto.
      	* gcc.target/riscv/rvv/autovec/unop/math-lrint-0.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/math-lrint-1.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/math-lrint-run-0.c: New test.
      	* gcc.target/riscv/rvv/autovec/unop/math-lrint-run-1.c: New test.
      	* gcc.target/riscv/rvv/autovec/vls/math-lrint-0.c: New test.
      	* gcc.target/riscv/rvv/autovec/vls/math-lrint-1.c: New test.
      
      Signed-off-by: default avatarPan Li <pan2.li@intel.com>
      d1e55666
    • Juzhe-Zhong's avatar
      RISC-V: Remove XFAIL of ssa-dom-cse-2.c · d4de593d
      Juzhe-Zhong authored
      Confirm RISC-V is able to CSE this case no matter whether we enable RVV or not.
      
      Remove XFAIL,  to fix:
      XPASS: gcc.dg/tree-ssa/ssa-dom-cse-2.c scan-tree-dump optimized "return 28;"
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.dg/tree-ssa/ssa-dom-cse-2.c: Remove riscv.
      d4de593d
    • Jakub Jelinek's avatar
      tree-ssa-strlen: optimization skips clobbering store [PR111519] · e75bf198
      Jakub Jelinek authored
      The following testcase is miscompiled, because count_nonzero_bytes incorrectly
      uses get_strinfo information on a pointer from which an earlier instruction
      loads SSA_NAME stored at the current instruction.  get_strinfo shows a state
      right before the current store though, so if there are some stores in between
      the current store and the load, the string length information might have
      changed.
      
      The patch passes around gimple_vuse from the store and punts instead of using
      strinfo on loads from MEM_REF which have different gimple_vuse from that.
      
      2023-10-11  Richard Biener  <rguenther@suse.de>
      	    Jakub Jelinek  <jakub@redhat.com>
      
      	PR tree-optimization/111519
      	* tree-ssa-strlen.cc (strlen_pass::count_nonzero_bytes): Add vuse
      	argument and pass it through to recursive calls and
      	count_nonzero_bytes_addr calls.  Don't shadow the stmt argument, but
      	change stmt for gimple_assign_single_p statements for which we don't
      	immediately punt.
      	(strlen_pass::count_nonzero_bytes_addr): Add vuse argument and pass
      	it through to recursive calls and count_nonzero_bytes calls.  Don't
      	use get_strinfo if gimple_vuse (stmt) is different from vuse.  Don't
      	shadow the stmt argument.
      
      	* gcc.dg/torture/pr111519.c: New testcase.
      e75bf198
    • Roger Sayle's avatar
      Optimize (ne:SI (subreg:QI (ashift:SI x 7) 0) 0) as (and:SI x 1). · c4149242
      Roger Sayle authored
      This patch is the middle-end piece of an improvement to PRs 101955 and
      106245, that adds a missing simplification to the RTL optimizers.
      This transformation is to simplify (char)(x << 7) != 0 as x & 1.
      Technically, the cast can be any truncation, where shift is by one
      less than the narrower type's precision, setting the most significant
      (only) bit from the least significant bit.
      
      This transformation applies to any target, but it's easy to see
      (and add a new test case) on x86, where the following function:
      
      int f(int a) { return (a << 31) >> 31; }
      
      currently gets compiled with -O2 to:
      
      foo:    movl    %edi, %eax
              sall    $7, %eax
              sarb    $7, %al
              movsbl  %al, %eax
              ret
      
      but with this patch, we now generate the slightly simpler.
      
      foo:    movl    %edi, %eax
              sall    $31, %eax
              sarl    $31, %eax
              ret
      
      2023-10-11  Roger Sayle  <roger@nextmovesoftware.com>
      
      gcc/ChangeLog
      	PR middle-end/101955
      	PR tree-optimization/106245
      	* simplify-rtx.cc (simplify_relational_operation_1): Simplify
      	the RTL (ne:SI (subreg:QI (ashift:SI x 7) 0) 0) to (and:SI x 1).
      
      gcc/testsuite/ChangeLog
      	* gcc.target/i386/pr106245-1.c: New test case.
      c4149242
    • Juzhe-Zhong's avatar
      RISC-V: Enable full coverage vect tests · 23aabded
      Juzhe-Zhong authored
      I have analyzed all existing FAILs.
      
      Except these following FAILs need to be addressed:
      FAIL: gcc.dg/vect/slp-reduc-7.c -flto -ffat-lto-objects execution test
      FAIL: gcc.dg/vect/slp-reduc-7.c execution test
      FAIL: gcc.dg/vect/vect-cond-arith-2.c -flto -ffat-lto-objects  scan-tree-dump optimized " = \\.COND_(LEN_)?SUB"
      FAIL: gcc.dg/vect/vect-cond-arith-2.c scan-tree-dump optimized " = \\.COND_(LEN_)?SUB"
      
      All other FAILs are dumple fail can be ignored (Confirm ARM SVE also has such FAILs and didn't fix them on either tests or implementation).
      
      Now, It's time to enable full coverage vect tests including vec_unpack, vec_pack, vec_interleave, ... etc.
      
      To see what we are still missing:
      
      Before this patch:
      
                      === gcc Summary ===
      
      # of expected passes            182839
      # of unexpected failures        79
      # of unexpected successes       11
      # of expected failures          1275
      # of unresolved testcases       4
      # of unsupported tests          4223
      
      After this patch:
      
                      === gcc Summary ===
      
      # of expected passes            183411
      # of unexpected failures        93
      # of unexpected successes       7
      # of expected failures          1285
      # of unresolved testcases       4
      # of unsupported tests          4157
      
      There is an important issue increased that I have noticed after this patch:
      
      FAIL: gcc.dg/vect/vect-gather-1.c -flto -ffat-lto-objects  scan-tree-dump vect "Loop contains only SLP stmts"
      FAIL: gcc.dg/vect/vect-gather-1.c scan-tree-dump vect "Loop contains only SLP stmts"
      FAIL: gcc.dg/vect/vect-gather-3.c -flto -ffat-lto-objects  scan-tree-dump vect "Loop contains only SLP stmts"
      FAIL: gcc.dg/vect/vect-gather-3.c scan-tree-dump vect "Loop contains only SLP stmts"
      
      It has a related PR: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111721
      
      I am gonna fix this first in the middle-end after commit this patch.
      
      Ok for trunk ?
      
      gcc/testsuite/ChangeLog:
      
      	* lib/target-supports.exp: Add RVV.
      23aabded
    • liuhongt's avatar
      Refine predicate of operands[2] in divv4hf3 with register_operand. · 4efe9085
      liuhongt authored
      In the expander, it will emit below insn.
      
      rtx tmp = gen_rtx_VEC_CONCAT (V4SFmode, operands[2],
      			force_reg (V2SFmode, CONST1_RTX (V2SFmode)));
      
      but *vec_concat<mode> only allow register_operand.
      
      gcc/ChangeLog:
      
      	PR target/111745
      	* config/i386/mmx.md (divv4hf3): Refine predicate of
      	operands[2] with register_operand.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/i386/pr111745.c: New test.
      4efe9085
    • Juzhe-Zhong's avatar
    • Juzhe-Zhong's avatar
      RISC-V Regression: Fix FAIL of vect-multitypes-16.c for RVV · cfe89942
      Juzhe-Zhong authored
      As Richard suggested: https://gcc.gnu.org/pipermail/gcc-patches/2023-October/632288.html
      
      Add vect_ext_char_longlong to fix FAIL for RVV.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.dg/vect/vect-multitypes-16.c: Adapt check for RVV.
      	* lib/target-supports.exp: Add vect_ext_char_longlong property.
      cfe89942
    • GCC Administrator's avatar
      Daily bump. · 69e3072c
      GCC Administrator authored
      69e3072c
  3. Oct 10, 2023
    • Andrew Waterman's avatar
      RISC-V: far-branch: Handle far jumps and branches for functions larger than 1MB · 71f90649
      Andrew Waterman authored
      
      On RISC-V, branches further than +/-1MB require a longer instruction
      sequence (3 instructions): we can reuse the jump-construction in the
      assmbler (which clobbers $ra) and a temporary to set up the jump
      destination.
      
      gcc/ChangeLog:
      
      	* config/riscv/riscv.cc (struct machine_function): Track if a
      	far-branch/jump is used within a function (and $ra needs to be
      	saved).
      	(riscv_print_operand): Implement 'N' (inverse integer branch).
      	(riscv_far_jump_used_p): Implement.
      	(riscv_save_return_addr_reg_p): New function.
      	(riscv_save_reg_p): Use riscv_save_return_addr_reg_p.
      	* config/riscv/riscv.h (FIXED_REGISTERS): Update $ra.
      	(CALL_USED_REGISTERS): Update $ra.
      	* config/riscv/riscv.md: Add new types "ret" and "jalr".
      	(length attribute): Handle long conditional and unconditional
      	branches.
      	(conditional branch pattern): Handle case where jump can not
      	reach the intended target.
      	(indirect_jump, tablejump): Use new "jalr" type.
      	(simple_return): Use new "ret" type.
      	(simple_return_internal, eh_return_internal): Likewise.
      	(gpr_restore_return, riscv_mret): Likewise.
      	(riscv_uret, riscv_sret): Likewise.
      	* config/riscv/generic.md (generic_branch): Also recognize jalr & ret
      	types.
      	* config/riscv/sifive-7.md (sifive_7_jump): Likewise.
      
      Co-authored-by: default avatarPhilipp Tomsich <philipp.tomsich@vrull.eu>
      Co-authored-by: default avatarJeff Law <jlaw@ventanamicro.com>
      71f90649
    • Jason Merrill's avatar
      c++: mangle multiple levels of template parms [PR109422] · bd5719bd
      Jason Merrill authored
      This becomes be more important with concepts, but can also be seen with
      generic lambdas.
      
      	PR c++/109422
      
      gcc/cp/ChangeLog:
      
      	* mangle.cc (write_template_param): Also mangle level.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/cpp2a/lambda-generic-mangle1.C: New test.
      	* g++.dg/cpp2a/lambda-generic-mangle1a.C: New test.
      bd5719bd
    • Andrew Pinski's avatar
      MATCH: [PR111679] Add alternative simplification of `a | ((~a) ^ b)` · 975da6fa
      Andrew Pinski authored
      So currently we have a simplification for `a | ~(a ^ b)` but
      that does not match the case where we had originally `(~a) | (a ^ b)`
      so we need to add a new pattern that matches that and uses bitwise_inverted_equal_p
      that also catches comparisons too.
      
      OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
      
      	PR tree-optimization/111679
      
      gcc/ChangeLog:
      
      	* match.pd (`a | ((~a) ^ b)`): New pattern.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.dg/tree-ssa/bitops-5.c: New test.
      975da6fa
    • Juzhe-Zhong's avatar
      RISC-V Regression: Make match patterns more accurate · 5bb6a876
      Juzhe-Zhong authored
      This patch fixes following 2 FAILs in RVV regression since the check is not accurate.
      
      It's inspired by Robin's previous patch:
      https://patchwork.sourceware.org/project/gcc/patch/dde89b9e-49a0-d70b-0906-fb3022cac11b@gmail.com/
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.dg/vect/no-scevccp-outer-7.c: Adjust regex pattern.
      	* gcc.dg/vect/no-scevccp-vect-iv-3.c: Ditto.
      5bb6a876
Loading