Skip to content
Snippets Groups Projects
  1. Apr 20, 2023
    • Arsen Arsenović's avatar
      update_web_docs_git: Allow setting TEXI2*, add git build default · fa3a5663
      Arsen Arsenović authored
      maintainer-scripts/ChangeLog:
      
      	* update_web_docs_git: Add a mechanism to override makeinfo,
      	texi2dvi and texi2pdf, and default them to
      	/home/gccadmin/texinfo/install-git/bin/${tool}, if present.
      Unverified
      fa3a5663
    • Patrick Palka's avatar
      c++: simplify TEMPLATE_TYPE_PARM level lowering · afc7e20e
      Patrick Palka authored
      1. Don't bother recursing when level lowering a cv-qualified type
         template parameter.
      2. Get rid of the recursive loop breaker when level lowering a
         constrained auto, and enable the TEMPLATE_PARM_DESCENDANTS cache in
         this case too.  This should be safe to do so now that we no longer
         substitute constraints on an auto.
      
      gcc/cp/ChangeLog:
      
      	* pt.cc (tsubst) <case TEMPLATE_TYPE_PARM>: Don't recurse when
      	level lowering a cv-qualified type template parameter.  Remove
      	recursive loop breaker in the level lowering case for constrained
      	autos.  Use the TEMPLATE_PARM_DESCENDANTS cache in this case as
      	well.
      afc7e20e
    • Patrick Palka's avatar
      c++: use TREE_VEC for trailing args of variadic built-in traits · 76fa66ea
      Patrick Palka authored
      This patch makes us use TREE_VEC instead of TREE_LIST to represent the
      trailing arguments of a variadic built-in trait.  These built-ins are
      typically passed a simple pack expansion as the second argument, e.g.
      
         __is_constructible(T, Ts...)
      
      and the main benefit of this representation change is that substituting
      into this argument list is now basically free since tsubst_template_args
      makes sure we reuse the TREE_VEC of the corresponding ARGUMENT_PACK when
      expanding such a pack expansion.  In the previous TREE_LIST representation
      we would need need to convert the expanded pack expansion into a TREE_LIST
      (via tsubst_tree_list).
      
      Note that an empty set of trailing arguments is now represented as an
      empty TREE_VEC instead of NULL_TREE, so now TRAIT_TYPE/EXPR_TYPE2 will
      be empty only for unary traits.
      
      gcc/cp/ChangeLog:
      
      	* constraint.cc (diagnose_trait_expr): Convert a TREE_VEC
      	of arguments into a TREE_LIST for sake of pretty printing.
      	* cxx-pretty-print.cc (pp_cxx_trait): Handle TREE_VEC
      	instead of TREE_LIST of trailing variadic trait arguments.
      	* method.cc (constructible_expr): Likewise.
      	(is_xible_helper): Likewise.
      	* parser.cc (cp_parser_trait): Represent trailing variadic trait
      	arguments as a TREE_VEC instead of TREE_LIST.
      	* pt.cc (value_dependent_expression_p): Handle TREE_VEC
      	instead of TREE_LIST of trailing variadic trait arguments.
      	* semantics.cc (finish_type_pack_element): Likewise.
      	(check_trait_type): Likewise.
      76fa66ea
    • Patrick Palka's avatar
      c++: make strip_typedefs generalize strip_typedefs_expr · d180a552
      Patrick Palka authored
      Currently if we have a TREE_VEC of types that we want to strip of typedefs,
      we unintuitively need to call strip_typedefs_expr instead of strip_typedefs
      since only strip_typedefs_expr handles TREE_VEC, and it also dispatches
      to strip_typedefs when given a type.  But this seems backwards: arguably
      strip_typedefs_expr should be the more specialized function, which
      strip_typedefs dispatches to (and thus generalizes).
      
      So this patch makes strip_typedefs subsume strip_typedefs_expr rather
      than vice versa, which allows for some simplifications.
      
      gcc/cp/ChangeLog:
      
      	* tree.cc (strip_typedefs): Move TREE_LIST handling to
      	strip_typedefs_expr.  Dispatch to strip_typedefs_expr for
      	non-type 't'.
      	<case TYPENAME_TYPE>: Remove manual dispatching to
      	strip_typedefs_expr.
      	<case TRAIT_TYPE>: Likewise.
      	(strip_typedefs_expr): Replaces calls to strip_typedefs_expr
      	with strip_typedefs throughout.  Don't dispatch to strip_typedefs
      	for type 't'.
      	<case TREE_LIST>: Replace this with the better version from
      	strip_typedefs.
      d180a552
    • Alejandro Colomar's avatar
      doc: Remove repeated word (typo) · d4e8523b
      Alejandro Colomar authored
      
      gcc/ChangeLog:
      
      	* doc/extend.texi (Common Function Attributes): Remove duplicate
      	word.
      
      Signed-off-by: default avatarAlejandro Colomar <alx@kernel.org>
      Unverified
      d4e8523b
    • Andrew MacLeod's avatar
      Do not ignore UNDEFINED ranges when determining PHI equivalences. · 17aa9ddb
      Andrew MacLeod authored
      Do not ignore UNDEFINED name arguments when registering two-way equivalences
      from PHIs.
      
      	PR tree-optimization/109564
      	gcc/
      	* gimple-range-fold.cc (fold_using_range::range_of_phi): Do no ignore
      	UNDEFINED range names when deciding if all PHI arguments are the same,
      
      	gcc/testsuite/
      	* gcc.dg/torture/pr109564-1.c: New testcase.
      	* gcc.dg/torture/pr109564-2.c: Likewise.
      	* gcc.dg/tree-ssa/evrp-ignore.c: XFAIL.
      	* gcc.dg/tree-ssa/vrp06.c: Likewise.
      17aa9ddb
    • Jakub Jelinek's avatar
      tree-vect-patterns: One small vect_recog_ctz_ffs_pattern tweak [PR109011] · 87c9bae4
      Jakub Jelinek authored
      I've noticed I've made a typo, ifn in this function this late
      is always only IFN_CTZ or IFN_FFS, never IFN_CLZ.
      
      Due to this typo, we weren't using the originally intended
      .CTZ (X) = .POPCOUNT ((X - 1) & ~X)
      but
      .CTZ (X) = PREC - .POPCOUNT (X | -X)
      instead when we want to emit __builtin_ctz*/.CTZ using .POPCOUNT.
      Both compute the same value, both are defined at 0 with the
      same value (PREC), both have same number of GIMPLE statements,
      but I think the former ought to be preferred, because lots of targets
      have andn as a single operation rather than two, and also putting
      a -1 constant into a vector register is often cheaper than vector
      with broadcast PREC power of two value.
      
      2023-04-20  Jakub Jelinek  <jakub@redhat.com>
      
      	PR tree-optimization/109011
      	* tree-vect-patterns.cc (vect_recog_ctz_ffs_pattern): Use
      	.CTZ (X) = .POPCOUNT ((X - 1) & ~X) in preference to
      	.CTZ (X) = PREC - .POPCOUNT (X | -X).
      87c9bae4
    • Jakub Jelinek's avatar
      c: Avoid -Wenum-int-mismatch warning for redeclaration of builtin acc_on_device [PR107041] · 3d7ab53d
      Jakub Jelinek authored
      The new -Wenum-int-mismatch warning triggers with -Wsystem-headers in
      <openacc.h>, for obvious reasons the builtin acc_on_device uses int
      type argument rather than enum which isn't defined yet when the builtin
      is created, while the OpenACC spec requires it to have acc_device_t
      enum argument.  The header makes sure it has int underlying type by using
      negative and __INT_MAX__ enumerators.
      
      I've tried to make the builtin typegeneric or just varargs, but that
      changes behavior e.g. when one calls it with some C++ class which has
      cast operator to acc_device_t, so the following patch instead disables
      the warning for this builtin.
      
      2023-04-20  Jakub Jelinek  <jakub@redhat.com>
      
      	PR c/107041
      	* c-decl.cc (diagnose_mismatched_decls): Avoid -Wenum-int-mismatch
      	warning on acc_on_device declaration.
      
      	* gcc.dg/goacc/pr107041.c: New test.
      3d7ab53d
    • Vladimir N. Makarov's avatar
      [LRA]: Exclude some hard regs for multi-reg inout reload pseudos used in asm in different mode · 1d2aa9a8
      Vladimir N. Makarov authored
      See gcc.c-torture/execute/20030222-1.c.  Consider the code for 32-bit (e.g. BE) target:
        int i, v; long x; x = v; asm ("" : "=r" (i) : "0" (x));
      We generate the following RTL with reload insns:
        1. subreg:si(x:di, 0) = 0;
        2. subreg:si(x:di, 4) = v:si;
        3. t:di = x:di, dead x;
        4. asm ("" : "=r" (subreg:si(t:di,4)) : "0" (t:di))
        5. i:si = subreg:si(t:di,4);
      If we assign hard reg of x to t, dead code elimination will remove insn #2
      and we will use unitialized hard reg.  So exclude the hard reg of x for t.
      We could ignore this problem for non-empty asm using all x value but it is hard to
      check that the asm are expanded into insn realy using x and setting r.
      The old reload pass used the same approach.
      
      gcc/ChangeLog
      
      	* lra-constraints.cc (match_reload): Exclude some hard regs for
      	multi-reg inout reload pseudos used in asm in different mode.
      1d2aa9a8
    • Uros Bizjak's avatar
      arch: Use VIRTUAL_REGISTER_P predicate. · cae48a9d
      Uros Bizjak authored
      gcc/ChangeLog:
      
      	* config/arm/arm.cc (thumb1_legitimate_address_p):
      	Use VIRTUAL_REGISTER_P predicate.
      	(arm_eliminable_register): Ditto.
      	* config/avr/avr.md (push<mode>_1): Ditto.
      	* config/bfin/predicates.md (register_no_elim_operand): Ditto.
      	* config/h8300/predicates.md (register_no_sp_elim_operand): Ditto.
      	* config/i386/predicates.md (register_no_elim_operand): Ditto.
      	* config/iq2000/predicates.md (call_insn_operand): Ditto.
      	* config/microblaze/microblaze.h (CALL_INSN_OP): Ditto.
      cae48a9d
    • Uros Bizjak's avatar
      i386: Handle sign-extract for QImode operations with high registers [PR78952] · 272484da
      Uros Bizjak authored
      Introduce extract_operator predicate to handle both, zero-extract and
      sign-extract extract operations with expressions like:
      
          (subreg:QI
            (zero_extract:SWI248
              (match_operand 1 "int248_register_operand" "0")
      	(const_int 8)
      	(const_int 8)) 0)
      
      As shown in the testcase, this will enable generation of QImode
      instructions with high registers when signed arguments are used.
      
      gcc/ChangeLog:
      
      	PR target/78952
      	* config/i386/predicates.md (extract_operator): New predicate.
      	* config/i386/i386.md (any_extract): Remove code iterator.
      	(*cmpqi_ext<mode>_1_mem_rex64): Use extract_operator predicate.
      	(*cmpqi_ext<mode>_1): Ditto.
      	(*cmpqi_ext<mode>_2): Ditto.
      	(*cmpqi_ext<mode>_3_mem_rex64): Ditto.
      	(*cmpqi_ext<mode>_3): Ditto.
      	(*cmpqi_ext<mode>_4): Ditto.
      	(*extzvqi_mem_rex64): Ditto.
      	(*extzvqi): Ditto.
      	(*insvqi_2): Ditto.
      	(*extendqi<SWI24:mode>_ext_1): Ditto.
      	(*addqi_ext<mode>_0): Ditto.
      	(*addqi_ext<mode>_1): Ditto.
      	(*addqi_ext<mode>_2): Ditto.
      	(*subqi_ext<mode>_0): Ditto.
      	(*subqi_ext<mode>_2): Ditto.
      	(*testqi_ext<mode>_1): Ditto.
      	(*testqi_ext<mode>_2): Ditto.
      	(*andqi_ext<mode>_0): Ditto.
      	(*andqi_ext<mode>_1): Ditto.
      	(*andqi_ext<mode>_1_cc): Ditto.
      	(*andqi_ext<mode>_2): Ditto.
      	(*<any_or:code>qi_ext<mode>_0): Ditto.
      	(*<any_or:code>qi_ext<mode>_1): Ditto.
      	(*<any_or:code>qi_ext<mode>_2): Ditto.
      	(*xorqi_ext<mode>_1_cc): Ditto.
      	(*negqi_ext<mode>_2): Ditto.
      	(*ashlqi_ext<mode>_2): Ditto.
      	(*<any_shiftrt:insn>qi_ext<mode>_2): Ditto.
      
      gcc/testsuite/ChangeLog:
      
      	PR target/78952
      	* gcc.target/i386/pr78952-4.c: New test.
      272484da
    • Raphael Zinsly's avatar
      [PR target/108248] [RISC-V] Break down some bitmanip insn types · 07e2576d
      Raphael Zinsly authored
      This is primarily Raphael's work.  All I did was adjust it to apply to the
      trunk and add the new types to generic.md's scheduling model.
      
      The basic idea here is to make sure we have the ability to schedule the
      bitmanip instructions with a finer degree of control.  Some of the bitmanip
      instructions are likely to have differing scheduler characteristics across
      different implementations.
      
      So rather than assign these instructions a generic "bitmanip" type, this
      patch assigns them a type based on their RTL code by using the <bitmanip_insn>
      iterator for the type.  Naturally we have to add a few new types.  It affects
      clz, ctz, cpop, min, max.
      
      We didn't do this for things like shNadd, single bit manipulation, etc. We
      certainly could if the needs presents itself.
      
      I threw all the new types into the generic_alu bucket in the generic
      scheduling model.  Seems as good a place as any. Someone who knows the
      sifive uarch should probably add these types (and bitmanip) to the sifive
      scheduling model.
      
      We also noticed that the recently added orc.b didn't have a type at all.
      So we added it as a generic bitmanip type.
      
      This has been bootstrapped in a gcc-12 base and I've built and run the
      testsuite without regressions on the trunk.
      
      Given it was primarily Raphael's work I could probably approve & commit it.
      But I'd like to give the other RISC-V folks a chance to chime in.
      
      	PR target/108248
      gcc/
      	* config/riscv/bitmanip.md (clz, ctz, pcnt, min, max patterns): Use
      	<bitmanip_insn> as the type to allow for fine grained control of
      	scheduling these insns.
      	* config/riscv/generic.md (generic_alu): Add bitmanip, clz, ctz, pcnt,
      	min, max.
      	* config/riscv/riscv.md (type attribute): Add types for clz, ctz,
      	pcnt, signed and unsigned min/max.
      07e2576d
    • Juzhe-Zhong's avatar
      RISC-V: Fix RVV register order · 7b206ae7
      Juzhe-Zhong authored
      
      This patch fixes the issue of incorrect reigster order of RVV.
      The new register order is coming from kito original RVV GCC implementation.
      
      Consider this case:
      void f (void *base,void *base2,void *out,size_t vl, int n)
      {
          vuint64m8_t bindex = __riscv_vle64_v_u64m8 (base + 100, vl);
          for (int i = 0; i < n; i++){
            vbool8_t m = __riscv_vlm_v_b8 (base + i, vl);
            vuint64m8_t v = __riscv_vluxei64_v_u64m8_m(m,base,bindex,vl);
            vuint64m8_t v2 = __riscv_vle64_v_u64m8_tu (v, base2 + i, vl);
            vint8m1_t v3 = __riscv_vluxei64_v_i8m1_m(m,base,v,vl);
            vint8m1_t v4 = __riscv_vluxei64_v_i8m1_m(m,base,v2,vl);
            __riscv_vse8_v_i8m1 (out + 100*i,v3,vl);
            __riscv_vse8_v_i8m1 (out + 222*i,v4,vl);
          }
      }
      
      Before this patch:
      f:
      	csrr    t0,vlenb
      	slli    t1,t0,3
      	sub     sp,sp,t1
      	addi    a5,a0,100
      	vsetvli zero,a3,e64,m8,ta,ma
      	vle64.v v24,0(a5)
      	vs8r.v  v24,0(sp)
      	ble     a4,zero,.L1
      	mv      a6,a0
      	add     a4,a4,a0
      	mv      a5,a2
      .L3:
      	vsetvli zero,zero,e64,m8,ta,ma
      	vl8re64.v       v24,0(sp)
      	vlm.v   v0,0(a6)
      	vluxei64.v      v24,(a0),v24,v0.t
      	addi    a6,a6,1
      	vsetvli zero,zero,e8,m1,tu,ma
      	vmv8r.v v16,v24
      	vluxei64.v      v8,(a0),v24,v0.t
      	vle64.v v16,0(a1)
      	vluxei64.v      v24,(a0),v16,v0.t
      	vse8.v  v8,0(a2)
      	vse8.v  v24,0(a5)
      	addi    a1,a1,1
      	addi    a2,a2,100
      	addi    a5,a5,222
      	bne     a4,a6,.L3
      .L1:
      	csrr    t0,vlenb
      	slli    t1,t0,3
      	add     sp,sp,t1
      	jr      ra
      
      After this patch:
      f:
      	addi    a5,a0,100
      	vsetvli zero,a3,e64,m8,ta,ma
      	vle64.v v24,0(a5)
      	ble     a4,zero,.L1
      	mv      a6,a0
      	add     a4,a4,a0
      	mv      a5,a2
      .L3:
      	vsetvli zero,zero,e64,m8,ta,ma
      	vlm.v   v0,0(a6)
      	addi    a6,a6,1
      	vluxei64.v      v8,(a0),v24,v0.t
      	vsetvli zero,zero,e8,m1,tu,ma
      	vmv8r.v v16,v8
      	vluxei64.v      v2,(a0),v8,v0.t
      	vle64.v v16,0(a1)
      	vluxei64.v      v1,(a0),v16,v0.t
      	vse8.v  v2,0(a2)
      	vse8.v  v1,0(a5)
      	addi    a1,a1,1
      	addi    a2,a2,100
      	addi    a5,a5,222
      	bne     a4,a6,.L3
      .L1:
      	ret
      
      The redundant register spillings is eliminated.
      However, there is one more issue need to be addressed which is the redundant
      move instruction "vmv8r.v". This is another story, and it will be fixed by another
      patch (Fine tune RVV machine description RA constraint).
      
      gcc/ChangeLog:
      
      	* config/riscv/riscv.h (enum reg_class): Fix RVV register order.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/riscv/rvv/base/spill-4.c: Adapt testcase.
      	* gcc.target/riscv/rvv/base/spill-6.c: Adapt testcase.
      	* gcc.target/riscv/rvv/base/reg_order-1.c: New test.
      
      Signed-off-by: default avatarJu-Zhe Zhong <juzhe.zhong@rivai.ai>
      Co-authored-by: default avatarkito-cheng <kito.cheng@sifive.com>
      7b206ae7
    • Kito Cheng's avatar
      RISC-V: Fix riscv/arch-19.c with different ISA spec version · 9fde76a3
      Kito Cheng authored
      In newer ISA spec, F will implied zicsr, add that into -march option to
      prevent different test result on different default -misa-spec version.
      
      gcc/testsuite/
      
      	* gcc.target/riscv/arch-19.c: Add -misa-spec.
      9fde76a3
    • Ju-Zhe Zhong's avatar
      RISC-V: Fix wrong check of register occurrences [PR109535] · a2d12abe
      Ju-Zhe Zhong authored
      
      count_occurrences will conly count same RTX (same code and same mode),
      but what we want to track is the occurrence of a register, a register
      might appeared in the insn with different mode or contain in SUBREG.
      
      Testcase coming from Kito.
      
      gcc/ChangeLog:
      
      	PR target/109535
      	* config/riscv/riscv-vsetvl.cc (count_regno_occurrences): New function.
      	(pass_vsetvl::cleanup_insns): Fix bug.
      
      gcc/testsuite/ChangeLog:
      
      	PR target/109535
      	* g++.target/riscv/rvv/base/pr109535.C: New test.
      	* gcc.target/riscv/rvv/base/pr109535.c: New test.
      
      Signed-off-by: default avatarJu-Zhe Zhong <juzhe.zhong@rivai.ai>
      Co-authored-by: default avatarkito-cheng <kito.cheng@sifive.com>
      a2d12abe
    • Kito Cheng's avatar
      RISC-V: Fix simplify_ior_optimization.c on rv32 · 98ebdda3
      Kito Cheng authored
      GCC will complaint if target ABI isn't have corresponding multi-lib on
      glibc toolchain, use stdint-gcc.h to suppress that.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/riscv/simplify_ior_optimization.c: Use stdint-gcc.h
      	rather than stdint.h
      98ebdda3
    • Andrew Stubbs's avatar
      amdgcn: bug fix ldexp insn · 0be4fbea
      Andrew Stubbs authored
      The vop3 instructions don't support B constraint immediates.
      Also, take the use the SV_FP iterator to delete a redundant pattern.
      
      gcc/ChangeLog:
      
      	* config/gcn/gcn-valu.md (vnsi, VnSI): Add scalar modes.
      	(ldexp<mode>3): Delete.
      	(ldexp<mode>3<exec>): Change "B" to "A".
      0be4fbea
    • Andrew Stubbs's avatar
      amdgcn: update target-supports.exp · 09751f52
      Andrew Stubbs authored
      The backend can now vectorize more things.
      
      gcc/testsuite/ChangeLog:
      
      	* lib/target-supports.exp
      	(check_effective_target_vect_call_copysignf): Add amdgcn.
      	(check_effective_target_vect_call_sqrtf): Add amdgcn.
      	(check_effective_target_vect_call_ceilf): Add amdgcn.
      	(check_effective_target_vect_call_floor): Add amdgcn.
      	(check_effective_target_vect_logical_reduc): Add amdgcn.
      09751f52
    • Jakub Jelinek's avatar
      tree: Add 3+ argument fndecl_built_in_p · 1edcb2ea
      Jakub Jelinek authored
      On Wed, Feb 22, 2023 at 09:52:06AM +0000, Richard Biener wrote:
      > > The following testcase ICEs because we still have some spots that
      > > treat BUILT_IN_UNREACHABLE specially but not BUILT_IN_UNREACHABLE_TRAP
      > > the same.
      
      This patch uses (fndecl_built_in_p (node, BUILT_IN_UNREACHABLE)
                       || fndecl_built_in_p (node, BUILT_IN_UNREACHABLE_TRAP))
      a lot and from grepping around, we do something like that in lots of
      other places, or in some spots instead as
      (fndecl_built_in_p (node, BUILT_IN_NORMAL)
       && (DECL_FUNCTION_CODE (node) == BUILT_IN_WHATEVER1
           || DECL_FUNCTION_CODE (node) == BUILT_IN_WHATEVER2))
      The following patch adds an overload for this case, so we can write
      it in a shorter way, using C++11 argument packs so that it supports
      as many codes as one needs.
      
      2023-04-20  Jakub Jelinek  <jakub@redhat.com>
      	    Jonathan Wakely  <jwakely@redhat.com>
      
      	* tree.h (built_in_function_equal_p): New helper function.
      	(fndecl_built_in_p): Turn into variadic template to support
      	1 or more built_in_function arguments.
      	* builtins.cc (fold_builtin_expect): Use 3 argument fndecl_built_in_p.
      	* gimplify.cc (goa_stabilize_expr): Likewise.
      	* cgraphclones.cc (cgraph_node::create_clone): Likewise.
      	* ipa-fnsummary.cc (compute_fn_summary): Likewise.
      	* omp-low.cc (setjmp_or_longjmp_p): Likewise.
      	* cgraph.cc (cgraph_edge::redirect_call_stmt_to_callee,
      	cgraph_update_edges_for_call_stmt_node,
      	cgraph_edge::verify_corresponds_to_fndecl,
      	cgraph_node::verify_node): Likewise.
      	* tree-stdarg.cc (optimize_va_list_gpr_fpr_size): Likewise.
      	* gimple-ssa-warn-access.cc (matching_alloc_calls_p): Likewise.
      	* ipa-prop.cc (try_make_edge_direct_virtual_call): Likewise.
      1edcb2ea
    • Jakub Jelinek's avatar
      tree-vect-patterns: Pattern recognize ctz or ffs using clz, popcount or ctz [PR109011] · 705b0d2b
      Jakub Jelinek authored
      The following patch allows to vectorize __builtin_ffs*/.FFS even if
      we just have vector .CTZ support, or __builtin_ffs*/.FFS/__builtin_ctz*/.CTZ
      if we just have vector .CLZ or .POPCOUNT support.
      It uses various expansions from Hacker's Delight book as well as GCC's
      expansion, in particular:
      .CTZ (X) = PREC - .CLZ ((X - 1) & ~X)
      .CTZ (X) = .POPCOUNT ((X - 1) & ~X)
      .CTZ (X) = (PREC - 1) - .CLZ (X & -X)
      .FFS (X) = PREC - .CLZ (X & -X)
      .CTZ (X) = PREC - .POPCOUNT (X | -X)
      .FFS (X) = (PREC + 1) - .POPCOUNT (X | -X)
      .FFS (X) = .CTZ (X) + 1
      where the first one can be only used if both CTZ and CLZ have value
      defined at zero (kind 2) and both have value of PREC there.
      If the original has value defined at zero and the latter doesn't
      for other forms or if it doesn't have matching value for that case,
      a COND_EXPR is added for that afterwards.
      
      The patch also modifies vect_recog_popcount_clz_ctz_ffs_pattern
      such that the two can work together.
      
      2023-04-20  Jakub Jelinek  <jakub@redhat.com>
      
      	PR tree-optimization/109011
      	* tree-vect-patterns.cc (vect_recog_ctz_ffs_pattern): New function.
      	(vect_recog_popcount_clz_ctz_ffs_pattern): Move vect_pattern_detected
      	call later.  Don't punt for IFN_CTZ or IFN_FFS if it doesn't have
      	direct optab support, but has instead IFN_CLZ, IFN_POPCOUNT or
      	for IFN_FFS IFN_CTZ support, use vect_recog_ctz_ffs_pattern for that
      	case.
      	(vect_vect_recog_func_ptrs): Add ctz_ffs entry.
      
      	* gcc.dg/vect/pr109011-1.c: Remove -mpower9-vector from
      	dg-additional-options.
      	(baz, qux): Remove functions and corresponding dg-final.
      	* gcc.dg/vect/pr109011-2.c: New test.
      	* gcc.dg/vect/pr109011-3.c: New test.
      	* gcc.dg/vect/pr109011-4.c: New test.
      	* gcc.dg/vect/pr109011-5.c: New test.
      705b0d2b
    • Richard Biener's avatar
      Remove duplicate DFS walks from DF init · 974326fd
      Richard Biener authored
      The following removes unused CFG order computes from
      rest_of_handle_df_initialize.  The CFG orders are computed from df_analyze ().
      This also removes code duplication that would have to be kept in sync.
      
      	* df-core.cc (rest_of_handle_df_initialize): Remove
      	computation of df->postorder, df->postorder_inverted and
      	df->n_blocks.
      974326fd
    • Jakub Jelinek's avatar
      testsuite: Fix up g++.dg/ext/int128-8.C testcase [PR109560] · bd4a1a54
      Jakub Jelinek authored
      The testcase needs to be restricted to int128 effective targets,
      it expectedly fails on i386 and other 32-bit targets.
      
      2023-04-20  Jakub Jelinek  <jakub@redhat.com>
      
      	PR c++/108099
      	PR testsuite/109560
      	* g++.dg/ext/int128-8.C: Require int128 effective target.
      bd4a1a54
    • Jiufu Guo's avatar
      PR testsuite/106879 FAIL: gcc.dg/vect/bb-slp-layout-19.c on powerpc64 · 57e7229a
      Jiufu Guo authored
      On P7, option -mno-allow-movmisalign is added during testing, which
      prevents slp happen on the case.
      
      Like PR65484 and PR87306, this patch use vect_hw_misalign to guard
      the case on powerpc targets.
      
      gcc/testsuite/ChangeLog:
      
      	PR testsuite/106879
      	* gcc.dg/vect/bb-slp-layout-19.c: Modify to guard the check with
      	vect_hw_misalign on POWERs.
      57e7229a
    • Haochen Jiang's avatar
      i386: Share AES xmm intrin with VAES · 24a8acc1
      Haochen Jiang authored
      Currently in GCC, the 128 bit intrin for instruction vaes{end,dec}{last,}
      is under AES ISA. Because there is no dependency between ISA set AES
      and VAES, The 128 bit intrin is not available when we use compiler flag
      -mvaes -mavx512vl and there is no other way to use that intrin. But it
      should according to Intel SDM.
      
      Although VAES aims to be a VEX/EVEX promotion for AES, but it is only part
      of it. Therefore, we share the AES xmm intrin with VAES.
      
      Also, since -mvaes indicates that we could use VEX encoding for ymm, we
      should imply AVX for VAES.
      
      gcc/ChangeLog:
      
      	* common/config/i386/i386-common.cc
      	(OPTION_MASK_ISA2_AVX_UNSET): Add OPTION_MASK_ISA2_VAES_UNSET.
      	(ix86_handle_option): Set AVX flag for VAES.
      	* config/i386/i386-builtins.cc (ix86_init_mmx_sse_builtins):
      	Add OPTION_MASK_ISA2_VAES_UNSET.
      	(def_builtin): Share builtin between AES and VAES.
      	* config/i386/i386-expand.cc (ix86_check_builtin_isa_match):
      	Ditto.
      	* config/i386/i386.md (aes): New isa attribute.
      	* config/i386/sse.md (aesenc): Add pattern for VAES with xmm.
      	(aesenclast): Ditto.
      	(aesdec): Ditto.
      	(aesdeclast): Ditto.
      	* config/i386/vaesintrin.h: Remove redundant avx target push.
      	* config/i386/wmmintrin.h (_mm_aesdec_si128): Change to macro.
      	(_mm_aesdeclast_si128): Ditto.
      	(_mm_aesenc_si128): Ditto.
      	(_mm_aesenclast_si128): Ditto.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/i386/avx512fvl-vaes-1.c: Add VAES xmm test.
      	* gcc.target/i386/pr109117-1.c: Modify error message.
      24a8acc1
    • Hu, Lin1's avatar
      Add reduce_*_ep[i|u][8|16] series intrinsics · ca3bd377
      Hu, Lin1 authored
      gcc/ChangeLog:
      
      	* config/i386/avx2intrin.h
      	(_MM_REDUCE_OPERATOR_BASIC_EPI16): New macro.
      	(_MM_REDUCE_OPERATOR_MAX_MIN_EP16): Ditto.
      	(_MM256_REDUCE_OPERATOR_BASIC_EPI16): Ditto.
      	(_MM256_REDUCE_OPERATOR_MAX_MIN_EP16): Ditto.
      	(_MM_REDUCE_OPERATOR_BASIC_EPI8): Ditto.
      	(_MM_REDUCE_OPERATOR_MAX_MIN_EP8): Ditto.
      	(_MM256_REDUCE_OPERATOR_BASIC_EPI8): Ditto.
      	(_MM256_REDUCE_OPERATOR_MAX_MIN_EP8): Ditto.
      	(_mm_reduce_add_epi16): New instrinsics.
      	(_mm_reduce_mul_epi16): Ditto.
      	(_mm_reduce_and_epi16): Ditto.
      	(_mm_reduce_or_epi16): Ditto.
      	(_mm_reduce_max_epi16): Ditto.
      	(_mm_reduce_max_epu16): Ditto.
      	(_mm_reduce_min_epi16): Ditto.
      	(_mm_reduce_min_epu16): Ditto.
      	(_mm256_reduce_add_epi16): Ditto.
      	(_mm256_reduce_mul_epi16): Ditto.
      	(_mm256_reduce_and_epi16): Ditto.
      	(_mm256_reduce_or_epi16): Ditto.
      	(_mm256_reduce_max_epi16): Ditto.
      	(_mm256_reduce_max_epu16): Ditto.
      	(_mm256_reduce_min_epi16): Ditto.
      	(_mm256_reduce_min_epu16): Ditto.
      	(_mm_reduce_add_epi8): Ditto.
      	(_mm_reduce_mul_epi8): Ditto.
      	(_mm_reduce_and_epi8): Ditto.
      	(_mm_reduce_or_epi8): Ditto.
      	(_mm_reduce_max_epi8): Ditto.
      	(_mm_reduce_max_epu8): Ditto.
      	(_mm_reduce_min_epi8): Ditto.
      	(_mm_reduce_min_epu8): Ditto.
      	(_mm256_reduce_add_epi8): Ditto.
      	(_mm256_reduce_mul_epi8): Ditto.
      	(_mm256_reduce_and_epi8): Ditto.
      	(_mm256_reduce_or_epi8): Ditto.
      	(_mm256_reduce_max_epi8): Ditto.
      	(_mm256_reduce_max_epu8): Ditto.
      	(_mm256_reduce_min_epi8): Ditto.
      	(_mm256_reduce_min_epu8): Ditto.
      	* config/i386/avx512vlbwintrin.h:
      	(_mm_mask_reduce_add_epi16): Ditto.
      	(_mm_mask_reduce_mul_epi16): Ditto.
      	(_mm_mask_reduce_and_epi16): Ditto.
      	(_mm_mask_reduce_or_epi16): Ditto.
      	(_mm_mask_reduce_max_epi16): Ditto.
      	(_mm_mask_reduce_max_epu16): Ditto.
      	(_mm_mask_reduce_min_epi16): Ditto.
      	(_mm_mask_reduce_min_epu16): Ditto.
      	(_mm256_mask_reduce_add_epi16): Ditto.
      	(_mm256_mask_reduce_mul_epi16): Ditto.
      	(_mm256_mask_reduce_and_epi16): Ditto.
      	(_mm256_mask_reduce_or_epi16): Ditto.
      	(_mm256_mask_reduce_max_epi16): Ditto.
      	(_mm256_mask_reduce_max_epu16): Ditto.
      	(_mm256_mask_reduce_min_epi16): Ditto.
      	(_mm256_mask_reduce_min_epu16): Ditto.
      	(_mm_mask_reduce_add_epi8): Ditto.
      	(_mm_mask_reduce_mul_epi8): Ditto.
      	(_mm_mask_reduce_and_epi8): Ditto.
      	(_mm_mask_reduce_or_epi8): Ditto.
      	(_mm_mask_reduce_max_epi8): Ditto.
      	(_mm_mask_reduce_max_epu8): Ditto.
      	(_mm_mask_reduce_min_epi8): Ditto.
      	(_mm_mask_reduce_min_epu8): Ditto.
      	(_mm256_mask_reduce_add_epi8): Ditto.
      	(_mm256_mask_reduce_mul_epi8): Ditto.
      	(_mm256_mask_reduce_and_epi8): Ditto.
      	(_mm256_mask_reduce_or_epi8): Ditto.
      	(_mm256_mask_reduce_max_epi8): Ditto.
      	(_mm256_mask_reduce_max_epu8): Ditto.
      	(_mm256_mask_reduce_min_epi8): Ditto.
      	(_mm256_mask_reduce_min_epu8): Ditto.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/i386/avx512vlbw-reduce-op-1.c: New test.
      ca3bd377
    • Haochen Jiang's avatar
      i386: Add PCLMUL dependency for VPCLMULQDQ · 4246611d
      Haochen Jiang authored
      Currently in GCC, the 128 bit intrin for instruction vpclmulqdq is
      under PCLMUL ISA. Because there is no dependency between ISA set PCLMUL
      and VPCLMULQDQ, The 128 bit intrin is not available when we just use
      compiler flag -mvpclmulqdq. But it should according to Intel SDM.
      
      Since VPCLMULQDQ is a VEX/EVEX promotion for PCLMUL, it is natural to
      add dependency between them.
      
      Also, with -mvpclmulqdq, we can use ymm under VEX encoding, so
      VPCLMULQDQ should imply AVX.
      
      gcc/ChangeLog:
      
      	* common/config/i386/i386-common.cc
      	(OPTION_MASK_ISA_VPCLMULQDQ_SET):
      	Add OPTION_MASK_ISA_PCLMUL_SET and OPTION_MASK_ISA_AVX_SET.
      	(OPTION_MASK_ISA_AVX_UNSET):
      	Add OPTION_MASK_ISA_VPCLMULQDQ_UNSET.
      	(OPTION_MASK_ISA_PCLMUL_UNSET): Ditto.
      	* config/i386/i386.md (vpclmulqdqvl): New.
      	* config/i386/sse.md (pclmulqdq): Add evex encoding.
      	* config/i386/vpclmulqdqintrin.h: Remove redudant avx target
      	push.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/i386/vpclmulqdq.c: Add compile test for xmm.
      4246611d
    • Haochen Jiang's avatar
      i386: Fix vpblendm{b,w} intrins and insns · e8571019
      Haochen Jiang authored
      For vpblendm{b,w}, they actually do not have constant parameters.
      Therefore, there is no need for them been wrapped in __OPTIMIZE__.
      
      Also, we should check TARGET_AVX512VL for 128/256 bit vectors.
      
      gcc/ChangeLog:
      
      	* config/i386/avx512vlbwintrin.h
      	(_mm_mask_blend_epi16): Remove __OPTIMIZE__ wrapper.
      	(_mm_mask_blend_epi8): Ditto.
      	(_mm256_mask_blend_epi16): Ditto.
      	(_mm256_mask_blend_epi8): Ditto.
      	* config/i386/avx512vlintrin.h
      	(_mm256_mask_blend_pd): Ditto.
      	(_mm256_mask_blend_ps): Ditto.
      	(_mm256_mask_blend_epi64): Ditto.
      	(_mm256_mask_blend_epi32): Ditto.
      	(_mm_mask_blend_pd): Ditto.
      	(_mm_mask_blend_ps): Ditto.
      	(_mm_mask_blend_epi64): Ditto.
      	(_mm_mask_blend_epi32): Ditto.
      	* config/i386/sse.md (VF_AVX512BWHFBF16): Removed.
      	(VF_AVX512HFBFVL): Move it before the first usage.
      	(<avx512>_blendm<mode>): Change iterator from VF_AVX512BWHFBF16
      	to VF_AVX512HFBFVL.
      e8571019
    • Haochen Jiang's avatar
      i386: Add AVX512BW dependency to AVX512VBMI2 · 4fb12ae9
      Haochen Jiang authored
      gcc/ChangeLog:
      
      	* common/config/i386/i386-common.cc
      	(OPTION_MASK_ISA_AVX512VBMI2_SET): Change OPTION_MASK_ISA_AVX512F_SET
      	to OPTION_MASK_ISA_AVX512BW_SET.
      	(OPTION_MASK_ISA_AVX512F_UNSET):
      	Remove OPTION_MASK_ISA_AVX512VBMI2_UNSET.
      	(OPTION_MASK_ISA_AVX512BW_UNSET):
      	Add OPTION_MASK_ISA_AVX512VBMI2_UNSET.
      	* config/i386/avx512vbmi2intrin.h: Do not push avx512bw.
      	* config/i386/avx512vbmi2vlintrin.h: Ditto.
      	* config/i386/i386-builtin.def: Remove OPTION_MASK_ISA_AVX512BW.
      	* config/i386/sse.md (VI12_AVX512VLBW): Removed.
      	(VI12_VI48F_AVX512VLBW): Rename to VI12_VI48F_AVX512VL.
      	(compress<mode>_mask): Change iterator from VI12_AVX512VLBW to
      	VI12_AVX512VL.
      	(compressstore<mode>_mask): Ditto.
      	(expand<mode>_mask): Ditto.
      	(expand<mode>_maskz): Ditto.
      	(*expand<mode>_mask): Change iterator from VI12_VI48F_AVX512VLBW to
      	VI12_VI48F_AVX512VL.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/i386/avx512bw-pr100267-1.c: Remove avx512f and avx512bw.
      	* gcc.target/i386/avx512bw-pr100267-b-2.c: Ditto.
      	* gcc.target/i386/avx512bw-pr100267-d-2.c: Ditto.
      	* gcc.target/i386/avx512bw-pr100267-q-2.c: Ditto.
      	* gcc.target/i386/avx512bw-pr100267-w-2.c: Ditto.
      	* gcc.target/i386/avx512f-vpcompressb-1.c: Ditto.
      	* gcc.target/i386/avx512f-vpcompressb-2.c: Ditto.
      	* gcc.target/i386/avx512f-vpcompressw-1.c: Ditto.
      	* gcc.target/i386/avx512f-vpcompressw-2.c: Ditto.
      	* gcc.target/i386/avx512f-vpexpandb-1.c: Ditto.
      	* gcc.target/i386/avx512f-vpexpandb-2.c: Ditto.
      	* gcc.target/i386/avx512f-vpexpandw-1.c: Ditto.
      	* gcc.target/i386/avx512f-vpexpandw-2.c: Ditto.
      	* gcc.target/i386/avx512f-vpshld-1.c: Ditto.
      	* gcc.target/i386/avx512f-vpshldd-2.c: Ditto.
      	* gcc.target/i386/avx512f-vpshldq-2.c: Ditto.
      	* gcc.target/i386/avx512f-vpshldv-1.c: Ditto.
      	* gcc.target/i386/avx512f-vpshldvd-2.c: Ditto.
      	* gcc.target/i386/avx512f-vpshldvq-2.c: Ditto.
      	* gcc.target/i386/avx512f-vpshldvw-2.c: Ditto.
      	* gcc.target/i386/avx512f-vpshrdd-2.c: Ditto.
      	* gcc.target/i386/avx512f-vpshrdq-2.c: Ditto.
      	* gcc.target/i386/avx512f-vpshrdv-1.c: Ditto.
      	* gcc.target/i386/avx512f-vpshrdvd-2.c: Ditto.
      	* gcc.target/i386/avx512f-vpshrdvq-2.c: Ditto.
      	* gcc.target/i386/avx512f-vpshrdvw-2.c: Ditto.
      	* gcc.target/i386/avx512f-vpshrdw-2.c: Ditto.
      	* gcc.target/i386/avx512vbmi2-vpshld-1.c: Ditto.
      	* gcc.target/i386/avx512vbmi2-vpshrd-1.c: Ditto.
      	* gcc.target/i386/avx512vl-vpcompressb-1.c: Ditto.
      	* gcc.target/i386/avx512vl-vpcompressb-2.c: Ditto.
      	* gcc.target/i386/avx512vl-vpcompressw-2.c: Ditto.
      	* gcc.target/i386/avx512vl-vpexpandb-1.c: Ditto.
      	* gcc.target/i386/avx512vl-vpexpandb-2.c: Ditto.
      	* gcc.target/i386/avx512vl-vpexpandw-1.c: Ditto.
      	* gcc.target/i386/avx512vl-vpexpandw-2.c: Ditto.
      	* gcc.target/i386/avx512vl-vpshldd-2.c: Ditto.
      	* gcc.target/i386/avx512vl-vpshldq-2.c: Ditto.
      	* gcc.target/i386/avx512vl-vpshldv-1.c: Ditto.
      	* gcc.target/i386/avx512vl-vpshldvd-2.c: Ditto.
      	* gcc.target/i386/avx512vl-vpshldvq-2.c: Ditto.
      	* gcc.target/i386/avx512vl-vpshldvw-2.c: Ditto.
      	* gcc.target/i386/avx512vl-vpshrdd-2.c: Ditto.
      	* gcc.target/i386/avx512vl-vpshrdq-2.c: Ditto.
      	* gcc.target/i386/avx512vl-vpshrdv-1.c: Ditto.
      	* gcc.target/i386/avx512vl-vpshrdvd-2.c: Ditto.
      	* gcc.target/i386/avx512vl-vpshrdvq-2.c: Ditto.
      	* gcc.target/i386/avx512vl-vpshrdvw-2.c: Ditto.
      	* gcc.target/i386/avx512vl-vpshrdw-2.c: Ditto.
      	* gcc.target/i386/avx512vlbw-pr100267-1.c: Ditto.
      	* gcc.target/i386/avx512vlbw-pr100267-b-2.c: Ditto.
      	* gcc.target/i386/avx512vlbw-pr100267-w-2.c: Ditto.
      4fb12ae9
    • Haochen Jiang's avatar
      i386: Add AVX512BW dependency to AVX512BITALG · d08b0559
      Haochen Jiang authored
      Since some of the AVX512BITALG intrins use 32/64 bit mask,
      AVX512BW should be implied.
      
      gcc/ChangeLog:
      
      	* common/config/i386/i386-common.cc
      	(OPTION_MASK_ISA_AVX512BITALG_SET):
      	Change OPTION_MASK_ISA_AVX512F_SET
      	to OPTION_MASK_ISA_AVX512BW_SET.
      	(OPTION_MASK_ISA_AVX512F_UNSET):
      	Remove OPTION_MASK_ISA_AVX512BITALG_SET.
      	(OPTION_MASK_ISA_AVX512BW_UNSET):
      	Add OPTION_MASK_ISA_AVX512BITALG_SET.
      	* config/i386/avx512bitalgintrin.h: Do not push avx512bw.
      	* config/i386/i386-builtin.def:
      	Remove redundant OPTION_MASK_ISA_AVX512BW.
      	* config/i386/sse.md (VI1_AVX512VLBW): Removed.
      	(avx512vl_vpshufbitqmb<mode><mask_scalar_merge_name>):
      	Change the iterator from VI1_AVX512VLBW to VI1_AVX512VL.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/i386/avx512bitalg-vpopcntb-1.c:
      	Remove avx512bw.
      	* gcc.target/i386/avx512bitalg-vpopcntb.c: Ditto.
      	* gcc.target/i386/avx512bitalg-vpopcntbvl.c: Ditto.
      	* gcc.target/i386/avx512bitalg-vpopcntw-1.c: Ditto.
      	* gcc.target/i386/avx512bitalg-vpopcntw.c: Ditto.
      	* gcc.target/i386/avx512bitalg-vpopcntwvl.c: Ditto.
      	* gcc.target/i386/avx512bitalg-vpshufbitqmb-1.c: Ditto.
      	* gcc.target/i386/avx512bitalg-vpshufbitqmb.c: Ditto.
      	* gcc.target/i386/avx512bitalgvl-vpopcntb-1.c: Ditto.
      	* gcc.target/i386/avx512bitalgvl-vpopcntw-1.c: Ditto.
      	* gcc.target/i386/avx512bitalgvl-vpshufbitqmb-1.c: Ditto.
      	* gcc.target/i386/pr93696-1.c: Ditto.
      	* gcc.target/i386/pr93696-2.c: Ditto.
      d08b0559
    • Haochen Jiang's avatar
      i386: Use macro to wrap up share builtin exceptions in builtin isa check · 5ebdbdb9
      Haochen Jiang authored
      gcc/ChangeLog:
      
      	* config/i386/i386-expand.cc
      	(ix86_check_builtin_isa_match): Correct wrong comments.
      	Add a new macro SHARE_BUILTIN and refactor the current if
      	clauses to macro.
      5ebdbdb9
    • Mo, Zewei's avatar
      Re-arrange sections of i386 cpuid · fd7ecd80
      Mo, Zewei authored
      gcc/ChangeLog:
      
      	* config/i386/cpuid.h: Open a new section for Extended Features
      	Leaf (%eax == 7, %ecx == 0) and Extended Features Sub-leaf (%eax == 7,
      	%ecx == 1).
      fd7ecd80
    • Hu, Lin1's avatar
      Optimize vshuf{i,f}{32x4,64x2} ymm and vperm{i,f}128 ymm · c2dac2e5
      Hu, Lin1 authored
      vshuf{i,f}{32x4,64x2} ymm and vperm{i,f}128 ymm are 3 clk.
      We can optimze them to vblend, vmovaps when there's no cross-lane.
      
      gcc/ChangeLog:
      
      	* config/i386/sse.md: Modify insn vperm{i,f}
      	and vshuf{i,f}.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/i386/avx512vl-vshuff32x4-1.c: Modify test.
      	* gcc.target/i386/avx512vl-vshuff64x2-1.c: Ditto.
      	* gcc.target/i386/avx512vl-vshufi32x4-1.c: Ditto.
      	* gcc.target/i386/avx512vl-vshufi64x2-1.c: Ditto.
      	* gcc.target/i386/opt-vperm-vshuf-1.c: New test.
      	* gcc.target/i386/opt-vperm-vshuf-2.c: Ditto.
      	* gcc.target/i386/opt-vperm-vshuf-3.c: Ditto.
      c2dac2e5
    • GCC Administrator's avatar
      Daily bump. · cf0d9dbc
      GCC Administrator authored
      cf0d9dbc
  2. Apr 19, 2023
    • Max Filippov's avatar
      gcc: xtensa: add -m[no-]strict-align option · 675b390e
      Max Filippov authored
      gcc/
      	* config/xtensa/xtensa-opts.h: New header.
      	* config/xtensa/xtensa.h (STRICT_ALIGNMENT): Redefine as
      	xtensa_strict_align.
      	* config/xtensa/xtensa.cc (xtensa_option_override): When
      	-m[no-]strict-align is not specified in the command line set
      	xtensa_strict_align to 0 if the hardware supports both unaligned
      	loads and stores or to 1 otherwise.
      	* config/xtensa/xtensa.opt (mstrict-align): New option.
      	* doc/invoke.texi (Xtensa Options): Document -m[no-]strict-align.
      675b390e
    • Max Filippov's avatar
      gcc: xtensa: add data alignment properties to dynconfig · ec9b3087
      Max Filippov authored
      gcc/
      	* config/xtensa/xtensa-dynconfig.cc (xtensa_get_config_v4): New
      	function.
      
      include/
      	* xtensa-dynconfig.h (xtensa_config_v4): New struct.
      	(XCHAL_DATA_WIDTH, XCHAL_UNALIGNED_LOAD_EXCEPTION)
      	(XCHAL_UNALIGNED_STORE_EXCEPTION, XCHAL_UNALIGNED_LOAD_HW)
      	(XCHAL_UNALIGNED_STORE_HW, XTENSA_CONFIG_V4_ENTRY_LIST): New
      	definitions.
      	(XTENSA_CONFIG_INSTANCE_LIST): Add xtensa_config_v4 instance.
      	(XTENSA_CONFIG_ENTRY_LIST): Add XTENSA_CONFIG_V4_ENTRY_LIST.
      ec9b3087
    • Patrick Palka's avatar
      c++: Define built-in for std::tuple_element [PR100157] · 58b7dbf8
      Patrick Palka authored
      
      This adds a new built-in to replace the recursive class template
      instantiations done by traits such as std::tuple_element and
      std::variant_alternative.  The purpose is to select the Nth type from a
      list of types, e.g. __type_pack_element<1, char, int, float> is int.
      We implement it as a special kind of TRAIT_TYPE.
      
      For a pathological example tuple_element_t<1000, tuple<2000 types...>>
      the compilation time is reduced by more than 90% and the memory used by
      the compiler is reduced by 97%.  In realistic examples the gains will be
      much smaller, but still relevant.
      
      Unlike the other built-in traits, __type_pack_element uses template-id
      syntax instead of call syntax and is SFINAE-enabled, matching Clang's
      implementation.  And like the other built-in traits, it's not mangleable
      so we can't use it directly in function signatures.
      
      N.B. Clang seems to implement __type_pack_element as a first-class
      template that can e.g. be used as a template-template argument.  For
      simplicity we implement it in a more ad-hoc way.
      
      Co-authored-by: default avatarJonathan Wakely <jwakely@redhat.com>
      
      	PR c++/100157
      
      gcc/cp/ChangeLog:
      
      	* cp-trait.def (TYPE_PACK_ELEMENT): Define.
      	* cp-tree.h (finish_trait_type): Add complain parameter.
      	* cxx-pretty-print.cc (pp_cxx_trait): Handle
      	CPTK_TYPE_PACK_ELEMENT.
      	* parser.cc (cp_parser_constant_expression): Document default
      	arguments.
      	(cp_parser_trait): Handle CPTK_TYPE_PACK_ELEMENT.  Pass
      	tf_warning_or_error to finish_trait_type.
      	* pt.cc (tsubst) <case TRAIT_TYPE>: Handle non-type first
      	argument.  Pass complain to finish_trait_type.
      	* semantics.cc (finish_type_pack_element): Define.
      	(finish_trait_type): Add complain parameter.  Handle
      	CPTK_TYPE_PACK_ELEMENT.
      	* tree.cc (strip_typedefs): Handle non-type first argument.
      	Pass tf_warning_or_error to finish_trait_type.
      	* typeck.cc (structural_comptypes) <case TRAIT_TYPE>: Use
      	cp_tree_equal instead of same_type_p for the first argument.
      
      libstdc++-v3/ChangeLog:
      
      	* include/bits/utility.h (_Nth_type): Conditionally define in
      	terms of __type_pack_element if available.
      	* testsuite/20_util/tuple/element_access/get_neg.cc: Prune
      	additional errors from the new built-in.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/ext/type_pack_element1.C: New test.
      	* g++.dg/ext/type_pack_element2.C: New test.
      	* g++.dg/ext/type_pack_element3.C: New test.
      58b7dbf8
    • Patrick Palka's avatar
      c++: bad ggc_free in try_class_unification [PR109556] · 5e284ebb
      Patrick Palka authored
      Aside from correcting how try_class_unification copies multi-dimensional
      'targs', r13-377-g3e948d645bc908 also made it ggc_free this copy as an
      optimization.  But this is wrong since the call to unify within might've
      captured the args in persistent memory such as the satisfaction cache
      (as part of constrained auto deduction).
      
      	PR c++/109556
      
      gcc/cp/ChangeLog:
      
      	* pt.cc (try_class_unification): Don't ggc_free the copy of
      	'targs'.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/cpp2a/concepts-placeholder13.C: New test.
      5e284ebb
    • Harald Anlauf's avatar
      testsuite: fix scan-tree-dump patterns [PR83904,PR100297] · 6fc8e25c
      Harald Anlauf authored
      Adjust scan-tree-dump patterns so that they do not accidentally match a
      valid path.
      
      gcc/testsuite/ChangeLog:
      
      	PR testsuite/83904
      	PR fortran/100297
      	* gfortran.dg/allocatable_function_1.f90: Use "__builtin_free "
      	instead of the naive "free".
      	* gfortran.dg/reshape_8.f90: Extend pattern from a simple "data".
      6fc8e25c
    • Andrew Pinski's avatar
      i386: Add new pattern for zero-extend cmov · 04a9209d
      Andrew Pinski authored
      After a phiopt change, I got a failure of cmov9.c.
      The RTL IR has zero_extend on the outside of
      the if_then_else rather than on the side. Both
      ways are considered canonical as mentioned in
      PR 66588.
      
      This fixes the failure I got and also adds a testcase
      which fails before even my phiopt patch but will pass
      with this patch.
      
      OK? Bootstrapped and tested on x86_64-linux-gnu with
      no regressions.
      
      gcc/ChangeLog:
      
      	* config/i386/i386.md (*movsicc_noc_zext_1): New pattern.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/i386/cmov10.c: New test.
      	* gcc.target/i386/cmov11.c: New test.
      04a9209d
    • Jason Merrill's avatar
      c++: fix 'unsigned __int128_t' semantics [PR108099] · ed32ec26
      Jason Merrill authored
      My earlier patch for 108099 made us accept this non-standard pattern but
      messed up the semantics, so that e.g. unsigned __int128_t was not a 128-bit
      type.
      
      	PR c++/108099
      
      gcc/cp/ChangeLog:
      
      	* decl.cc (grokdeclarator): Keep typedef_decl for __int128_t.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/ext/int128-8.C: New test.
      ed32ec26
Loading