Skip to content
Snippets Groups Projects
  1. Aug 09, 2024
    • Will Hawkins's avatar
      btf: Protect BTF_KIND_INFO against invalid kind · d0bc1cbf
      Will Hawkins authored
      
      If the user provides a kind value that is more than 5 bits, the
      BTF_KIND_INFO macro would emit incorrect values for info (by clobbering
      values of the kind flag).
      
      Tested on x86_64-redhat-linux.
      
      include/ChangeLog:
      
      	* btf.h (BTF_TYPE_INFO): Protect against user providing invalid
      	kind.
      
      Signed-off-by: default avatarWill Hawkins <hawkinsw@obs.cr>
      d0bc1cbf
    • Simon Martin's avatar
      c++: Don't accept multiple enum definitions within template class [PR115806] · 786ebbd6
      Simon Martin authored
      We have been accepting the following invalid code since revision 557831a9
      
      === cut here ===
      template <typename T> struct S {
        enum E { a };
        enum E { b };
      };
      S<int> s;
      === cut here ===
      
      The problem is that start_enum will set OPAQUE_ENUM_P to true even if it
      retrieves an existing definition for the enum, which causes the redefinition
      check in cp_parser_enum_specifier to be bypassed.
      
      This patch only sets OPAQUE_ENUM_P and ENUM_FIXED_UNDERLYING_TYPE_P when
      actually pushing a new tag for the enum.
      
      	PR c++/115806
      
      gcc/cp/ChangeLog:
      
      	* decl.cc (start_enum): Only set OPAQUE_ENUM_P and
      	ENUM_FIXED_UNDERLYING_TYPE_P when pushing a new tag.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/parse/enum15.C: New test.
      786ebbd6
    • Raphael Moreira Zinsly's avatar
      RISC-V: Enable stack clash in alloca · 180ede35
      Raphael Moreira Zinsly authored
      Add the TARGET_STACK_CLASH_PROTECTION_ALLOCA_PROBE_RANGE to riscv in
      order to enable stack clash protection when using alloca.
      The code and tests are the same used by aarch64.
      
      gcc/ChangeLog:
      	* config/riscv/riscv.cc (riscv_compute_frame_info): Update
      	outgoing args size.
      	(riscv_stack_clash_protection_alloca_probe_range): New.
      	(TARGET_STACK_CLASH_PROTECTION_ALLOCA_PROBE_RANGE): New.
      	* config/riscv/riscv.h
      	(STACK_CLASH_MIN_BYTES_OUTGOING_ARGS): New.
      	(STACK_DYNAMIC_OFFSET): New.
      
      gcc/testsuite/ChangeLog:
      	* gcc.target/riscv/stack-check-14.c: New test.
      	* gcc.target/riscv/stack-check-15.c: New test.
      	* gcc.target/riscv/stack-check-alloca-1.c: New test.
      	* gcc.target/riscv/stack-check-alloca-2.c: New test.
      	* gcc.target/riscv/stack-check-alloca-3.c: New test.
      	* gcc.target/riscv/stack-check-alloca-4.c: New test.
      	* gcc.target/riscv/stack-check-alloca-5.c: New test.
      	* gcc.target/riscv/stack-check-alloca-6.c: New test.
      	* gcc.target/riscv/stack-check-alloca-7.c: New test.
      	* gcc.target/riscv/stack-check-alloca-8.c: New test.
      	* gcc.target/riscv/stack-check-alloca-9.c: New test.
      	* gcc.target/riscv/stack-check-alloca-10.c: New test.
      	* gcc.target/riscv/stack-check-alloca.h: New.
      180ede35
    • Raphael Moreira Zinsly's avatar
      RISC-V: Add support to vector stack-clash protection · 2862d99b
      Raphael Moreira Zinsly authored
      Adds basic support to vector stack-clash protection using a loop to do
      the probing and stack adjustments.
      
      gcc/ChangeLog:
      	* config/riscv/riscv.cc
      	(riscv_allocate_and_probe_stack_loop): New function.
      	(riscv_v_adjust_scalable_frame): Add stack-clash protection
      	support.
      	(riscv_allocate_and_probe_stack_space): Move the probe loop
      	implementation to riscv_allocate_and_probe_stack_loop.
      	* config/riscv/riscv.h: Define RISCV_STACK_CLASH_VECTOR_CFA_REGNUM.
      
      gcc/testsuite/ChangeLog:
      	* gcc.target/riscv/stack-check-cfa-3.c: New test.
      	* gcc.target/riscv/stack-check-prologue-16.c: New test.
      	* gcc.target/riscv/struct_vect_24.c: New test.
      2862d99b
    • Raphael Moreira Zinsly's avatar
      RISC-V: Stack-clash protection implemention · b82d173d
      Raphael Moreira Zinsly authored
      This implements stack-clash protection for riscv, with
      riscv_allocate_and_probe_stack_space being based of
      aarch64_allocate_and_probe_stack_space from aarch64's implementation.
      We enforce the probing interval and the guard size to always be equal, their
      default value is 4Kb which is riscv page size.
      
      We also probe up by 1024 bytes in the general case when a probe is required.
      
      gcc/ChangeLog:
      	* config/riscv/riscv.cc
      	(riscv_option_override): Enforce that interval is the same size as
      	guard size.
      	(riscv_allocate_and_probe_stack_space): New function.
      	(riscv_expand_prologue): Call riscv_allocate_and_probe_stack_space
      	to the final allocation of the stack and add stack-clash dump
      	information.
      	* config/riscv/riscv.h: Define STACK_CLASH_CALLER_GUARD and
      	STACK_CLASH_MAX_UNROLL_PAGES.
      
      gcc/testsuite/ChangeLog:
      	* gcc.dg/params/blocksort-part.c: Skip riscv for
      	stack-clash protection intervals.
      	* gcc.dg/pr82788.c: Skip riscv.
      	* gcc.dg/stack-check-6.c: Skip residual check for riscv.
      	* gcc.dg/stack-check-6a.c: Skip riscv.
      	* gcc.target/riscv/stack-check-12.c: New test.
      	* gcc.target/riscv/stack-check-13.c: New test.
      	* gcc.target/riscv/stack-check-cfa-1.c: New test.
      	* gcc.target/riscv/stack-check-cfa-2.c: New test.
      	* gcc.target/riscv/stack-check-prologue-1.c: New test.
      	* gcc.target/riscv/stack-check-prologue-10.c: New test.
      	* gcc.target/riscv/stack-check-prologue-11.c: New test.
      	* gcc.target/riscv/stack-check-prologue-12.c: New test.
      	* gcc.target/riscv/stack-check-prologue-13.c: New test.
      	* gcc.target/riscv/stack-check-prologue-14.c: New test.
      	* gcc.target/riscv/stack-check-prologue-15.c: New test.
      	* gcc.target/riscv/stack-check-prologue-2.c: New test.
      	* gcc.target/riscv/stack-check-prologue-3.c: New test.
      	* gcc.target/riscv/stack-check-prologue-4.c: New test.
      	* gcc.target/riscv/stack-check-prologue-5.c: New test.
      	* gcc.target/riscv/stack-check-prologue-6.c: New test.
      	* gcc.target/riscv/stack-check-prologue-7.c: New test.
      	* gcc.target/riscv/stack-check-prologue-8.c: New test.
      	* gcc.target/riscv/stack-check-prologue-9.c: New test.
      	* gcc.target/riscv/stack-check-prologue.h: New file.
      	* lib/target-supports.exp
      	(check_effective_target_supports_stack_clash_protection):
      	Add riscv.
      	(check_effective_target_caller_implicit_probes): Likewise.
      b82d173d
    • Raphael Moreira Zinsly's avatar
      RISC-V: Move riscv_v_adjust_scalable_frame · 5694fcf7
      Raphael Moreira Zinsly authored
      Move riscv_v_adjust_scalable_frame () in preparation for the stack clash
      protection support.
      
      gcc/ChangeLog:
      	* config/riscv/riscv.cc (riscv_v_adjust_scalable_frame): Move
      	closer to riscv_expand_prologue.
      5694fcf7
    • Raphael Moreira Zinsly's avatar
      RISC-V: Small stack tie changes · 0e604d0e
      Raphael Moreira Zinsly authored
      Enable the register used by riscv_emit_stack_tie () to be passed as
      an argument so we can tie the stack with other registers besides
      hard_frame_pointer_rtx.
      Also don't allow operand 1 of stack_tie<mode> to be optimized to sp
      in preparation for the stack clash protection support.
      
      gcc/ChangeLog:
      	* config/riscv/riscv.cc (riscv_emit_stack_tie): Pass the
      	register to be tied to the stack pointer as argument.
      	* config/riscv/riscv.md (stack_tie<mode>): Don't match equal
      	operands.
      0e604d0e
    • Patrick Palka's avatar
      c-family: regenerate c.opt.urls · f91f7201
      Patrick Palka authored
      The addition of -Wtemplate-body in r15-2774-g596d1ed9d40b10 means
      we need to regenerate c.opt.urls.
      
      gcc/c-family/ChangeLog:
      
      	* c.opt.urls: Regenerate.
      f91f7201
    • Patrick Palka's avatar
      c++: add fixed testcase [PR116289] · 4aa89bad
      Patrick Palka authored
      Fully fixed since r14-6724-gfced59166f95e9.
      
      	PR c++/116289
      	PR c++/113063
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/cpp2a/spaceship-synth16a.C: New test.
      4aa89bad
    • Jakub Jelinek's avatar
      i386: Fix up __builtin_ia32_b{extr{,i}_u{32,64},zhi_{s,d}i} folding [PR116287] · 6e7088db
      Jakub Jelinek authored
      The GENERIC folding of these builtins have cases where it folds to a
      constant regardless of the value of the first operand.  If so, we need
      to use omit_one_operand to avoid throwing away side-effects in the first
      operand if any.  The cases which verify the first argument is INTEGER_CST
      don't need that, INTEGER_CST doesn't have side-effects.
      
      2024-08-09  Jakub Jelinek  <jakub@redhat.com>
      
      	PR target/116287
      	* config/i386/i386.cc (ix86_fold_builtin) <case IX86_BUILTIN_BEXTR32>:
      	When folding into zero without checking whether first argument is
      	constant, use omit_one_operand.
      	(ix86_fold_builtin) <case IX86_BUILTIN_BZHI32>: Likewise.
      
      	* gcc.target/i386/bmi-pr116287.c: New test.
      	* gcc.target/i386/bmi2-pr116287.c: New test.
      	* gcc.target/i386/tbm-pr116287.c: New test.
      6e7088db
    • Andrew Stubbs's avatar
      amdgcn: Add padding to trampoline · b5a09a68
      Andrew Stubbs authored
      This avoids a -Wpadded warning (testcase gcc.dg/20050607-1.c).
      
      gcc/ChangeLog:
      
      	* config/gcn/gcn.cc (gcn_asm_trampoline_template): Add .align.
      	* config/gcn/gcn.h (TRAMPOLINE_SIZE): Increase to 40.
      b5a09a68
    • Thomas Schwinge's avatar
      OpenMP: Constructors and destructors for "declare target" static aggregates:... · 9f5d22e3
      Thomas Schwinge authored
      OpenMP: Constructors and destructors for "declare target" static aggregates: Fix effective-target keyword in test cases
      
      (Most of) the tests added in commit f1bfba3a
      "OpenMP: Constructors and destructors for "declare target" static aggregates"
      had a mismatch between dump file production and its scanning; the former needs
      to use 'offload_target_nvptx' (like 'offload_target_amdgcn'), not
      'offload_device_nvptx'.
      
      	libgomp/
      	* testsuite/libgomp.c++/static-aggr-constructor-destructor-1.C:
      	Fix effective-target keyword.
      	* testsuite/libgomp.c++/static-aggr-constructor-destructor-2.C:
      	Likewise.
      	* testsuite/libgomp.c-c++-common/target-is-initial-host-2.c:
      	Likewise.
      	* testsuite/libgomp.c-c++-common/target-is-initial-host.c:
      	Likewise.
      	* testsuite/libgomp.fortran/target-is-initial-host-2.f90:
      	Likewise.
      	* testsuite/libgomp.fortran/target-is-initial-host.f: Likewise.
      	* testsuite/libgomp.fortran/target-is-initial-host.f90: Likewise.
      9f5d22e3
    • Georg-Johann Lay's avatar
      AVR: Tidy up code for __[x]load insns. · a90c74ab
      Georg-Johann Lay authored
      gcc/
      	* config/avr/avr.md (*load_<mode>_libgcc, *xload_<mode>_libgcc):
      	Tidy up code.
      a90c74ab
    • Jakub Jelinek's avatar
      c-family: Add some more ARRAY_SIZE uses · 723e0f72
      Jakub Jelinek authored
      These two spots were just non-standard, because they divided
      sizeof (omp_pragmas_simd) by sizeof (*omp_pragmas) and not
      the expected sizeof (*omp_pragmas_simd) and so weren't converted
      into ARRAY_SIZE.  Both of the latter sizes are the same though,
      as both arrays have the same type, so this patch doesn't change
      anything but readability.
      
      2024-08-09  Jakub Jelinek  <jakub@redhat.com>
      
      	* c-pragma.cc (c_pp_lookup_pragma): Use ARRAY_SIZE in
      	n_omp_pragmas_simd initializer.
      	(init_pragmas): Likewise.
      723e0f72
    • Kyrylo Tkachov's avatar
      aarch64: Check CONSTM1_RTX in definition of Dm constraint · 19e565ed
      Kyrylo Tkachov authored
      
      The constraint Dm is intended to match vectors of minus 1, but actually
      checks for CONST1_RTX. This doesn't have a bad effect in practice as its
      only use in the aarch64_wrffr pattern for the setffr instruction which
      is a VNx16BI operation and -1 and 1 are the same there. That pattern
      can only be currently generated through intrinsics anyway that create it
      with a CONSTM1_RTX constant.
      
      Fix the constraint definition so that it doesn't become a footgun if its
      used in some other pattern.
      
      Bootstrapped and tested on aarch64-none-linux-gnu.
      
      Signed-off-by: default avatarKyrylo Tkachov <ktkachov@nvidia.com>
      
      gcc/ChangeLog:
      
      	* config/aarch64/constraints.md (Dm): Match CONSTM1_RTX rather
      	CONST1_RTX.
      19e565ed
    • GCC Administrator's avatar
      Daily bump. · 77ccfa6a
      GCC Administrator authored
      77ccfa6a
  2. Aug 08, 2024
    • Andrew Pinski's avatar
      aarch64/testsuite: Fix if-compare_2.c for removing vcond{,u,eq} patterns [PR116041] · 7223c647
      Andrew Pinski authored
      
      For bar1 and bar2, we currently is expecting to use the bsl instruction but
      with slightly different register allocation inside the loop (which happens after
      the removal of the vcond{,u,eq} patterns), we get the bit instruction.  The pattern that
      outputs bsl instruction will output bit and bif too depending register allocation.
      
      So let's check for bsl, bit or bif instructions instead of just bsl instruction.
      
      Tested on aarch64 both with an unmodified compiler and one which has the patch to disable
      these optabs.
      
      gcc/testsuite/ChangeLog:
      
      	PR testsuite/116041
      	* gcc.target/aarch64/if-compare_2.c: Support bit and bif for
      	both bar1 and bar2; add comment on why too.
      
      Signed-off-by: default avatarAndrew Pinski <quic_apinski@quicinc.com>
      7223c647
    • Tamar Christina's avatar
      AArch64: Fix signbit mask creation after late combine [PR116229] · 2c24e056
      Tamar Christina authored
      The optimization to generate a Di signbit constant by using fneg was relying
      on nothing being able to push the constant into the negate.  It's run quite
      late for this reason.
      
      However late combine now runs after it and triggers RTL simplification based on
      the neg.  When -fno-signed-zeros this ends up dropping the - from the -0.0 and
      thus producing incorrect code.
      
      This change adds a new unspec FNEG on DI mode which prevents this simplication.
      
      gcc/ChangeLog:
      
      	PR target/116229
      	* config/aarch64/aarch64-simd.md (aarch64_fnegv2di2<vczle><vczbe>): New.
      	* config/aarch64/aarch64.cc (aarch64_maybe_generate_simd_constant):
      	Update call to gen_aarch64_fnegv2di2.
      	* config/aarch64/iterators.md: New UNSPEC_FNEG.
      
      gcc/testsuite/ChangeLog:
      
      	PR target/116229
      	* gcc.target/aarch64/pr116229.c: New test.
      2c24e056
    • Georg-Johann Lay's avatar
      AVR: target/116295 - Fix unrecognizable insn with __flash read. · c4d3dba2
      Georg-Johann Lay authored
      Some loads from non-generic address-spaces are performed by
      libgcc calls, and they don't have a POST_INC form.  Don't consider
      such insns when running -mfuse-add.
      
           PR target/116295
      gcc/
      	* config/avr/avr.cc (Mem_Insn::Mem_Insn): Don't consider MEMs
      	that are avr_mem_memx_p or avr_load_libgcc_p.
      
      gcc/testsuite/
      	* gcc.target/avr/torture/pr116295.c: New test.
      c4d3dba2
    • Georg-Johann Lay's avatar
      AVR: Fix a typo in __builtin_avr_mask1 documentation. · f6a41ebb
      Georg-Johann Lay authored
      gcc/
      	* doc/extend.texi (AVR Built-in Functions) <mask1>: Fix a typo.
      f6a41ebb
    • Georg-Johann Lay's avatar
      AVR: Improve POST_INC output in some rare cases. · ef697f83
      Georg-Johann Lay authored
      gcc/
      	* config/avr/avr.cc (avr_insn_has_reg_unused_note_p): New function.
      	(_reg_unused_after): Use it to recognize more cases.
      	(avr_out_lpm_no_lpmx) [POST_INC]: Use reg_unused_after.
      ef697f83
    • Andrew Stubbs's avatar
      amdgcn: Fix VGPR max count · 71531733
      Andrew Stubbs authored
      The metadata for RDNA3 kernels allocates VGPRs in blocks of 12, which means the
      maximum usable number of registers is 252.  This patch prevents the compiler
      from exceeding this artifical limit.
      
      gcc/ChangeLog:
      
      	* config/gcn/gcn.cc (gcn_conditional_register_usage): Fix registers
      	remaining after maximum allocation using TARGET_VGPR_GRANULARITY.
      71531733
    • Tobias Burnus's avatar
      libgomp.texi: Update implementation status table for OpenMP TR13 · 89d2f3fe
      Tobias Burnus authored
      libgomp/ChangeLog:
      
      	* libgomp.texi (OpenMP Technical Report 13): Renamed from
      	'OpenMP Technical Report 12'; updated for TR13 changes.
      89d2f3fe
    • Steve Baird's avatar
      ada: Missing legality check when type completed · fc49ee59
      Steve Baird authored
      An access discriminant is allowed to have a default value only if the
      discriminated type is immutably limited. In the case of a discriminated
      limited private type declaration, this rule needs to be checked when
      the completion of the type is seen.
      
      gcc/ada/
      
      	* sem_ch6.adb (Check_Discriminant_Conformance): Perform check for
      	illegal access discriminant default values when the completion of
      	a limited private type is analyzed.
      	* sem_aux.adb (Is_Immutably_Limited): If passed the
      	not-yet-analyzed entity for the full view of a record type, test
      	the Limited_Present flag
      	(which is set by the parser).
      fc49ee59
    • Steve Baird's avatar
      ada: Etype missing for raise expression · 480819c9
      Steve Baird authored
      If the primitive equality operator of the component type of an array type is
      abstract, then a call to that abstract function raises Program_Error (when
      such a call is legal). The FE generates a raise expression to implement this.
      That raise expression is an expression so it should have a valid Etype.
      
      gcc/ada/
      
      	* exp_ch4.adb (Build_Eq_Call): In the abstract callee case, copy
      	the Etype of the callee onto the Make_Raise_Program_Error result.
      480819c9
    • Gary Dismukes's avatar
      ada: Run-time error with GNAT-LLVM on container aggregate with finalization · 85f2ffd8
      Gary Dismukes authored
      When unnesting is enabled, the compiler was failing to copy the At_End_Proc
      field from a block statement to the procedure created to replace it when
      unnesting of top-level blocks is done.  At run time this could lead to
      exceptions due to missing finalization calls.
      
      gcc/ada/
      
      	* exp_ch7.adb (Unnest_Block): Copy the At_End_Proc from the block
      	statement to the newly created subprogram body.
      85f2ffd8
    • Justin Squirek's avatar
      ada: Futher refinements to mutably tagged types · 352d1478
      Justin Squirek authored
      This patch further enhances the mutably tagged type implementation by fixing
      several oversights relating to generic instantiations, attributes, and
      type conversions.
      
      gcc/ada/
      
      	* exp_put_image.adb (Append_Component_Attr): Obtain the mutably
      	tagged type for the component type.
      	* mutably_tagged.adb (Make_Mutably_Tagged_Conversion): Add more
      	cases to avoid conversion generation.
      	* sem_attr.adb (Check_Put_Image_Attribute): Add mutably tagged
      	type conversion.
      	* sem_ch12.adb (Analyze_One_Association): Add rewrite for formal
      	type declarations which are mutably tagged type to their
      	equivalent type.
      	(Instantiate_Type): Add condition to obtain class wide equivalent
      	types.
      	(Validate_Private_Type_Instance): Add check for class wide
      	equivalent types which are considered "definite".
      	* sem_util.adb (Is_Variable): Add condition to handle selected
      	components of view conversions. Add missing check for selected
      	components.
      	(Is_View_Conversion): Add condition to handle class wide
      	equivalent types.
      352d1478
    • Justin Squirek's avatar
      ada: Spurious maximum nesting level warnings · c5420753
      Justin Squirek authored
      This patch fixes an issue in the compiler whereby disabling style checks via
      pragma Style_Checks ("-L") resulted in the minimum nesting level being zero
      but the style still being enabled - leading to spurious maximum nesting level
      exceeded warnings.
      
      gcc/ada/
      
      	* stylesw.adb (Set_Style_Check_Options): Disable max nesting level
      	when unspecified
      c5420753
    • Javier Miranda's avatar
      ada: Finalization_Size raises Constraint_Error · 90b3826d
      Javier Miranda authored
      When the attribute Finalization_Size is applied to an interface type
      object, the compiler-generated code fails at runtime, raising a
      Constraint_Error exception.
      
      gcc/ada/
      
      	* exp_attr.adb (Expand_N_Attribute_Reference) <Finalization_Size>:
      	If the prefix is an interface type, generate code to obtain its
      	address and displace it to reference the base of the object.
      90b3826d
    • Christoph Müllner's avatar
      RISC-V: rv32/DF: Prevent 2 SImode loads using XTheadMemIdx · 33aca37e
      Christoph Müllner authored
      
      When enabling XTheadFmv/Zfa and XThead(F)MemIdx, we might end up
      with the following insn (registers are examples, but of correct class):
      
      (set (reg:DF a4)
           (mem:DF (plus:SI (mult:SI (reg:SI a0)
      			       (const_int 8))
      		      (reg:SI a5))))
      
      This is a result of an attempt to load the DF register via two SI
      register loads followed by XTheadFmv/Zfa instructions to move the
      contents of the two SI registers into the DF register.
      
      The two loads are generated in riscv_split_doubleword_move(),
      where the second load adds an offset of 4 to load address.
      While this works fine for RVI loads, this can't be handled
      for XTheadMemIdx addresses.  Coming back to the example above,
      we would end up with the following insn, which can't be simplified
      or matched:
      
      (set (reg:SI a4)
           (mem:SI (plus:SI (plus:SI (mult:SI (reg:SI a0)
      					(const_int 8))
      			       (reg:SI a5))
      		      (const_int 4))))
      
      This triggered an ICE in the past, which was resolved in b79cd204,
      which also added the test xtheadfmemidx-medany.c, where the examples
      are from.  The patch postponed the optimization insn_and_split pattern
      for XThead(F)MemIdx, so that the situation could effectively be avoided.
      
      Since we don't want to rely on these optimization pattern in the future,
      we need a different solution.  Therefore, this patch restricts the
      movdf_hardfloat_rv32 insn to not match for split-double-word-moves
      with XThead(F)MemIdx operands.  This ensures we don't need to split
      them up later.
      
      When looking at the code generation of the test file, we can see that
      we have less GP<->FP conversions, but cannot use the indexed loads.
      The new sequence is identical to rv32gc_xtheadfmv (similar to rv32gc_zfa).
      
      Old:
      [...]
      	lla     a5,.LANCHOR0
      	th.flrd fa5,a5,a0,3
      	fmv.x.w a4,fa5
      	th.fmv.x.hw     a5,fa5
      .L1:
      	fmv.w.x fa0,a4
      	th.fmv.hw.x     fa0,a5
      	ret
      [...]
      
      New:
      [...]
      	lla     a5,.LANCHOR0
      	slli    a4,a0,3
      	add     a4,a4,a5
      	lw      a5,4(a4)
      	lw      a4,0(a4)
      .L1:
      	fmv.w.x fa0,a4
      	th.fmv.hw.x     fa0,a5
      	ret
      [...]
      
      This was tested (together with the patch that eliminates the
      XTheadMemIdx optimization patterns) with SPEC CPU 2017 intrate
      on QEMU (RV64/lp64d).
      
      gcc/ChangeLog:
      
      	* config/riscv/constraints.md (th_m_noi): New constraint.
      	* config/riscv/riscv.md: Adjust movdf_hardfloat_rv32 for
      	XTheadMemIdx.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/riscv/xtheadfmemidx-xtheadfmv-medany.c: Adjust.
      	* gcc.target/riscv/xtheadfmemidx-zfa-medany.c: Likewise.
      
      Signed-off-by: default avatarChristoph Müllner <christoph.muellner@vrull.eu>
      33aca37e
    • Christoph Müllner's avatar
      RISC-V: xthead(f)memidx: Eliminate optimization patterns · 31c3c5d1
      Christoph Müllner authored
      
      We have a huge amount of optimization patterns (insn_and_split) for
      XTheadMemIdx and XTheadFMemIdx that attempt to do something, that can be
      done more efficient by generic GCC passes, if we have proper support code.
      
      A key function in eliminating the optimization patterns is
      th_memidx_classify_address_index(), which needs to identify each possible
      memory expression that can be lowered into a XTheadMemIdx/XTheadFMemIdx
      instruction.  This patch adds all memory expressions that were
      previously only recognized by the optimization patterns.
      
      Now, that the address classification is complete, we can finally remove
      all optimization patterns with the side-effect or getting rid of the
      non-canonical memory expression they produced: (plus (reg) (ashift (reg) (imm))).
      
      A positive side-effect of this change is, that we address an RV32 ICE,
      that was caused by the th_memidx_I_c pattern, which did not properly
      handle SUBREGs (more details are in PR116131).
      
      A temporary negative side-effect of this change is, that we cause a
      regression of the xtheadfmemidx + xtheadfmv/zfa tests (initially
      introduced as part of b79cd204 to address an ICE).
      As this issue cannot be addressed in the code parts that are
      adjusted in this patch, we just accept the regression for now.
      
      	PR target/116131
      
      gcc/ChangeLog:
      
      	* config/riscv/thead.cc (th_memidx_classify_address_index):
      	Recognize all possible XTheadMemIdx memory operand structures.
      	(th_fmemidx_output_index): Do strict classification.
      	* config/riscv/thead.md (*th_memidx_operand): Remove.
      	(TARGET_XTHEADMEMIDX): Likewise.
      	(TARGET_HARD_FLOAT && TARGET_XTHEADFMEMIDX): Likewise.
      	(!TARGET_64BIT && TARGET_XTHEADMEMIDX): Likewise.
      	(*th_memidx_I_a): Likewise.
      	(*th_memidx_I_b): Likewise.
      	(*th_memidx_I_c): Likewise.
      	(*th_memidx_US_a): Likewise.
      	(*th_memidx_US_b): Likewise.
      	(*th_memidx_US_c): Likewise.
      	(*th_memidx_UZ_a): Likewise.
      	(*th_memidx_UZ_b): Likewise.
      	(*th_memidx_UZ_c): Likewise.
      	(*th_fmemidx_movsf_hardfloat): Likewise.
      	(*th_fmemidx_movdf_hardfloat_rv64): Likewise.
      	(*th_fmemidx_I_a): Likewise.
      	(*th_fmemidx_I_c): Likewise.
      	(*th_fmemidx_US_a): Likewise.
      	(*th_fmemidx_US_c): Likewise.
      	(*th_fmemidx_UZ_a): Likewise.
      	(*th_fmemidx_UZ_c): Likewise.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/riscv/pr116131.c: New test.
      
      Reported-by: default avatarPatrick O'Neill <patrick@rivosinc.com>
      Signed-off-by: default avatarChristoph Müllner <christoph.muellner@vrull.eu>
      31c3c5d1
    • Christoph Müllner's avatar
      RISC-V: testsuite: xtheadfmemidx: Rename test and add similar Zfa test · 8e6bc6dd
      Christoph Müllner authored
      
      Test file xtheadfmemidx-medany.c has been added in b79cd204 as a
      test case that provoked an ICE when loading DFmode registers via two
      SImode register loads followed by a SI->DF[63:32] move from XTheadFmv.
      Since Zfa is affected in the same way as XTheadFmv, even if both
      have slightly different instructions, let's add a test for Zfa as well
      and give the tests proper names.
      
      Let's also add a test into the test files that counts the SI->DF moves
      from XTheadFmv/Zfa.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/riscv/xtheadfmemidx-medany.c: Move to...
      	* gcc.target/riscv/xtheadfmemidx-xtheadfmv-medany.c: ...here.
      	* gcc.target/riscv/xtheadfmemidx-zfa-medany.c: New test.
      
      Signed-off-by: default avatarChristoph Müllner <christoph.muellner@vrull.eu>
      8e6bc6dd
    • Andrew Pinski's avatar
      vect: Small C++11-ification of vect_vect_recog_func_ptrs · ad7d4843
      Andrew Pinski authored
      
      This is a small C++11-ificiation for the use of vect_vect_recog_func_ptrs.
      Changes the loop into a range based loop which then we can remove the variable
      definition of NUM_PATTERNS. Also uses const reference instead of a pointer.
      
      Bootstrapped and tested on x86_64-linux-gnu.
      
      gcc/ChangeLog:
      
      	* tree-vect-patterns.cc (NUM_PATTERNS): Delete.
      	(vect_pattern_recog_1): Constify and change
      	recog_func to a reference.
      	(vect_pattern_recog): Use range-based loop over
      	vect_vect_recog_func_ptrs.
      
      Signed-off-by: default avatarAndrew Pinski <quic_apinski@quicinc.com>
      ad7d4843
    • Jin Ma's avatar
      RISC-V: Delete duplicate '#define RISCV_DWARF_VLENB' · ecdf7a4e
      Jin Ma authored
      gcc/ChangeLog:
      
      	* config/riscv/riscv.h (RISCV_DWARF_VLENB): Delete.
      ecdf7a4e
    • Andrew Stubbs's avatar
      amdgcn: Re-enable trampolines · 6f71e050
      Andrew Stubbs authored
      The stacks are executable since the reverse-offload features were added, so
      trampolines actually do work.
      
      gcc/ChangeLog:
      
      	* config/gcn/gcn.cc (gcn_trampoline_init): Re-enable trampolines.
      6f71e050
    • Jeff Law's avatar
      [RISC-V][PR target/116240] Ensure object is a comparison before extracting arguments · 190ad812
      Jeff Law authored
      This was supposed to go out the door yesterday, but I kept getting interrupted.
      
      The target bits for rtx costing can't assume the rtl they're given actually
      matches a target pattern.   It's just kind of inherent in how the costing
      routines get called in various places.
      
      In this particular case we're trying to cost a conditional move:
      
      (set (dest) (if_then_else (cond) (true) (false))
      
      On the RISC-V port the backend only allows actual conditionals for COND.  So
      something like (eq (reg) (const_int 0)).  In the costing code for if-then-else
      we did something like
      
      (XEXP (XEXP (cond, 0), 0)))
      
      Which fails miserably if COND is a terminal node like (reg) rather than (ne
      (reg) (const_int 0)
      
      So this patch tightens up the RTL scanning to ensure that we have a comparison
      before we start looking at the comparison's arguments.
      
      Run through my tester without incident, but I'll wait for the pre-commit tester
      to run through a cycle before pushing to the trunk.
      
      Jeff
      
      ps.   We probably could support a naked REG for the condition and internally convert it to (ne (reg) (const_int 0)), but I don't think it likely happens with any regularity.
      
      	PR target/116240
      gcc/
      	* config/riscv/riscv.cc (riscv_rtx_costs): Ensure object is a
      	comparison before looking at its arguments.
      
      gcc/testsuite
      	* gcc.target/riscv/pr116240.c: New test.
      190ad812
    • Manolis Tsamis's avatar
      Rearrange SLP nodes with duplicate statements [PR98138] · ab187858
      Manolis Tsamis authored
      This change checks when a two_operators SLP node has multiple occurrences of
      the same statement (e.g. {A, B, A, B, ...}) and tries to rearrange the operands
      so that there are no duplicates. Two vec_perm expressions are then introduced
      to recreate the original ordering. These duplicates can appear due to how
      two_operators nodes are handled, and they prevent vectorization in some cases.
      
      This targets the vectorization of the SPEC2017 x264 pixel_satd functions.
      In some processors a larger than 10% improvement on x264 has been observed.
      
      	PR tree-optimization/98138
      
      gcc/ChangeLog:
      
      	* tree-vect-slp.cc: Avoid duplicates in two_operators nodes.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/aarch64/vect-slp-two-operator.c: New test.
      ab187858
    • Nathaniel Shead's avatar
      c++: Propagate TREE_ADDRESSABLE in fixup_type_variants [PR115062] · 71aebb36
      Nathaniel Shead authored
      
      This has caused issues with modules when an import fills in the
      definition of a type already created with a typedef.
      
      	PR c++/115062
      
      gcc/cp/ChangeLog:
      
      	* class.cc (fixup_type_variants): Propagate TREE_ADDRESSABLE.
      	(finish_struct_bits): Cleanup now that TREE_ADDRESSABLE is
      	propagated by fixup_type_variants.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/modules/pr115062_a.H: New test.
      	* g++.dg/modules/pr115062_b.H: New test.
      	* g++.dg/modules/pr115062_c.C: New test.
      
      Signed-off-by: default avatarNathaniel Shead <nathanieloshead@gmail.com>
      71aebb36
    • Nathaniel Shead's avatar
      c++/modules: Assume header bindings are global module · 0de1481a
      Nathaniel Shead authored
      
      While stepping through some code I noticed that we do some extra work
      (finding the originating module decl, stripping the template, and
      inspecting the attached-ness) for every declaration taken from a header
      unit.  This doesn't seem necessary though since no declaration in a
      header unit can be attached to anything but the global module, so we can
      just assume that global_p will be true.
      
      This was the original behaviour before I removed this assumption while
      refactoring for r15-2807-gc592310d5275e0.
      
      gcc/cp/ChangeLog:
      
      	* module.cc (module_state::read_cluster): Assume header module
      	declarations will require GM merging.
      
      Signed-off-by: default avatarNathaniel Shead <nathanieloshead@gmail.com>
      0de1481a
    • Tobias Burnus's avatar
      libgomp/libgomp.texi: Mention -fno-builtin-omp_is_initial_device · 8b5a8b1f
      Tobias Burnus authored
      libgomp/ChangeLog:
      
      	* libgomp.texi (omp_is_initial_device): Mention
      	-fno-builtin-omp_is_initial_device and folding by default.
      8b5a8b1f
Loading