Skip to content
Snippets Groups Projects
  1. Aug 19, 2024
    • Andrew Carlotti's avatar
      aarch64: Fix ls64 intrinsic availability · fceecc51
      Andrew Carlotti authored
      The availability of ls64 intrinsics and data types were determined
      solely by the globally specified architecture features, which did not
      reflect any changes specified in target pragmas or attributes.
      
      This patch removes the initialisation-time guards for the intrinsics,
      and replaces them with checks at use time. We also get better error
      messages when ls64 is not available (matching the existing error
      messages for SVE intrinsics).
      
      The data512_t type is made always available; this is consistent with the
      present behaviour for Neon fp16/bf16 types.
      
      gcc/ChangeLog:
      
      	PR target/112108
      	* config/aarch64/aarch64-builtins.cc (handle_arm_acle_h): Remove
      	feature check at initialisation.
      	(aarch64_general_check_builtin_call): Check ls64 intrinsics.
      	* config/aarch64/arm_acle.h: (data512_t) Make always available.
      
      gcc/testsuite/ChangeLog:
      
      	PR target/112108
      	* gcc.target/aarch64/acle/ls64_guard-1.c: New test.
      	* gcc.target/aarch64/acle/ls64_guard-2.c: New test.
      	* gcc.target/aarch64/acle/ls64_guard-3.c: New test.
      	* gcc.target/aarch64/acle/ls64_guard-4.c: New test.
      fceecc51
    • Andrew Carlotti's avatar
      aarch64: Fix memtag intrinsic availability · 4e1b617b
      Andrew Carlotti authored
      The availability of memtag intrinsics and data types were determined
      solely by the globally specified architecture features, which did not
      reflect any changes specified in target pragmas or attributes.
      
      This patch removes the initialisation-time guards for the intrinsics,
      and replaces them with checks at use time. It also removes the macro
      indirection from the header file - this simplifies the header, and
      allows the missing extension error reporting to find the user-facing
      intrinsic names.
      
      gcc/ChangeLog:
      
      	PR target/112108
      	* config/aarch64/aarch64-builtins.cc (aarch64_init_memtag_builtins):
      	Define intrinsic names directly.
      	(aarch64_general_init_builtins): Move memtag intialisation...
      	(handle_arm_acle_h): ...to here, and remove feature check.
      	(aarch64_general_check_builtin_call): Check memtag intrinsics.
      	* config/aarch64/arm_acle.h (__arm_mte_create_random_tag)
      	(__arm_mte_exclude_tag, __arm_mte_ptrdiff)
      	(__arm_mte_increment_tag, __arm_mte_set_tag, __arm_mte_get_tag):
      	Remove.
      
      gcc/testsuite/ChangeLog:
      
      	PR target/112108
      	* gcc.target/aarch64/acle/memtag_guard-1.c: New test.
      	* gcc.target/aarch64/acle/memtag_guard-2.c: New test.
      	* gcc.target/aarch64/acle/memtag_guard-3.c: New test.
      	* gcc.target/aarch64/acle/memtag_guard-4.c: New test.
      4e1b617b
    • Andrew Carlotti's avatar
      aarch64: Fix tme intrinsic availability · 32afbb60
      Andrew Carlotti authored
      The availability of tme intrinsics was previously gated at both
      initialisation time (using global target options) and usage time
      (accounting for function-specific target options).  This patch removes
      the check at initialisation time, and also moves the intrinsics out of
      the header file to allow for better error messages (matching the
      existing error messages for SVE intrinsics).
      
      gcc/ChangeLog:
      
      	PR target/112108
      	* config/aarch64/aarch64-builtins.cc (aarch64_init_tme_builtins):
      	Define intrinsic names directly.
      	(aarch64_general_init_builtins): Move tme initialisation...
      	(handle_arm_acle_h): ...to here, and remove feature check.
      	(aarch64_general_check_builtin_call): Check tme intrinsics.
      	* config/aarch64/arm_acle.h (__tstart, __tcommit, __tcancel)
      	(__ttest): Remove.
      	(_TMFAILURE_*): Define unconditionally.
      
      gcc/testsuite/ChangeLog:
      
      	PR target/112108
      	* gcc.target/aarch64/acle/tme_guard-1.c: New test.
      	* gcc.target/aarch64/acle/tme_guard-2.c: New test.
      	* gcc.target/aarch64/acle/tme_guard-3.c: New test.
      	* gcc.target/aarch64/acle/tme_guard-4.c: New test.
      32afbb60
    • Andrew Carlotti's avatar
      aarch64: Move check_required_extensions · baf71ec5
      Andrew Carlotti authored
      Move SVE extension checking functionality to aarch64-builtins.cc, so
      that it can be shared by non-SVE intrinsics.
      
      gcc/ChangeLog:
      
      	* config/aarch64/aarch64-sve-builtins.cc (check_builtin_call)
      	(expand_builtin): Update calls to the below.
      	(report_missing_extension, report_missing_registers)
      	(check_required_extensions): Move out of aarch64_sve namespace,
      	rename, and move into...
      	* config/aarch64/aarch64-builtins.cc (aarch64_report_missing_extension)
      	(aarch64_report_missing_registers)
      	(aarch64_check_required_extensions) ...here.
      	* config/aarch64/aarch64-protos.h (aarch64_check_required_extensions):
      	Add prototype.
      baf71ec5
    • Andrew Carlotti's avatar
      aarch64: Refactor check_required_extensions · a4b39dc4
      Andrew Carlotti authored
      Replace TARGET_GENERAL_REGS_ONLY check with an explicit check that
      aarch64_isa_flags enables all required extensions.  This will be more
      flexible when repurposing this function for non-SVE intrinsics.
      
      gcc/ChangeLog:
      
      	* config/aarch64/aarch64-sve-builtins.cc
      	(check_required_registers): Remove target check and rename to...
      	(report_missing_registers): ...this.
      	(check_required_extensions): Refactor.
      a4b39dc4
    • Andre Vehreschild's avatar
      Allow coarrays in select type. [PR46371, PR56496] · 8871489c
      Andre Vehreschild authored
      Fix ICE when scalar coarrays are used in a select type. Prevent
      coindexing in associate/select type/select rank selector expression.
      
      gcc/fortran/ChangeLog:
      
      	PR fortran/46371
      	PR fortran/56496
      
      	* expr.cc (gfc_is_coindexed): Detect is coindexed also when
      	rewritten to caf_get.
      	* trans-stmt.cc (trans_associate_var): Always accept a
      	descriptor for coarrays.
      
      gcc/testsuite/ChangeLog:
      
      	* gfortran.dg/coarray/select_type_1.f90: New test.
      	* gfortran.dg/coarray/select_type_2.f90: New test.
      	* gfortran.dg/coarray/select_type_3.f90: New test.
      8871489c
    • Arsen Arsenović's avatar
      gnat: fix lto-type-mismatch between C_Version_String and gnat_version_string [PR115917] · 9cbcf8d1
      Arsen Arsenović authored
      gcc/ada/ChangeLog:
      
      	PR ada/115917
      	* gnatvsn.ads: Add note about the duplication of this value in
      	version.c.
      	* version.c (VER_LEN_MAX): Define to the same value as
      	Gnatvsn.Ver_Len_Max.
      	(gnat_version_string): Use VER_LEN_MAX as bound.
      9cbcf8d1
    • Kyrylo Tkachov's avatar
      aarch64: Reduce FP reassociation width for Neoverse V2 and set... · cc572242
      Kyrylo Tkachov authored
      aarch64: Reduce FP reassociation width for Neoverse V2 and set AARCH64_EXTRA_TUNE_FULLY_PIPELINED_FMA
      
      The fp reassociation width for Neoverse V2 was set to 6 since its
      introduction and I guess it was empirically tuned.  But since
      AARCH64_EXTRA_TUNE_FULLY_PIPELINED_FMA was added the tree reassociation
      pass seems to be more deliberate in forming FMAs and when that flag is
      used it seems to more properly evaluate the FMA vs non-FMA reassociation
      widths.
      According to the Neoverse V2 SWOG the core has a throughput of 4 for
      most FP operations, so the value 6 is not accurate anyway.
      Also, the SWOG does state that FMADD operations are pipelined and the
      results can be forwarded from FP multiplies to the accumulation operands
      of FMADD instructions, which seems to be what
      AARCH64_EXTRA_TUNE_FULLY_PIPELINED_FMA expresses.
      
      This patch sets the fp_reassoc_width field to 4 and enables
      AARCH64_EXTRA_TUNE_FULLY_PIPELINED_FMA for -mcpu=neoverse-v2.
      
      On SPEC2017 fprate I see the following changes on a Grace system:
      503.bwaves_r	0.16%
      507.cactuBSSN_r	-0.32%
      508.namd_r	3.04%
      510.parest_r	0.00%
      511.povray_r	0.78%
      519.lbm_r 	0.35%
      521.wrf_r	0.69%
      526.blender_r	-0.53%
      527.cam4_r	0.84%
      538.imagick_r	0.00%
      544.nab_r	-0.97%
      549.fotonik3d_r	-0.45%
      554.roms_r	0.97%
      Geomean	        0.35%
      
      with -Ofast -mcpu=grace -flto.
      
      So slight overall improvement with a meaningful improvement in
      508.namd_r.
      
      I think other tunings in aarch64 should look into
      AARCH64_EXTRA_TUNE_FULLY_PIPELINED_FMA as well, but I'll leave the
      benchmarking to someone else.
      
      Signed-off-by: default avatarKyrylo Tkachov <ktkachov@nvidia.com>
      
      gcc/ChangeLog:
      
      	* config/aarch64/tuning_models/neoversev2.h (fp_reassoc_width):
      	Set to 4.
      	(tune_flags): Add AARCH64_EXTRA_TUNE_FULLY_PIPELINED_FMA.
      cc572242
    • Torbjörn SVENSSON's avatar
      testsuite: Prune warning about size of enums · 6d8b9b77
      Torbjörn SVENSSON authored
      This fixes reported regression at
      https://linaro.atlassian.net/browse/GNU-1315
      
      .
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/warn/pr33738-2.C: dg-prune arm linker messages about
      	size of enums.
      
      Signed-off-by: default avatarTorbjörn SVENSSON <torbjorn.svensson@foss.st.com>
      6d8b9b77
    • Andre Vieira's avatar
      rtl: Enable the use of rtx values with int and mode attributes · e57d3cce
      Andre Vieira authored
      The 'code' part of a 'define_code_attr' refers to the type of the key, in other
      words, it uses a code_iterator to pick the 'value' from their (key "value") pair
      list.
      
      However, rtx_alloc_for_name requires a code_attribute to be used when the
      'value' needs to be a type. In other words, no other type of attributes could be
      used, before this patch, to produce a rtx typed 'value'.
      
      This patch removes that restriction and allows the backend to use any kind of
      attribute as long as that attribute always produces a valid code typed 'value'.
      
      gcc/ChangeLog:
      
      	* read-rtl.cc (rtx_reader::rtx_alloc_for_name): Allow all attribute
      	types to produce code 'values'.
      	(check_code_attribute): Rename ...
      	(check_attribute_codes): ... to this.  And change comments to refer to
      	* doc/md.texi: Add paragraph to document that you can use int and mode
      	attributes to produce codes.
      e57d3cce
    • Richard Sandiford's avatar
      testsuite: Reduce cut-&-paste in scanltranstree.exp · 71059d26
      Richard Sandiford authored
      scanltranstree.exp defines some LTO wrappers around standard
      non-LTO scanners.  Four of them are cut-&-paste variants of
      one another, so this patch generates them from a single template.
      It also does the same for scan-ltrans-tree-dump-times, so that
      other *-times scanners can be added easily in future.
      
      The scanners seem to be lightly used.  gcc.dg/ipa/ipa-icf-38.c uses
      scan-ltrans-tree-dump{,-not} and libgomp.c/declare-variant-1.c
      uses scan-ltrans-tree-dump-{not,times}.  Nothing currently seems
      to use scan-ltrans-tree-dump-dem*.
      
      gcc/testsuite/
      	* lib/scanltranstree.exp: Redefine the routines using two
      	templates.
      71059d26
    • Andre Vehreschild's avatar
      Fix ICE in recompute_tree_invariant_for_addr_expr, at tree.c:4535 [PR84244] · 661acde6
      Andre Vehreschild authored
      Declaring an unused function with a derived type having a pointer
      component and using that derived type as a coarray, lead the compiler to
      ICE because the caf_token for the pointer was not linked into the
      component correctly.
      
      	PR fortran/84244
      
      gcc/fortran/ChangeLog:
      
      	* trans-types.cc (gfc_get_derived_type): When a caf_sub_token is
      	generated for a component, link it to the component it is
      	generated for (the previous one).
      
      gcc/testsuite/ChangeLog:
      
      	* gfortran.dg/coarray/ptr_comp_5.f08: New test.
      661acde6
    • Haochen Gui's avatar
      aarch64: Implement 16-byte vector mode const0 store by TImode · 8d6c6fbc
      Haochen Gui authored
      gcc/
      	* config/aarch64/aarch64-simd.md (mov<mode> for VSTRUCT_QD):
      	Expand 16-byte vector mode const0 store by TImode.
      8d6c6fbc
    • Hu, Lin1's avatar
      AVX10.2 ymm rounding: Support vsqrtp{s,d,h} and vsubp{s,d,h} intrins · 7f62e710
      Hu, Lin1 authored
      gcc/ChangeLog:
      
      	* config/i386/avx10_2roundingintrin.h: New intrins.
      	* config/i386/i386-builtin.def (BDESC): Add new builtins.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/i386/avx-1.c: Add new builtin test.
      	* gcc.target/i386/sse-13.c: Ditto.
      	* gcc.target/i386/sse-14.c: Ditto.
      	* gcc.target/i386/sse-22.c: Add new macro test.
      	* gcc.target/i386/sse-23.c: Ditto.
      	* gcc.target/i386/avx10_2-rounding-3.c: Add test.
      7f62e710
    • Hu, Lin1's avatar
      AVX10.2 ymm rounding: Support vscalefp{s,d,h} intrins · 1f86cf06
      Hu, Lin1 authored
      gcc/ChangeLog:
      
      	* config/i386/avx10_2roundingintrin.h: New intrins.
      	* config/i386/i386-builtin.def: Add new builtins.
      	* config/i386/sse.md:
      	(<avx512>_scalef<mode><mask_name><round_name>): Add condition check.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/i386/avx-1.c: Add new builtin test.
      	* gcc.target/i386/sse-13.c: Ditto.
      	* gcc.target/i386/sse-14.c: Ditto.
      	* gcc.target/i386/sse-22.c: Add new macro test.
      	* gcc.target/i386/sse-23.c: Ditto.
      	* gcc.target/i386/avx10_2-rounding-3.c: Add test.
      1f86cf06
    • Hu, Lin1's avatar
      AVX10.2 ymm rounding: Support vreducep{s,d,h} and vrndscalep{s,d,h} intrins · 9afa5081
      Hu, Lin1 authored
      gcc/ChangeLog:
      
      	* config/i386/avx10_2roundingintrin.h: New intrins.
      	* config/i386/i386-builtin.def (BDESC): Add new builtins.
      	* config/i386/sse.md:
      	(<mask_codefor>reducep<mode><mask_name><round_saeonly_name>):
      	Add condition check.
      	(<avx512>_rndscale<mode><mask_name><round_saeonly_name>): Ditto.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/i386/avx-1.c: Add new builtin test.
      	* gcc.target/i386/sse-13.c: Ditto.
      	* gcc.target/i386/sse-14.c: Ditto.
      	* gcc.target/i386/sse-22.c: Add new macro test.
      	* gcc.target/i386/sse-23.c: Ditto.
      	* gcc.target/i386/avx10_2-rounding-3.c: Add test.
      9afa5081
    • Hu, Lin1's avatar
      AVX10.2 ymm rounding: Support vmulp{s,d,h} and vrangep{s,d} intrins · 90cc5b0c
      Hu, Lin1 authored
      gcc/ChangeLog:
      
      	* config/i386/avx10_2roundingintrin.h: New intrins.
      	* config/i386/i386-builtin-types.def: Add new DEF_FUNCTION_TYPE.
      	* config/i386/i386-builtin.def (BDESC): Add new builtins.
      	* config/i386/i386-expand.cc (ix86_expand_round_builtin):
      	Handle V8SF_FTYPE_V8SF_V8SF_INT_V8SF_UQI_INT,
      	V4DF_FTYPE_V4DF_V4DF_INT_V4DF_UQI_INT.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/i386/avx-1.c: Add new builtin test.
      	* gcc.target/i386/sse-13.c: Ditto.
      	* gcc.target/i386/sse-14.c: Ditto.
      	* gcc.target/i386/sse-22.c: Add new macro test.
      	* gcc.target/i386/sse-23.c: Ditto.
      	* gcc.target/i386/avx10_2-rounding-3.c: Add test.
      90cc5b0c
    • Hu, Lin1's avatar
      AVX10.2 ymm rounding: Support v{max,min}p{s,d,h} intrins · cc8a7596
      Hu, Lin1 authored
      gcc/ChangeLog:
      
      	* config/i386/avx10_2roundingintrin.h: New intrins.
      	* config/i386/i386-builtin.def (BDESC): Add new builtins.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/i386/avx-1.c: Add new builtin test.
      	* gcc.target/i386/sse-13.c: Ditto.
      	* gcc.target/i386/sse-14.c: Ditto.
      	* gcc.target/i386/sse-22.c: Add new macro test.
      	* gcc.target/i386/sse-23.c: Ditto.
      	* gcc.target/i386/avx10_2-rounding-3.c: Add test.
      cc8a7596
    • Hu, Lin1's avatar
      AVX10.2 ymm rounding: Support vgetexpp{s,d,h} and vgetmantp{s,d,h} intrins · 8d4f5429
      Hu, Lin1 authored
      gcc/ChangeLog:
      
      	* config/i386/avx10_2roundingintrin.h: New intrins.
      	* config/i386/i386-builtin-types.def: Add new DEF_FUNCTION_TYPE.
      	* config/i386/i386-builtin.def (BDESC): Add new builtins.
      	* config/i386/i386-expand.cc (ix86_expand_round_builtin): Handle
      	V8SF_FTYPE_V8SF_V8SF_UQI_INT, V4DF_FTYPE_V4DF_V4DF_UQI_INT,
      	V16HF_FTYPE_V16HF_V16HF_UHI_INT, V16HF_FTYPE_V16HF_INT_V16HF_UHI_INT,
      	V4DF_FTYPE_V4DF_INT_V4DF_UQI_INT, V8SF_FTYPE_V8SF_INT_V8SF_UQI_INT.
      	* config/i386/sse.md:
      	(<avx512>_getexp<mode><mask_name><round_saeonly_name>):
      	Add condition check.
      	(<avx512>_getmant<mode><mask_name><round_saeonly_name>):
      	Ditto.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/i386/avx-1.c: Add new builtin test.
      	* gcc.target/i386/sse-13.c: Ditto.
      	* gcc.target/i386/sse-14.c: Ditto.
      	* gcc.target/i386/sse-22.c: Add new macro test.
      	* gcc.target/i386/sse-23.c: Ditto.
      	* gcc.target/i386/avx10_2-rounding-3.c: Add test.
      8d4f5429
    • Hu, Lin1's avatar
      AVX10.2 ymm rounding: Support vfnmsub{132,231,213}p{s,d,h} intrins · 0983d406
      Hu, Lin1 authored
      gcc/ChangeLog:
      
      	* config/i386/avx10_2roundingintrin.h: New intrins.
      	* config/i386/i386-builtin.def (BDESC): Add new builtins.
      	* config/i386/sse.md:
      	(<avx512>_fnmsub_<mode>_mask3<round_name>): Add condition check.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/i386/avx-1.c: Add new builtin test.
      	* gcc.target/i386/sse-13.c: Ditto.
      	* gcc.target/i386/sse-14.c: Ditto.
      	* gcc.target/i386/sse-22.c: Add new macro test.
      	* gcc.target/i386/sse-23.c: Ditto.
      	* gcc.target/i386/avx10_2-rounding-3.c: Add test.
      0983d406
    • Hu, Lin1's avatar
      AVX10.2 ymm rounding: Support vfmulcph and vfnmadd{132,231,213}p{s,d,h} intrins · 6f0aa7ad
      Hu, Lin1 authored
      gcc/ChangeLog:
      
      	* config/i386/avx10_2roundingintrin.h: New intrins.
      	* config/i386/i386-builtin.def (BDESC): Add new builtins.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/i386/avx-1.c: Add new builtin test.
      	* gcc.target/i386/sse-13.c: Ditto.
      	* gcc.target/i386/sse-14.c: Ditto.
      	* gcc.target/i386/sse-22.c: Add new macro test.
      	* gcc.target/i386/sse-23.c: Ditto.
      	* gcc.target/i386/avx10_2-rounding-3.c: Add test.
      6f0aa7ad
    • Hu, Lin1's avatar
      AVX10.2 ymm rounding: Support vfm{sub,subadd}{132,231,213}p{s,d,h} intrins · dd48acbe
      Hu, Lin1 authored
      gcc/ChangeLog:
      
      	* config/i386/avx10_2roundingintrin.h: New intrins.
      	* config/i386/i386-builtin.def (BDESC): Add new builtins.
      	* config/i386/sse.md:
      	(<avx512>_fmsub_<mode>_mask<round_name>): Add condition check.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/i386/avx-1.c: Add new builtin test.
      	* gcc.target/i386/sse-13.c: Ditto.
      	* gcc.target/i386/sse-14.c: Ditto.
      	* gcc.target/i386/sse-22.c: Add new macro test.
      	* gcc.target/i386/sse-23.c: Ditto.
      	* gcc.target/i386/avx10_2-rounding-3.c: Add test.
      dd48acbe
    • Hu, Lin1's avatar
      AVX10.2 ymm rounding: Support vfmaddcph and vfmaddsub{132,231,213}p{s,d,h} intrins · cfbc94ea
      Hu, Lin1 authored
      gcc/ChangeLog:
      
      	* config/i386/avx10_2roundingintrin.h: New intrins.
      	* config/i386/i386-builtin.def (BDESC): Add new builtins.
      	* config/i386/sse.md:
      	(<avx512>_fmaddsub_<mode>_mask<round_name>): Add condition check.
      	(<avx512>_fmaddsub_<mode>_mask3<round_name>): Ditto.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/i386/avx-1.c: Add new builtin test.
      	* gcc.target/i386/sse-13.c: Ditto.
      	* gcc.target/i386/sse-14.c: Ditto.
      	* gcc.target/i386/sse-22.c: Add new macro test.
      	* gcc.target/i386/sse-23.c: Ditto.
      	* gcc.target/i386/avx10_2-rounding-3.c: Add test.
      cfbc94ea
    • Hu, Lin1's avatar
      AVX10.2 ymm rounding: Support vfmadd{132,231,213}p{s,d,h} intrins · 0683ca35
      Hu, Lin1 authored
      gcc/ChangeLog:
      
      	* config/i386/avx10_2roundingintrin.h: New intrins.
      	* config/i386/i386-builtin.def (BDESC): Add new builtins.
      	* config/i386/sse.md:
      	(<avx512>_fmadd_<mode>_mask3<round_name>): Add condition check.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/i386/avx-1.c: Add new builtin test.
      	* gcc.target/i386/sse-13.c: Ditto.
      	* gcc.target/i386/sse-14.c: Ditto.
      	* gcc.target/i386/sse-22.c: Add new macro test.
      	* gcc.target/i386/sse-23.c: Ditto.
      	* gcc.target/i386/avx10_2-rounding-3.c: New test.
      0683ca35
    • Hu, Lin1's avatar
      AVX10.2 ymm rounding: Support vfc{madd,mul}cph, vfixupimmp{s,d} intrins · 95980b29
      Hu, Lin1 authored
      gcc/ChangeLog:
      
      	* config/i386/avx10_2roundingintrin.h: New intrins.
      	* config/i386/i386-builtin-types.def: Add new DEF_FUNCTION_TYPE.
      	* config/i386/i386-builtin.def (BDESC): Add new builtins.
      	* config/i386/i386-expand.cc (ix86_expand_round_builtin): Handle
      	V16HF_FTYPE_V16HF_V16HF_INT, V16HF_FTYPE_V16HF_V16HF_V16HF_INT,
      	V16HF_FTYPE_V16HF_V16HF_V16HF_UQI_INT,
      	V4DF_FTYPE_V4DF_V4DF_V4DI_INT_UQI_INT,
      	V8SF_FTYPE_V8SF_V8SF_V8SI_INT_UQI_INT.
      	* config/i386/sse.md:
      	(<avx512>_fixupimm<mode><sd_maskz_name><round_saeonly_name>):
      	Add condition check.
      	(<avx512>_fixupimm<mode>_mask<round_saeonly_name>): Ditto.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/i386/avx-1.c: Add new builtin test.
      	* gcc.target/i386/sse-13.c: Ditto.
      	* gcc.target/i386/sse-14.c: Ditto.
      	* gcc.target/i386/sse-22.c: Add new macro test.
      	* gcc.target/i386/sse-23.c: Ditto.
      	* gcc.target/i386/avx10_2-rounding-3.c: New test.
      95980b29
    • Hu, Lin1's avatar
      AVX10.2 ymm rounding: Support vcvt{,u}w2ph and vdivp{s,d,h} intrins · 3d1b5530
      Hu, Lin1 authored
      gcc/ChangeLog:
      
      	* config/i386/avx10_2roundingintrin.h: New intrins.
      	* config/i386/i386-builtin-types.def: Add new DEF_FUNCTION_TYPE.
      	* config/i386/i386-builtin.def (BDESC): Add new builtins.
      	* config/i386/i386-expand.cc (ix86_expand_round_builtin): Handle
      	V16HF_FTYPE_V16HI_V16HF_UHI_INT.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/i386/avx-1.c: Add new builtin test.
      	* gcc.target/i386/sse-13.c: Ditto.
      	* gcc.target/i386/sse-14.c: Ditto.
      	* gcc.target/i386/sse-22.c: Add new macro test.
      	* gcc.target/i386/sse-23.c: Ditto.
      	* gcc.target/i386/avx10_2-rounding-3.c: New test.
      3d1b5530
    • Hu, Lin1's avatar
      AVX10.2 ymm rounding: Support vcvttps2{,u}{dq,qq} and vcvtu{dq,qq}2p{s,d,h} intrins · b2754227
      Hu, Lin1 authored
      gcc/ChangeLog:
      
      	* config/i386/avx10_2roundingintrin.h: New intrins.
      	* config/i386/i386-builtin.def (BDESC): Add new builtins.
      	* config/i386/sse.md
      	(unspec_fix_truncv8sfv8si2<mask_name>): Extend rounding control.
      	(<mask_codefor>fixuns_trunc<mode><sseintvecmodelower>2<mask_name>):
      	Ditto.
      	(<mask_codefor>floatuns<sseintvecmodelower><mode>2<mask_name><round_name>):
      	Add condition check.
      	(fix<fixunssuffix>_trunc<mode><sselongvecmodelower>2<mask_name><round_saeonly_name>):
      	Remove round_saeonly_name.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/i386/avx-1.c: Add new builtin test.
      	* gcc.target/i386/sse-13.c: Ditto.
      	* gcc.target/i386/sse-14.c: Ditto.
      	* gcc.target/i386/sse-22.c: Add new macro test.
      	* gcc.target/i386/sse-23.c: Ditto.
      	* gcc.target/i386/avx10_2-rounding-2.c: Add test.
      b2754227
    • Hu, Lin1's avatar
      AVX10.2 ymm rounding: Support vcvttph2{,u}{dq,qq,w} intrins · 493c5096
      Hu, Lin1 authored
      gcc/ChangeLog:
      
      	* config/i386/avx10_2roundingintrin.h: New intrins.
      	* config/i386/i386-builtin.def (BDESC): Add new builtins.
      	* config/i386/sse.md (avx512fp16_fix<fixunssuffix>_trunc<mode>2<mask_name>):
      	Extend round control for 256bit.
      	(unspec_avx512fp16_fix<vcvtt_uns_suffix>_trunc<mode>2<mask_name>):
      	Ditto.
      	(avx512fp16_fix<fixunssuffix>_trunc<mode>2<mask_name><round_saeonly_name>):
      	Add condition check.
      	* config/i386/subst.md
      	(round_saeonly_mode_condition): Add V16HI check for 256bit.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/i386/avx-1.c: Add new builtin test.
      	* gcc.target/i386/sse-13.c: Ditto.
      	* gcc.target/i386/sse-14.c: Ditto.
      	* gcc.target/i386/sse-22.c: Add new macro test.
      	* gcc.target/i386/sse-23.c: Ditto.
      	* gcc.target/i386/avx10_2-rounding-2.c: Add test.
      493c5096
    • Hu, Lin1's avatar
      AVX10.2 ymm rounding: Support vcvtqq2p{s,d,h} and vcvttpd2{,u}{dq,qq} intrins · 6e231f85
      Hu, Lin1 authored
      gcc/ChangeLog:
      
      	* config/i386/avx10_2roundingintrin.h: New intrins.
      	* config/i386/i386-builtin-types.def: Add new DEF_FUNCTION_TYPE.
      	* config/i386/i386-builtin.def (BDESC): Add new builtins.
      	* config/i386/i386-expand.cc (ix86_expand_round_builtin): Handle
      	V4DF_FTYPE_V4DI_V4DF_UQI_INT, V4SF_FTYPE_V4DI_V4SF_UQI_INT,
      	V8HF_FTYPE_V4DI_V8HF_UQI_INT.
      	* config/i386/sse.md:
      	(avx512fp16_vcvt<floatsuffix>qq2ph_v4di_mask_round): New expand.
      	(*avx512fp16_vcvt<floatsuffix><sseintconvert>2ph_<mode>_mask):
      	Extend round control and add "_1" suffix.
      	(float<floatunssuffix><sseintvecmodelower><mode>2<mask_name><round_name>):
      	Add condition check.
      	(float<floatunssuffix><sselongvecmodelower><mode>2<mask_name><round_name>):
      	Ditto.
      	(float<floatunssuffix><mode><ssePSmode2lower>2<mask_name><round_name>):
      	Limit suffix output.
      	(unspec_fix_truncv4dfv4si2<mask_name>): Extend round control.
      	(unspec_fixuns_truncv4dfv4si2<mask_name>): Ditto.
      	* config/i386/subst.md (round_qq2pssuff): New iterator.
      	(round_saeonly_suff): Ditto.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/i386/avx-1.c: Add new builtin test.
      	* gcc.target/i386/sse-13.c: Ditto.
      	* gcc.target/i386/sse-14.c: Ditto.
      	* gcc.target/i386/sse-22.c: Add new macro test.
      	* gcc.target/i386/sse-23.c: Ditto.
      	* gcc.target/i386/avx10_2-rounding-2.c: New test.
      6e231f85
    • Hu, Lin1's avatar
      AVX10.2 ymm rounding: Support vcvtps2{,u}{dq,qq} intrins · 0f5a42d4
      Hu, Lin1 authored
      gcc/ChangeLog:
      
      	* config/i386/avx10_2roundingintrin.h: New intrins.
      	* config/i386/i386-builtin-types.def: Add new DEF_FUNCTION_TYPE.
      	* config/i386/i386-builtin.def (BDESC): Add new builtins.
      	* config/i386/i386-expand.cc (ix86_expand_round_builtin): Handle
      	V8SI_FTYPE_V8SF_V8SI_UQI_INT, V4DI_FTYPE_V4SF_V4DI_UQI_INT.
      	* config/i386/sse.md
      	(<sse2_avx_avx512f>_fix_notrunc<sf2simodelower><mode><mask_name>):
      	Extend to round.
      	(<mask_codefor><avx512>_fixuns_notrunc<sf2simodelower><mode><mask_name><round_name>):
      	Add round condition check.
      	* config/i386/subst.md (round_constraint4): New.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/i386/avx-1.c: Add new builtin test.
      	* gcc.target/i386/sse-13.c: Ditto.
      	* gcc.target/i386/sse-14.c: Ditto.
      	* gcc.target/i386/sse-22.c: Add new macro test.
      	* gcc.target/i386/sse-23.c: Ditto.
      	* gcc.target/i386/avx10_2-rounding-1.c: Add test.
      0f5a42d4
    • Hu, Lin1's avatar
      AVX10.2 ymm rounding: Support vcvtph2{,u}w and vcvtps2p{d,hx} intrins · b70bb94a
      Hu, Lin1 authored
      gcc/ChangeLog:
      
      	* config/i386/avx10_2roundingintrin.h: New intrins.
      	* config/i386/i386-builtin-types.def: Add new DEF_FUNCTION_TYPE.
      	* config/i386/i386-builtin.def (BDESC): Add new builtins.
      	* config/i386/i386-expand.cc (ix86_expand_round_builtin): Handle
      	V16HI_FTYPE_V16HF_V16HI_UHI_INT, V4DF_FTYPE_V4SF_V4DF_UQI_INT
      	V8HF_FTYPE_V8SF_V8HF_UQI_INT.
      	* config/i386/sse.md
      	(avx512fp16_vcvt<castmode>2ph_<mode><mask_name><round_name>):
      	Add round condition check.
      	* config/i386/subst.md (round_mode_condition): Add V16HI check for
      	256bit.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/i386/avx-1.c: Add new builtin test.
      	* gcc.target/i386/sse-13.c: Ditto.
      	* gcc.target/i386/sse-14.c: Ditto.
      	* gcc.target/i386/sse-22.c: Add new macro test.
      	* gcc.target/i386/sse-23.c: Ditto.
      	* gcc.target/i386/avx10_2-rounding-1.c: Add test.
      b70bb94a
    • Hu, Lin1's avatar
      AVX10.2 ymm rounding: Support vcvtph2p{s,d,sx} and vcvtph2{,u}{dq,qq} intrins · 6f2eac53
      Hu, Lin1 authored
      gcc/ChangeLog:
      
      	* config/i386/avx10_2roundingintrin.h: New intrins.
      	* config/i386/i386-builtin-types.def: Add new DEF_FUNCTION_TYPE.
      	* config/i386/i386-builtin.def (BDESC): Add new builtins.
      	* config/i386/i386-expand.cc (ix86_expand_round_builtin): Handle
      	V8SF_FTYPE_V8HF_V8SF_UQI_INT, V8SI_FTYPE_V8HF_V8SI_UQI_INT,
      	V4DF_FTYPE_V8HF_V4DF_UQI_INT, V4DI_FTYPE_V8HF_V4DI_UQI_INT.
      	* config/i386/sse.md:
      	(avx512fp16_float_extend_ph<mode>2<mask_name><round_saeonly_name>):
      	Add condition check.
      	(avx512fp16_vcvtph2<sseintconvertsignprefix><sseintconvert>_<mode>
      	<mask_name><round_name>):
      	Ditto.
      	(avx512fp16_float_extend_ph<mode>2<mask_name>): Extend round saeonly.
      	(vcvtph2ps256<mask_name>): Ditto.
      	* config/i386/subst.md
      	(round_saeonly_applied): New condition.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/i386/avx-1.c: Add new builtin test.
      	* gcc.target/i386/sse-13.c: Ditto.
      	* gcc.target/i386/sse-14.c: Ditto.
      	* gcc.target/i386/sse-22.c: Add new macro test.
      	* gcc.target/i386/sse-23.c: Ditto.
      	* gcc.target/i386/avx10_2-rounding-1.c: Add test.
      6f2eac53
    • Hu, Lin1's avatar
      AVX10.2 ymm rounding: Support vcvtpd2{,u}{dq,qq} intrins · 508ac49e
      Hu, Lin1 authored
      gcc/ChangeLog:
      
      	* config/i386/avx10_2roundingintrin.h: Add new intrins.
      	* config/i386/i386-builtin-types.def: Add new DEF_FUNCTION_TYPE.
      	* config/i386/i386-builtin.def (BDESC): Add new builtins.
      	* config/i386/i386-expand.cc (ix86_expand_round_builtin): Handle
      	V4DI_FTYPE_V4DF_V4DI_UQI_INT, V4SI_FTYPE_V4DF_V4SI_UQI_INT.
      	* config/i386/sse.md:
      	(avx_cvtpd2dq256<mask_name>): Change name to
      	avx_cvtpd2dq256<mask_name><round_name> and extend pattern to
      	generate 256bit insns.
      	(fixuns_notrunc<mode><si2dfmodelower>2<mask_name><round_name>):
      	Add round_mode_condition.
      	* config/i386/subst.md (round_pd2udqsuff): New iterator.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/i386/avx-1.c: Add new builtin test.
      	* gcc.target/i386/sse-13.c: Ditto.
      	* gcc.target/i386/sse-23.c: Ditto.
      	* gcc.target/i386/sse-14.c: Add new macro test.
      	* gcc.target/i386/sse-22.c: Ditto.
      	* gcc.target/i386/avx10_2-rounding-1.c: Add test.
      508ac49e
    • Hu, Lin1's avatar
      AVX10.2 ymm rounding: Support vcvtdq2p{s,h} and vcvtpd2p{s,h} intrins · 85e874d1
      Hu, Lin1 authored
      gcc/ChangeLog:
      
      	* config/i386/avx10_2roundingintrin.h: Add new intrins.
      	* config/i386/i386-builtin-types.def: Add new DEF_FUNCTION_TYPE.
      	* config/i386/i386-builtin.def (BDESC): Add new builtins.
      	* config/i386/i386-expand.cc (ix86_expand_round_builtin): Handle
      	V8SF_FTYPE_V8SI_V8SF_UQI_INT, V4SF_FTYPE_V4DF_V4SF_UQI_INT,
      	V8HF_FTYPE_V8SI_V8HF_UQI_INT, V8HF_FTYPE_V4DF_V8HF_UQI_INT.
      	* config/i386/sse.md:
      	(avx512fp16_vcvt<floatsuffix><sseintconvert>2ph_<mode><mask_name><round_name>):
      	Add condition check.
      	(avx512fp16_vcvtpd2ph_v4df_mask_round): New expand.
      	(*avx512fp16_vcvt<castmode>2ph_<mode>_mask): Change name to
      	avx512fp16_vcvt<castmode>2ph_<mode>_mask<round_name>_1
      	and extend pattern to generate 256bit insns.
      	(avx_cvtpd2ps256<mask_name>): Change name to
      	avx_cvtpd2ps256<mask_name><round_name> and extend pattern to
      	generate 256bit insns.
      	* config/i386/subst.md (round_applied): New condition.
      	(round_suff): New iterator.
      	(round_mode_condition): Add V32HI check for 512bit.
      	(round_saeonly_mode_condition): Ditto.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/i386/avx-1.c: Add new builtin test.
      	* gcc.target/i386/sse-13.c: Ditto.
      	* gcc.target/i386/sse-23.c: Ditto.
      	* gcc.target/i386/sse-14.c: Add new macro test.
      	* gcc.target/i386/sse-22.c: Ditto.
      	* gcc.target/i386/avx10_2-rounding-1.c: Add test.
      85e874d1
    • Hu, Lin1's avatar
      AVX10.2 ymm rounding: Support vadd{s,d,h} and vcmp{s,d,h} intrins · e22e3af1
      Hu, Lin1 authored
      gcc/ChangeLog:
      
      	* config.gcc: Add avx10_2roundingintrin.h.
      	* config/i386/i386-builtin-types.def: Add new DEF_FUNCTION_TYPE.
      	* config/i386/i386-builtin.def (BDESC): Add new builtins.
      	* config/i386/i386-expand.cc (ix86_expand_round_builtin): Handle
      	V4DF_FTYPE_V4DF_V4DF_V4DF_UQI_INT, V8SF_FTYPE_V8SF_V8SF_V8SF_UQI_INT,
      	V16HF_FTYPE_V16HF_V16HF_V16HF_UHI_INT, UQI_FTYPE_V4DF_V4DF_INT_UQI_INT,
      	UHI_FTYPE_V16HF_V16HF_INT_UHI_INT, UQI_FTYPE_V8SF_V8SF_INT_UQI_INT.
      	* config/i386/immintrin.h: Include avx10_2roundingintrin.h.
      	* config/i386/sse.md: Change subst_attr name due to renaming.
      	* config/i386/subst.md:
      	(<round_mode512bit_condition>): Add condition check for avx10.2
      	rounding control 256bit intrins and renamed to ...
      	(<round_mode_condition>): ...this.
      	(round_saeonly_mode512bit_condition): Add condition check for
      	avx10.2 rounding control 256 bit intris and renamed to ...
      	(round_saeonly_mode_condition): ...this.
      	* config/i386/avx10_2roundingintrin.h: New file.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/i386/avx-1.c: Add -mavx10.2 and new builtin test.
      	* gcc.target/i386/avx-2.c: Ditto.
      	* gcc.target/i386/sse-13.c: Add new tests.
      	* gcc.target/i386/sse-23.c: Ditto.
      	* gcc.target/i386/sse-14.c: Ditto.
      	* gcc.target/i386/sse-22.c: Ditto.
      	* gcc.target/i386/avx10_2-rounding-1.c: New test.
      e22e3af1
    • GCC Administrator's avatar
      Daily bump. · f11bc088
      GCC Administrator authored
      f11bc088
  2. Aug 18, 2024
    • Jeff Law's avatar
      [PR rtl-optimization/115876] Avoid ubsan in ext-dce.cc · f10d2ee9
      Jeff Law authored
      This fixes two general ubsan issues in ext-dce, both related to use-side
      processsing of modes > DImode.
      
      In ext_dce_process_uses we can be presented with something like this as a use
      (subreg:SI (reg:TF) 12)
      
      That will result in an out of range shift for a HOST_WIDE_INT object.  Where
      this happens is safe to just break from the SET context and process the
      subjects.  This will ultimately result in seeing (reg:TF) and we'll mark all
      bit groups as live.
      
      In carry_backpropagate we can be presented with a TImode shift (for example)
      and the shift count can be > 63 for such a shift.  This naturally trips ubsan
      as well as we're operating on 64 bit objects.
      
      We can just return mmask in this case noting that every bit group is live.
      
      The combination of these two fixes eliminates all the reported ubsan issues in
      ext-dce seen in a bootstrap and regression test on x86.
      
      While I was in there I went ahead and fixed the various hardcoded 63/64 values
      to be HOST_BITS_PER_WIDE_INT based.
      
      Bootstrapped and regression tested on x86 with no regressions.  Also built with
      ubsan enabled and verified the build logs and testsuite logs don't call out any
      issues in ext-dce anymore.
      
      Pushing to the trunk.
      
      	PR rtl-optimization/115876
      gcc
      	* ext-dce.cc (ext_dce_process_sets): Replace hardcoded 63/64 instances
      	with HOST_BITS_PER_WIDE_INT based values.
      	(carry_backpropagate): Handle modes with more bits than
      	HOST_BITS_PER_WIDE_INT gracefully, avoiding undefined behavior.
      	(ext_dce_process_uses): Handle subreg offsets which would result
      	in ubsan shifts gracefully, avoiding undefined behavior.
      f10d2ee9
    • Gerald Pfeifer's avatar
      libstdc++: Remove note from the GCC 4.0.1 days · fc412630
      Gerald Pfeifer authored
      libstdc++-v3:
      	* doc/xml/manual/prerequisites.xml: Remove note from the
      	GCC 4.0.1 days.
      	* doc/html/manual/setup.html: Regenerate.
      fc412630
    • Gerald Pfeifer's avatar
      doc: Tweak gm2 mailing list address · b9ac01d8
      Gerald Pfeifer authored
      gcc:
      	* doc/gm2.texi (Contributing): Tweak gm2 mailing list address.
      b9ac01d8
    • Andrew Pinski's avatar
      PHIOPT: move factor_out_conditional_operation over to use gimple_match_op · cd2f3944
      Andrew Pinski authored
      
      To start working on more with expressions with more than one operand, converting
      over to use gimple_match_op is needed.
      The added side-effect here is factor_out_conditional_operation can now support
      builtins/internal calls that has one operand without any extra code added.
      
      Note on the changed testcases:
      * pr87007-5.c: the test was testing testing for avoiding partial register stalls
      for the sqrt and making sure there is only one zero of the register before the
      branch, the phiopt would now merge the sqrt's so disable phiopt.
      
      Bootstrapped and tested on x86_64-linux-gnu with no regressions.
      
      gcc/ChangeLog:
      
      	* gimple-match-exports.cc (gimple_match_op::operands_occurs_in_abnormal_phi):
      	New function.
      	* gimple-match.h (gimple_match_op): Add operands_occurs_in_abnormal_phi.
      	* tree-ssa-phiopt.cc (factor_out_conditional_operation): Use gimple_match_op
      	instead of manually extracting from/creating the gimple.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/i386/pr87007-5.c: Disable phi-opt.
      
      Signed-off-by: default avatarAndrew Pinski <quic_apinski@quicinc.com>
      cd2f3944
Loading