- Aug 19, 2024
-
-
Andrew Carlotti authored
The availability of ls64 intrinsics and data types were determined solely by the globally specified architecture features, which did not reflect any changes specified in target pragmas or attributes. This patch removes the initialisation-time guards for the intrinsics, and replaces them with checks at use time. We also get better error messages when ls64 is not available (matching the existing error messages for SVE intrinsics). The data512_t type is made always available; this is consistent with the present behaviour for Neon fp16/bf16 types. gcc/ChangeLog: PR target/112108 * config/aarch64/aarch64-builtins.cc (handle_arm_acle_h): Remove feature check at initialisation. (aarch64_general_check_builtin_call): Check ls64 intrinsics. * config/aarch64/arm_acle.h: (data512_t) Make always available. gcc/testsuite/ChangeLog: PR target/112108 * gcc.target/aarch64/acle/ls64_guard-1.c: New test. * gcc.target/aarch64/acle/ls64_guard-2.c: New test. * gcc.target/aarch64/acle/ls64_guard-3.c: New test. * gcc.target/aarch64/acle/ls64_guard-4.c: New test.
-
Andrew Carlotti authored
The availability of memtag intrinsics and data types were determined solely by the globally specified architecture features, which did not reflect any changes specified in target pragmas or attributes. This patch removes the initialisation-time guards for the intrinsics, and replaces them with checks at use time. It also removes the macro indirection from the header file - this simplifies the header, and allows the missing extension error reporting to find the user-facing intrinsic names. gcc/ChangeLog: PR target/112108 * config/aarch64/aarch64-builtins.cc (aarch64_init_memtag_builtins): Define intrinsic names directly. (aarch64_general_init_builtins): Move memtag intialisation... (handle_arm_acle_h): ...to here, and remove feature check. (aarch64_general_check_builtin_call): Check memtag intrinsics. * config/aarch64/arm_acle.h (__arm_mte_create_random_tag) (__arm_mte_exclude_tag, __arm_mte_ptrdiff) (__arm_mte_increment_tag, __arm_mte_set_tag, __arm_mte_get_tag): Remove. gcc/testsuite/ChangeLog: PR target/112108 * gcc.target/aarch64/acle/memtag_guard-1.c: New test. * gcc.target/aarch64/acle/memtag_guard-2.c: New test. * gcc.target/aarch64/acle/memtag_guard-3.c: New test. * gcc.target/aarch64/acle/memtag_guard-4.c: New test.
-
Andrew Carlotti authored
The availability of tme intrinsics was previously gated at both initialisation time (using global target options) and usage time (accounting for function-specific target options). This patch removes the check at initialisation time, and also moves the intrinsics out of the header file to allow for better error messages (matching the existing error messages for SVE intrinsics). gcc/ChangeLog: PR target/112108 * config/aarch64/aarch64-builtins.cc (aarch64_init_tme_builtins): Define intrinsic names directly. (aarch64_general_init_builtins): Move tme initialisation... (handle_arm_acle_h): ...to here, and remove feature check. (aarch64_general_check_builtin_call): Check tme intrinsics. * config/aarch64/arm_acle.h (__tstart, __tcommit, __tcancel) (__ttest): Remove. (_TMFAILURE_*): Define unconditionally. gcc/testsuite/ChangeLog: PR target/112108 * gcc.target/aarch64/acle/tme_guard-1.c: New test. * gcc.target/aarch64/acle/tme_guard-2.c: New test. * gcc.target/aarch64/acle/tme_guard-3.c: New test. * gcc.target/aarch64/acle/tme_guard-4.c: New test.
-
Andrew Carlotti authored
Move SVE extension checking functionality to aarch64-builtins.cc, so that it can be shared by non-SVE intrinsics. gcc/ChangeLog: * config/aarch64/aarch64-sve-builtins.cc (check_builtin_call) (expand_builtin): Update calls to the below. (report_missing_extension, report_missing_registers) (check_required_extensions): Move out of aarch64_sve namespace, rename, and move into... * config/aarch64/aarch64-builtins.cc (aarch64_report_missing_extension) (aarch64_report_missing_registers) (aarch64_check_required_extensions) ...here. * config/aarch64/aarch64-protos.h (aarch64_check_required_extensions): Add prototype.
-
Andrew Carlotti authored
Replace TARGET_GENERAL_REGS_ONLY check with an explicit check that aarch64_isa_flags enables all required extensions. This will be more flexible when repurposing this function for non-SVE intrinsics. gcc/ChangeLog: * config/aarch64/aarch64-sve-builtins.cc (check_required_registers): Remove target check and rename to... (report_missing_registers): ...this. (check_required_extensions): Refactor.
-
Andre Vehreschild authored
Fix ICE when scalar coarrays are used in a select type. Prevent coindexing in associate/select type/select rank selector expression. gcc/fortran/ChangeLog: PR fortran/46371 PR fortran/56496 * expr.cc (gfc_is_coindexed): Detect is coindexed also when rewritten to caf_get. * trans-stmt.cc (trans_associate_var): Always accept a descriptor for coarrays. gcc/testsuite/ChangeLog: * gfortran.dg/coarray/select_type_1.f90: New test. * gfortran.dg/coarray/select_type_2.f90: New test. * gfortran.dg/coarray/select_type_3.f90: New test.
-
Arsen Arsenović authored
gcc/ada/ChangeLog: PR ada/115917 * gnatvsn.ads: Add note about the duplication of this value in version.c. * version.c (VER_LEN_MAX): Define to the same value as Gnatvsn.Ver_Len_Max. (gnat_version_string): Use VER_LEN_MAX as bound.
-
Kyrylo Tkachov authored
aarch64: Reduce FP reassociation width for Neoverse V2 and set AARCH64_EXTRA_TUNE_FULLY_PIPELINED_FMA The fp reassociation width for Neoverse V2 was set to 6 since its introduction and I guess it was empirically tuned. But since AARCH64_EXTRA_TUNE_FULLY_PIPELINED_FMA was added the tree reassociation pass seems to be more deliberate in forming FMAs and when that flag is used it seems to more properly evaluate the FMA vs non-FMA reassociation widths. According to the Neoverse V2 SWOG the core has a throughput of 4 for most FP operations, so the value 6 is not accurate anyway. Also, the SWOG does state that FMADD operations are pipelined and the results can be forwarded from FP multiplies to the accumulation operands of FMADD instructions, which seems to be what AARCH64_EXTRA_TUNE_FULLY_PIPELINED_FMA expresses. This patch sets the fp_reassoc_width field to 4 and enables AARCH64_EXTRA_TUNE_FULLY_PIPELINED_FMA for -mcpu=neoverse-v2. On SPEC2017 fprate I see the following changes on a Grace system: 503.bwaves_r 0.16% 507.cactuBSSN_r -0.32% 508.namd_r 3.04% 510.parest_r 0.00% 511.povray_r 0.78% 519.lbm_r 0.35% 521.wrf_r 0.69% 526.blender_r -0.53% 527.cam4_r 0.84% 538.imagick_r 0.00% 544.nab_r -0.97% 549.fotonik3d_r -0.45% 554.roms_r 0.97% Geomean 0.35% with -Ofast -mcpu=grace -flto. So slight overall improvement with a meaningful improvement in 508.namd_r. I think other tunings in aarch64 should look into AARCH64_EXTRA_TUNE_FULLY_PIPELINED_FMA as well, but I'll leave the benchmarking to someone else. Signed-off-by:
Kyrylo Tkachov <ktkachov@nvidia.com> gcc/ChangeLog: * config/aarch64/tuning_models/neoversev2.h (fp_reassoc_width): Set to 4. (tune_flags): Add AARCH64_EXTRA_TUNE_FULLY_PIPELINED_FMA.
-
Torbjörn SVENSSON authored
This fixes reported regression at https://linaro.atlassian.net/browse/GNU-1315 . gcc/testsuite/ChangeLog: * g++.dg/warn/pr33738-2.C: dg-prune arm linker messages about size of enums. Signed-off-by:
Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
-
Andre Vieira authored
The 'code' part of a 'define_code_attr' refers to the type of the key, in other words, it uses a code_iterator to pick the 'value' from their (key "value") pair list. However, rtx_alloc_for_name requires a code_attribute to be used when the 'value' needs to be a type. In other words, no other type of attributes could be used, before this patch, to produce a rtx typed 'value'. This patch removes that restriction and allows the backend to use any kind of attribute as long as that attribute always produces a valid code typed 'value'. gcc/ChangeLog: * read-rtl.cc (rtx_reader::rtx_alloc_for_name): Allow all attribute types to produce code 'values'. (check_code_attribute): Rename ... (check_attribute_codes): ... to this. And change comments to refer to * doc/md.texi: Add paragraph to document that you can use int and mode attributes to produce codes.
-
Richard Sandiford authored
scanltranstree.exp defines some LTO wrappers around standard non-LTO scanners. Four of them are cut-&-paste variants of one another, so this patch generates them from a single template. It also does the same for scan-ltrans-tree-dump-times, so that other *-times scanners can be added easily in future. The scanners seem to be lightly used. gcc.dg/ipa/ipa-icf-38.c uses scan-ltrans-tree-dump{,-not} and libgomp.c/declare-variant-1.c uses scan-ltrans-tree-dump-{not,times}. Nothing currently seems to use scan-ltrans-tree-dump-dem*. gcc/testsuite/ * lib/scanltranstree.exp: Redefine the routines using two templates.
-
Andre Vehreschild authored
Declaring an unused function with a derived type having a pointer component and using that derived type as a coarray, lead the compiler to ICE because the caf_token for the pointer was not linked into the component correctly. PR fortran/84244 gcc/fortran/ChangeLog: * trans-types.cc (gfc_get_derived_type): When a caf_sub_token is generated for a component, link it to the component it is generated for (the previous one). gcc/testsuite/ChangeLog: * gfortran.dg/coarray/ptr_comp_5.f08: New test.
-
Haochen Gui authored
gcc/ * config/aarch64/aarch64-simd.md (mov<mode> for VSTRUCT_QD): Expand 16-byte vector mode const0 store by TImode.
-
Hu, Lin1 authored
gcc/ChangeLog: * config/i386/avx10_2roundingintrin.h: New intrins. * config/i386/i386-builtin.def (BDESC): Add new builtins. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add new builtin test. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Add new macro test. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/avx10_2-rounding-3.c: Add test.
-
Hu, Lin1 authored
gcc/ChangeLog: * config/i386/avx10_2roundingintrin.h: New intrins. * config/i386/i386-builtin.def: Add new builtins. * config/i386/sse.md: (<avx512>_scalef<mode><mask_name><round_name>): Add condition check. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add new builtin test. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Add new macro test. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/avx10_2-rounding-3.c: Add test.
-
Hu, Lin1 authored
gcc/ChangeLog: * config/i386/avx10_2roundingintrin.h: New intrins. * config/i386/i386-builtin.def (BDESC): Add new builtins. * config/i386/sse.md: (<mask_codefor>reducep<mode><mask_name><round_saeonly_name>): Add condition check. (<avx512>_rndscale<mode><mask_name><round_saeonly_name>): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add new builtin test. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Add new macro test. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/avx10_2-rounding-3.c: Add test.
-
Hu, Lin1 authored
gcc/ChangeLog: * config/i386/avx10_2roundingintrin.h: New intrins. * config/i386/i386-builtin-types.def: Add new DEF_FUNCTION_TYPE. * config/i386/i386-builtin.def (BDESC): Add new builtins. * config/i386/i386-expand.cc (ix86_expand_round_builtin): Handle V8SF_FTYPE_V8SF_V8SF_INT_V8SF_UQI_INT, V4DF_FTYPE_V4DF_V4DF_INT_V4DF_UQI_INT. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add new builtin test. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Add new macro test. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/avx10_2-rounding-3.c: Add test.
-
Hu, Lin1 authored
gcc/ChangeLog: * config/i386/avx10_2roundingintrin.h: New intrins. * config/i386/i386-builtin.def (BDESC): Add new builtins. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add new builtin test. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Add new macro test. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/avx10_2-rounding-3.c: Add test.
-
Hu, Lin1 authored
gcc/ChangeLog: * config/i386/avx10_2roundingintrin.h: New intrins. * config/i386/i386-builtin-types.def: Add new DEF_FUNCTION_TYPE. * config/i386/i386-builtin.def (BDESC): Add new builtins. * config/i386/i386-expand.cc (ix86_expand_round_builtin): Handle V8SF_FTYPE_V8SF_V8SF_UQI_INT, V4DF_FTYPE_V4DF_V4DF_UQI_INT, V16HF_FTYPE_V16HF_V16HF_UHI_INT, V16HF_FTYPE_V16HF_INT_V16HF_UHI_INT, V4DF_FTYPE_V4DF_INT_V4DF_UQI_INT, V8SF_FTYPE_V8SF_INT_V8SF_UQI_INT. * config/i386/sse.md: (<avx512>_getexp<mode><mask_name><round_saeonly_name>): Add condition check. (<avx512>_getmant<mode><mask_name><round_saeonly_name>): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add new builtin test. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Add new macro test. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/avx10_2-rounding-3.c: Add test.
-
Hu, Lin1 authored
gcc/ChangeLog: * config/i386/avx10_2roundingintrin.h: New intrins. * config/i386/i386-builtin.def (BDESC): Add new builtins. * config/i386/sse.md: (<avx512>_fnmsub_<mode>_mask3<round_name>): Add condition check. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add new builtin test. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Add new macro test. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/avx10_2-rounding-3.c: Add test.
-
Hu, Lin1 authored
gcc/ChangeLog: * config/i386/avx10_2roundingintrin.h: New intrins. * config/i386/i386-builtin.def (BDESC): Add new builtins. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add new builtin test. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Add new macro test. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/avx10_2-rounding-3.c: Add test.
-
Hu, Lin1 authored
gcc/ChangeLog: * config/i386/avx10_2roundingintrin.h: New intrins. * config/i386/i386-builtin.def (BDESC): Add new builtins. * config/i386/sse.md: (<avx512>_fmsub_<mode>_mask<round_name>): Add condition check. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add new builtin test. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Add new macro test. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/avx10_2-rounding-3.c: Add test.
-
Hu, Lin1 authored
gcc/ChangeLog: * config/i386/avx10_2roundingintrin.h: New intrins. * config/i386/i386-builtin.def (BDESC): Add new builtins. * config/i386/sse.md: (<avx512>_fmaddsub_<mode>_mask<round_name>): Add condition check. (<avx512>_fmaddsub_<mode>_mask3<round_name>): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add new builtin test. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Add new macro test. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/avx10_2-rounding-3.c: Add test.
-
Hu, Lin1 authored
gcc/ChangeLog: * config/i386/avx10_2roundingintrin.h: New intrins. * config/i386/i386-builtin.def (BDESC): Add new builtins. * config/i386/sse.md: (<avx512>_fmadd_<mode>_mask3<round_name>): Add condition check. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add new builtin test. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Add new macro test. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/avx10_2-rounding-3.c: New test.
-
Hu, Lin1 authored
gcc/ChangeLog: * config/i386/avx10_2roundingintrin.h: New intrins. * config/i386/i386-builtin-types.def: Add new DEF_FUNCTION_TYPE. * config/i386/i386-builtin.def (BDESC): Add new builtins. * config/i386/i386-expand.cc (ix86_expand_round_builtin): Handle V16HF_FTYPE_V16HF_V16HF_INT, V16HF_FTYPE_V16HF_V16HF_V16HF_INT, V16HF_FTYPE_V16HF_V16HF_V16HF_UQI_INT, V4DF_FTYPE_V4DF_V4DF_V4DI_INT_UQI_INT, V8SF_FTYPE_V8SF_V8SF_V8SI_INT_UQI_INT. * config/i386/sse.md: (<avx512>_fixupimm<mode><sd_maskz_name><round_saeonly_name>): Add condition check. (<avx512>_fixupimm<mode>_mask<round_saeonly_name>): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add new builtin test. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Add new macro test. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/avx10_2-rounding-3.c: New test.
-
Hu, Lin1 authored
gcc/ChangeLog: * config/i386/avx10_2roundingintrin.h: New intrins. * config/i386/i386-builtin-types.def: Add new DEF_FUNCTION_TYPE. * config/i386/i386-builtin.def (BDESC): Add new builtins. * config/i386/i386-expand.cc (ix86_expand_round_builtin): Handle V16HF_FTYPE_V16HI_V16HF_UHI_INT. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add new builtin test. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Add new macro test. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/avx10_2-rounding-3.c: New test.
-
Hu, Lin1 authored
gcc/ChangeLog: * config/i386/avx10_2roundingintrin.h: New intrins. * config/i386/i386-builtin.def (BDESC): Add new builtins. * config/i386/sse.md (unspec_fix_truncv8sfv8si2<mask_name>): Extend rounding control. (<mask_codefor>fixuns_trunc<mode><sseintvecmodelower>2<mask_name>): Ditto. (<mask_codefor>floatuns<sseintvecmodelower><mode>2<mask_name><round_name>): Add condition check. (fix<fixunssuffix>_trunc<mode><sselongvecmodelower>2<mask_name><round_saeonly_name>): Remove round_saeonly_name. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add new builtin test. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Add new macro test. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/avx10_2-rounding-2.c: Add test.
-
Hu, Lin1 authored
gcc/ChangeLog: * config/i386/avx10_2roundingintrin.h: New intrins. * config/i386/i386-builtin.def (BDESC): Add new builtins. * config/i386/sse.md (avx512fp16_fix<fixunssuffix>_trunc<mode>2<mask_name>): Extend round control for 256bit. (unspec_avx512fp16_fix<vcvtt_uns_suffix>_trunc<mode>2<mask_name>): Ditto. (avx512fp16_fix<fixunssuffix>_trunc<mode>2<mask_name><round_saeonly_name>): Add condition check. * config/i386/subst.md (round_saeonly_mode_condition): Add V16HI check for 256bit. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add new builtin test. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Add new macro test. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/avx10_2-rounding-2.c: Add test.
-
Hu, Lin1 authored
gcc/ChangeLog: * config/i386/avx10_2roundingintrin.h: New intrins. * config/i386/i386-builtin-types.def: Add new DEF_FUNCTION_TYPE. * config/i386/i386-builtin.def (BDESC): Add new builtins. * config/i386/i386-expand.cc (ix86_expand_round_builtin): Handle V4DF_FTYPE_V4DI_V4DF_UQI_INT, V4SF_FTYPE_V4DI_V4SF_UQI_INT, V8HF_FTYPE_V4DI_V8HF_UQI_INT. * config/i386/sse.md: (avx512fp16_vcvt<floatsuffix>qq2ph_v4di_mask_round): New expand. (*avx512fp16_vcvt<floatsuffix><sseintconvert>2ph_<mode>_mask): Extend round control and add "_1" suffix. (float<floatunssuffix><sseintvecmodelower><mode>2<mask_name><round_name>): Add condition check. (float<floatunssuffix><sselongvecmodelower><mode>2<mask_name><round_name>): Ditto. (float<floatunssuffix><mode><ssePSmode2lower>2<mask_name><round_name>): Limit suffix output. (unspec_fix_truncv4dfv4si2<mask_name>): Extend round control. (unspec_fixuns_truncv4dfv4si2<mask_name>): Ditto. * config/i386/subst.md (round_qq2pssuff): New iterator. (round_saeonly_suff): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add new builtin test. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Add new macro test. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/avx10_2-rounding-2.c: New test.
-
Hu, Lin1 authored
gcc/ChangeLog: * config/i386/avx10_2roundingintrin.h: New intrins. * config/i386/i386-builtin-types.def: Add new DEF_FUNCTION_TYPE. * config/i386/i386-builtin.def (BDESC): Add new builtins. * config/i386/i386-expand.cc (ix86_expand_round_builtin): Handle V8SI_FTYPE_V8SF_V8SI_UQI_INT, V4DI_FTYPE_V4SF_V4DI_UQI_INT. * config/i386/sse.md (<sse2_avx_avx512f>_fix_notrunc<sf2simodelower><mode><mask_name>): Extend to round. (<mask_codefor><avx512>_fixuns_notrunc<sf2simodelower><mode><mask_name><round_name>): Add round condition check. * config/i386/subst.md (round_constraint4): New. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add new builtin test. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Add new macro test. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/avx10_2-rounding-1.c: Add test.
-
Hu, Lin1 authored
gcc/ChangeLog: * config/i386/avx10_2roundingintrin.h: New intrins. * config/i386/i386-builtin-types.def: Add new DEF_FUNCTION_TYPE. * config/i386/i386-builtin.def (BDESC): Add new builtins. * config/i386/i386-expand.cc (ix86_expand_round_builtin): Handle V16HI_FTYPE_V16HF_V16HI_UHI_INT, V4DF_FTYPE_V4SF_V4DF_UQI_INT V8HF_FTYPE_V8SF_V8HF_UQI_INT. * config/i386/sse.md (avx512fp16_vcvt<castmode>2ph_<mode><mask_name><round_name>): Add round condition check. * config/i386/subst.md (round_mode_condition): Add V16HI check for 256bit. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add new builtin test. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Add new macro test. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/avx10_2-rounding-1.c: Add test.
-
Hu, Lin1 authored
gcc/ChangeLog: * config/i386/avx10_2roundingintrin.h: New intrins. * config/i386/i386-builtin-types.def: Add new DEF_FUNCTION_TYPE. * config/i386/i386-builtin.def (BDESC): Add new builtins. * config/i386/i386-expand.cc (ix86_expand_round_builtin): Handle V8SF_FTYPE_V8HF_V8SF_UQI_INT, V8SI_FTYPE_V8HF_V8SI_UQI_INT, V4DF_FTYPE_V8HF_V4DF_UQI_INT, V4DI_FTYPE_V8HF_V4DI_UQI_INT. * config/i386/sse.md: (avx512fp16_float_extend_ph<mode>2<mask_name><round_saeonly_name>): Add condition check. (avx512fp16_vcvtph2<sseintconvertsignprefix><sseintconvert>_<mode> <mask_name><round_name>): Ditto. (avx512fp16_float_extend_ph<mode>2<mask_name>): Extend round saeonly. (vcvtph2ps256<mask_name>): Ditto. * config/i386/subst.md (round_saeonly_applied): New condition. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add new builtin test. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Add new macro test. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/avx10_2-rounding-1.c: Add test.
-
Hu, Lin1 authored
gcc/ChangeLog: * config/i386/avx10_2roundingintrin.h: Add new intrins. * config/i386/i386-builtin-types.def: Add new DEF_FUNCTION_TYPE. * config/i386/i386-builtin.def (BDESC): Add new builtins. * config/i386/i386-expand.cc (ix86_expand_round_builtin): Handle V4DI_FTYPE_V4DF_V4DI_UQI_INT, V4SI_FTYPE_V4DF_V4SI_UQI_INT. * config/i386/sse.md: (avx_cvtpd2dq256<mask_name>): Change name to avx_cvtpd2dq256<mask_name><round_name> and extend pattern to generate 256bit insns. (fixuns_notrunc<mode><si2dfmodelower>2<mask_name><round_name>): Add round_mode_condition. * config/i386/subst.md (round_pd2udqsuff): New iterator. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add new builtin test. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/sse-14.c: Add new macro test. * gcc.target/i386/sse-22.c: Ditto. * gcc.target/i386/avx10_2-rounding-1.c: Add test.
-
Hu, Lin1 authored
gcc/ChangeLog: * config/i386/avx10_2roundingintrin.h: Add new intrins. * config/i386/i386-builtin-types.def: Add new DEF_FUNCTION_TYPE. * config/i386/i386-builtin.def (BDESC): Add new builtins. * config/i386/i386-expand.cc (ix86_expand_round_builtin): Handle V8SF_FTYPE_V8SI_V8SF_UQI_INT, V4SF_FTYPE_V4DF_V4SF_UQI_INT, V8HF_FTYPE_V8SI_V8HF_UQI_INT, V8HF_FTYPE_V4DF_V8HF_UQI_INT. * config/i386/sse.md: (avx512fp16_vcvt<floatsuffix><sseintconvert>2ph_<mode><mask_name><round_name>): Add condition check. (avx512fp16_vcvtpd2ph_v4df_mask_round): New expand. (*avx512fp16_vcvt<castmode>2ph_<mode>_mask): Change name to avx512fp16_vcvt<castmode>2ph_<mode>_mask<round_name>_1 and extend pattern to generate 256bit insns. (avx_cvtpd2ps256<mask_name>): Change name to avx_cvtpd2ps256<mask_name><round_name> and extend pattern to generate 256bit insns. * config/i386/subst.md (round_applied): New condition. (round_suff): New iterator. (round_mode_condition): Add V32HI check for 512bit. (round_saeonly_mode_condition): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add new builtin test. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/sse-14.c: Add new macro test. * gcc.target/i386/sse-22.c: Ditto. * gcc.target/i386/avx10_2-rounding-1.c: Add test.
-
Hu, Lin1 authored
gcc/ChangeLog: * config.gcc: Add avx10_2roundingintrin.h. * config/i386/i386-builtin-types.def: Add new DEF_FUNCTION_TYPE. * config/i386/i386-builtin.def (BDESC): Add new builtins. * config/i386/i386-expand.cc (ix86_expand_round_builtin): Handle V4DF_FTYPE_V4DF_V4DF_V4DF_UQI_INT, V8SF_FTYPE_V8SF_V8SF_V8SF_UQI_INT, V16HF_FTYPE_V16HF_V16HF_V16HF_UHI_INT, UQI_FTYPE_V4DF_V4DF_INT_UQI_INT, UHI_FTYPE_V16HF_V16HF_INT_UHI_INT, UQI_FTYPE_V8SF_V8SF_INT_UQI_INT. * config/i386/immintrin.h: Include avx10_2roundingintrin.h. * config/i386/sse.md: Change subst_attr name due to renaming. * config/i386/subst.md: (<round_mode512bit_condition>): Add condition check for avx10.2 rounding control 256bit intrins and renamed to ... (<round_mode_condition>): ...this. (round_saeonly_mode512bit_condition): Add condition check for avx10.2 rounding control 256 bit intris and renamed to ... (round_saeonly_mode_condition): ...this. * config/i386/avx10_2roundingintrin.h: New file. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add -mavx10.2 and new builtin test. * gcc.target/i386/avx-2.c: Ditto. * gcc.target/i386/sse-13.c: Add new tests. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Ditto. * gcc.target/i386/avx10_2-rounding-1.c: New test.
-
GCC Administrator authored
-
- Aug 18, 2024
-
-
Jeff Law authored
This fixes two general ubsan issues in ext-dce, both related to use-side processsing of modes > DImode. In ext_dce_process_uses we can be presented with something like this as a use (subreg:SI (reg:TF) 12) That will result in an out of range shift for a HOST_WIDE_INT object. Where this happens is safe to just break from the SET context and process the subjects. This will ultimately result in seeing (reg:TF) and we'll mark all bit groups as live. In carry_backpropagate we can be presented with a TImode shift (for example) and the shift count can be > 63 for such a shift. This naturally trips ubsan as well as we're operating on 64 bit objects. We can just return mmask in this case noting that every bit group is live. The combination of these two fixes eliminates all the reported ubsan issues in ext-dce seen in a bootstrap and regression test on x86. While I was in there I went ahead and fixed the various hardcoded 63/64 values to be HOST_BITS_PER_WIDE_INT based. Bootstrapped and regression tested on x86 with no regressions. Also built with ubsan enabled and verified the build logs and testsuite logs don't call out any issues in ext-dce anymore. Pushing to the trunk. PR rtl-optimization/115876 gcc * ext-dce.cc (ext_dce_process_sets): Replace hardcoded 63/64 instances with HOST_BITS_PER_WIDE_INT based values. (carry_backpropagate): Handle modes with more bits than HOST_BITS_PER_WIDE_INT gracefully, avoiding undefined behavior. (ext_dce_process_uses): Handle subreg offsets which would result in ubsan shifts gracefully, avoiding undefined behavior.
-
Gerald Pfeifer authored
libstdc++-v3: * doc/xml/manual/prerequisites.xml: Remove note from the GCC 4.0.1 days. * doc/html/manual/setup.html: Regenerate.
-
Gerald Pfeifer authored
gcc: * doc/gm2.texi (Contributing): Tweak gm2 mailing list address.
-
Andrew Pinski authored
To start working on more with expressions with more than one operand, converting over to use gimple_match_op is needed. The added side-effect here is factor_out_conditional_operation can now support builtins/internal calls that has one operand without any extra code added. Note on the changed testcases: * pr87007-5.c: the test was testing testing for avoiding partial register stalls for the sqrt and making sure there is only one zero of the register before the branch, the phiopt would now merge the sqrt's so disable phiopt. Bootstrapped and tested on x86_64-linux-gnu with no regressions. gcc/ChangeLog: * gimple-match-exports.cc (gimple_match_op::operands_occurs_in_abnormal_phi): New function. * gimple-match.h (gimple_match_op): Add operands_occurs_in_abnormal_phi. * tree-ssa-phiopt.cc (factor_out_conditional_operation): Use gimple_match_op instead of manually extracting from/creating the gimple. gcc/testsuite/ChangeLog: * gcc.target/i386/pr87007-5.c: Disable phi-opt. Signed-off-by:
Andrew Pinski <quic_apinski@quicinc.com>
-