- Sep 19, 2021
-
-
Matwey V. Kornilov authored
gcc/ * config/avr/avr-mcus.def: Add atmega324pb. * doc/avr-mmcu.texi: Corresponding changes.
-
Roger Sayle authored
This patch tackles PR middle-end/88173 where the order of operands in a comparison affects constant folding. As diagnosed by Jason Merrill, "match.pd handles these comparisons very differently". The history is that the middle end, typically canonicalizes comparisons to place constants on the right, but when a comparison contains two constants we need to check/transform both constants, i.e. on both the left and the right. Hence the added lines below duplicate for @0 the same transform applied a few lines above for @1. Whilst preparing the testcase, I noticed that this transformation is incorrectly disabled with -fsignaling-nans even when both operands are known not be be signaling NaNs, so I've corrected that and added a second test case. Unfortunately, c-c++-common/pr57371-4.c then starts failing, as it doesn't distinguish QNaNs (which are quiet) from SNaNs (which signal), so this patch includes a minor tweak to the expected behaviour for QNaNs in that existing test. 2021-09-19 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog PR middle-end/88173 * match.pd (cmp @0 REAL_CST@1): When @0 is also REAL_CST, apply the same transformations as to @1. For comparisons against NaN, don't check HONOR_SNANS but confirm that neither operand is a signaling NaN. gcc/testsuite/ChangeLog PR middle-end/88173 * c-c++-common/pr57371-4.c: Tweak/correct test case for QNaNs. * g++.dg/pr88173-1.C: New test case. * g++.dg/pr88173-2.C: New test case.
-
Benjamin Peterson authored
gcc/ * attribs.c (make_unique_name): Delete. * attribs.h (make_unique_name): Delete.
-
Andrew Pinski authored
So this is a simple fix is to just add to the assert that sclass and dclass are both greater than or equal to NO_REGS. NO_REGS is documented as the first register class so it should have the value of 0. gcc/ChangeLog: * lra-constraints.c (check_and_process_move): Assert that dclass and sclass are greater than or equal to NO_REGS.
-
GCC Administrator authored
-
- Sep 18, 2021
-
-
Jakub Jelinek authored
This patch adds handling for unconstrained and reproducible modifiers on order(concurrent) clause. For all static schedules (including auto and no schedule or dist_schedule clauses) I believe what we implement is reproducible, so the patch doesn't do much beyond recognizing those. Note, there is an OpenMP/spec issue that needs resolution on what should happen with the dynamic schedules (whether it should be an error to mix such clauses, or silently make it non-reproducible, and in which exact cases), so it might need some follow-up. Besides that, this patch allows order(concurrent) clause on the distribute construct which is something also added in OpenMP 5.1, and finally check the newly added restriction that at most one order clause can appear on a construct. The allowing of order clause on distribute has a side-effect that order(concurrent) copyin(thrpriv) is no longer allowed on combined/composite constructs with distribute parallel for{, simd} in it, previously the order applied only to for/simd and so a threadprivate var could be seen in the construct, but now it also applies to distribute and so on the parallel we shouldn't refer to a threadprivate var. 2021-09-18 Jakub Jelinek <jakub@redhat.com> gcc/ * tree.h (OMP_CLAUSE_ORDER_UNCONSTRAINED): Define. * tree-pretty-print.c (dump_omp_clause): Print unconstrained: for OMP_CLAUSE_ORDER_UNCONSTRAINED. gcc/c-family/ * c-omp.c (c_omp_split_clauses): Split order clause also to distribute construct. Copy over OMP_CLAUSE_ORDER_UNCONSTRAINED. gcc/c/ * c-parser.c (c_parser_omp_clause_order): Parse unconstrained and reproducible modifiers. (OMP_DISTRIBUTE_CLAUSE_MASK): Add order clause. gcc/cp/ * parser.c (cp_parser_omp_clause_order): Parse unconstrained and reproducible modifiers. (OMP_DISTRIBUTE_CLAUSE_MASK): Add order clause. gcc/testsuite/ * c-c++-common/gomp/order-1.c (f2): Add tests for distribute with order clause. (f3): Remove. * c-c++-common/gomp/order-2.c: Don't expect error for distribute with order clause. * c-c++-common/gomp/order-5.c: New test. * c-c++-common/gomp/order-6.c: New test. * c-c++-common/gomp/clause-dups-1.c (f1): Add tests for duplicated order clause. (f9): New function. * c-c++-common/gomp/clauses-1.c (baz, bar): Don't mix copyin and order(concurrent) clauses on the same composite construct combined with distribute, instead split it into two tests, one without copyin and one without order(concurrent). Add order(concurrent) clauses to {,{,target} teams} distribute. * g++.dg/gomp/attrs-1.C (baz, bar): Likewise. * g++.dg/gomp/attrs-2.C (baz, bar): Likewise.
-
liuhongt authored
Besides conversion instructions, pass_rpad also handles scalar sqrt/rsqrt/rcp/round instructions, while r12-3614 should only want to handle conversion instructions, so fix it. gcc/ChangeLog: * config/i386/i386-features.c (remove_partial_avx_dependency): Restrict TARGET_USE_VECTOR_FP_CONVERTS and TARGET_USE_VECTOR_CONVERTS to conversion instructions only.
-
Jakub Jelinek authored
OpenMP 5.1 allows default(private) or default(firstprivate) even in C/C++, but it behaves the same way as in Fortran only for variables not declared at namespace or file scope. For the namespace/file scope variables it instead behaves as default(none). 2021-09-18 Jakub Jelinek <jakub@redhat.com> gcc/ * gimplify.c (omp_default_clause): For C/C++ default({,first}private), if file/namespace scope variable doesn't have predetermined sharing, treat it as if there was default(none). gcc/c/ * c-parser.c (c_parser_omp_clause_default): Handle private and firstprivate arguments, adjust diagnostics on unknown argument. gcc/cp/ * parser.c (cp_parser_omp_clause_default): Handle private and firstprivate arguments, adjust diagnostics on unknown argument. * cp-gimplify.c (cxx_omp_finish_clause): Handle OMP_CLAUSE_PRIVATE. gcc/testsuite/ * c-c++-common/gomp/default-2.c: New test. * c-c++-common/gomp/default-3.c: New test. * g++.dg/gomp/default-1.C: New test. libgomp/ * testsuite/libgomp.c++/default-1.C: New test. * testsuite/libgomp.c-c++-common/default-1.c: New test. * libgomp.texi (OpenMP 5.1): Mark "private and firstprivate argument to default clause in C and C++" as implemented.
-
liuhongt authored
gcc/testsuite/ChangeLog: * gcc.target/i386/avx512fp16-vfmaddXXXsh-1a.c: New test. * gcc.target/i386/avx512fp16-vfmaddXXXsh-1b.c: Ditto. * gcc.target/i386/avx512fp16-vfmsubXXXsh-1a.c: Ditto. * gcc.target/i386/avx512fp16-vfmsubXXXsh-1b.c: Ditto. * gcc.target/i386/avx512fp16-vfnmaddXXXsh-1a.c: Ditto. * gcc.target/i386/avx512fp16-vfnmaddXXXsh-1b.c: Ditto. * gcc.target/i386/avx512fp16-vfnmsubXXXsh-1a.c: Ditto. * gcc.target/i386/avx512fp16-vfnmsubXXXsh-1b.c: Ditto.
-
liuhongt authored
Add vfmadd[132,213,231]sh/vfnmadd[132,213,231]sh/ vfmsub[132,213,231]sh/vfnmsub[132,213,231]sh. gcc/ChangeLog: * config/i386/avx512fp16intrin.h (_mm_fmadd_sh): New intrinsic. (_mm_mask_fmadd_sh): Likewise. (_mm_mask3_fmadd_sh): Likewise. (_mm_maskz_fmadd_sh): Likewise. (_mm_fmadd_round_sh): Likewise. (_mm_mask_fmadd_round_sh): Likewise. (_mm_mask3_fmadd_round_sh): Likewise. (_mm_maskz_fmadd_round_sh): Likewise. (_mm_fnmadd_sh): Likewise. (_mm_mask_fnmadd_sh): Likewise. (_mm_mask3_fnmadd_sh): Likewise. (_mm_maskz_fnmadd_sh): Likewise. (_mm_fnmadd_round_sh): Likewise. (_mm_mask_fnmadd_round_sh): Likewise. (_mm_mask3_fnmadd_round_sh): Likewise. (_mm_maskz_fnmadd_round_sh): Likewise. (_mm_fmsub_sh): Likewise. (_mm_mask_fmsub_sh): Likewise. (_mm_mask3_fmsub_sh): Likewise. (_mm_maskz_fmsub_sh): Likewise. (_mm_fmsub_round_sh): Likewise. (_mm_mask_fmsub_round_sh): Likewise. (_mm_mask3_fmsub_round_sh): Likewise. (_mm_maskz_fmsub_round_sh): Likewise. (_mm_fnmsub_sh): Likewise. (_mm_mask_fnmsub_sh): Likewise. (_mm_mask3_fnmsub_sh): Likewise. (_mm_maskz_fnmsub_sh): Likewise. (_mm_fnmsub_round_sh): Likewise. (_mm_mask_fnmsub_round_sh): Likewise. (_mm_mask3_fnmsub_round_sh): Likewise. (_mm_maskz_fnmsub_round_sh): Likewise. * config/i386/i386-builtin-types.def (V8HF_FTYPE_V8HF_V8HF_V8HF_UQI_INT): New builtin type. * config/i386/i386-builtin.def: Add new builtins. * config/i386/i386-expand.c: Handle new builtin type. * config/i386/sse.md (fmai_vmfmadd_<mode><round_name>): Ajdust to support FP16. (fmai_vmfmsub_<mode><round_name>): Ditto. (fmai_vmfnmadd_<mode><round_name>): Ditto. (fmai_vmfnmsub_<mode><round_name>): Ditto. (*fmai_fmadd_<mode>): Ditto. (*fmai_fmsub_<mode>): Ditto. (*fmai_fnmadd_<mode><round_name>): Ditto. (*fmai_fnmsub_<mode><round_name>): Ditto. (avx512f_vmfmadd_<mode>_mask<round_name>): Ditto. (avx512f_vmfmadd_<mode>_mask3<round_name>): Ditto. (avx512f_vmfmadd_<mode>_maskz<round_expand_name>): Ditto. (avx512f_vmfmadd_<mode>_maskz_1<round_name>): Ditto. (*avx512f_vmfmsub_<mode>_mask<round_name>): Ditto. (avx512f_vmfmsub_<mode>_mask3<round_name>): Ditto. (*avx512f_vmfmsub_<mode>_maskz_1<round_name>): Ditto. (*avx512f_vmfnmsub_<mode>_mask<round_name>): Ditto. (*avx512f_vmfnmsub_<mode>_mask3<round_name>): Ditto. (*avx512f_vmfnmsub_<mode>_mask<round_name>): Ditto. (*avx512f_vmfnmadd_<mode>_mask<round_name>): Renamed to ... (avx512f_vmfnmadd_<mode>_mask<round_name>) ... this, and adjust to support FP16. (avx512f_vmfnmadd_<mode>_mask3<round_name>): Ditto. (avx512f_vmfnmadd_<mode>_maskz_1<round_name>): Ditto. (avx512f_vmfnmadd_<mode>_maskz<round_expand_name>): New expander. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add test for new builtins. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/sse-14.c: Add test for new intrinsics. * gcc.target/i386/sse-22.c: Ditto.
-
H.J. Lu authored
gcc/ChangeLog: * config/i386/sse.md (avx512fmaskmodelower): Extend to support HF modes. (maskload<mode><avx512fmaskmodelower>): Ditto. (maskstore<mode><avx512fmaskmodelower>): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx512fp16-xorsign-1.c: New test.
-
liuhongt authored
gcc/testsuite/ChangeLog: * gcc.target/i386/avx512fp16-neg-1a.c: New test. * gcc.target/i386/avx512fp16-neg-1b.c: Ditto. * gcc.target/i386/avx512fp16-scalar-bitwise-1a.c: Ditto. * gcc.target/i386/avx512fp16-scalar-bitwise-1b.c: Ditto. * gcc.target/i386/avx512fp16-vector-bitwise-1a.c: Ditto. * gcc.target/i386/avx512fp16-vector-bitwise-1b.c: Ditto. * gcc.target/i386/avx512fp16vl-neg-1a.c: Ditto. * gcc.target/i386/avx512fp16vl-neg-1b.c: Ditto.
-
H.J. Lu authored
1. FP16 vector xor/ior/and/andnot/abs/neg 2. FP16 scalar abs/neg/copysign/xorsign gcc/ChangeLog: * config/i386/i386-expand.c (ix86_expand_fp_absneg_operator): Handle HFmode. (ix86_expand_copysign): Ditto. (ix86_expand_xorsign): Ditto. * config/i386/i386.c (ix86_build_const_vector): Handle HF vector modes. (ix86_build_signbit_mask): Ditto. (ix86_can_change_mode_class): Ditto. * config/i386/i386.md (SSEMODEF): Add HFmode. (ssevecmodef): Ditto. (<code>hf2): New define_expand. (*<code>hf2_1): New define_insn_and_split. (copysign<mode>): Extend to support HFmode under AVX512FP16. (xorsign<mode>): Ditto. * config/i386/sse.md (VFB): New mode iterator. (VFB_128_256): Ditto. (VFB_512): Ditto. (sseintvecmode2): Support HF vector mode. (<code><mode>2): Use new mode iterator. (*<code><mode>2): Ditto. (copysign<mode>3): Ditto. (xorsign<mode>3): Ditto. (<code><mode>3<mask_name>): Ditto. (<code><mode>3<mask_name>): Ditto. (<sse>_andnot<mode>3<mask_name>): Adjust for HF vector mode. (<sse>_andnot<mode>3<mask_name>): Ditto. (*<code><mode>3<mask_name>): Ditto. (*<code><mode>3<mask_name>): Ditto.
-
liuhongt authored
gcc/testsuite/ChangeLog: * gcc.target/i386/avx512fp16-vfmaddXXXph-1a.c: New test. * gcc.target/i386/avx512fp16-vfmaddXXXph-1b.c: Ditto. * gcc.target/i386/avx512fp16-vfmsubXXXph-1a.c: Ditto. * gcc.target/i386/avx512fp16-vfmsubXXXph-1b.c: Ditto. * gcc.target/i386/avx512fp16-vfnmaddXXXph-1a.c: Ditto. * gcc.target/i386/avx512fp16-vfnmaddXXXph-1b.c: Ditto. * gcc.target/i386/avx512fp16-vfnmsubXXXph-1a.c: Ditto. * gcc.target/i386/avx512fp16-vfnmsubXXXph-1b.c: Ditto. * gcc.target/i386/avx512fp16vl-vfmaddXXXph-1a.c: Ditto. * gcc.target/i386/avx512fp16vl-vfmaddXXXph-1b.c: Ditto. * gcc.target/i386/avx512fp16vl-vfmsubXXXph-1a.c: Ditto. * gcc.target/i386/avx512fp16vl-vfmsubXXXph-1b.c: Ditto. * gcc.target/i386/avx512fp16vl-vfnmaddXXXph-1a.c: Ditto. * gcc.target/i386/avx512fp16vl-vfnmaddXXXph-1b.c: Ditto. * gcc.target/i386/avx512fp16vl-vfnmsubXXXph-1a.c: Ditto. * gcc.target/i386/avx512fp16vl-vfnmsubXXXph-1b.c: Ditto.
-
liuhongt authored
Add vfmadd[132,213,231]ph/vfnmadd[132,213,231]ph/vfmsub[132,213,231]ph/ vfnmsub[132,213,231]ph. gcc/ChangeLog: * config/i386/avx512fp16intrin.h (_mm512_mask_fmadd_ph): New intrinsic. (_mm512_mask3_fmadd_ph): Likewise. (_mm512_maskz_fmadd_ph): Likewise. (_mm512_fmadd_round_ph): Likewise. (_mm512_mask_fmadd_round_ph): Likewise. (_mm512_mask3_fmadd_round_ph): Likewise. (_mm512_maskz_fmadd_round_ph): Likewise. (_mm512_fnmadd_ph): Likewise. (_mm512_mask_fnmadd_ph): Likewise. (_mm512_mask3_fnmadd_ph): Likewise. (_mm512_maskz_fnmadd_ph): Likewise. (_mm512_fnmadd_round_ph): Likewise. (_mm512_mask_fnmadd_round_ph): Likewise. (_mm512_mask3_fnmadd_round_ph): Likewise. (_mm512_maskz_fnmadd_round_ph): Likewise. (_mm512_fmsub_ph): Likewise. (_mm512_mask_fmsub_ph): Likewise. (_mm512_mask3_fmsub_ph): Likewise. (_mm512_maskz_fmsub_ph): Likewise. (_mm512_fmsub_round_ph): Likewise. (_mm512_mask_fmsub_round_ph): Likewise. (_mm512_mask3_fmsub_round_ph): Likewise. (_mm512_maskz_fmsub_round_ph): Likewise. (_mm512_fnmsub_ph): Likewise. (_mm512_mask_fnmsub_ph): Likewise. (_mm512_mask3_fnmsub_ph): Likewise. (_mm512_maskz_fnmsub_ph): Likewise. (_mm512_fnmsub_round_ph): Likewise. (_mm512_mask_fnmsub_round_ph): Likewise. (_mm512_mask3_fnmsub_round_ph): Likewise. (_mm512_maskz_fnmsub_round_ph): Likewise. * config/i386/avx512fp16vlintrin.h (_mm256_fmadd_ph): New intrinsic. (_mm256_mask_fmadd_ph): Likewise. (_mm256_mask3_fmadd_ph): Likewise. (_mm256_maskz_fmadd_ph): Likewise. (_mm_fmadd_ph): Likewise. (_mm_mask_fmadd_ph): Likewise. (_mm_mask3_fmadd_ph): Likewise. (_mm_maskz_fmadd_ph): Likewise. (_mm256_fnmadd_ph): Likewise. (_mm256_mask_fnmadd_ph): Likewise. (_mm256_mask3_fnmadd_ph): Likewise. (_mm256_maskz_fnmadd_ph): Likewise. (_mm_fnmadd_ph): Likewise. (_mm_mask_fnmadd_ph): Likewise. (_mm_mask3_fnmadd_ph): Likewise. (_mm_maskz_fnmadd_ph): Likewise. (_mm256_fmsub_ph): Likewise. (_mm256_mask_fmsub_ph): Likewise. (_mm256_mask3_fmsub_ph): Likewise. (_mm256_maskz_fmsub_ph): Likewise. (_mm_fmsub_ph): Likewise. (_mm_mask_fmsub_ph): Likewise. (_mm_mask3_fmsub_ph): Likewise. (_mm_maskz_fmsub_ph): Likewise. (_mm256_fnmsub_ph): Likewise. (_mm256_mask_fnmsub_ph): Likewise. (_mm256_mask3_fnmsub_ph): Likewise. (_mm256_maskz_fnmsub_ph): Likewise. (_mm_fnmsub_ph): Likewise. (_mm_mask_fnmsub_ph): Likewise. (_mm_mask3_fnmsub_ph): Likewise. (_mm_maskz_fnmsub_ph): Likewise. * config/i386/i386-builtin.def: Add corresponding new builtins. * config/i386/sse.md (<avx512>_fmadd_<mode>_maskz<round_expand_name>): Adjust to support HF vector modes. (<sd_mask_codefor>fma_fmadd_<mode><sd_maskz_name><round_name>): Ditto. (*<sd_mask_codefor>fma_fmadd_<mode><sd_maskz_name>_bcst_1): Ditto. (*<sd_mask_codefor>fma_fmadd_<mode><sd_maskz_name>_bcst_2): Ditto. (*<sd_mask_codefor>fma_fmadd_<mode><sd_maskz_name>_bcst_3): Ditto. (<avx512>_fmadd_<mode>_mask<round_name>): Ditto. (<avx512>_fmadd_<mode>_mask3<round_name>): Ditto. (<avx512>_fmsub_<mode>_maskz<round_expand_name>): Ditto. (<sd_mask_codefor>fma_fmsub_<mode><sd_maskz_name><round_name>): Ditto. (*<sd_mask_codefor>fma_fmsub_<mode><sd_maskz_name>_bcst_1): Ditto. (*<sd_mask_codefor>fma_fmsub_<mode><sd_maskz_name>_bcst_2): Ditto. (*<sd_mask_codefor>fma_fmsub_<mode><sd_maskz_name>_bcst_3): Ditto. (<avx512>_fmsub_<mode>_mask<round_name>): Ditto. (<avx512>_fmsub_<mode>_mask3<round_name>): Ditto. (<sd_mask_codefor>fma_fnmadd_<mode><sd_maskz_name><round_name>): Ditto. (*<sd_mask_codefor>fma_fnmadd_<mode><sd_maskz_name>_bcst_1): Ditto. (*<sd_mask_codefor>fma_fnmadd_<mode><sd_maskz_name>_bcst_2): Ditto. (*<sd_mask_codefor>fma_fnmadd_<mode><sd_maskz_name>_bcst_3): Ditto. (<avx512>_fnmadd_<mode>_mask<round_name>): Ditto. (<avx512>_fnmadd_<mode>_mask3<round_name>): Ditto. (<avx512>_fnmsub_<mode>_maskz<round_expand_name>): Ditto. (<sd_mask_codefor>fma_fnmsub_<mode><sd_maskz_name><round_name>): Ditto. (*<sd_mask_codefor>fma_fnmsub_<mode><sd_maskz_name>_bcst_1): Ditto. (*<sd_mask_codefor>fma_fnmsub_<mode><sd_maskz_name>_bcst_2): Ditto. (*<sd_mask_codefor>fma_fnmsub_<mode><sd_maskz_name>_bcst_3): Ditto. (<avx512>_fnmsub_<mode>_mask<round_name>): Ditto. (<avx512>_fnmsub_<mode>_mask3<round_name>): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add test for new builtins. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/sse-14.c: Add test fot new intrinsics. * gcc.target/i386/sse-22.c: Ditto.
-
liuhongt authored
gcc/testsuite/ChangeLog: * gcc.target/i386/avx512fp16-vfmaddsubXXXph-1a.c: New test. * gcc.target/i386/avx512fp16-vfmaddsubXXXph-1b.c: Ditto. * gcc.target/i386/avx512fp16-vfmsubaddXXXph-1a.c: Ditto. * gcc.target/i386/avx512fp16-vfmsubaddXXXph-1b.c: Ditto. * gcc.target/i386/avx512fp16vl-vfmaddsubXXXph-1a.c: Ditto. * gcc.target/i386/avx512fp16vl-vfmaddsubXXXph-1b.c: Ditto. * gcc.target/i386/avx512fp16vl-vfmsubaddXXXph-1a.c: Ditto. * gcc.target/i386/avx512fp16vl-vfmsubaddXXXph-1b.c: Ditto.
-
liuhongt authored
gcc/ChangeLog: * config/i386/avx512fp16intrin.h (_mm512_fmaddsub_ph): New intrinsic. (_mm512_mask_fmaddsub_ph): Likewise. (_mm512_mask3_fmaddsub_ph): Likewise. (_mm512_maskz_fmaddsub_ph): Likewise. (_mm512_fmaddsub_round_ph): Likewise. (_mm512_mask_fmaddsub_round_ph): Likewise. (_mm512_mask3_fmaddsub_round_ph): Likewise. (_mm512_maskz_fmaddsub_round_ph): Likewise. (_mm512_mask_fmsubadd_ph): Likewise. (_mm512_mask3_fmsubadd_ph): Likewise. (_mm512_maskz_fmsubadd_ph): Likewise. (_mm512_fmsubadd_round_ph): Likewise. (_mm512_mask_fmsubadd_round_ph): Likewise. (_mm512_mask3_fmsubadd_round_ph): Likewise. (_mm512_maskz_fmsubadd_round_ph): Likewise. * config/i386/avx512fp16vlintrin.h (_mm256_fmaddsub_ph): New intrinsic. (_mm256_mask_fmaddsub_ph): Likewise. (_mm256_mask3_fmaddsub_ph): Likewise. (_mm256_maskz_fmaddsub_ph): Likewise. (_mm_fmaddsub_ph): Likewise. (_mm_mask_fmaddsub_ph): Likewise. (_mm_mask3_fmaddsub_ph): Likewise. (_mm_maskz_fmaddsub_ph): Likewise. (_mm256_fmsubadd_ph): Likewise. (_mm256_mask_fmsubadd_ph): Likewise. (_mm256_mask3_fmsubadd_ph): Likewise. (_mm256_maskz_fmsubadd_ph): Likewise. (_mm_fmsubadd_ph): Likewise. (_mm_mask_fmsubadd_ph): Likewise. (_mm_mask3_fmsubadd_ph): Likewise. (_mm_maskz_fmsubadd_ph): Likewise. * config/i386/i386-builtin.def: Add corresponding new builtins. * config/i386/sse.md (VFH_SF_AVX512VL): New mode iterator. * (<avx512>_fmsubadd_<mode>_maskz<round_expand_name>): New expander. * (<avx512>_fmaddsub_<mode>_maskz<round_expand_name>): Use VFH_SF_AVX512VL. * (<sd_mask_codefor>fma_fmaddsub_<mode><sd_maskz_name><round_name>): Ditto. * (<avx512>_fmaddsub_<mode>_mask<round_name>): Ditto. * (<avx512>_fmaddsub_<mode>_mask3<round_name>): Ditto. * (<sd_mask_codefor>fma_fmsubadd_<mode><sd_maskz_name><round_name>): Ditto. * (<avx512>_fmsubadd_<mode>_mask<round_name>): Ditto. * (<avx512>_fmsubadd_<mode>_mask3<round_name>): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add test for new builtins. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/sse-14.c: Add test for new intrinsics. * gcc.target/i386/sse-22.c: Ditto.
-
liuhongt authored
gcc/ChangeLog: PR target/87767 * config/i386/i386.c (ix86_print_operand): Handle V8HF/V16HF/V32HFmode. * config/i386/i386.h (VALID_BCST_MODE_P): Add HFmode. * config/i386/sse.md (avx512bcst): Remove. gcc/testsuite/ChangeLog: * gcc.target/i386/avx512fp16-broadcast-1.c: New test. * gcc.target/i386/avx512fp16-broadcast-2.c: New test.
-
Jason Merrill authored
I've been working on the resolution of CWG1835 by P1787, which among many other things clarified that a name after -> or . is looked up first in the class of the object expression even if it's dependent. This patch does not make that change; this is a smaller change extracted from that work in progress to make the lookup in the object type work better in cases where unqualified lookup doesn't find anything. Basically, if we see "t.foo::" we know that looking up foo in t needs to find a type, so we build an implicit TYPENAME_TYPE for it. This also implements the change from P1787 to assume that a name followed by < in a type-only context names a template, since the less-than operator can't appear in a type context. This makes some of the lines in dtor11.C work. I introduce the predicate 'dependentish_scope_p' for the case where the current instantiation has dependent bases, so even though we can perform name lookup, we can't conclude that a lookup failure is conclusive. gcc/cp/ChangeLog: * cp-tree.h (dependentish_scope_p): Declare. * pt.c (dependentish_scope_p): New. * parser.c (cp_parser_lookup_name): Return a TYPENAME_TYPE for lookup of a type in a dependent object. (cp_parser_template_id): Handle TYPENAME_TYPE. (cp_parser_template_name): If we're looking for a type, a name followed by < names a template. gcc/testsuite/ChangeLog: * g++.dg/template/dtor5.C: Adjust expected error. * g++.dg/cpp23/lookup2.C: New test. * g++.dg/template/dtor11.C: New test.
-
Jason Merrill authored
gcc/cp/ChangeLog: * cp-tree.h: Fix typo in LANG_FLAG list.
-
GCC Administrator authored
-
- Sep 17, 2021
-
-
Martin Sebor authored
gcc/ChangeLog: * Makefile.in (OBJS): Add gimple-predicate-analysis.o. * tree-ssa-uninit.c (max_phi_args): Move to gimple-predicate-analysis. (MASK_SET_BIT, MASK_TEST_BIT, MASK_EMPTY): Same. (check_defs): Add comment. (can_skip_redundant_opnd): Update comment. (compute_uninit_opnds_pos): Adjust to namespace change. (find_pdom): Move to gimple-predicate-analysis.cc. (find_dom): Same. (struct uninit_undef_val_t): New. (is_non_loop_exit_postdominating): Move to gimple-predicate-analysis.cc. (find_control_equiv_block): Same. (MAX_NUM_CHAINS, MAX_CHAIN_LEN, MAX_POSTDOM_CHECK): Same. (MAX_SWITCH_CASES): Same. (compute_control_dep_chain): Same. (find_uninit_use): Use predicate analyzer. (struct pred_info): Move to gimple-predicate-analysis. (convert_control_dep_chain_into_preds): Same. (find_predicates): Same. (collect_phi_def_edges): Same. (warn_uninitialized_phi): Use predicate analyzer. (find_def_preds): Move to gimple-predicate-analysis. (dump_pred_info): Same. (dump_pred_chain): Same. (dump_predicates): Same. (destroy_predicate_vecs): Remove. (execute_late_warn_uninitialized): New. (get_cmp_code): Move to gimple-predicate-analysis. (is_value_included_in): Same. (value_sat_pred_p): Same. (find_matching_predicate_in_rest_chains): Same. (is_use_properly_guarded): Same. (prune_uninit_phi_opnds): Same. (find_var_cmp_const): Same. (use_pred_not_overlap_with_undef_path_pred): Same. (pred_equal_p): Same. (is_neq_relop_p): Same. (is_neq_zero_form_p): Same. (pred_expr_equal_p): Same. (is_pred_expr_subset_of): Same. (is_pred_chain_subset_of): Same. (is_included_in): Same. (is_superset_of): Same. (pred_neg_p): Same. (simplify_pred): Same. (simplify_preds_2): Same. (simplify_preds_3): Same. (simplify_preds_4): Same. (simplify_preds): Same. (push_pred): Same. (push_to_worklist): Same. (get_pred_info_from_cmp): Same. (is_degenerated_phi): Same. (normalize_one_pred_1): Same. (normalize_one_pred): Same. (normalize_one_pred_chain): Same. (normalize_preds): Same. (can_one_predicate_be_invalidated_p): Same. (can_chain_union_be_invalidated_p): Same. (uninit_uses_cannot_happen): Same. (pass_late_warn_uninitialized::execute): Define. * gimple-predicate-analysis.cc: New file. * gimple-predicate-analysis.h: New file.
-
Harald Anlauf authored
gcc/fortran/ChangeLog: PR fortran/102366 * trans-decl.c (gfc_finish_var_decl): Disable the warning message for variables moved from stack to static storange if they are declared in the main, but allow the move to happen. gcc/testsuite/ChangeLog: PR fortran/102366 * gfortran.dg/pr102366.f90: New test.
-
Jonathan Wakely authored
All path::iterator operations are non-throwing. Signed-off-by:
Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: * include/bits/fs_path.h (path::iterator): Add noexcept to all member functions and friend functions. (distance): Add noexcept. (advance): Add noexcept and inline. * include/experimental/bits/fs_path.h (path::iterator): Add noexcept to all member functions.
-
Jonathan Wakely authored
Also rename the test so it actually runs. Signed-off-by:
Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: PR libstdc++/102270 * include/std/tuple (_Tuple_impl): Add constexpr to constructor missed in previous patch. * testsuite/20_util/tuple/cons/102270.C: Moved to... * testsuite/20_util/tuple/cons/102270.cc: ...here. * testsuite/util/testsuite_allocator.h (SimpleAllocator): Add constexpr to constructor so it can be used for C++20 tests.
-
Julian Brown authored
This is an optimisation for middle-end worker-partitioning support (used to support multiple workers on AMD GCN). At present, barriers may be emitted in cases where they aren't needed and cannot be optimised away. This patch stops the extraneous barriers from being emitted in the first place. One exception to the above (where the barrier is still needed) is for predicated blocks of code that perform a write to gang-private shared memory from one worker. We must execute a barrier before other workers read that shared memory location. gcc/ * config/gcn/gcn.c (gimple.h): Include. (gcn_fork_join): Emit barrier for worker-level joins. * omp-oacc-neuter-broadcast.cc (find_local_vars_to_propagate): Add writes_gang_private bitmap parameter. Set bit for blocks containing gang-private variable writes. (worker_single_simple): Don't emit barrier after predicated block. (worker_single_copy): Don't emit barrier if we're not broadcasting anything and the block contains no gang-private writes. (neuter_worker_single): Don't predicate blocks that only contain NOPs or internal marker functions. Pass has_gang_private_write argument to worker_single_copy. (oacc_do_neutering): Add writes_gang_private bitmap handling.
-
Julian Brown authored
This patch implements an algorithm to lay out local data-share (LDS) space. It currently works for AMD GCN. At the moment, LDS is used for three things: 1. Gang-private variables 2. Reduction temporaries (accumulators) 3. Broadcasting for worker partitioning After the patch is applied, (2) and (3) are placed at preallocated locations in LDS, and (1) continues to be handled by the backend (as it is at present prior to this patch being applied). LDS now looks like this: +--------------+ (gang-private size + 1024, = 1536) | free space | | ... | | - - - - - - -| | worker bcast | +--------------+ | reductions | +--------------+ <<< -mgang-private-size=<number> (def. 512) | gang-private | | vars | +--------------+ (32) | low LDS vars | +--------------+ LDS base So, gang-private space is fixed at a constant amount at compile time (which can be increased with a command-line switch if necessary for some given code). The layout algorithm takes out a slice of the remainder of usable space for reduction vars, and uses the rest for worker partitioning. The partitioning algorithm works as follows. 1. An "adjacency" set is built up for each basic block that might do a broadcast. This is calculated by starting at each such block, and doing a recursive DFS walk over successors to find the next block (or blocks) that *also* does a broadcast (dfs_broadcast_reachable_1). 2. The adjacency set is inverted to get adjacent predecessor blocks also. 3. Blocks that will perform a broadcast are sorted by size of that broadcast: the biggest blocks are handled first. 4. A splay tree structure is used to calculate the spans of LDS memory that are already allocated by the blocks adjacent to this one (merge_ranges{,_1}. 5. The current block's broadcast space is allocated from the first free span not allocated in the splay tree structure calculated above (first_fit_range). This seems to work quite nicely and efficiently with the splay tree structure. 6. Continue with the next-biggest broadcast block until we're done. In this way, "adjacent" broadcasts will not use the same piece of LDS memory. PR96334 "openacc: Unshare reduction temporaries for GCN" got merged in: The GCN backend uses tree nodes like MEM((__lds TYPE *) <constant>) for reduction temporaries. Unlike e.g. var decls and SSA names, these nodes cannot be shared during gimplification, but are so in some circumstances. This is detected when appropriate --enable-checking options are used. This patch unshares such nodes when they are reused more than once. gcc/ * config/gcn/gcn-protos.h (gcn_goacc_create_worker_broadcast_record): Update prototype. * config/gcn/gcn-tree.c (gcn_goacc_get_worker_red_decl): Use preallocated block of LDS memory. Do not cache/share decls for reduction temporaries between invocations. (gcn_goacc_reduction_teardown): Unshare VAR on second use. (gcn_goacc_create_worker_broadcast_record): Add OFFSET parameter and return temporary LDS space at that offset. Return pointer in "sender" case. * config/gcn/gcn.c (acc_lds_size, gang_private_hwm, lds_allocs): New global vars. (ACC_LDS_SIZE): Define as acc_lds_size. (gcn_init_machine_status): Don't initialise lds_allocated, lds_allocs, reduc_decls fields of machine function struct. (gcn_option_override): Handle default size for gang-private variables and -mgang-private-size option. (gcn_expand_prologue): Use LDS_SIZE instead of LDS_SIZE-1 when initialising M0_REG. (gcn_shared_mem_layout): New function. (gcn_print_lds_decl): Update comment. Use global lds_allocs map and gang_private_hwm variable. (TARGET_GOACC_SHARED_MEM_LAYOUT): Define target hook. * config/gcn/gcn.h (machine_function): Remove lds_allocated, lds_allocs, reduc_decls. Add reduction_base, reduction_limit. * config/gcn/gcn.opt (gang_private_size_opt): New global. (mgang-private-size=): New option. * doc/tm.texi.in (TARGET_GOACC_SHARED_MEM_LAYOUT): Place documentation hook. * doc/tm.texi: Regenerate. * omp-oacc-neuter-broadcast.cc (targhooks.h, diagnostic-core.h): Add includes. (build_sender_ref): Handle sender_decl being pointer. (worker_single_copy): Add PLACEMENT and ISOLATE_BROADCASTS parameters. Pass placement argument to create_worker_broadcast_record hook invocations. Handle sender_decl being pointer and isolate_broadcasts inserting extra barriers. (blk_offset_map_t): Add typedef. (neuter_worker_single): Add BLK_OFFSET_MAP parameter. Pass preallocated range to worker_single_copy call. (dfs_broadcast_reachable_1): New function. (idx_decl_pair_t, used_range_vec_t): New typedefs. (sort_size_descending): New function. (addr_range): New class. (splay_tree_compare_addr_range, splay_tree_free_key) (first_fit_range, merge_ranges_1, merge_ranges): New functions. (execute_omp_oacc_neuter_broadcast): Rename to... (oacc_do_neutering): ... this. Add BOUNDS_LO, BOUNDS_HI parameters. Arrange layout of shared memory for broadcast operations. (execute_omp_oacc_neuter_broadcast): New function. (pass_omp_oacc_neuter_broadcast::gate): Remove num_workers==1 handling from here. Enable pass for all OpenACC routines in order to call shared memory-layout hook. * target.def (create_worker_broadcast_record): Add OFFSET parameter. (shared_mem_layout): New hook. libgomp/ * testsuite/libgomp.oacc-c-c++-common/broadcast-many.c: Update.
-
Julian Brown authored
This patch turns off the middle-end worker-partitioning support if the number of workers for an outlined offload function is one. In that case, we do not need to perform the broadcasting/neutering code transformation. gcc/ * omp-oacc-neuter-broadcast.cc (pass_omp_oacc_neuter_broadcast::gate): Disable if num_workers is 1. (execute_omp_oacc_neuter_broadcast): Adjust. Co-Authored-By:
Thomas Schwinge <thomas@codesourcery.com>
-
Julian Brown authored
libgomp/ * testsuite/libgomp.oacc-c-c++-common/broadcast-many.c: New test.
-
Andrew MacLeod authored
This provides a path_oracle class which can optionally be used in conjunction with another oracle to track relations on a path as it is walked. * value-relation.cc (class equiv_chain): Move to header file. (path_oracle::path_oracle): New. (path_oracle::~path_oracle): New. (path_oracle::register_relation): New. (path_oracle::query_relation): New. (path_oracle::reset_path): New. (path_oracle::dump): New. * value-relation.h (class equiv_chain): Move to here. (class path_oracle): New.
-
Andrew MacLeod authored
Standardize equiv_oracle API onto the new relation_oracle virtual base, and then have dom_oracle inherit from that. equiv_set always returns an equivalency set now, never NULL. EQ_EXPR requires symmetry now. Each SSA name must be in the other equiv set. Shuffle some routines around, simplify. * gimple-range-cache.cc (ranger_cache::ranger_cache): Create a DOM based oracle. * gimple-range-fold.cc (fur_depend::register_relation): Use register_stmt/edge routines. * value-relation.cc (equiv_chain::find): Relocate from equiv_oracle. (equiv_oracle::equiv_oracle): Create self equivalence cache. (equiv_oracle::~equiv_oracle): Release same. (equiv_oracle::equiv_set): Return entry from self equiv cache if there are no equivalences. (equiv_oracle::find_equiv_block): Move list find to equiv_chain. (equiv_oracle::register_relation): Rename from register_equiv. (relation_chain_head::find_relation): Relocate from dom_oracle. (relation_oracle::register_stmt): New. (relation_oracle::register_edge): New. (dom_oracle::*): Rename from relation_oracle. (dom_oracle::register_relation): Adjust to call equiv_oracle. (dom_oracle::set_one_relation): Split from register_relation. (dom_oracle::register_transitives): Consolidate 2 methods. (dom_oracle::find_relation_block): Move core to relation_chain. (dom_oracle::query_relation): Rename from find_relation_dom and adjust. * value-relation.h (class relation_oracle): New pure virtual base. (class equiv_oracle): Inherit from relation_oracle and adjust. (class dom_oracle): Rename from old relation_oracle and adjust.
-
qing zhao authored
This set of tests failed on many different combination of -march, -mtune. some of them failed with -fstack-protestor-all, or -mno-sse. And the pattern matches are also different on lp64 or ia32. The reason for these failures is that the RTL or assembly level patten matches are only valid for -march=x86-64 -mtune=generic. We restrict the testing only for -march=x86-64 and -mtune=generic. Also add -fno-stack-protector or -msse for some of the testing cases. gcc/testsuite/ChangeLog: 2021-09-17 qing zhao <qing.zhao@oracle.com> * gcc.target/i386/auto-init-1.c: Restrict the testing only for -march=x86-64 and -mtune=generic. Add -fno-stack-protector. * gcc.target/i386/auto-init-2.c: Restrict the testing only for -march=x86-64 and -mtune=generic -msse. * gcc.target/i386/auto-init-3.c: Likewise. * gcc.target/i386/auto-init-4.c: Likewise. * gcc.target/i386/auto-init-5.c: Different pattern match for lp64 and ia32. * gcc.target/i386/auto-init-6.c: Restrict the testing only for -march=x86-64 and -mtune-generic -msse. Add -fno-stack-protector. * gcc.target/i386/auto-init-7.c: Likewise. * gcc.target/i386/auto-init-8.c: Restrict the testing only for -march=x86-64 and -mtune=generic -msse.. * gcc.target/i386/auto-init-padding-1.c: Likewise. * gcc.target/i386/auto-init-padding-10.c: Likewise. * gcc.target/i386/auto-init-padding-11.c: Likewise. * gcc.target/i386/auto-init-padding-12.c: Likewise. * gcc.target/i386/auto-init-padding-2.c: Likewise. * gcc.target/i386/auto-init-padding-3.c: Restrict the testing only for -march=x86-64. Different pattern match for lp64 and ia32. * gcc.target/i386/auto-init-padding-4.c: Restrict the testing only for -march=x86-64 and -mtune-generic -msse. * gcc.target/i386/auto-init-padding-5.c: Likewise. * gcc.target/i386/auto-init-padding-6.c: Likewise. * gcc.target/i386/auto-init-padding-7.c: Restrict the testing only for -march=x86-64 and -mtune-generic -msse. Add -fno-stack-protector. * gcc.target/i386/auto-init-padding-8.c: Likewise. * gcc.target/i386/auto-init-padding-9.c: Restrict the testing only for -march=x86-64. Different pattern match for lp64 and ia32.
-
Martin Sebor authored
Resolves: PR middle-end/102200 - ICE on a min of a decl and pointer in a loop gcc/ChangeLog: PR middle-end/102200 * pointer-query.cc (access_ref::inform_access): Handle MIN/MAX_EXPR. (handle_min_max_size): Change argument. Store original SSA_NAME for operands to potentially distinct (sub)objects. (compute_objsize_r): Adjust call to the above. gcc/testsuite/ChangeLog: PR middle-end/102200 * gcc.dg/Wstringop-overflow-62.c: Adjust text of an expected note. * gcc.dg/Warray-bounds-89.c: New test. * gcc.dg/Wstringop-overflow-74.c: New test. * gcc.dg/Wstringop-overflow-75.c: New test. * gcc.dg/Wstringop-overflow-76.c: New test.
-
Bill Schmidt authored
This patch just duplicates a couple of functions and adjusts them to use the new builtin names. There's no logical change otherwise. 2021-09-17 Bill Schmidt <wschmidt@linux.ibm.com> gcc/ * config/rs6000/rs6000.c (rs6000-builtins.h): New include. (rs6000_new_builtin_vectorized_function): New function. (rs6000_new_builtin_md_vectorized_function): Likewise. (rs6000_builtin_vectorized_function): Call rs6000_new_builtin_vectorized_function. (rs6000_builtin_md_vectorized_function): Call rs6000_new_builtin_md_vectorized_function.
-
Bill Schmidt authored
Peter Bergner recently added two new builtins __builtin_vsx_lxvp and __builtin_vsx_stxvp. These happened to break a pattern in MMA builtins that I had been using to automate gimple folding of MMA builtins. Previously, every MMA function that could be folded had an associated internal function that it was folded into. The LXVP/STXVP builtins are just folded directly into memory operations. Instead of relying on this pattern, this patch adds a new attribute to builtins called "mmaint," which is set for all MMA builtins that have an associated internal builtin. The naming convention that adds _INTERNAL to the builtin index name remains. The rest of the patch is just duplicating Peter's patch, using the new builtin infrastructure. 2021-09-17 Bill Schmidt <wschmidt@linux.ibm.com> gcc/ * config/rs6000/rs6000-builtin-new.def (ASSEMBLE_ACC): Add mmaint flag. (ASSEMBLE_PAIR): Likewise. (BUILD_ACC): Likewise. (DISASSEMBLE_ACC): Likewise. (DISASSEMBLE_PAIR): Likewise. (PMXVBF16GER2): Likewise. (PMXVBF16GER2NN): Likewise. (PMXVBF16GER2NP): Likewise. (PMXVBF16GER2PN): Likewise. (PMXVBF16GER2PP): Likewise. (PMXVF16GER2): Likewise. (PMXVF16GER2NN): Likewise. (PMXVF16GER2NP): Likewise. (PMXVF16GER2PN): Likewise. (PMXVF16GER2PP): Likewise. (PMXVF32GER): Likewise. (PMXVF32GERNN): Likewise. (PMXVF32GERNP): Likewise. (PMXVF32GERPN): Likewise. (PMXVF32GERPP): Likewise. (PMXVF64GER): Likewise. (PMXVF64GERNN): Likewise. (PMXVF64GERNP): Likewise. (PMXVF64GERPN): Likewise. (PMXVF64GERPP): Likewise. (PMXVI16GER2): Likewise. (PMXVI16GER2PP): Likewise. (PMXVI16GER2S): Likewise. (PMXVI16GER2SPP): Likewise. (PMXVI4GER8): Likewise. (PMXVI4GER8PP): Likewise. (PMXVI8GER4): Likewise. (PMXVI8GER4PP): Likewise. (PMXVI8GER4SPP): Likewise. (XVBF16GER2): Likewise. (XVBF16GER2NN): Likewise. (XVBF16GER2NP): Likewise. (XVBF16GER2PN): Likewise. (XVBF16GER2PP): Likewise. (XVF16GER2): Likewise. (XVF16GER2NN): Likewise. (XVF16GER2NP): Likewise. (XVF16GER2PN): Likewise. (XVF16GER2PP): Likewise. (XVF32GER): Likewise. (XVF32GERNN): Likewise. (XVF32GERNP): Likewise. (XVF32GERPN): Likewise. (XVF32GERPP): Likewise. (XVF64GER): Likewise. (XVF64GERNN): Likewise. (XVF64GERNP): Likewise. (XVF64GERPN): Likewise. (XVF64GERPP): Likewise. (XVI16GER2): Likewise. (XVI16GER2PP): Likewise. (XVI16GER2S): Likewise. (XVI16GER2SPP): Likewise. (XVI4GER8): Likewise. (XVI4GER8PP): Likewise. (XVI8GER4): Likewise. (XVI8GER4PP): Likewise. (XVI8GER4SPP): Likewise. (XXMFACC): Likewise. (XXMTACC): Likewise. (XXSETACCZ): Likewise. (ASSEMBLE_PAIR_V): Likewise. (BUILD_PAIR): Likewise. (DISASSEMBLE_PAIR_V): Likewise. (LXVP): New. (STXVP): New. * config/rs6000/rs6000-call.c (rs6000_gimple_fold_new_mma_builtin): Handle RS6000_BIF_LXVP and RS6000_BIF_STXVP. * config/rs6000/rs6000-gen-builtins.c (attrinfo): Add ismmaint. (parse_bif_attrs): Handle ismmaint. (write_decls): Add bif_mmaint_bit and bif_is_mmaint. (write_bif_static_init): Handle ismmaint.
-
Bill Schmidt authored
This is another patch that looks bigger than it really is. Because we have a new namespace for the builtins, allowing us to have both the old and new builtin infrastructure supported at once, we need versions of these functions that use the new builtin namespace. Otherwise the code is unchanged. 2021-09-17 Bill Schmidt <wschmidt@linux.ibm.com> gcc/ * config/rs6000/rs6000-call.c (rs6000_gimple_fold_new_builtin): New forward decl. (rs6000_gimple_fold_builtin): Call rs6000_gimple_fold_new_builtin. (rs6000_new_builtin_valid_without_lhs): New function. (rs6000_gimple_fold_new_mma_builtin): Likewise. (rs6000_gimple_fold_new_builtin): Likewise.
-
Thomas Schwinge authored
Thus plugging potentional memory leaks if these have non-trivial constructor/destructor. See <https://stackoverflow.com/questions/6730403/how-to-delete-object-constructed-via-placement-new-operator> and others. As one example, compilation of 'g++.dg/warn/Wmismatched-tags.C' per 'valgrind --leak-check=full' improves as follows: [...] - -104 bytes in 1 blocks are definitely lost in loss record 399 of 519 - at 0x483DFAF: realloc (vg_replace_malloc.c:836) - by 0x223B62C: xrealloc (xmalloc.c:179) - by 0xA8D848: void va_heap::reserve<class_decl_loc_t::class_key_loc_t>(vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_embed>*&, unsigned int, bool) (vec.h:290) - by 0xA8B373: vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_ptr>::reserve(unsigned int, bool) (vec.h:1858) - by 0xA8B277: vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_ptr>::safe_push(class_decl_loc_t::class_key_loc_t const&) (vec.h:1967) - by 0xA57481: class_decl_loc_t::add_or_diag_mismatched_tag(tree_node*, tag_types, bool, bool) (parser.c:32967) - by 0xA573E1: class_decl_loc_t::add(cp_parser*, unsigned int, tag_types, tree_node*, bool, bool) (parser.c:32941) - by 0xA56C52: cp_parser_check_class_key(cp_parser*, unsigned int, tag_types, tree_node*, bool, bool) (parser.c:32819) - by 0xA3AD12: cp_parser_elaborated_type_specifier(cp_parser*, bool, bool) (parser.c:20227) - by 0xA37EF2: cp_parser_type_specifier(cp_parser*, int, cp_decl_specifier_seq*, bool, int*, bool*) (parser.c:18942) - by 0xA31CDD: cp_parser_decl_specifier_seq(cp_parser*, int, cp_decl_specifier_seq*, int*) (parser.c:15517) - by 0xA43C71: cp_parser_parameter_declaration(cp_parser*, int, bool, bool*) (parser.c:24242) - -168 bytes in 3 blocks are definitely lost in loss record 422 of 519 - at 0x483DFAF: realloc (vg_replace_malloc.c:836) - by 0x223B62C: xrealloc (xmalloc.c:179) - by 0xA8D848: void va_heap::reserve<class_decl_loc_t::class_key_loc_t>(vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_embed>*&, unsigned int, bool) (vec.h:290) - by 0xA8B373: vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_ptr>::reserve(unsigned int, bool) (vec.h:1858) - by 0xA8B277: vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_ptr>::safe_push(class_decl_loc_t::class_key_loc_t const&) (vec.h:1967) - by 0xA57481: class_decl_loc_t::add_or_diag_mismatched_tag(tree_node*, tag_types, bool, bool) (parser.c:32967) - by 0xA573E1: class_decl_loc_t::add(cp_parser*, unsigned int, tag_types, tree_node*, bool, bool) (parser.c:32941) - by 0xA56C52: cp_parser_check_class_key(cp_parser*, unsigned int, tag_types, tree_node*, bool, bool) (parser.c:32819) - by 0xA3AD12: cp_parser_elaborated_type_specifier(cp_parser*, bool, bool) (parser.c:20227) - by 0xA37EF2: cp_parser_type_specifier(cp_parser*, int, cp_decl_specifier_seq*, bool, int*, bool*) (parser.c:18942) - by 0xA31CDD: cp_parser_decl_specifier_seq(cp_parser*, int, cp_decl_specifier_seq*, int*) (parser.c:15517) - by 0xA53385: cp_parser_single_declaration(cp_parser*, vec<deferred_access_check, va_gc, vl_embed>*, bool, bool, bool*) (parser.c:31072) - -488 bytes in 7 blocks are definitely lost in loss record 449 of 519 - at 0x483DFAF: realloc (vg_replace_malloc.c:836) - by 0x223B62C: xrealloc (xmalloc.c:179) - by 0xA8D848: void va_heap::reserve<class_decl_loc_t::class_key_loc_t>(vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_embed>*&, unsigned int, bool) (vec.h:290) - by 0xA8B373: vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_ptr>::reserve(unsigned int, bool) (vec.h:1858) - by 0xA8B277: vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_ptr>::safe_push(class_decl_loc_t::class_key_loc_t const&) (vec.h:1967) - by 0xA57481: class_decl_loc_t::add_or_diag_mismatched_tag(tree_node*, tag_types, bool, bool) (parser.c:32967) - by 0xA573E1: class_decl_loc_t::add(cp_parser*, unsigned int, tag_types, tree_node*, bool, bool) (parser.c:32941) - by 0xA56C52: cp_parser_check_class_key(cp_parser*, unsigned int, tag_types, tree_node*, bool, bool) (parser.c:32819) - by 0xA3AD12: cp_parser_elaborated_type_specifier(cp_parser*, bool, bool) (parser.c:20227) - by 0xA37EF2: cp_parser_type_specifier(cp_parser*, int, cp_decl_specifier_seq*, bool, int*, bool*) (parser.c:18942) - by 0xA31CDD: cp_parser_decl_specifier_seq(cp_parser*, int, cp_decl_specifier_seq*, int*) (parser.c:15517) - by 0xA49508: cp_parser_member_declaration(cp_parser*) (parser.c:26440) - -728 bytes in 7 blocks are definitely lost in loss record 455 of 519 - at 0x483B7F3: malloc (vg_replace_malloc.c:309) - by 0x223B63F: xrealloc (xmalloc.c:177) - by 0xA8D848: void va_heap::reserve<class_decl_loc_t::class_key_loc_t>(vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_embed>*&, unsigned int, bool) (vec.h:290) - by 0xA8B373: vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_ptr>::reserve(unsigned int, bool) (vec.h:1858) - by 0xA57508: class_decl_loc_t::add_or_diag_mismatched_tag(tree_node*, tag_types, bool, bool) (parser.c:32980) - by 0xA573E1: class_decl_loc_t::add(cp_parser*, unsigned int, tag_types, tree_node*, bool, bool) (parser.c:32941) - by 0xA56C52: cp_parser_check_class_key(cp_parser*, unsigned int, tag_types, tree_node*, bool, bool) (parser.c:32819) - by 0xA48BC6: cp_parser_class_head(cp_parser*, bool*) (parser.c:26090) - by 0xA4674B: cp_parser_class_specifier_1(cp_parser*) (parser.c:25302) - by 0xA47D76: cp_parser_class_specifier(cp_parser*) (parser.c:25680) - by 0xA37E27: cp_parser_type_specifier(cp_parser*, int, cp_decl_specifier_seq*, bool, int*, bool*) (parser.c:18912) - by 0xA31CDD: cp_parser_decl_specifier_seq(cp_parser*, int, cp_decl_specifier_seq*, int*) (parser.c:15517) - -832 bytes in 8 blocks are definitely lost in loss record 458 of 519 - at 0x483B7F3: malloc (vg_replace_malloc.c:309) - by 0x223B63F: xrealloc (xmalloc.c:177) - by 0xA8D848: void va_heap::reserve<class_decl_loc_t::class_key_loc_t>(vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_embed>*&, unsigned int, bool) (vec.h:290) - by 0xA901ED: bool vec_safe_reserve<class_decl_loc_t::class_key_loc_t, va_heap>(vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_embed>*&, unsigned int, bool) (vec.h:697) - by 0xA8F161: void vec_alloc<class_decl_loc_t::class_key_loc_t, va_heap>(vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_embed>*&, unsigned int) (vec.h:718) - by 0xA8D18D: vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_embed>::copy() const (vec.h:979) - by 0xA8B0C3: vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_ptr>::copy() const (vec.h:1824) - by 0xA896B1: class_decl_loc_t::operator=(class_decl_loc_t const&) (parser.c:32697) - by 0xA571FD: class_decl_loc_t::add(cp_parser*, unsigned int, tag_types, tree_node*, bool, bool) (parser.c:32899) - by 0xA56C52: cp_parser_check_class_key(cp_parser*, unsigned int, tag_types, tree_node*, bool, bool) (parser.c:32819) - by 0xA3AD12: cp_parser_elaborated_type_specifier(cp_parser*, bool, bool) (parser.c:20227) - by 0xA37EF2: cp_parser_type_specifier(cp_parser*, int, cp_decl_specifier_seq*, bool, int*, bool*) (parser.c:18942) - -1,144 bytes in 11 blocks are definitely lost in loss record 466 of 519 - at 0x483B7F3: malloc (vg_replace_malloc.c:309) - by 0x223B63F: xrealloc (xmalloc.c:177) - by 0xA8D848: void va_heap::reserve<class_decl_loc_t::class_key_loc_t>(vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_embed>*&, unsigned int, bool) (vec.h:290) - by 0xA901ED: bool vec_safe_reserve<class_decl_loc_t::class_key_loc_t, va_heap>(vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_embed>*&, unsigned int, bool) (vec.h:697) - by 0xA8F161: void vec_alloc<class_decl_loc_t::class_key_loc_t, va_heap>(vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_embed>*&, unsigned int) (vec.h:718) - by 0xA8D18D: vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_embed>::copy() const (vec.h:979) - by 0xA8B0C3: vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_ptr>::copy() const (vec.h:1824) - by 0xA896B1: class_decl_loc_t::operator=(class_decl_loc_t const&) (parser.c:32697) - by 0xA571FD: class_decl_loc_t::add(cp_parser*, unsigned int, tag_types, tree_node*, bool, bool) (parser.c:32899) - by 0xA56C52: cp_parser_check_class_key(cp_parser*, unsigned int, tag_types, tree_node*, bool, bool) (parser.c:32819) - by 0xA48BC6: cp_parser_class_head(cp_parser*, bool*) (parser.c:26090) - by 0xA4674B: cp_parser_class_specifier_1(cp_parser*) (parser.c:25302) - -1,376 bytes in 10 blocks are definitely lost in loss record 467 of 519 - at 0x483DFAF: realloc (vg_replace_malloc.c:836) - by 0x223B62C: xrealloc (xmalloc.c:179) - by 0xA8D848: void va_heap::reserve<class_decl_loc_t::class_key_loc_t>(vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_embed>*&, unsigned int, bool) (vec.h:290) - by 0xA8B373: vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_ptr>::reserve(unsigned int, bool) (vec.h:1858) - by 0xA8B277: vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_ptr>::safe_push(class_decl_loc_t::class_key_loc_t const&) (vec.h:1967) - by 0xA57481: class_decl_loc_t::add_or_diag_mismatched_tag(tree_node*, tag_types, bool, bool) (parser.c:32967) - by 0xA573E1: class_decl_loc_t::add(cp_parser*, unsigned int, tag_types, tree_node*, bool, bool) (parser.c:32941) - by 0xA56C52: cp_parser_check_class_key(cp_parser*, unsigned int, tag_types, tree_node*, bool, bool) (parser.c:32819) - by 0xA3AD12: cp_parser_elaborated_type_specifier(cp_parser*, bool, bool) (parser.c:20227) - by 0xA37EF2: cp_parser_type_specifier(cp_parser*, int, cp_decl_specifier_seq*, bool, int*, bool*) (parser.c:18942) - by 0xA31CDD: cp_parser_decl_specifier_seq(cp_parser*, int, cp_decl_specifier_seq*, int*) (parser.c:15517) - by 0xA301E0: cp_parser_simple_declaration(cp_parser*, bool, tree_node**) (parser.c:14772) - -3,552 bytes in 33 blocks are definitely lost in loss record 483 of 519 - at 0x483B7F3: malloc (vg_replace_malloc.c:309) - by 0x223B63F: xrealloc (xmalloc.c:177) - by 0xA8D848: void va_heap::reserve<class_decl_loc_t::class_key_loc_t>(vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_embed>*&, unsigned int, bool) (vec.h:290) - by 0xA901ED: bool vec_safe_reserve<class_decl_loc_t::class_key_loc_t, va_heap>(vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_embed>*&, unsigned int, bool) (vec.h:697) - by 0xA8F161: void vec_alloc<class_decl_loc_t::class_key_loc_t, va_heap>(vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_embed>*&, unsigned int) (vec.h:718) - by 0xA8D18D: vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_embed>::copy() const (vec.h:979) - by 0xA8B0C3: vec<class_decl_loc_t::class_key_loc_t, va_heap, vl_ptr>::copy() const (vec.h:1824) - by 0xA8964A: class_decl_loc_t::class_decl_loc_t(class_decl_loc_t const&) (parser.c:32689) - by 0xA8F515: hash_table<hash_map<tree_decl_hash, class_decl_loc_t, simple_hashmap_traits<default_hash_traits<tree_decl_hash>, class_decl_loc_t> >::hash_entry, false, xcallocator>::expand() (hash-table.h:839) - by 0xA8D4B3: hash_table<hash_map<tree_decl_hash, class_decl_loc_t, simple_hashmap_traits<default_hash_traits<tree_decl_hash>, class_decl_loc_t> >::hash_entry, false, xcallocator>::find_slot_with_hash(tree_node* const&, unsigned int, insert_option) (hash-table.h:1008) - by 0xA8B1DC: hash_map<tree_decl_hash, class_decl_loc_t, simple_hashmap_traits<default_hash_traits<tree_decl_hash>, class_decl_loc_t> >::get_or_insert(tree_node* const&, bool*) (hash-map.h:200) - by 0xA57128: class_decl_loc_t::add(cp_parser*, unsigned int, tag_types, tree_node*, bool, bool) (parser.c:32888) [...] LEAK SUMMARY: - definitely lost: 8,440 bytes in 81 blocks + definitely lost: 48 bytes in 1 blocks indirectly lost: 12,529 bytes in 329 blocks possibly lost: 0 bytes in 0 blocks still reachable: 1,644,376 bytes in 768 blocks gcc/ * hash-table.h (hash_table<Descriptor, Lazy, Allocator>::expand): Destruct stale Value objects. * hash-map-tests.c (test_map_of_type_with_ctor_and_dtor_expand): Update.
-
Sandra Loosemore authored
The GNU Fortran manual documents that the c_float128 kind corresponds to __float128, but in fact the implementation uses float128_type_node, which is _Float128. Both refer to the 128-bit IEEE/ISO encoding, but some targets including aarch64 only define _Float128 and not __float128, and do not provide quadmath.h. This caused errors in some test cases referring to __float128. This patch changes the documentation (including code comments) and test cases to use _Float128 to match the implementation. 2021-09-16 Sandra Loosemore <sandra@codesourcery.com> gcc/fortran/ * intrinsic.texi (ISO_C_BINDING): Change C_FLOAT128 to correspond to _Float128 rather than __float128. * iso-c-binding.def (c_float128): Update comments. * trans-intrinsic.c (gfc_builtin_decl_for_float_kind): Likewise. (build_round_expr): Likewise. (gfc_build_intrinsic_lib_fndcecls): Likewise. * trans-types.h (gfc_real16_is_float128): Likewise. gcc/testsuite/ * gfortran.dg/PR100914.c: Do not include quadmath.h. Use _Float128 _Complex instead of __complex128. * gfortran.dg/PR100914.f90: Add -Wno-pedantic to suppress error about use of _Float128. * gfortran.dg/c-interop/typecodes-array-float128-c.c: Use _Float128 instead of __float128. * gfortran.dg/c-interop/typecodes-sanity-c.c: Likewise. * gfortran.dg/c-interop/typecodes-scalar-float128-c.c: Likewise. * lib/target-supports.exp (check_effective_target_fortran_real_c_float128): Update comments. libgfortran/ * ISO_Fortran_binding.h: Update comments. * runtime/ISO_Fortran_binding.c: Likewise.
-
Roger Sayle authored
Respecting Jakub's suggestion that it may be better to warn-on-valid for "if (x << 0)" as the author might have intended "if (x < 0)" [which will also warn when x is _Bool], the simplest way to resolve this regression is to disable the recently added fold transformation for shifts by zero; these will be optimized later (elsewhere). Guarding against integer_zerop is the simplest of three alternatives; the second being to only apply this transformation to GIMPLE and not GENERIC, and the third (potentially) being to explicitly handle shifts by zero here, with an (if cond then else), optimizing the expression to a convert, but awkwardly duplicating a more general transformation earlier in match.pd's shift simplifications. 2021-09-17 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog PR c/102245 * match.pd (shift optimizations): Disable recent sign-changing optimization for shifts by zero, these will be folded later. gcc/testsuite/ChangeLog PR c/102245 * gcc.dg/Wint-in-bool-context-4.c: New test case.
-
Bill Schmidt authored
I over-restricted use of __builtin_mffsl, since I was unaware that it automatically uses mffs when mffsl is not available. Paul Clarke pointed this out in discussion of his SSE 4.1 compatibility patches. 2021-08-31 Bill Schmidt <wschmidt@linux.ibm.com> gcc/ * config/rs6000/rs6000-builtin-new.def (__builtin_mffsl): Move from [power9] to [always].
-