Skip to content
Snippets Groups Projects
  1. Nov 22, 2024
    • Tobias Burnus's avatar
      OpenMP: Add 'interop' clause to 'dispatch' for C/C++ · f34422e0
      Tobias Burnus authored
      Will fail with an error if/as no suitable 'append_args' has been specified,
      given that 'append_args' is not yet implemented.
      
      gcc/c-family/ChangeLog:
      
      	* c-pragma.h (enum pragma_omp_clause): Add PRAGMA_OMP_CLAUSE_INTEROP.
      
      gcc/c/ChangeLog:
      
      	* c-parser.cc (c_parser_omp_clause_interop): New.
      	(c_parser_omp_clause_name, c_parser_omp_all_clauses,
      	c_parser_omp_dispatch_body): Handle 'interop' clause.
      	* c-typeck.cc (c_finish_omp_clauses): Likewise.
      
      gcc/cp/ChangeLog:
      
      	* parser.cc (cp_parser_omp_clause_name, cp_parser_omp_all_clauses,
      	cp_parser_omp_dispatch_body): Handle 'interop' clause.
      	* pt.cc (tsubst_omp_clauses): Likewise.
      	* semantics.cc (finish_omp_clauses): Likewise.
      
      gcc/ChangeLog:
      
      	* gimplify.cc (gimplify_call_expr): Add initial support for
      	dispatch's 'interop' clause.
      	(gimplify_scan_omp_clauses): Handle interop clause.
      	* tree-pretty-print.cc (dump_omp_clause): Likewise.
      	* tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_INTEROP.
      	* tree.cc (omp_clause_num_ops, omp_clause_code_name): Add interop.
      
      gcc/testsuite/ChangeLog:
      
      	* c-c++-common/gomp/dispatch-11.c: New test.
      	* c-c++-common/gomp/dispatch-12.c: New test.
      f34422e0
    • Tobias Burnus's avatar
      OpenMP: 'interop' construct - add C/C++ parser support, improve Fortran parsing · 8f0c8e57
      Tobias Burnus authored
      Add middle end support for the 'interop' directive and the 'init', 'use',
      and 'destroy' clauses - but fail with a sorry, unimplemented in gimplify.cc.
      
      For Fortran, generate the tree code, update the internal representation,
      add some more diagnostic checks and update for newer specification changes
      ('fr' only takes a single value, but it integer expressions are permitted
      again [like with the old syntax] not only constant identifiers).
      
      For C and C++, this patch adds the full parser support for 'interop'.
      
      Still missing is actually handling the directive in the middle end and
      in libgomp.
      
      The GOMP_INTEROP_IFR_* internal values have been changed to have space
      for vendor specific values that are adjacent to the existing values
      but negative, if needed.
      
      gcc/c-family/ChangeLog:
      
      	* c-common.h (enum c_omp_region_type): Add C_ORT_INTEROP
      	and C_ORT_OMP_INTEROP.
      	(c_omp_interop_t_p): New prototype.
      	* c-omp.cc (c_omp_interop_t_p): Check whether the type is
      	omp_interop_t.
      	(c_omp_directives): Uncomment 'interop'.
      	* c-pragma.cc (omp_pragmas): Add 'interop'.
      	* c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_INTEROP.
      	(enum pragma_omp_clause): Add init, use, and destroy clauses.
      
      gcc/c/ChangeLog:
      
      	* c-parser.cc (INCLUDE_STRING): Define.
      	(c_parser_pragma): Handle 'interop' directive.
      	(c_parser_omp_clause_name): Handle init, use, and destroy clauses.
      	(c_parser_omp_all_clauses): Likewise; use C_ORT_OMP_INTEROP, if
      	'use' is permitted, for c_finish_omp_clauses.
      	(c_parser_omp_clause_destroy, c_parser_omp_modifier_prefer_type,
      	c_parser_omp_clause_init, c_parser_omp_clause_use,
      	OMP_INTEROP_CLAUSE_MASK, c_parser_omp_interop): New.
      	* c-typeck.cc (c_finish_omp_clauses): Add missing OPT_Wopenmp to
      	a warning; handle new clauses.
      
      gcc/cp/ChangeLog:
      
      	* parser.cc (INCLUDE_STRING): Define.
      	(cp_parser_omp_clause_name): Handle init, use, and destroy clauses.
      	(cp_parser_omp_all_clauses): Likewise; use C_ORT_OMP_INTEROP, if
      	'use' is permitted, for c_finish_omp_clauses.
      	(cp_parser_omp_modifier_prefer_type, cp_parser_omp_clause_init,
      	OMP_INTEROP_CLAUSE_MASK, cp_parser_omp_interop): New.
      	(cp_parser_pragma): Handle 'interop' directive.
      	* pt.cc (tsubst_omp_clauses): Handle init, use, and destroy clauses.
      	(tsubst_stmt): Handle OMP_INTEROP.
      	* semantics.cc (cp_omp_init_prefer_type_update): New.
      	(finish_omp_clauses): Handle  init, use, and destroy clauses
      	and add clause check for 'depend' on 'interop'.
      
      gcc/fortran/ChangeLog:
      
      	* gfortran.h (gfc_omp_namelist): Cleanup interop internal
      	representation.
      	* dump-parse-tree.cc (show_omp_namelist): Update for changed
      	internal representation.
      	* match.cc (gfc_free_omp_namelist): Likewise.
      	* openmp.cc (gfc_match_omp_prefer_type, gfc_match_omp_init):
      	Likewise; also handle some corner cases better and update for
      	newer 6.0 changes related to 'fr'.
      	(resolve_omp_clauses): Add type-check for interop variables.
      	* trans-openmp.cc (gfc_trans_omp_clauses): Handle init, use
      	and destroy clauses.
      	(gfc_trans_openmp_interop): New.
      	(gfc_trans_omp_directive): Call it.
      
      gcc/ChangeLog:
      
      	* gimplify.cc (gimplify_expr): Handle OMP_INTEROP by printing
      	"sorry, uninplemented".
      	* omp-api.h (omp_get_fr_id_from_name): Change return type to
      	'char'.
      	* omp-general.cc (omp_get_fr_id_from_name): Likewise; return
      	GOMP_INTEROP_IFR_UNKNOWN not 0 if not found.
      	(omp_get_name_from_fr_id): Return "<unknown>" not NULL
      	if not found (used for dumps).
      	* tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_DESTROY,
      	OMP_CLAUSE_USE, and OMP_CLAUSE_INIT.
      	* tree-pretty-print.cc (dump_omp_init_prefer_type): New.
      	(dump_omp_clause): Handle init, use and destroy clauses.
      	(dump_generic_node): Handle interop directive.
      	* tree.cc (omp_clause_num_ops, omp_clause_code_name): Add new
      	init/use/destroy clauses.
      	* tree.def (OACC_LOOP): Fix comment.
      	(OMP_INTEROP): Add.
      	* tree.h (OMP_INTEROP_CLAUSES, OMP_CLAUSE_INIT_TARGET,
      	OMP_CLAUSE_INIT_TARGETSYNC, OMP_CLAUSE_INIT_PREFER_TYPE): New.
      
      include/ChangeLog:
      
      	* gomp-constants.h (GOMP_INTEROP_IFR_NONE): Rename ...
      	(GOMP_INTEROP_IFR_UNKNOWN): ... to this. And change value.
      	(GOMP_INTEROP_IFR_SEPARATOR): Likewise.
      
      gcc/testsuite/ChangeLog:
      
      	* gfortran.dg/gomp/interop-1.f90: Update for parser changes,
      	spec changes and add new tests.
      	* gfortran.dg/gomp/interop-2.f90: Likewise.
      	* gfortran.dg/gomp/interop-3.f90: Likewise.
      	* c-c++-common/gomp/interop-1.c: New test.
      	* c-c++-common/gomp/interop-2.c: New test.
      	* c-c++-common/gomp/interop-3.c: New test.
      	* c-c++-common/gomp/interop-4.c: New test.
      	* g++.dg/gomp/interop-5.C: New test.
      	* gfortran.dg/gomp/interop-4.f90: New test.
      8f0c8e57
    • Evgeny Karpov's avatar
      MAINTAINERS: Add myself to write after approval · 8d7f2d53
      Evgeny Karpov authored
      ChangeLog:
      
      	* MAINTAINERS: Add myself to write after approval.
      8d7f2d53
    • Jakub Jelinek's avatar
      i386: Make __builtin_ia32_f{nstenv,ldenv,nstsw,fnclex} builtins internal [PR117165] · d6d1fdcf
      Jakub Jelinek authored
      As the comment says, these builtins are meant to be internal for the atomic
      support and cause various ICEs when using them directly in various
      conditions.
      So the following patch makes them internal.
      We do have also internal-fn.*, but those target specific builtins would
      need to be there in generic code, so I've just added space to their name,
      which is the old way to hide builtins/attributes etc.
      
      2024-11-22  Jakub Jelinek  <jakub@redhat.com>
      
      	PR target/117165
      	* config/i386/i386-builtin.def (IX86_BUILTIN_FNSTENV,
      	IX86_BUILTIN_FLDENV, IX86_BUILTIN_FNSTSW, IX86_BUILTIN_FNCLEX): Add
      	space to the end of the builtin name to make it really internal.
      
      	* gcc.target/i386/pr117165.c: New test.
      d6d1fdcf
    • Jakub Jelinek's avatar
      testsuite: Fix up vector-{8,9,10}.c tests · 77f4b109
      Jakub Jelinek authored
      On Thu, Nov 21, 2024 at 01:30:39PM +0100, Christoph Müllner wrote:
      > > >       * gcc.dg/tree-ssa/satd-hadamard.c: New test.
      > > >       * gcc.dg/tree-ssa/vector-10.c: New test.
      > > >       * gcc.dg/tree-ssa/vector-8.c: New test.
      > > >       * gcc.dg/tree-ssa/vector-9.c: New test.
      
      I see FAILs on i686-linux or on x86_64-linux (in the latter
      with -m32 testing).
      
      One problem is that vector-10.c doesn't use -Wno-psabi option
      and uses a function which returns a vector and takes vector
      as first parameter, the other problems are that 3 other
      tests don't arrange for at least basic vector ISA support,
      plus non-standardly test only on x86_64-*-*, while normally
      one would allow both i?86-*-* x86_64-*-* and if it is e.g.
      specific to 64-bit, also check for lp64 or int128 or whatever
      else is needed.  E.g. Solaris I think has i?86-*-* triplet even
      for 64-bit code, etc.
      
      The following patch fixes these.
      
      2024-11-22  Jakub Jelinek  <jakub@redhat.com>
      
      	* gcc.dg/tree-ssa/satd-hadamard.c: Add -msse2 as dg-additional-options
      	on x86.  Also scan-tree-dump on i?86-*-*.
      	* gcc.dg/tree-ssa/vector-8.c: Likewise.
      	* gcc.dg/tree-ssa/vector-9.c: Likewise.
      	* gcc.dg/tree-ssa/vector-10.c: Add -Wno-psabi to dg-additional-options.
      77f4b109
    • Tamar Christina's avatar
      middle-end:For multiplication try swapping operands when matching complex multiply [PR116463] · a9473f9c
      Tamar Christina authored
      This commit fixes the failures of complex.exp=fast-math-complex-mls-*.c on the
      GCC 14 branch and some of the ones on the master.
      
      The current matching just looks for one order for multiplication and was relying
      on canonicalization to always give the right order because of the TWO_OPERANDS.
      
      However when it comes to the multiplication trying only one order is a bit
      fragile as they can be flipped.
      
      The failing tests on the branch are:
      
      void fms180snd(_Complex TYPE a[restrict N], _Complex TYPE b[restrict N],
                     _Complex TYPE c[restrict N]) {
        for (int i = 0; i < N; i++)
          c[i] -= a[i] * (b[i] * I * I);
      }
      
      void fms180fst(_Complex TYPE a[restrict N], _Complex TYPE b[restrict N],
                     _Complex TYPE c[restrict N]) {
        for (int i = 0; i < N; i++)
          c[i] -= (a[i] * I * I) * b[i];
      }
      
      The issue is just a small difference in commutative operations.
      we look for {R,R} * {R,I} but found {R,I} * {R,R}.
      
      Since the DF analysis is cached, we should be able to swap operands and retry
      for multiply cheaply.
      
      There is a constraint being checked by vect_validate_multiplication for the data
      flow of the operands feeding the multiplications.  So e.g.
      
      between the nodes:
      
      note:   node 0x4d1d210 (max_nunits=2, refcnt=3) vector(2) double
      note:   op template: _27 = _10 * _25;
      note:      stmt 0 _27 = _10 * _25;
      note:      stmt 1 _29 = _11 * _25;
      note:   node 0x4d1d060 (max_nunits=2, refcnt=2) vector(2) double
      note:   op template: _26 = _11 * _24;
      note:      stmt 0 _26 = _11 * _24;
      note:      stmt 1 _28 = _10 * _24;
      
      we require the lanes to come from the same source which
      vect_validate_multiplication checks.  As such it doesn't make sense to flip them
      individually because that would invalidate the earlier linear_loads_p checks
      which have validated that the arguments all come from the same datarefs.
      
      This patch thus flips the operands in unison to still maintain this invariant,
      but also honor the commutative nature of multiplication.
      
      gcc/ChangeLog:
      
      	PR tree-optimization/116463
      	* tree-vect-slp-patterns.cc (complex_mul_pattern::matches,
      	complex_fms_pattern::matches): Try swapping operands on multiply.
      a9473f9c
    • Lulu Cheng's avatar
      LoongArch: Modify the document to remove options that don't exist. · 92864116
      Lulu Cheng authored
      gcc/ChangeLog:
      
      	* doc/invoke.texi: Remove the non-existent option
      	'-msmall-data-limit' and add a description of '-G'.
      92864116
    • Lulu Cheng's avatar
      LoongArch: Remove redundant code. · a3a375b2
      Lulu Cheng authored
      TARGET_ASM_ALIGNED_{HI,SI,QI}_OP are defined repeatedly and deleted.
      
      gcc/ChangeLog:
      
      	* config/loongarch/loongarch-builtins.cc
      	(loongarch_builtin_vectorized_function): Delete.
      	(LARCH_GET_BUILTIN): Delete.
      	* config/loongarch/loongarch-protos.h
      	(loongarch_builtin_vectorized_function): Delete.
      	* config/loongarch/loongarch.cc
      	(TARGET_ASM_ALIGNED_HI_OP): Delete.
      	(TARGET_ASM_ALIGNED_SI_OP): Delete.
      	(TARGET_ASM_ALIGNED_DI_OP): Delete.
      a3a375b2
    • Haochen Jiang's avatar
      i386/testsuite: Enhance AVX10.2 vmovd/w testcases · 45135f9d
      Haochen Jiang authored
      Under -fno-omit-frame-pointer, %ebp will be used, which is the
      Solaris/x86 default. Both check %ebp and %esp to avoid error on that.
      
      gcc/testsuite/ChangeLog:
      
      	PR target/117697
      	* gcc.target/i386/avx10_2-vmovd-1.c: Both check %esp and %ebp.
      	* gcc.target/i386/avx10_2-vmovw-1.c: Ditto.
      45135f9d
    • Lulu Cheng's avatar
      LoongArch: Fix clerical errors in lasx_xvreplgr2vr_* and lsx_vreplgr2vr_*. · f0cb64fb
      Lulu Cheng authored
      [x]vldi.{b/h/w/d} is not implemented in LoongArch.
      Use the macro [x]vrepli.{b/h/w/d} to replace.
      
      gcc/ChangeLog:
      
      	* config/loongarch/lasx.md: Fixed.
      	* config/loongarch/lsx.md: Fixed.
      f0cb64fb
    • Xi Ruoyao's avatar
      LoongArch: Make __builtin_lsx_vorn_v and __builtin_lasx_xvorn_v arguments and... · ae7e2566
      Xi Ruoyao authored
      LoongArch: Make __builtin_lsx_vorn_v and __builtin_lasx_xvorn_v arguments and return values unsigned
      
      Align them with other vector bitwise builtins.
      
      This may break programs directly invoking __builtin_lsx_vorn_v or
      __builtin_lasx_xvorn_v, but doing so is not supported (as builtins are
      not documented, only intrinsics are documented and users should use them
      instead).
      
      gcc/ChangeLog:
      
      	* config/loongarch/loongarch-builtins.cc (vorn_v, xvorn_v): Use
      	unsigned vector modes.
      	* config/loongarch/lsxintrin.h (__lsx_vorn_v): Cast arguments to
      	v16u8.
      	* config/loongarch/lasxintrin.h (__lasx_xvorn_v): Cast arguments
      	to v32u8.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/loongarch/vector/lsx/lsx-builtin.c (__lsx_vorn_v):
      	Change arguments and return value to v16u8.
      	* gcc.target/loongarch/vector/lasx/lasx-builtin.c
      	(__lasx_xvorn_v): Change arguments and return value to v32u8.
      ae7e2566
    • GCC Administrator's avatar
      Daily bump. · 8500a8c3
      GCC Administrator authored
      8500a8c3
  2. Nov 21, 2024
    • Jeff Law's avatar
      [RISC-V][PR target/117690] Add missing shift in constant synthesis · 9b7917b3
      Jeff Law authored
      As hinted out in the BZ, we were missing a left shift in the constant synthesis
      in the case where the upper 32 bits can be synthesized using a shNadd of the
      low 32 bits.
      
      This adjusts the synthesis to add the missing left shift and adjusts the cost
      to account for the additional instruction.
      
      Regression tested on riscv64-elf in my tester.  Waiting for the pre-commit
      tester before moving forward.
      
      	PR target/117690
      gcc/
      	* config/riscv/riscv.cc (riscv_build_integer): Add missing left
      	shift when using shNadd to derive upper 32 bits from lower 32 bits.
      
      gcc/testsuite
      	* gcc.target/riscv/pr117690.c: New test.
      	* gcc.target/riscv/synthesis-13.c: Adjust expected output.
      9b7917b3
    • Arsen Arsenović's avatar
      doc/cpp: Document __has_include_next · ffeee625
      Arsen Arsenović authored
      While hacking on an unrelated change, I noticed that __has_include_next
      hasn't been documented at all.  This patch adds it to the __has_include
      manual node.
      
      gcc/ChangeLog:
      
      	* doc/cpp.texi (__has_include): Document __has_include_next
      	also.
      	(Conditional Syntax): Mention __has_include_next in the
      	description for the __has_include menu entry.
      Unverified
      ffeee625
    • Joseph Myers's avatar
      c: Give errors more consistently for void parameters [PR114816] · 338d687e
      Joseph Myers authored
      Cases of void parameters, other than a parameter list of (void) (or
      equivalent with a typedef for void) in its entirety, have been made a
      constraint violation in C2Y (N3344 alternative 1 was adopted), as part
      of a series of changes to eliminate unnecessary undefined behavior by
      turning it into constraint violations, implementation-defined behavior
      or something else with stricter bounds on what behavior is allowed.
      Previously, these were implicitly undefined behavior (see DR#295),
      with only some cases listed in Annex J as undefined (but even those
      cases not having wording in the normative text to make them explicitly
      undefined).
      
      As discussed in bug 114816, GCC is not entirely consistent about
      diagnosing such usages; unnamed void parameters get errors when not
      the entire parameter list, while qualified and register void (the
      cases listed in Annex J) get errors as a single unnamed parameter, but
      named void parameters are accepted with a warning (in a declaration
      that's not a definition; it's not possible to define a function with
      incomplete parameter types).
      
      Following C2Y, make all these cases into errors.  The errors are not
      conditional on the standard version, given that this was previously
      implicit undefined behavior.  Since it wasn't possible anyway to
      define such functions, only declare them without defining them (or
      otherwise use such parameters in function type names that can't
      correspond to any defined function), hopefully the risks of
      compatibility issues are small.
      
      Bootstrapped with no regressions for x86-64-pc-linux-gnu.
      
      	PR c/114816
      
      gcc/c/
      	* c-decl.cc (grokparms): Do not warn for void parameter type here.
      	(get_parm_info): Give errors for void parameters even when named.
      
      gcc/testsuite/
      	* gcc.dg/c2y-void-parm-1.c: New test.
      	* gcc.dg/noncompile/920616-2.c, gcc.dg/noncompile/921116-1.c,
      	gcc.dg/parm-incomplete-1.c: Update expected diagnostics.
      338d687e
    • David Malcolm's avatar
      json parsing: avoid relying on floating point equality [PR117677] · 4574f15b
      David Malcolm authored
      
      gcc/ChangeLog:
      	PR bootstrap/117677
      	* json-parsing.cc (selftest::test_parse_number): Replace
      	ASSERT_EQ of 'double' values with ASSERT_NEAR.  Eliminate
      	ASSERT_PRINT_EQ for such values.
      	* selftest.h (ASSERT_NEAR): New.
      	(ASSERT_NEAR_AT): New.
      
      Signed-off-by: default avatarDavid Malcolm <dmalcolm@redhat.com>
      4574f15b
    • David Malcolm's avatar
      testsuite: add print-stack.exp · b599498e
      David Malcolm authored
      
      I wrote this support file to help me debug Tcl issues in the
      testsuite.
      
      Adding a call to:
      
        print_stack_backtrace
      
      somewhere in a .exp file (along with "load_lib print-stack.exp") leads
      to the interpreter printing a backtrace in a form that e.g. Emacs can
      consume, with filename:linenum: lines, and quoting the line of .exp
      source code.
      
      Fer example, adding a print_stack_backtrace to scansarif.exp in
      run-sarif-pytest I get this output:
      
      VVV START OF BACKTRACE VVV
        /home/david/coding/gcc-newgit/src/gcc/testsuite/lib/scansarif.exp:142: frame 16 in proc print_stack_backtrace
          142 |     print_stack_backtrace
        <proc>: frame 15 in proc run-sarif-pytest
        <eval>: frame 14 in proc dg-final-proc
        /usr/share/dejagnu/dg.exp:851: frame 13 in proc dg-final-proc
          851 | 	if {[catch "dg-final-proc $prog" errmsg]} {
        <eval>: frame 12 in proc saved-dg-test
        /home/david/coding/gcc-newgit/src/gcc/testsuite/lib/gcc-dg.exp:1080: frame 11 in proc saved-dg-test
          1080 | 	if { [ catch { eval saved-dg-test $args } errmsg ] } {
        /usr/share/dejagnu/dg.exp:559: frame 10 in proc dg-test
          559 | 	dg-test $testcase $options ${default-extra-options}
        /home/david/coding/gcc-newgit/src/gcc/testsuite/gcc.dg/sarif-output/sarif-output.exp:28: frame 9
          28 | dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.c]] "" ""
        <eval>: frame 8
        <eval>: frame 7
        /usr/share/dejagnu/runtest.exp:1460: frame 6
          1460 | 	if { [catch "uplevel #0 source $test_file_name"] == 1 } {
        /usr/share/dejagnu/runtest.exp:1886: frame 5 in proc dg-runtest
          1886 | 			runtest $test_name
        /usr/share/dejagnu/runtest.exp:1845: frame 4 in proc dg-runtest
          1845 | 		    foreach test_name [lsort [find ${dir} *.exp]] {
        /usr/share/dejagnu/runtest.exp:1788: frame 3 in proc dg-runtest
          1788 | 	    foreach dir "${test_top_dirs}" {
        /usr/share/dejagnu/runtest.exp:1669: frame 2 in proc dg-runtest
          1669 |     foreach pass $multipass {
        /usr/share/dejagnu/runtest.exp:1619: frame 1 in proc dg-runtest
          1619 | foreach current_target $target_list {
      ^^^  END OF BACKTRACE  ^^^
      
      and can click on the lines in Emacs's compilation buffer to take
      me to the relevant places.
      
      I found this made it *much* easier to debug my .exp files.  That
      said, I'm uncomfortable with Tcl, and so
      (a) there may be a better way of doing this
      (b) I may have made mistakes
      
      gcc/testsuite/ChangeLog:
      	* lib/print-stack.exp: New file.
      
      Signed-off-by: default avatarDavid Malcolm <dmalcolm@redhat.com>
      b599498e
    • Christoph Müllner's avatar
      testsuite: tree-ssa: Limit targets for vec perm tests · ae0d842f
      Christoph Müllner authored
      
      Recently added test cases assume optimized code generation for certain
      vectorized code.  However, this optimization might not be applied if
      the backends don't support the optimized permuation.
      
      The tests are confirmed to work on aarch64 and x86-64, so this
      patch restricts the tests accordingly.
      
      Tested on x86-64.
      
      	PR117728
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.dg/tree-ssa/satd-hadamard.c: Restrict to aarch64 and x86-64.
      	* gcc.dg/tree-ssa/vector-8.c: Likewise.
      	* gcc.dg/tree-ssa/vector-9.c: Likewise.
      
      Signed-off-by: default avatarChristoph Müllner <christoph.muellner@vrull.eu>
      ae0d842f
    • Jason Merrill's avatar
      c++: inline variables and modules · 819f67a2
      Jason Merrill authored
      We weren't writing out the definition of an inline variable, so the importer
      either got an undefined symbol or 0.
      
      gcc/cp/ChangeLog:
      
      	* module.cc (has_definition): Also true for inline vars.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/modules/inline-1_a.C: New test.
      	* g++.dg/modules/inline-1_b.C: New test.
      819f67a2
    • Jason Merrill's avatar
      c++: modules and debug marker stmts · 74498be0
      Jason Merrill authored
      21_strings/basic_string/operations/contains/nonnull.cc was failing because
      the module was built with debug markers and the testcase was built not
      expecting debug markers, so we crashed in lower_stmt.  Let's accommodate
      this by discarding debug marker statements we don't want.
      
      gcc/cp/ChangeLog:
      
      	* module.cc (trees_in::core_vals) [STATEMENT_LIST]: Skip
      	DEBUG_BEGIN_STMT if !MAY_HAVE_DEBUG_MARKER_STMTS.
      74498be0
    • Jason Merrill's avatar
      c++: modules and tsubst_friend_class · 03c7145a
      Jason Merrill authored
      In 20_util/function_objects/mem_fn/constexpr.cc we start to instantiate
      _Mem_fn_base's friend declaration of _Bind_check_arity before we've loaded
      the namespace-scope declaration, so lookup_imported_hidden_friend doesn't
      find it.  But then we load the namespace-scope declaration in
      lookup_template_class during substitution, and so when we get around to
      pushing the result of substitution, they conflict.  Fixed by calling
      lazy_load_pendings in lookup_imported_hidden_friend.
      
      gcc/cp/ChangeLog:
      
      	* name-lookup.cc (lookup_imported_hidden_friend): Call
      	lazy_load_pendings.
      03c7145a
    • Georg-Johann Lay's avatar
      AVR: target/117726 - Better optimizations of ASHIFT:SI insns. · 873cffc7
      Georg-Johann Lay authored
      This patch improves the 4-byte ASHIFT insns.
      1) It adds a "r,r,C15" alternative for improved long << 15.
      2) It adds 3-operand alternatives (depending on options) and
         splits them after peephole2 / before avr-fuse-move into
         a 3-operand byte shift and a 2-operand residual bit shift.
      For better control, it introduces new option -msplit-bit-shift
      that's activated at -O2 and higher per default.  2) is even
      performed with -Os, but not with -Oz.
      
      	PR target/117726
      gcc/
      	* config/avr/avr.opt (-msplit-bit-shift): Add new optimization option.
      	* common/config/avr/avr-common.cc (avr_option_optimization_table)
      	[OPT_LEVELS_2_PLUS]: Turn on -msplit-bit-shift.
      	* config/avr/avr.h (machine_function.n_avr_fuse_add_executed):
      	New bool component.
      	* config/avr/avr.md (attr "isa") <2op, 3op>: Add new values.
      	(attr "enabled"): Handle them.
      	(ashlsi3, *ashlsi3, *ashlsi3_const): Add "r,r,C15" alternative.
      	Add "r,0,C4l" and "r,r,C4l" alternatives (depending on 2op / 3op).
      	(define_split) [avr_split_bit_shift]: Add 2 new ashift:ALL4 splitters.
      	(define_peephole2) [ashift:ALL4]: Add (match_dup 3) so that the scratch
      	won't overlap with the output operand of the matched insn.
      	(*ashl<mode>3_const_split): Remove unused ashift:ALL4 splitter.
      	* config/avr/avr-passes.cc (emit_valid_insn)
      	(emit_valid_move_clobbercc): Move out of anonymous namespace.
      	(make_avr_pass_fuse_add) <gate>: Don't override.
      	<execute>: Set n_avr_fuse_add_executed according to
      	func->machine->n_avr_fuse_add_executed.
      	(pass_data avr_pass_data_split_after_peephole2): New object.
      	(avr_pass_split_after_peephole2): New rtl_opt_pass.
      	(avr_emit_shift): New static function.
      	(avr_shift_is_3op, avr_split_shift_p, avr_split_shift)
      	(make_avr_pass_split_after_peephole2): New functions.
      	* config/avr/avr-passes.def (avr_pass_split_after_peephole2):
      	Insert new pass after pass_peephole2.
      	* config/avr/avr-protos.h
      	(n_avr_fuse_add_executed, avr_shift_is_3op, avr_split_shift_p)
      	(avr_split_shift, avr_optimize_size_level)
      	(make_avr_pass_split_after_peephole2): New prototypes.
      	* config/avr/avr.cc (n_avr_fuse_add_executed): New global variable.
      	(avr_optimize_size_level): New function.
      	(avr_set_current_function): Set n_avr_fuse_add_executed
      	according to cfun->machine->n_avr_fuse_add_executed.
      	(ashlsi3_out) [case 15]: Output optimized code for this offset.
      	(avr_rtx_costs_1) [ASHIFT, SImode]: Adjust costs of oggsets 15, 16.
      	* config/avr/constraints.md (C4a, C4r, C4r): New constraints.
      	* pass_manager.h (pass_manager): Adjust comments.
      873cffc7
    • Georg-Johann Lay's avatar
      AVR: Fix a nit in avr-passes.cc::absint_t.dump(). · 938094ab
      Georg-Johann Lay authored
      gcc/
      	* config/avr/avr-passes.cc (absint_t::dump): Fix missing
      	newline in dump.
      938094ab
    • Jeff Law's avatar
      [RISC-V][PR target/116590] Avoid emitting multiple instructions from fmacc patterns · 41fb3a56
      Jeff Law authored
      So much like my patch from last week, this removes alternatives that
      create multiple instructions that we really should have never needed.
      
      In this case it fixes one of two bugs in pr116590.  In particular we
      don't want vmvNr instructions for thead-vector.  Those instructions were
      emitted as part of those two instruction sequences.
      
      I've tested this in my tester and assuming the pre-commit tester is
      happy, I'll push it to the trunk.
      
      	PR target/116590
      gcc
      	* config/riscv/vector.md (pred_mul_<optab>mode_undef): Drop
      	unnecessary alternatives.
      	(pred_<madd_msub><mode>): Likewise.
      	(pred_<macc_msac><mode>): Likewise.
      	(pred_<madd_msub><mode>_scalar): Likewise.
      	(pred_<macc_msac><mode>_scalar): Likewise.
      	(pred_mul_neg_<optab><mode>_undef): Likewise.
      	(pred_<nmsub_nmadd><mode>): Likewise.
      	(pred_<nmsac_nmacc><mode>): Likewise.
      	(pred_<nmsub_nmadd><mode>_scalar): Likewise.
      	(pred_<nmsac_nmacc><mode>_scalar): Likewise.
      
      gcc/testsuite
      	* gcc.target/riscv/pr116590.c: New test.
      41fb3a56
    • Pan Li's avatar
      Match: Refactor the unsigned SAT_ADD match pattern [NFC] · fbca864a
      Pan Li authored
      
      This patch would like to refactor the unsigned SAT_ADD pattern by:
      * Extract type check outside.
      * Extract common sub pattern.
      * Re-arrange the related match pattern forms together.
      * Remove unnecessary helper pattern matches.
      
      The below test suites are passed for this patch.
      * The rv64gcv fully regression test.
      * The x86 bootstrap test.
      * The x86 fully regression test.
      
      gcc/ChangeLog:
      
      	* match.pd: Refactor sorts of unsigned SAT_ADD match pattern.
      
      Signed-off-by: default avatarPan Li <pan2.li@intel.com>
      Signed-off-by: default avatarPan Li <pan2.li@intel.com>
      fbca864a
    • Tamar Christina's avatar
      middle-end: Pass along SLP node when costing vector loads/stores · dbc38dd9
      Tamar Christina authored
      With the support to SLP only we now pass the VMAT through the SLP node, however
      the majority of the costing calls inside vectorizable_load and
      vectorizable_store do no pass the SLP node along.  Due to this the backend costing
      never sees the VMAT for these cases anymore.
      
      Additionally the helper around record_stmt_cost when both SLP and stmt_vinfo are
      passed would only pass the SLP node along.  However the SLP node doesn't contain
      all the info available in the stmt_vinfo and we'd have to go through the
      SLP_TREE_REPRESENTATIVE anyway.  As such I changed the function to just Always
      pass both along.  Unlike the VMAT changes, I don't believe there to be a
      correctness issue here but would minimize the number of churn in the backend
      costing until vectorizer costing as a whole is revisited in GCC 16.
      
      These changes re-enable the cost model on AArch64 and also correctly find the
      VMATs on loads and stores fixing testcases such as sve_iters_low_2.c.
      
      gcc/ChangeLog:
      
      	* tree-vect-data-refs.cc (vect_get_data_access_cost): Pass NULL for SLP
      	node.
      	* tree-vect-stmts.cc (record_stmt_cost): Expose.
      	(vect_get_store_cost, vect_get_load_cost): Extend with SLP node.
      	(vectorizable_store, vectorizable_load): Pass SLP node to all costing.
      	* tree-vectorizer.h (record_stmt_cost): Always pass both SLP node and
      	stmt_vinfo to costing.
      	(vect_get_load_cost, vect_get_store_cost): Extend with SLP node.
      dbc38dd9
    • Rainer Orth's avatar
      Use decl size in Solaris ASM_DECLARE_OBJECT_NAME [PR102296] · 116b1c54
      Rainer Orth authored
      Solaris has modified versions of ASM_DECLARE_OBJECT_NAME on both i386
      and sparc.  When
      
      commit ce597aed
      Author: Ilya Enkovich <ilya.enkovich@intel.com>
      Date:   Thu Aug 7 08:04:55 2014 +0000
      
          elfos.h (ASM_DECLARE_OBJECT_NAME): Use decl size instead of type size.
      
      was applied, those were missed.  At the same time, the testcase was
      restricted to Linux though there's nothing Linux-specific in there, so
      the error remained undetected.
      
      This patch fixes the definitions to match elfos.h and enables the test
      on Solaris, too.
      
      Bootstrapped without regressions on i386-pc-solaris2.11 and
      sparc-sun-solaris2.11.
      
      2024-11-19  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>
      
      	gcc/testsuite:
      	PR target/102296
      	* gcc.target/i386/struct-size.c: Enable on *-*-solaris*.
      
      	gcc:
      	PR target/102296
      	* config/i386/sol2.h (ASM_DECLARE_OBJECT_NAME): Use decl size
      	instead of type size.
      	* config/sparc/sol2.h (ASM_DECLARE_OBJECT_NAME): Likewise.
      116b1c54
    • Christoph Müllner's avatar
      forwprop: Try to blend two isomorphic VEC_PERM sequences · 1c4d39ad
      Christoph Müllner authored
      
      This extends forwprop by yet another VEC_PERM optimization:
      It attempts to blend two isomorphic vector sequences by using the
      redundancy in the lane utilization in these sequences.
      This redundancy in lane utilization comes from the way how specific
      scalar statements end up vectorized: two VEC_PERMs on top, binary operations
      on both of them, and a final VEC_PERM to create the result.
      Here is an example of this sequence:
      
        v_in = {e0, e1, e2, e3}
        v_1 = VEC_PERM <v_in, v_in, {0, 2, 0, 2}>
        // v_1 = {e0, e2, e0, e2}
        v_2 = VEC_PERM <v_in, v_in, {1, 3, 1, 3}>
        // v_2 = {e1, e3, e1, e3}
      
        v_x = v_1 + v_2
        // v_x = {e0+e1, e2+e3, e0+e1, e2+e3}
        v_y = v_1 - v_2
        // v_y = {e0-e1, e2-e3, e0-e1, e2-e3}
      
        v_out = VEC_PERM <v_x, v_y, {0, 1, 6, 7}>
        // v_out = {e0+e1, e2+e3, e0-e1, e2-e3}
      
      To remove the redundancy, lanes 2 and 3 can be freed, which allows to
      change the last statement into:
        v_out' = VEC_PERM <v_x, v_y, {0, 1, 4, 5}>
        // v_out' = {e0+e1, e2+e3, e0-e1, e2-e3}
      
      The cost of eliminating the redundancy in the lane utilization is that
      lowering the VEC PERM expression could get more expensive because of
      tighter packing of the lanes.  Therefore this optimization is not done
      alone, but in only in case we identify two such sequences that can be
      blended.
      
      Once all candidate sequences have been identified, we try to blend them,
      so that we can use the freed lanes for the second sequence.
      On success we convert 2x (2x BINOP + 1x VEC_PERM) to
      2x VEC_PERM + 2x BINOP + 2x VEC_PERM traded for 4x VEC_PERM + 2x BINOP.
      
      The implemented transformation reuses (rewrites) the statements
      of the first sequence and the last VEC_PERM of the second sequence.
      The remaining four statements of the second statment are left untouched
      and will be eliminated by DCE later.
      
      This targets x264_pixel_satd_8x4, which calculates the sum of absolute
      transformed differences (SATD) using Hadamard transformation.
      We have seen 8% speedup on SPEC's x264 on a 5950X (x86-64) and 7%
      speedup on an AArch64 machine.
      
      Bootstrapped and reg-tested on x86-64 and AArch64 (all languages).
      
      gcc/ChangeLog:
      
      	* tree-ssa-forwprop.cc (struct _vec_perm_simplify_seq): New data
      	structure to store analysis results of a vec perm simplify sequence.
      	(get_vect_selector_index_map): Helper to get an index map from the
      	provided vector permute selector.
      	(recognise_vec_perm_simplify_seq): Helper to recognise a
      	vec perm simplify sequence.
      	(narrow_vec_perm_simplify_seq): Helper to pack the lanes more
      	tight.
      	(can_blend_vec_perm_simplify_seqs_p): Test if two vec perm
      	sequences can be blended.
      	(calc_perm_vec_perm_simplify_seqs): Helper to calculate the new
      	permutation indices.
      	(blend_vec_perm_simplify_seqs): Helper to blend two vec perm
      	simplify sequences.
      	(process_vec_perm_simplify_seq_list): Helper to process a list
      	of vec perm simplify sequences.
      	(append_vec_perm_simplify_seq_list): Helper to add a vec perm
      	simplify sequence to the list.
      	(pass_forwprop::execute): Integrate new functionality.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.dg/tree-ssa/satd-hadamard.c: New test.
      	* gcc.dg/tree-ssa/vector-10.c: New test.
      	* gcc.dg/tree-ssa/vector-8.c: New test.
      	* gcc.dg/tree-ssa/vector-9.c: New test.
      	* gcc.target/aarch64/sve/satd-hadamard.c: New test.
      
      Signed-off-by: default avatarChristoph Müllner <christoph.muellner@vrull.eu>
      1c4d39ad
    • H.J. Lu's avatar
      apx-ndd-tls-1[ab].c: Add -std=gnu17 · 42a8005c
      H.J. Lu authored
      
      Since GCC 15 defaults to -std=gnu23, add -std=gnu17 to apx-ndd-tls-1[ab].c
      to avoid:
      
      gcc.target/i386/apx-ndd-tls-1a.c: In function ‘k’:
      gcc.target/i386/apx-ndd-tls-1a.c:29:7: error: too many arguments to function ‘l’
      gcc.target/i386/apx-ndd-tls-1a.c:25:5: note: declared here
      
      	* gcc.target/i386/apx-ndd-tls-1a.c: -std=gnu17.
      	* gcc.target/i386/apx-ndd-tls-1b.c: Likewise.
      
      Signed-off-by: default avatarH.J. Lu <hjl.tools@gmail.com>
      42a8005c
    • Rainer Orth's avatar
      libgomp: testsuite: Fix libgomp.c/alloc-pinned-3.c etc. for C23 on non-Linux · 0f7def85
      Rainer Orth authored
      Since the switch to a C23 default, three libgomp tests FAIL on Solaris:
      
      FAIL: libgomp.c/alloc-pinned-3.c (test for excess errors)
      UNRESOLVED: libgomp.c/alloc-pinned-3.c compilation failed to produce executable
      FAIL: libgomp.c/alloc-pinned-4.c (test for excess errors)
      UNRESOLVED: libgomp.c/alloc-pinned-4.c compilation failed to produce executable
      FAIL: libgomp.c/alloc-pinned-6.c (test for excess errors)
      UNRESOLVED: libgomp.c/alloc-pinned-6.c compilation failed to produce executable
      
      Excess errors:
      /vol/gcc/src/hg/master/local/libgomp/testsuite/libgomp.c/alloc-pinned-3.c:104:3: error: too many arguments to function 'set_pin_limit'
      
      Fixed by adding the missing size argument to the stub functions.
      
      Tested on i386-pc-solaris2.11 and sparc-sun-solaris2.11.
      
      2024-11-20  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>
      
      	libgomp:
      	* testsuite/libgomp.c/alloc-pinned-3.c [!__linux__]
      	(set_pin_limit): Add size arg.
      	* testsuite/libgomp.c/alloc-pinned-4.c [!__linux__]
      	(set_pin_limit): Likewise.
      	* testsuite/libgomp.c/alloc-pinned-6.c [!__linux__]
      	(set_pin_limit): Likewise.
      0f7def85
    • Jakub Jelinek's avatar
      include: Add new post-DWARF 5 DW_LANG_* enumerators · 806563f1
      Jakub Jelinek authored
      DWARF changed the language code assignment to be on a web page and
      after DWARF 5 has been published already 27 codes have been assigned.
      We have some of those already in the header, but most of them were missing,
      including one added just yesterday (DW_LANG_C23).
      Note, this is really post-DWARF 5 stuff rather than DWARF 6, because
      DWARF 6 plans to switch from DW_AT_language to DW_AT_language_{name,version}
      pair where we'll say DW_LNAME_C with 202311 version instead of this.
      
      2024-11-21  Jakub Jelinek  <jakub@redhat.com>
      
      	* dwarf2.h (enum dwarf_source_language): Add comment where
      	the post DWARF 5 additions start.  Refresh list from
      	https://dwarfstd.org/languages.html.
      806563f1
    • Richard Biener's avatar
      tree-optimization/117720 - check alignment for VMAT_STRIDED_SLP · 7e9b0d90
      Richard Biener authored
      While vectorizable_store was already checking alignment requirement
      of the stores and fall back to elementwise accesses if not honored
      the vectorizable_load path wasn't doing this.  After the previous
      change to disregard alignment checking for VMAT_STRIDED_SLP in
      get_group_load_store_type this now tripped on power.
      
      	PR tree-optimization/117720
      	* tree-vect-stmts.cc (vectorizable_load): For VMAT_STRIDED_SLP
      	verify the choosen load type is OK with regard to alignment.
      7e9b0d90
    • Jakub Jelinek's avatar
      c-family, docs: Adjust descriptions/documentation for C23 publication · ab8d3606
      Jakub Jelinek authored
      As C23 has been published already https://www.iso.org/standard/82075.html
      we don't need to say that it is expected to be published etc.
      
      Furthermore, standards.texi was still documenting that -std=gnu17
      is the default.
      
      2024-11-21  Jakub Jelinek  <jakub@redhat.com>
      
      gcc/
      	* doc/invoke.texi (-std=c23): Adjust documentation for
      	publication of the ISO/IEC 9899:2024 standard.
      	* doc/standards.texi: Likewise.  Document -std=gnu17 and
      	-std=gnu23 options.  Mention that -std=gnu23 rather than
      	-std=gnu17 is now the default for C.
      gcc/c-family/
      	* c.opt (std=c23, std=gnu23, std=iso9899:2024): Adjust description
      	for publication of the ISO/IEC 9899:2024 standard.
      ab8d3606
    • Jakub Jelinek's avatar
      phiopt: Improve spaceship_replacement for HONOR_NANS [PR117612] · 05ab9447
      Jakub Jelinek authored
      The following patch optimizes spaceship followed by comparisons of the
      spaceship value even for floating point spaceship when NaNs can appear.
      operator<=> for this emits roughly
      signed char c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2;
      and I believe the
      /* The optimization may be unsafe due to NaNs.  */
      comment just isn't true.
      Sure, the i == j comparison doesn't raise exceptions on qNaNs, but if
      one of the operands is qNaN, then i == j is false and i < j or i > j
      is then executed and raises exceptions even on qNaNs.
      And we can safely optimize say
      c == -1 comparison after the above into i < j, that also raises
      exceptions like before and handles NaNs the same way as the original.
      The only unsafe transormation would be c == 0 or c != 0, turning it
      into i == j or i != j wouldn't raise exception, so I'm not doing that
      optimization (but other parts of the compiler optimize the i < j comparison
      away anyway).
      
      Anyway, to match the HONOR_NANS case, we need to verify that the
      second comparison has true edge to the phi_bb (yielding there -1 or 1),
      it can't be the false edge because when NaNs are honored, the false
      edge is for both the case where the inverted comparison is true or when
      one of the operands is NaN.  Similarly we need to ensure that the two
      non-equality comparisons are the opposite, while for -ffast-math we can in
      some cases get one comparison x >= 5.0 and the other x > 5.0 and it is fine,
      because NaN is UB, when NaNs are honored, they must be different to leave
      the unordered case with 2 value as the last one remaining.
      The patch also punts if HONOR_NANS and the phi has just 3 arguments instead
      of 4.
      When NaNs are honored, we also in some cases need to perform some comparison
      and then invert its result (so that exceptions are properly thrown and we
      get the correct result).
      
      2024-11-21  Jakub Jelinek  <jakub@redhat.com>
      
      	PR tree-optimization/94589
      	PR tree-optimization/117612
      	* tree-ssa-phiopt.cc (spaceship_replacement): Handle
      	HONOR_NANS (TREE_TYPE (lhs1)) case when possible.
      
      	* gcc.dg/pr94589-5.c: New test.
      	* gcc.dg/pr94589-6.c: New test.
      	* g++.dg/opt/pr94589-5.C: New test.
      	* g++.dg/opt/pr94589-6.C: New test.
      05ab9447
    • Jakub Jelinek's avatar
      phiopt: Fix a pasto in spaceship_replacement [PR117612] · ca7430f1
      Jakub Jelinek authored
      When working on the PR117612 fix, I've noticed a pasto in
      tree-ssa-phiopt.cc (spaceship_replacement).
      The code is
            if (absu_hwi (tree_to_shwi (arg2)) != 1)
              return false;
            if (e1->flags & EDGE_TRUE_VALUE)
              {
                if (tree_to_shwi (arg0) != 2
                    || absu_hwi (tree_to_shwi (arg1)) != 1
                    || wi::to_widest (arg1) == wi::to_widest (arg2))
                  return false;
              }
            else if (tree_to_shwi (arg1) != 2
                     || absu_hwi (tree_to_shwi (arg0)) != 1
                     || wi::to_widest (arg0) == wi::to_widest (arg1))
              return false;
      where arg{0,1,2,3} are PHI args and wants to ensure that if e1 is a
      true edge, then arg0 is 2 and one of arg{1,2} is -1 and one is 1,
      otherwise arg1 is 2 and one of arg{0,2} is -1 and one is 1.
      But due to pasto in the latte case doesn't verify that arg0
      is different from arg2, it could be both -1 or both 1 and we wouldn't
      punt.  The wi::to_widest (arg0) == wi::to_widest (arg1) test
      is always false when we've made sure in the earlier conditions that
      arg1 is 2 and arg0 is -1 or 1, so never 2.
      
      2024-11-21  Jakub Jelinek  <jakub@redhat.com>
      
      	PR tree-optimization/94589
      	PR tree-optimization/117612
      	* tree-ssa-phiopt.cc (spaceship_replacement): Fix up
      	a pasto in check when arg1 is 2.
      ca7430f1
    • Jakub Jelinek's avatar
      c: Add u{,l,ll,imax}abs builtins [PR117024] · 7272e09c
      Jakub Jelinek authored
      The following patch adds u{,l,ll,imax}abs builtins, which just fold
      to ABSU_EXPR, similarly to how {,l,ll,imax}abs builtins fold to
      ABS_EXPR.
      
      2024-11-21  Jakub Jelinek  <jakub@redhat.com>
      
      	PR c/117024
      gcc/
      	* coretypes.h (enum function_class): Add function_c2y_misc
      	enumerator.
      	* builtin-types.def (BT_FN_UINTMAX_INTMAX, BT_FN_ULONG_LONG,
      	BT_FN_ULONGLONG_LONGLONG): New DEF_FUNCTION_TYPE_1s.
      	* builtins.def (DEF_C2Y_BUILTIN): Define.
      	(BUILT_IN_UABS, BUILT_IN_UIMAXABS, BUILT_IN_ULABS,
      	BUILT_IN_ULLABS): New builtins.
      	* builtins.cc (fold_builtin_abs): Handle also folding of u*abs
      	to ABSU_EXPR.
      	(fold_builtin_1): Handle BUILT_IN_U{,L,LL,IMAX}ABS.
      gcc/lto/ChangeLog:
      	* lto-lang.cc (flag_isoc2y): New variable.
      gcc/ada/ChangeLog:
      	* gcc-interface/utils.cc (flag_isoc2y): New variable.
      gcc/testsuite/
      	* gcc.c-torture/execute/builtins/lib/abs.c (uintmax_t): New typedef.
      	(uabs, ulabs, ullabs, uimaxabs): New functions.
      	* gcc.c-torture/execute/builtins/uabs-1.c: New test.
      	* gcc.c-torture/execute/builtins/uabs-1.x: New file.
      	* gcc.c-torture/execute/builtins/uabs-1-lib.c: New file.
      	* gcc.c-torture/execute/builtins/uabs-2.c: New test.
      	* gcc.c-torture/execute/builtins/uabs-2.x: New file.
      	* gcc.c-torture/execute/builtins/uabs-2-lib.c: New file.
      	* gcc.c-torture/execute/builtins/uabs-3.c: New test.
      	* gcc.c-torture/execute/builtins/uabs-3.x: New test.
      	* gcc.c-torture/execute/builtins/uabs-3-lib.c: New test.
      7272e09c
    • Kewen Lin's avatar
      rs6000: Adjust FLOAT128 signbit2 expander for P8 LE [PR114567] · 10e70278
      Kewen Lin authored
      As the associated test case shows, signbit generated assembly
      is sub-optimal for _Float128 argument from memory on P8 LE.
      On P8 LE, p8swap pass puts an explicit AND -16 on the memory,
      which causes mode_dependent_address_p considers it's invalid
      to change its mode and combine fails to make use of the
      existing pattern signbit<SIGNBIT:mode>2_dm_mem.  Considering
      it's always more efficient to make use of 8 bytes load and
      shift on P8 LE, this patch is to adjust the current expander
      and treat it specially.
      
      	PR target/114567
      
      gcc/ChangeLog:
      
      	* config/rs6000/rs6000.md (expander signbit<FLOAT128:mode>2): Adjust.
      	(*signbit<mode>2_dm_mem): Rename to ...
      	(signbit<mode>2_dm_mem): ... this.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/powerpc/pr114567.c: New test.
      10e70278
    • Kewen Lin's avatar
      rs6000: Use standard name {add,sub}v1ti3 for altivec_v{add,sub}uqm · baf53675
      Kewen Lin authored
      This patch is to adjust define_insn altivec_v{add,sub}uqm
      with standard names, as the associated test case shows, w/o
      this patch, it ends up with scalar {add,subf}c/{add,subf}e,
      the standard names help to exploit v{add,sub}uqm.
      
      gcc/ChangeLog:
      
      	* config/rs6000/altivec.md (altivec_vadduqm): Rename to ...
      	(addv1ti3): ... this.
      	(altivec_vsubuqm): Rename to ...
      	(subv1ti3): ... this.
      	* config/rs6000/rs6000-builtins.def (__builtin_altivec_vadduqm):
      	Replace bif expander altivec_vadduqm with addv1ti3.
      	(__builtin_altivec_vsubuqm): Replace bif expander altivec_vsubuqm with
      	subv1ti3.
      
      gcc/testsuite/ChangeLog:
      
      	* gcc.target/powerpc/p8vector-int128-3.c: New test.
      baf53675
    • Kewen Lin's avatar
      rs6000: Remove entry for V1TImode from VI_unit · ca96c1d1
      Kewen Lin authored
      When making a patch to adjust VECTOR_P8_VECTOR rs6000_vector
      enum, I noticed that V1TImode's mode attribute in VI_unit
      VECTOR_UNIT_ALTIVEC_P (V1TImode) is never true, since
      VECTOR_UNIT_ALTIVEC_P checks if vector_unit[V1TImode] is
      equal to VECTOR_ALTIVEC, but vector_unit[V1TImode] can only
      be VECTOR_NONE or VECTOR_P8_VECTOR, there is no chance to be
      VECTOR_ALTIVEC:
        rs6000_vector_unit[V1TImode]
            = (TARGET_P8_VECTOR) ? VECTOR_P8_VECTOR : VECTOR_NONE;
      
      By checking all uses of VI_unit, the used mode iterator is
      one of VI2, VI, VP_small and VP, none of them has V1TImode,
      so the entry for V1TImode is useless.  I guessed it was
      designed to have one mode attribute to cover all integer
      vector modes, but later we separated V1TI handlings to its
      own patterns (those guarded with TARGET_VADDUQM).  Anyway,
      this patch is to remove this useless and confusing entry.
      
      gcc/ChangeLog:
      
      	* config/rs6000/altivec.md (mode attr for V1TI in VI_unit): Remove.
      ca96c1d1
    • Kewen Lin's avatar
      rs6000: Add veqv support to *eqv<mode>3_internal1 · 2441dc24
      Kewen Lin authored
      When making patch to replace TARGET_P8_VECTOR, I noticed
      for *eqv<BOOL_128:mode>3_internal1 unlike the other logical
      operations, we only exploited the vsx version.  I think it
      is an oversight, this patch is to consider veqv as well.
      
      gcc/ChangeLog:
      
      	* config/rs6000/rs6000.md (*eqv<BOOL_128:mode>3_internal1): Generate
      	insn veqv if TARGET_ALTIVEC and operands are altivec_register_operand.
      2441dc24
Loading