- Feb 07, 2025
-
-
Jakub Jelinek authored
My r15-3046 change regressed the first half of the following testcase. When it calls decl_attributes, it doesn't handle attributes with dependent arguments correctly and so is now rejected that N is not a constant integer during template parsing. I've actually followed the pointer/reference case which did that too and that one has been failing for a couple of years on the second part of the testcase. Note, there is also if (decl_context != PARM && decl_context != TYPENAME) /* Assume that any attributes that get applied late to templates will DTRT when applied to the declaration as a whole. */ late_attrs = splice_template_attributes (&attrs, type); returned_attrs = decl_attributes (&type, attr_chainon (returned_attrs, attrs), attr_flags); returned_attrs = attr_chainon (late_attrs, returned_attrs); call directly to decl_attributes in grokdeclarator, but this one handles the splicing manually, so maybe it is ok as is (and I don't have a testcase of anything misbehaving for that). 2025-02-07 Jakub Jelinek <jakub@redhat.com> PR c++/118773 * decl.cc (grokdeclarator): Use cplus_decl_attributes rather than decl_attributes for std_attributes on pointer and array types. * g++.dg/cpp0x/gen-attrs-87.C: New test. * g++.dg/gomp/attrs-3.C: Adjust expected diagnostics.
-
Jakub Jelinek authored
As mentioned in the PR, https://eel.is/c++draft/conv.lval#note-1 says that even volatile reads from std::nullptr_t typed objects actually don't read anything and https://eel.is/c++draft/expr.const#10.9 says that even those are ok in constant expressions. So, the following patch adjusts the r9-4793 changes to have an exception for NULLPTR_TYPE. As [conv.lval]/3 also talks about accessing to inactive member, I've added testcase to cover that as well. 2025-02-07 Jakub Jelinek <jakub@redhat.com> PR c++/118661 * constexpr.cc (potential_constant_expression_1): Don't diagnose lvalue-to-rvalue conversion of volatile lvalue if it has NULLPTR_TYPE. * decl2.cc (decl_maybe_constant_var_p): Return true for constexpr decls with NULLPTR_TYPE even if they are volatile. * g++.dg/cpp0x/constexpr-volatile4.C: New test. * g++.dg/cpp0x/constexpr-union9.C: New test.
-
Paul Thomas authored
2025-02-07 Tomáš Trnka <trnka@scm.com> gcc/fortran PR fortran/116829 * trans-decl.cc (init_intent_out_dt): Always call gfc_init_default_dt() for BT_DERIVED to apply s->value if the symbol isn't allocatable. Also simplify the logic a bit. gcc/testsuite/ PR fortran/116829 * gfortran.dg/derived_init_7.f90: New test.
-
Richard Biener authored
The following fixes a latent issue where we use ranges to verify correctness of a vector conversion optimization. We rely on ranges from 'op0' which for SLP is extracted from the representative stmt which does not necessarily correspond to any actual scalar operation. We also do not verify the range of all scalar lanes in the SLP operand match. The following rectifies this, restricting the support to single-lane SLP nodes at this point - on branches we'd simply not perform this optimization with SLP. PR tree-optimization/115538 * tree-vectorizer.h (vect_get_slp_scalar_def): Declare. * tree-vect-slp.cc (vect_get_slp_scalar_def): New helper. * tree-vect-generic.cc (expand_vector_conversion): Adjust. * tree-vect-stmts.cc (vectorizable_conversion): For SLP correctly look at ranges of the scalar defs of the SLP operand. (supportable_indirect_convert_operation): Likewise.
-
Tobias Burnus authored
The amdhsa.version depends on the code object version; while V3 had 1.0, V4 has 1.1 and V5 (and V6) have 1.2. GCC used 1.0 but generated since a while either V4 or, with -march=gfx...-generic, V6. Now it uses the proper version again. gcc/ChangeLog: * config/gcn/gcn.cc (gcn_hsa_declare_function_name): Update 'amdhsa.version' output to match used code version. * config/gcn/gen-gcn-device-macros.awk: Add a comment to crosslink.
-
Tobias Burnus authored
libgomp/ChangeLog: * plugin/plugin-gcn.c (ELFABIVERSION_AMDGPU_HSA_V6, EF_AMDGPU_GENERIC_VERSION_V, EF_AMDGPU_GENERIC_VERSION_OFFSET, GET_GENERIC_VERSION): New #define. (elf_gcn_isa_is_generic): New. (isa_matches_agent): Accept all generic code objects on the first go; extend the diagnostic and handle runtime-failed case. (create_and_finalize_hsa_program): Call it also after loading the code failed, pass the status.
-
Xi Ruoyao authored
For mask{eq,ne}z, rk is always compared with 0 in the full width, thus the mode for rk should be X. I found the issue reviewing a patch fixing a similar issue for RISC-V XTheadCondMov [1], but interestingly I cannot find a test case really blowing up on LoongArch. But as the issue is obvious enough let's fix it anyway so it won't blow up in the future. [1]: https://gcc.gnu.org/pipermail/gcc-patches/2025-January/674004.html gcc/ChangeLog: * config/loongarch/loongarch.md (*sel<code><GPR:mode>_using_<GPR2:mode>): Rename to ... (*sel<code><GPR:mode>_using_<X:mode>): ... here. (GPR2): Remove as nothing uses it now.
-
Alexandre Oliva authored
If decode_field_reference finds a load that accesses past the inner object's size, bail out. Drop the too-strict assert. for gcc/ChangeLog PR tree-optimization/118514 PR tree-optimization/118706 * gimple-fold.cc (decode_field_reference): Refuse to consider merging out-of-bounds BIT_FIELD_REFs. (make_bit_field_load): Drop too-strict assert. * tree-eh.cc (bit_field_ref_in_bounds_p): Rename to... (access_in_bounds_of_type_p): ... this. Change interface, export. (tree_could_trap_p): Adjust. * tree-eh.h (access_in_bounds_of_type_p): Declare. for gcc/testsuite/ChangeLog PR tree-optimization/118514 PR tree-optimization/118706 * gcc.dg/field-merge-25.c: New.
-
Tobias Burnus authored
This patch adds gfx9-generic, completing the gfx*-generic support. It also adds all gfx* devices that are part of any of the gfx*-generic, i.e. gfx902, gfx904, gfx909, gfx1031, gfx1032, gfx1033, gfx1034, gfx1035, gfx1101, gfx1102, gfx1150, gfx1151, gfx1152, and gfx1153. gcc/ChangeLog: * config/gcn/gcn-devices.def (GCN_DEVICE): Add gfx9-generic, gfx902, gfx904, gfx909, gfx1031, gfx1032, gfx1033, gfx1034, gfx1035, gfx1101, gfx1102, gfx1150, gfx1151, gfx1152, and gfx1153. Add a currently unused column linking, a specific ISA to a generic one (if it exists). * config/gcn/gcn-tables.opt: Regenerate * doc/invoke.texi (AMD GCN): Add the the new gfc... and the older gfx{10-3,11}-generic to -march= as 'experimental'.
-
Tobias Burnus authored
When compiling with -g, mkoffload.cc creates a device object file itself; however, in order that the linker dos not complain, the ELF flags must match what the compiler / linker does. For gfx906, the assembler defaults to sramecc = any, but gcn-devices.def contained unsupported, which is not the same - causing link errors. That's a regression caused by commit r15-4540-ga6b26e5ea09779 - which can be best seen by looking at the changes to mkoffload.cc. Additionally, this commit adds '...' to the GCN_DEVICE #define in gcn.cc to make it agnostic to the addition of fields. gcc/ChangeLog: * config/gcn/gcn-devices.def (GCN_DEVICE): Change sramecc for gfx906 to 'any'. * config/gcn/gcn.cc (GCN_DEVICE): Add tailing ... to #define.
-
Alexandre Oliva authored
vis3move-3.c expects fsmuld, that is not available on all variants of sparc. Select a cpu that supports it for the test. Now, -mfix-ut699 irrevocbly disables fsmuld, so skip the test if the test configuration uses that option. for gcc/testsuite/ChangeLog * gcc.target/sparc/vis3move-3.c: Select ultrasparc. Skip with -mfix-ut699.
-
Alexandre Oliva authored
A number of tls tests expect TLS-specific relocations, that are not present when tls is emulated, as on e.g. leon3-elf. Skip the tests when tls is emulated. for gcc/testsuite/ChangeLog * gcc.target/sparc/tls-ld-int16.c: Skip when tls is emulated. * gcc.target/sparc/tls-ld-int32.c: Likewise. * gcc.target/sparc/tls-ld-int8.c: Likewise. * gcc.target/sparc/tls-ld-uint16.c: Likewise. * gcc.target/sparc/tls-ld-uint32.c: Likewise. * gcc.target/sparc/tls-ld-uint8.c: Likewise.
-
Alexandre Oliva authored
Option -mfix-ut699 changes the set of instructions that can be placed in the delay slot, preventing the expected insn placement. Skip the test if the option is present. for gcc/testsuite/ChangeLog * gcc.target/sparc/sparc-ret-1.c: Skip on -mfix-ut699.
-
Alexandre Oliva authored
If -mcpu=leon3 is present in the command line for a test run, overriding it with -mcpu=niagara7 is not enough to override the tuning for leon3 selected by the previous -mcpu option. niagara7-align.c tests for niagara7 alignment tuning, so use -mtune rather than -mcpu. for gcc/testsuite/ChangeLog * gcc.target/sparc/niagara7-align.c: Use -mtune.
-
H.J. Lu authored
commit 3b9b8d6c Author: Surya Kumari Jangala <jskumari@linux.ibm.com> Date: Tue Jun 25 08:37:49 2024 -0500 ira: Scale save/restore costs of callee save registers with block frequency scales the cost of saving/restoring a callee-save hard register in epilogue and prologue with the entry block frequency, which, if not optimizing for size, is 10000, for all targets. As the result, callee-saved registers may not be used to preserve local variable values across calls on some targets, like x86. Add a target hook for the callee-saved register cost scale in epilogue and prologue used by IRA. The default version of this target hook returns 1 if optimizing for size, otherwise returns the entry block frequency. Add an x86 version of this target hook to restore the old behavior prior to the above commit. PR rtl-optimization/111673 PR rtl-optimization/115932 PR rtl-optimization/116028 PR rtl-optimization/117081 PR rtl-optimization/117082 PR rtl-optimization/118497 * ira-color.cc (assign_hard_reg): Call the target hook for the callee-saved register cost scale in epilogue and prologue. * target.def (ira_callee_saved_register_cost_scale): New target hook. * targhooks.cc (default_ira_callee_saved_register_cost_scale): New. * targhooks.h (default_ira_callee_saved_register_cost_scale): Likewise. * config/i386/i386.cc (ix86_ira_callee_saved_register_cost_scale): New. (TARGET_IRA_CALLEE_SAVED_REGISTER_COST_SCALE): Likewise. * doc/tm.texi: Regenerated. * doc/tm.texi.in (TARGET_IRA_CALLEE_SAVED_REGISTER_COST_SCALE): New. Signed-off-by:
H.J. Lu <hjl.tools@gmail.com>
-
GCC Administrator authored
-
- Feb 06, 2025
-
-
Craig Blackmore authored
stack_protect_{set,test}_<mode> were showing up in RTL dumps as UNSPEC_COPYSIGN and UNSPEC_FMV_X_W due to UNSPEC_SSP_SET and UNSPEC_SSP_TEST being put in the unspecv enum instead of unspec. gcc/ChangeLog: * config/riscv/riscv.md: Move UNSPEC_SSP_SET and UNSPEC_SSP_TEST to unspec enum.
-
Jeff Law authored
Richard S's recent change to iv increment insertion removed a reg->reg move (which was its intent AFAICT). This triggered a failure on a riscv test. That test was meant to verify that we didn't have an extraneous reg->reg move due to a buglet in the risc-v splitters. Before the 2023 change we had two vector reg->reg moves and after the 2023 fix we had just one. With Richard's change we have none ;-) Adjusting test accordingly. Pushed to the trunk. gcc/testsuite * gcc.target/riscv/rvv/autovec/madd-split2-1.c: Update expected output.
-
Georg-Johann Lay authored
gcc/ * config/avr/avr.opt.urls: Add mcvt.
-
Tamar Christina authored
It seems that after my IVopts patches the function contain_complex_addr_expr became unused and clang is rightfully complaining about it. This removes the unused internal function. gcc/ChangeLog: PR tree-optimization/118756 * tree-ssa-loop-ivopts.cc (contain_complex_addr_expr): Remove.
-
Jerry DeLisle authored
This patch is a partial fix of handling of X edit descriptors when combined with certain T edit descriptors. PR libfortran/114618 libgfortran/ChangeLog: * io/transfer.c (formatted_transfer_scalar_write): Change name of vriable 'pos' to 'tab_pos' to improve clarity. Add new variable next_pos when calculating the maximum position. Update the calculation of pending spaces. gcc/testsuite/ChangeLog: * gfortran.dg/pr114618.f90: New test.
-
Jakub Jelinek authored
Another non-problematic attribute. 2025-02-06 Jakub Jelinek <jakub@redhat.com> PR c++/110345 * g++.dg/cpp0x/attr-no_unique_address1.C: New test.
-
Jakub Jelinek authored
Another non-problematic attribute. 2025-02-06 Jakub Jelinek <jakub@redhat.com> PR c++/110345 * g++.dg/cpp0x/attr-noreturn1.C: New test.
-
Jakub Jelinek authored
Fairly non-problematic attribute. 2025-02-06 Jakub Jelinek <jakub@redhat.com> PR c++/110345 * g++.dg/cpp0x/attr-nodiscard1.C: New test.
-
Georg-Johann Lay authored
Some AVR devices support a CVT: - Devices from the 0-series, 1-series, 2-series. - AVR16, AVR32, AVR64, AVR128 devices. The support is provided by means of a startup code file crt<mcu>-cvt.o from AVR-LibC v2.3 that can be linked instead of the traditional crt<mcu>.o. This patch adds a new command line option -mcvt that links that CVT startup code (or issues an error when the device doesn't support a CVT). PR target/118764 gcc/ * config/avr/avr.opt (-mcvt): New target option. * config/avr/avr-arch.h (AVR_CVT): New enum value. * config/avr/avr-mcus.def: Add AVR_CVT flag for devices that support it. * config/avr/avr.cc (avr_handle_isr_attribute) [TARGET_CVT]: Issue an error when a vector number larger that 3 is used. * config/avr/gen-avr-mmcu-specs.cc (McuInfo.have_cvt): New property. (print_mcu) <*avrlibc_startfile>: Use crt<mcu>-cvt.o depending on -mcvt (or issue an error when the device doesn't support a CVT). * doc/invoke.texi (AVR Options): Document -mcvt.
-
Paul Thomas authored
2025-02-06 Paul Thomas <pault@gcc.gnu.org> gcc/fortran PR fortran/118750 * resolve.cc (resolve_assoc_var): If the target expression has a rank, do not use gfc_expression_rank, since it will return 0 if the function is elemental. Resolution will have produced the correct rank. gcc/testsuite/ PR fortran/118750 * gfortran.dg/associate_72.f90: New test.
-
Jakub Jelinek authored
The following test ICEs on RISC-V at least latently since r14-1622-g99bfdb072e67fa3fe294d86b4b2a9f686f8d9705 which added RISC-V specific case to get_biv_step_1 to recognize also ({zero,sign}_extend:DI (plus:SI op0 op1)) The reason for the ICE is that op1 in this case is CONST_POLY_INT which unlike the really expected VOIDmode CONST_INTs has its own mode and still satisfies CONSTANT_P. GET_MODE (rhs) (SImode) is different from outer_mode (DImode), so the function later does *inner_step = simplify_gen_binary (code, outer_mode, *inner_step, op1); but that obviously ICEs because while *inner_step is either VOIDmode or DImode, op1 has SImode. The following patch fixes it by extending op1 using code so that simplify_gen_binary can handle it. Another option would be to change the !CONSTANT_P (op1) 3 lines above this to !CONST_INT_P (op1), I think it isn't very likely that we get something useful from other constants there. 2025-02-06 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/117506 * loop-iv.cc (get_biv_step_1): For {ZERO,SIGN}_EXTEND of PLUS apply {ZERO,SIGN}_EXTEND to op1. * gcc.dg/pr117506.c: New test. * gcc.target/riscv/pr117506.c: New test.
-
Georg-Johann Lay authored
gcc/ PR target/118768 * config/avr/genmultilib.awk: Parse the AVR_MCU lines in a more robust way w.r.t. white spaces.
-
Lulu Cheng authored
PR target/118561 gcc/ChangeLog: * config/loongarch/loongarch-builtins.cc (loongarch_expand_builtin_lsx_test_branch): NULL_RTX will not be returned when an error is detected. (loongarch_expand_builtin): Likewise. gcc/testsuite/ChangeLog: * gcc.target/loongarch/pr118561.c: New test.
-
Richard Sandiford authored
In this PR, we used to generate: .L6: mov v30.16b, v31.16b fadd v31.4s, v31.4s, v27.4s fadd v29.4s, v30.4s, v28.4s stp q30, q29, [x0] add x0, x0, 32 cmp x1, x0 bne .L6 for an unrolled induction in: for (int i = 0; i < 1024; i++) { arr[i] = freq; freq += step; } with the problem being the unnecessary MOV. The main induction IV was incremented by VF * step == 2 * nunits * step, and then nunits * step was added for the second store to arr. The original patch for the PR (r14-2367-g224fd59b2dc8) avoided the MOV by incrementing the IV by nunits * step twice. The problem with that approach is that it doubles the loop-carried latency. This change was deliberately not preserved when moving from loop-vect to SLP and so the test started failing again after r15-3509-gd34cda720988. I think the main problem is that we put the IV increment in the wrong place. Normal IVs created by create_iv are placed before the exit condition where possible, but vectorizable_induction instead always inserted them at the start of the loop body. The only use of the incremented IV is by the phi node, so the effect is to make both the old and new IV values live for the whole loop body, which is why we need the MOV. The simplest fix therefore seems to be to reuse the create_iv logic. gcc/ PR tree-optimization/110449 * tree-ssa-loop-manip.h (insert_iv_increment): Declare. * tree-ssa-loop-manip.cc (insert_iv_increment): New function, split out from... (create_iv): ...here and generalized to gimple_seqs. * tree-vect-loop.cc (vectorizable_induction): Use standard_iv_increment_position and insert_iv_increment to insert the IV increment. gcc/testsuite/ PR tree-optimization/110449 * gcc.target/aarch64/pr110449.c: Expect an increment by 8.0, but test that there is no MOV.
-
Richard Biener authored
The PR shows fold-mem-offsets taking ages and a lot of memory computing DU/UD chains as that requires the RD problem. The issue is not so much the memory required for the pruned sets but the high CFG connectivity (and that the CFG is cyclic) which makes solving the dataflow problem expensive. The following adds the same limit as the one imposed by GCSE and CPROP. PR rtl-optimization/117922 * fold-mem-offsets.cc (pass_fold_mem_offsets::execute): Do nothing for a highly connected CFG.
-
Richard Biener authored
The vectorizer thinks it can align a vector access to 16 bytes when using a vectorization factor of 8 and 1 byte elements. That of course does not work for the 2nd vector iteration. Apparently we lack a guard against such nonsense. PR tree-optimization/118749 * tree-vect-data-refs.cc (vector_alignment_reachable_p): Pass in the vectorization factor, when that cannot maintain the DRs target alignment do not claim we can reach that by peeling. * gcc.dg/vect/pr118749.c: New testcase.
-
GCC Administrator authored
-
- Feb 05, 2025
-
-
Jeff Law authored
I was looking at a regression on the bfin port with a recent change to the IRA and stumbled across this just doing a general port healthyness evaluation. The ABS instruction in the blackfin ISA is defined as saturating on INT_MIN, which is a bit unexpected. We certainly can't use it when -fwrapv is enabled. Given the failures on the C23 uabs tests, I'm inclined to just disable the pattern completely. Fixes pr23047, uabs-2 and uabs-3. While it's not a regression, it's the blackfin port, so I think we've got a higher degree of freedom here. Pushing to the trunk. gcc/ * config/bfin/bfin.md (abssi): Disable pattern.
-
Simon Martin authored
We segfault upon the following invalid code === cut here === template <int> struct S { friend void foo (int a = []{}()); }; void foo (int a) {} int main () { S<0> t; foo (); } === cut here === The problem is that we end up with a LAMBDA_EXPR callee in set_flags_from_callee, and dereference its NULL_TREE TREE_TYPE (TREE_TYPE (..)). This patch sets the default argument to error_mark_node and gives a hard error for template class friend functions that do not meet the requirement in C++17 11.3.6/4 (the change is restricted to templates per discussion with Jason). PR c++/118319 gcc/cp/ChangeLog: * decl.cc (grokfndecl): Inspect all friend function parameters. If it's not valid for them to have a default value and we're processing a template, set the default value to error_mark_node and give a hard error. gcc/testsuite/ChangeLog: * g++.dg/parse/defarg18.C: New test. * g++.dg/parse/defarg18a.C: New test.
-
Vladimir N. Makarov authored
In this PR case LRA rematerialized a value from inheritance insn instead of output reload one. This resulted in considering a rematerilization candidate value available when it was actually not. As a consequence an insn after rematerliazation used the unexpected value and this use resulted in fp exception. The patch fixes this bug. gcc/ChangeLog: PR rtl-optimization/115568 * lra-remat.cc (create_cands): Check that output reload insn is adjacent to given insn. Update a comment. gcc/testsuite/ChangeLog: PR rtl-optimization/115568 * gcc.target/i386/pr115568.c: New.
-
Ian Lance Taylor authored
PR go/118746 * go-gcc.cc (class Gcc_backend): Define builtin_cold, builtin_leaf, builtin_nonnull. Alphabetize constants. (Gcc_backend::Gcc_backend): Update attributes for builtin functions to match builtins.def. (Gcc_backend::define_builtin): Split out attribute setting into set_attribtues. (Gcc_backend::set_attribtues): New method split out of define_builtin. Support new flag values.
-
Richard Sandiford authored
gcc.target/aarch64/sve/acle/general/ldff1_8.c and gcc.target/aarch64/sve/ptest_1.c were failing because the aarch64 port was giving a zero (unknown) cost to instructions that compute two results in parallel. This was latent until r15-1575-gea8061f46a30, which fixed rtl-ssa to treat zero costs as unknown. A long-standing todo here is to make insn_cost derive costs from md information, rather than having to write a lot of matching code in aarch64_rtx_costs. But that's not something we can do for GCC 15. This patch instead treats the cost of a PARALLEL as being the maximum cost of its constituent sets. I don't like this very much, since it isn't really target-specific behaviour. If it were stage 1, I'd be trying to change pattern_cost instead. gcc/ * config/aarch64/aarch64.cc (aarch64_insn_cost): Give PARALLELs the same cost as the costliest SET.
-
Tobias Burnus authored
PR fortran/118740 gcc/fortran/ChangeLog: * openmp.cc (gfc_match_omp_context_selector, match_omp_metadirective): Change sorry to sorry_at and use gfc_current_locus as location. * trans-openmp.cc (gfc_trans_omp_clauses): Likewise, but use n->where. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/append_args-2.f90: Update for line change.
-
Jakub Jelinek authored
Sorry, our CI bot just notified me I broke SPARC build. There are two #ifdef STACK_ADDRESS_OFFSET guarded snippets and the macro is only defined on SPARC target, so I didn't notice there was a syntax error. Fixed thusly. 2025-02-05 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/117239 * cselib.cc (cselib_init): Remove spurious closing paren in the #ifdef STACK_ADDRESS_OFFSET specific code.
-