- Nov 17, 2023
-
-
Jonathan Wakely authored
libstdc++-v3/ChangeLog: * include/bits/chrono_io.h: Fix Doxygen markup.
-
Jakub Jelinek authored
ctz(ext(X)) is the same as ctz(X) in the UB on zero case (or could be also in the 2 argument case on large BITINT_TYPE by preserving the argument, not implemented in this patch), popcount(zext(X)) is the same as popcount(X), parity(zext(X)) is the same as parity(X), parity(sext(X)) is the same as parity(X) provided the bit difference between the extended and unextended types is even, ffs(ext(X)) is the same as ffs(X). The following patch optimizes those in match.pd if those are beneficial (always in the large BITINT_TYPE case, or if the narrower type has optab and the wider doesn't, or the wider is larger than word and narrower is one of the standard argument sizes (tested just int and long long, as long is on most targets same bitsize as one of those two). Joseph in the PR mentioned that ctz(narrow(X)) is the same as ctz(X) if UB on 0, but that can be handled incrementally (and would need different decisions when it is profitable). And clz(zext(X)) is clz(X) + bit_difference, but not sure we want to change that in match.pd at all, perhaps during insn selection? 2023-11-17 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/112566 PR tree-optimization/83171 * match.pd (ctz(ext(X)) -> ctz(X), popcount(zext(X)) -> popcount(X), parity(ext(X)) -> parity(X), ffs(ext(X)) -> ffs(X)): New simplifications. ( __builtin_ffs (X) == 0 -> X == 0): Use FFS rather than BUILT_IN_FFS BUILT_IN_FFSL BUILT_IN_FFSLL BUILT_IN_FFSIMAX. * gcc.dg/pr112566-1.c: New test. * gcc.dg/pr112566-2.c: New test. * gcc.target/i386/pr78057.c (foo): Pass another long long argument and use it in __builtin_ia32_*zcnt_u64 instead of the int one.
-
Jakub Jelinek authored
As mentioned in the PR, the intent of the r14-5076 changes was that it doesn't count one of the uses on the use_stmt, but what actually got implemented is that it does this processing on any op_use_stmt, even if it is not the use_stmt statement, which means that it can increase count even on debug stmts (-fcompare-debug failures), or if there would be some other use stmt with 2+ uses it could count that as a single use. Though, because it fails whenever cnt != 1 and I believe use_stmt must be one of the uses, it would probably fail in the latter case anyway. The following patch fixes that by doing this extra processing only when op_use_stmt is use_stmt, and using the normal processing otherwise (so ignore debug stmts, and increase on any uses on the stmt). 2023-11-17 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/112374 * tree-vect-loop.cc (check_reduction_path): Perform the cond_fn_p special case only if op_use_stmt == use_stmt, use as_a rather than dyn_cast in that case. * gcc.dg/pr112374-1.c: New test. * gcc.dg/pr112374-2.c: New test. * g++.dg/opt/pr112374.C: New test.
-
Richard Biener authored
The offending commit r14-5444-g5ea2965b499f9e was reverted. The following adds a testcase. PR tree-optimization/112585 * gcc.dg/torture/pr112585.c: New testcase.
-
Richard Biener authored
This reverts commit 5ea2965b.
-
Tobias Burnus authored
This patch accepts -std=f2023, uses it by default and bumps for the free-source form the line length to 10,000 and the statement length alias number of continuation lines to unlimited. gcc/fortran/ChangeLog: * gfortran.texi (_gfortran_set_options): Document GFC_STD_F2023. * invoke.texi (std,pedantic,Wampersand,Wtabs): Add -std=2023. * lang.opt (std=f2023): Add. * libgfortran.h (GFC_STD_F2023, GFC_STD_OPT_F23): Add. * options.cc (set_default_std_flags): Add GFC_STD_F2023. (gfc_init_options): Set max_continue_free to 1,000,000. (gfc_post_options): Set flag_free_line_length if unset. (gfc_handle_option): Add OPT_std_f2023, set max_continue_free = 255 for -std=f2003, f2008 and f2018. gcc/testsuite/ChangeLog: * gfortran.dg/goacc/warn_truncated.f90: Add -std=f2018 option. * gfortran.dg/gomp/warn_truncated.f90: Likewise. * gfortran.dg/line_length_10.f90: Likewise. * gfortran.dg/line_length_11.f90: Likewise. * gfortran.dg/line_length_2.f90: Likewise. * gfortran.dg/line_length_5.f90: Likewise. * gfortran.dg/line_length_6.f90: Likewise. * gfortran.dg/line_length_7.f90: Likewise. * gfortran.dg/line_length_8.f90: Likewise. * gfortran.dg/line_length_9.f90: Likewise. * gfortran.dg/continuation_17.f90: New test. * gfortran.dg/continuation_18.f90: New test. * gfortran.dg/continuation_19.f: New test. * gfortran.dg/line_length_12.f90: New test. * gfortran.dg/line_length_13.f90: New test.
-
Georg-Johann Lay authored
gcc/ PR target/53372 * config/avr/avr.cc (avr_asm_named_section) [AVR_SECTION_PROGMEM]: Only return some .progmem*.data section if the user did not specify a section attribute. (avr_section_type_flags) [avr_progmem_p]: Unset SECTION_NOTYPE in returned section flags. gcc/testsuite/ PR target/53372 * gcc.target/avr/pr53372-1.c: New test. * gcc.target/avr/pr53372-2.c: New test.
-
Francois-Xavier Coudert authored
We introduced in commit a0673ec5 some noisy messages, which clutter output with things like: dg set al ... revised FFLAGS ... and are not really useful information. Let's remove them. gcc/testsuite/ChangeLog: * gfortran.dg/coarray/caf.exp: Remove some output. * gfortran.dg/dg.exp: Remove some output.
-
Xi Ruoyao authored
With LSX or LASX, copysign (x[i], -1) (or any negative constant) can be vectorized using [x]vbitseti.{w/d} instructions to directly set the signbits. Inspired by Tamar Christina's "AArch64: Handle copysign (x, -1) expansion efficiently" (r14-5289). gcc/ChangeLog: * config/loongarch/lsx.md (copysign<mode>3): Allow operand[2] to be an reg_or_vector_same_val_operand. If it's a const vector with same negative elements, expand the copysign with a bitset instruction. Otherwise, force it into an register. * config/loongarch/lasx.md (copysign<mode>3): Likewise. gcc/testsuite/ChangeLog: * g++.target/loongarch/vect-copysign-negconst.C: New test. * g++.target/loongarch/vect-copysign-negconst-run.C: New test.
-
Haochen Gui authored
The previous patch enables 16-byte by pieces move. Originally 16-byte move is implemented via pattern. expand_block_move does an optimization on P8 LE to leverage V2DI reversed load/store for memory to memory move. Now 16-byte move is implemented via by pieces move and finally split to two DI load/store. This patch creates an insn_and_split pattern to retake the optimization. gcc/ PR target/111449 * config/rs6000/vsx.md (*vsx_le_mem_to_mem_mov_ti): New. gcc/testsuite/ PR target/111449 * gcc.target/powerpc/pr111449-2.c: New.
-
Haochen Gui authored
This patch adds a new expand pattern - cbranchv16qi4 to enable vector mode by pieces equality compare on rs6000. The macro MOVE_MAX_PIECES (COMPARE_MAX_PIECES) is set to 16 bytes when EFFICIENT_UNALIGNED_VSX is enabled, otherwise keeps unchanged. The macro STORE_MAX_PIECES is set to the same value as MOVE_MAX_PIECES by default, so now it's explicitly defined and keeps unchanged. gcc/ PR target/111449 * config/rs6000/altivec.md (cbranchv16qi4): New expand pattern. * config/rs6000/rs6000.cc (rs6000_generate_compare): Generate insn sequence for V16QImode equality compare. * config/rs6000/rs6000.h (MOVE_MAX_PIECES): Define. (STORE_MAX_PIECES): Define. gcc/testsuite/ PR target/111449 * gcc.target/powerpc/pr111449-1.c: New. * gcc.dg/tree-ssa/sra-17.c: Add additional options for 32-bit powerpc. * gcc.dg/tree-ssa/sra-18.c: Likewise.
-
Li Wei authored
The LoongArch has defined ctz and clz on the backend, but if we want GCC do CTZ transformation optimization in forwprop2 pass, GCC need to know the value of c[lt]z at zero, which may be beneficial for some test cases (like spec2017 deepsjeng_r). After implementing the macro, we test dynamic instruction count on deepsjeng_r: - before 1688423249186 - after 1660311215745 (1.66% reduction) gcc/ChangeLog: * config/loongarch/loongarch.h (CLZ_DEFINED_VALUE_AT_ZERO): Implement. (CTZ_DEFINED_VALUE_AT_ZERO): Same. gcc/testsuite/ChangeLog: * gcc.dg/pr90838.c: add clz/ctz test support on LoongArch.
-
Richard Biener authored
We have a support case that shows GCC 7 sometimes creates DW_TAG_label refering to itself via a DW_AT_abstract_origin when using LTO. This for example triggers the sanity check added below during LTO bootstrap. Making this check cover more than just DW_AT_abstract_origin breaks bootstrap on trunk for /* GNU extension: Record what type our vtable lives in. */ if (TYPE_VFIELD (type)) { tree vtype = DECL_FCONTEXT (TYPE_VFIELD (type)); gen_type_die (vtype, context_die); add_AT_die_ref (type_die, DW_AT_containing_type, lookup_type_die (vtype)); so the check is for now restricted to DW_AT_abstract_origin and DW_AT_specification both of which we follow within get_AT. * dwarf2out.cc (add_AT_die_ref): Assert we do not add a self-ref DW_AT_abstract_origin or DW_AT_specification.
-
Jiahao Xu authored
Based on SPEC2017 performance evaluation results, it's better to make them equal to the cost of unaligned store/load so as to avoid odd alignment peeling. gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_builtin_vectorization_cost): Adjust.
-
Jiahao Xu authored
These tests fail when they are first added,this patch adjusts the scan-assembler-times to fix them. gcc/testsuite/ChangeLog: * gcc.target/loongarch/vector/lasx/lasx-vcond-1.c: Adjust assembler times. * gcc.target/loongarch/vector/lasx/lasx-vcond-2.c: Ditto. * gcc.target/loongarch/vector/lsx/lsx-vcond-1.c: Ditto. * gcc.target/loongarch/vector/lsx/lsx-vcond-2.c: Ditto.
-
GCC Administrator authored
-
- Nov 16, 2023
-
-
Andrew Pinski authored
Only allow (copysign x, NEG_CONST) -> (fneg (fabs x)) simplification for constant folding [PR112483] On targets with native copysign instructions, (copysign x, -1) is usually more efficient than (fneg (fabs x)). Since r14-5284, in the middle end we always optimize (fneg (fabs x)) to (copysign x, -1), not vice versa. If the target does not support native fcopysign, expand_COPYSIGN will expand it as (fneg (fabs x)) anyway. gcc/ChangeLog: PR rtl-optimization/112483 * simplify-rtx.cc (simplify_binary_operation_1) <case COPYSIGN>: Call simplify_unary_operation for NEG instead of simplify_gen_unary.
-
Eric Botcazou authored
gcc/testsuite/ * gnat.dg/varsize4.adb (Func): Initialize Byte_Read parameter.
-
Edwin Lu authored
Fix __riscv_unaligned_fast/slow/avoid macro name to __riscv_misaligned_fast/slow/avoid to be consistent with the RISC-V API Spec PR target/111557 gcc/ChangeLog: * config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins): update macro name gcc/testsuite/ChangeLog: * gcc.target/riscv/attribute-1.c: update macro name * gcc.target/riscv/attribute-4.c: ditto * gcc.target/riscv/attribute-5.c: ditto * gcc.target/riscv/predef-align-1.c: ditto * gcc.target/riscv/predef-align-2.c: ditto * gcc.target/riscv/predef-align-3.c: ditto * gcc.target/riscv/predef-align-4.c: ditto * gcc.target/riscv/predef-align-5.c: ditto * gcc.target/riscv/predef-align-6.c: ditto Signed-off-by:
Edwin Lu <ewlu@rivosinc.com>
-
Uros Bizjak authored
Sometimes the compiler emits the following code with <insn>qi_ext<mode>_0: shrl $8, %eax addb %bh, %al Patch introduces new low part QImode insn patterns with both of their input arguments extracted from high register. This invalid insn is split after reload to a move from the high register and <insn>qi_ext<mode>_0 instruction. The combine pass is able to convert shift to zero/sign-extract sub-RTX, which we split to the optimal: movzbl %bh, %edx addb %ah, %dl PR target/78904 gcc/ChangeLog: * config/i386/i386.md (*addqi_ext2<mode>_0): New define_insn_and_split pattern. (*subqi_ext2<mode>_0): Ditto. (*<code>qi_ext2<mode>_0): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/pr78904-10.c: New test. * gcc.target/i386/pr78904-10a.c: New test. * gcc.target/i386/pr78904-10b.c: New test.
-
John David Anglin authored
In analyzing PR rtl-optimization/112415, I realized that restricting REG+D offsets to 5-bits before reload results in very poor code and complexities in optimizing these instructions after reload. The general problem is long displacements are not allowed for floating point accesses when generating PA 1.1 code. Even with PA 2.0, there is a ELF linker bug that prevents using long displacements for floating point loads and stores. In the past, enabling long displacements before reload caused issues in reload. However, there have been fixes in the handling of reloads for floating-point accesses. This change allows long displacements before reload and corrects a couple of issues in the constraint handling for integer and floating-point accesses. 2023-11-16 John David Anglin <danglin@gcc.gnu.org> gcc/ChangeLog: PR rtl-optimization/112415 * config/pa/pa.cc (pa_legitimate_address_p): Allow 14-bit displacements before reload. Simplify logic flow. Revise comments. * config/pa/pa.h (TARGET_ELF64): New define. (INT14_OK_STRICT): Update define and comment. * config/pa/pa64-linux.h (TARGET_ELF64): Define. * config/pa/predicates.md (base14_operand): Don't check alignment of short displacements. (integer_store_memory_operand): Don't return true when reload_in_progress is true. Remove INT_5_BITS check. (floating_point_store_memory_operand): Don't return true when reload_in_progress is true. Use INT14_OK_STRICT to check whether long displacements are always okay.
-
Eric Botcazou authored
This is a tree sharing issue for the internal return type synthesized for a function returning a dynamically-sized type and taking an Out or In/Out parameter passed by copy. gcc/ada/ * gcc-interface/decl.cc (gnat_to_gnu_subprog_type): Also create a TYPE_DECL for the return type built for the CI/CO mechanism. gcc/testsuite/ * gnat.dg/varsize4.ads, gnat.dg/varsize4.adb: New test. * gnat.dg/varsize4_pkg.ads: New helper.
-
Jonathan Wakely authored
The formatter for std::thread::id should default to right-align, and the formatter for std::stacktrace_entry should not just ignore the fill-and-align and width from the format-spec! libstdc++-v3/ChangeLog: PR libstdc++/112564 * include/std/stacktrace (formatter::format): Format according to format-spec. * include/std/thread (formatter::format): Use _Align_right as default. * testsuite/19_diagnostics/stacktrace/output.cc: Check fill-and-align handling. Change compile test to run. * testsuite/30_threads/thread/id/output.cc: Check fill-and-align handling.
-
Michal Jires authored
ChangeLog: * MAINTAINERS: Add myself.
-
Jakub Jelinek authored
check_field_decls for DECL_C_BIT_FIELD FIELD_DECLs with error_mark_node TREE_TYPE continues early and doesn't call check_bitfield_decl which would either set DECL_BIT_FIELD, or clear DECL_C_BIT_FIELD. So, the following testcase ICEs after emitting tons of errors, because SET_DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD asserts DECL_BIT_FIELD. The patch skips that for FIELD_DECLs with error_mark_node, another option would be to check DECL_BIT_FIELD in addition to DECL_C_BIT_FIELD. 2023-11-16 Jakub Jelinek <jakub@redhat.com> PR c++/112365 * class.cc (layout_class_type): Don't SET_DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD on FIELD_DECLs with error_mark_node type. * g++.dg/cpp0x/pr112365.C: New test.
-
Uros Bizjak authored
Also fix some indentitation inconsistencies. PR target/112567 gcc/ChangeLog: * config/i386/i386.md (*<any_logic:code>qi_ext<mode>_1_slp): Fix generation of invalid RTX in split pattern.
-
Patrick Palka authored
Both of these PRs are fixed by r12-1403-gc4e50e500da7692a. PR c++/98614 PR c++/104802 gcc/testsuite/ChangeLog: * g++.dg/cpp1z/nontype-auto22.C: New test. * g++.dg/cpp2a/concepts-partial-spec14.C: New test.
-
Patrick Palka authored
potential_constant_expression for CALL_EXPR tests FUNCTION_POINTER_TYPE_P on the callee rather than on the type of the callee, which means we always pass want_rval=any when recursing and so may fail to identify a non-constant function pointer callee as such. Fixing this turns out to further work around PR111703. PR c++/111703 PR c++/107939 gcc/cp/ChangeLog: * constexpr.cc (potential_constant_expression_1) <case CALL_EXPR>: Fix FUNCTION_POINTER_TYPE_P test. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/concepts-fn8.C: Extend test. * g++.dg/diagnostic/constexpr4.C: New test.
-
David Malcolm authored
No functional change intended. gcc/ChangeLog: * diagnostic.cc (diagnostic_context::set_option_hooks): Add "lang_mask" param. * diagnostic.h (diagnostic_context::option_enabled_p): Update for move of m_lang_mask. (diagnostic_context::set_option_hooks): Add "lang_mask" param. (diagnostic_context::get_lang_mask): New. (diagnostic_context::m_lang_mask): Move into m_option_callbacks, thus making private. * lto-wrapper.cc (main): Update for new lang_mask param of set_option_hooks. * toplev.cc (init_asm_output): Use get_lang_mask. (general_init): Move initialization of global_dc's lang_mask to new lang_mask param of set_option_hooks. Signed-off-by:
David Malcolm <dmalcolm@redhat.com>
-
Tamar Christina authored
Before my refactoring if the loop->latch was incorrect then find_loop_location skipped checking the edges and would eventually return a dummy location. It turns out that a loop can have loops_state_satisfies_p (LOOPS_HAVE_RECORDED_EXITS) but also not have a latch in which case get_loop_exit_edges traps. This restores the old behavior. gcc/ChangeLog: PR tree-optimization/111878 * tree-vect-loop-manip.cc (find_loop_location): Skip edges check if latch incorrect. gcc/testsuite/ChangeLog: PR tree-optimization/111878 * gcc.dg/graphite/pr111878.c: New test.
-
Florian Weimer authored
gcc/testsuite/ * gcc.c-torture/execute/931004-13.c (main): Fix mistakenly swapped int/void types.
-
Kito Cheng authored
The target attribute which proposed in [1], target attribute allow user to specify a local setting per-function basis. The syntax of target attribute is `__attribute__((target("<ATTR-STRING>")))`. and the syntax of `<ATTR-STRING>` describes below: ``` ATTR-STRING := ATTR-STRING ';' ATTR | ATTR ATTR := ARCH-ATTR | CPU-ATTR | TUNE-ATTR ARCH-ATTR := 'arch=' EXTENSIONS-OR-FULLARCH EXTENSIONS-OR-FULLARCH := <EXTENSIONS> | <FULLARCHSTR> EXTENSIONS := <EXTENSION> ',' <EXTENSIONS> | <EXTENSION> FULLARCHSTR := <full-arch-string> EXTENSION := <OP> <EXTENSION-NAME> <VERSION> OP := '+' VERSION := [0-9]+ 'p' [0-9]+ | [1-9][0-9]* | EXTENSION-NAME := Naming rule is defined in RISC-V ISA manual CPU-ATTR := 'cpu=' <valid-cpu-name> TUNE-ATTR := 'tune=' <valid-tune-name> ``` Changes since v1: - Use std::unique_ptr rather than alloca to prevent memory issue. - Error rather than warning when attribute duplicated. [1] https://github.com/riscv-non-isa/riscv-c-api-doc/pull/35 gcc/ChangeLog: * config.gcc (riscv): Add riscv-target-attr.o. * config/riscv/riscv-protos.h (riscv_declare_function_size) New. (riscv_option_valid_attribute_p): New. (riscv_override_options_internal): New. (struct riscv_tune_info): New. (riscv_parse_tune): New. * config/riscv/riscv-target-attr.cc (class riscv_target_attr_parser): New. (struct riscv_attribute_info): New. (riscv_attributes): New. (riscv_target_attr_parser::parse_arch): New. (riscv_target_attr_parser::handle_arch): New. (riscv_target_attr_parser::handle_cpu): New. (riscv_target_attr_parser::handle_tune): New. (riscv_target_attr_parser::update_settings): New. (riscv_process_one_target_attr): New. (num_occurences_in_str): New. (riscv_process_target_attr): New. (riscv_option_valid_attribute_p): New. * config/riscv/riscv.cc: Include target-globals.h and riscv-subset.h. (struct riscv_tune_info): Move to riscv-protos.h. (get_tune_str): New. (riscv_parse_tune): New parameter null_p. (riscv_declare_function_size): New. (riscv_option_override): Build target_option_default_node and target_option_current_node. (riscv_save_restore_target_globals): New. (riscv_option_restore): New. (riscv_previous_fndecl): New. (riscv_set_current_function): Apply the target attribute. (TARGET_OPTION_RESTORE): Define. (TARGET_OPTION_VALID_ATTRIBUTE_P): Ditto. * config/riscv/riscv.h (SWITCHABLE_TARGET): Define to 1. (ASM_DECLARE_FUNCTION_SIZE) Define. * config/riscv/riscv.opt (mtune=): Add Save attribute. (mcpu=): Ditto. (mcmodel=): Ditto. * config/riscv/t-riscv: Add build rule for riscv-target-attr.o * doc/extend.texi: Add doc for target attribute. gcc/testsuite/ChangeLog: * gcc.target/riscv/target-attr-01.c: New. * gcc.target/riscv/target-attr-02.c: Ditto. * gcc.target/riscv/target-attr-03.c: Ditto. * gcc.target/riscv/target-attr-04.c: Ditto. * gcc.target/riscv/target-attr-05.c: Ditto. * gcc.target/riscv/target-attr-06.c: Ditto. * gcc.target/riscv/target-attr-07.c: Ditto. * gcc.target/riscv/target-attr-bad-01.c: Ditto. * gcc.target/riscv/target-attr-bad-02.c: Ditto. * gcc.target/riscv/target-attr-bad-03.c: Ditto. * gcc.target/riscv/target-attr-bad-04.c: Ditto. * gcc.target/riscv/target-attr-bad-05.c: Ditto. * gcc.target/riscv/target-attr-bad-06.c: Ditto. * gcc.target/riscv/target-attr-bad-07.c: Ditto. * gcc.target/riscv/target-attr-bad-08.c: Ditto. * gcc.target/riscv/target-attr-bad-09.c: Ditto. * gcc.target/riscv/target-attr-bad-10.c: Ditto. Reviewed-by:
Christoph Müllner <christoph.muellner@vrull.eu>
-
Kito Cheng authored
We set ra to fixed register now, but we still need to save/restore that at prologue/epilogue if that has used. gcc/ChangeLog: PR target/112478 * config/riscv/riscv.cc (riscv_save_return_addr_reg_p): Check ra is ever lived. gcc/testsuite/ChangeLog: PR target/112478 * gcc.target/riscv/pr112478.c: New. Reviewed-by:
Christoph Müllner <christoph.muellner@vrull.eu> Tested-by:
Christoph Müllner <christoph.muellner@vrull.eu>
-
liuhongt authored
The new added splitter will generate (insn 58 56 59 2 (set (reg:V4HI 20 xmm0 [129]) (vec_duplicate:V4HI (reg:HI 22 xmm2 [123]))) "testcase.c":16:21 -1 But we only have (define_insn "*vec_dupv4hi" [(set (match_operand:V4HI 0 "register_operand" "=y,Yw") (vec_duplicate:V4HI (truncate:HI (match_operand:SI 1 "register_operand" "0,Yw"))))] The patch add patterns for V4HI and V2HI. gcc/ChangeLog: PR target/112532 * config/i386/mmx.md (*vec_dup<mode>): Extend for V4HI and V2HI. gcc/testsuite/ChangeLog: * gcc.target/i386/pr112532.c: New test.
-
Jonathan Wakely authored
This implements that changes from P1132R8, including optimized paths for std::shared_ptr and std::unique_ptr. For std::shared_ptr we pre-allocate a new control block in the std::out_ptr_t constructor so that the destructor is non-throwing. This requires some care because unlike the shared_ptr(Y*, D, A) constructor, we don't want to invoke the deleter if allocating the control block throws, because we don't own any pointer yet. In order to avoid the unwanted deleter invocation, we create the control block manually. We also want to avoid invoking the deleter on a null pointer on destruction, so we destroy the control block manually if there is no pointer to take ownership of. For std::unique_ptr and for raw pointers, the out_ptr_t object hands out direct access to the pointer, so that we don't have anything to do (except possibly assign a new deleter) in the ~out_ptr_t destructor. These optimizations avoid requiring additional temporary storage for the pointer (and optional arguments), and avoid additional instructions to copy that pointer into the smart pointer at the end. libstdc++-v3/ChangeLog: PR libstdc++/111667 * include/Makefile.am: Add new header. * include/Makefile.in: Regenerate. * include/bits/out_ptr.h: New file. * include/bits/shared_ptr.h (__is_shared_ptr): Move definition to here ... * include/bits/shared_ptr_atomic.h (__is_shared_ptr): ... from here. * include/bits/shared_ptr_base.h (__shared_count): Declare out_ptr_t as a friend. (_Sp_counted_deleter, __shared_ptr): Likewise. * include/bits/unique_ptr.h (unique_ptr, unique_ptr<T[], D>): Declare out_ptr_t and inout_ptr_t as friends. (__is_unique_ptr): Define new variable template. * include/bits/version.def (out_ptr): Define. * include/bits/version.h: Regenerate. * include/std/memory: Include new header. * testsuite/20_util/smartptr.adapt/inout_ptr/1.cc: New test. * testsuite/20_util/smartptr.adapt/inout_ptr/2.cc: New test. * testsuite/20_util/smartptr.adapt/inout_ptr/shared_ptr_neg.cc: New test. * testsuite/20_util/smartptr.adapt/inout_ptr/void_ptr.cc: New test. * testsuite/20_util/smartptr.adapt/out_ptr/1.cc: New test. * testsuite/20_util/smartptr.adapt/out_ptr/2.cc: New test. * testsuite/20_util/smartptr.adapt/out_ptr/shared_ptr_neg.cc: New test. * testsuite/20_util/smartptr.adapt/out_ptr/void_ptr.cc: New test.
-
Jonathan Wakely authored
This change moves the definitions of feature test macros (or strictly speaking, the requests for <bits/version.h> to define them) so that only standard headers define them. For example, <bits/shared_ptr.h> will no longer define macros related to std::shared_ptr, only <memory> and <version> will define them. This means that __cpp_lib_shared_ptr_arrays will not be defined by <future> or by other headers that include <bits/shared_ptr.h>. It will only be defined when <memory> has been included. This will discourage users from relying on transitive includes. As a result, internal headers that need to query the macros should use the internal macros like __glibcxx_shared_ptr_arrays instead of __cpp_lib_shared_ptr_arrays, as those internal macros are defined by the internal headers after icluding <bits/version.h>. There are some exceptions to this rule, because __cpp_lib_is_constant_evaluated is defined by bits/c++config.h and so is available everywhere, and __cpp_lib_three_way_comparison is defined by <compare> which several headers are explicitly specified to include, so its macro is guaranteed to be usable too. N.B. not many internal headers actually need an explicit include of <bits/version.h>, because most of them include <type_traits> and so get all the __glibcxx_foo internal macros from there. libstdc++-v3/ChangeLog: * include/bits/algorithmfwd.h: Do not define standard feature test macro here. * include/bits/align.h: Likewise. Test internal macros instead of standard macros. * include/bits/alloc_traits.h: Likewise. * include/bits/allocator.h: Likewise. * include/bits/atomic_base.h: Likewise. * include/bits/atomic_timed_wait.h: Likewise. * include/bits/atomic_wait.h: Likewise. * include/bits/basic_string.h: Likewise. * include/bits/basic_string.tcc: Likewise. * include/bits/char_traits.h: Likewise. * include/bits/chrono.h: Likewise. * include/bits/cow_string.h: Likewise. * include/bits/forward_list.h: Likewise. * include/bits/hashtable.h: Likewise. * include/bits/ios_base.h: Likewise. * include/bits/memory_resource.h: Likewise. * include/bits/move.h: Likewise. * include/bits/move_only_function.h: Likewise. * include/bits/node_handle.h: Likewise. * include/bits/ptr_traits.h: Likewise. * include/bits/range_access.h: Likewise. * include/bits/ranges_algo.h: Likewise. * include/bits/ranges_cmp.h: Likewise. * include/bits/ranges_util.h: Likewise. * include/bits/semaphore_base.h: Likewise. * include/bits/shared_ptr.h: Likewise. * include/bits/shared_ptr_atomic.h: Likewise. * include/bits/shared_ptr_base.h: Likewise. * include/bits/stl_algo.h: Likewise. * include/bits/stl_algobase.h: Likewise. * include/bits/stl_function.h: Likewise. * include/bits/stl_iterator.h: Likewise. * include/bits/stl_list.h: Likewise. * include/bits/stl_map.h: Likewise. * include/bits/stl_pair.h: Likewise. * include/bits/stl_queue.h: Likewise. * include/bits/stl_stack.h: Likewise. * include/bits/stl_tree.h: Likewise. * include/bits/stl_uninitialized.h: Likewise. * include/bits/stl_vector.h: Likewise. * include/bits/unique_ptr.h: Likewise. * include/bits/unordered_map.h: Likewise. * include/bits/uses_allocator_args.h: Likewise. * include/bits/utility.h: Likewise. * include/bits/erase_if.h: Add comment. * include/std/algorithm: Define standard feature test macros here. * include/std/atomic: Likewise. * include/std/array: Likewise. * include/std/chrono: Likewise. * include/std/condition_variable: Likewise. * include/std/deque: Likewise. * include/std/format: Likewise. * include/std/functional: Likewise. * include/std/forward_list: Likewise. * include/std/ios: Likewise. * include/std/iterator: Likewise. * include/std/list: Likewise. * include/std/map: Likewise. * include/std/memory: Likewise. * include/std/numeric: Likewise. * include/std/queue: Likewise. * include/std/ranges: Likewise. * include/std/regex: Likewise. * include/std/set: Likewise. * include/std/stack: Likewise. * include/std/stop_token: Likewise. * include/std/string: Likewise. * include/std/string_view: * include/std/tuple: Likewise. * include/std/unordered_map: * include/std/unordered_set: * include/std/utility: Likewise. * include/std/vector: Likewise. * include/std/scoped_allocator: Query internal macros instead of standard macros.
-
Jonathan Wakely authored
Tests which check for feature test macros should use the no_pch option, so that we're really testing for the definition being in the intended header, and not just testing that it's present in <bits/stdc++.h> (which includes all the standard headers and so defines all the macros). libstdc++-v3/ChangeLog: * testsuite/18_support/byte/requirements.cc: Disable PCH. * testsuite/18_support/destroying_delete.cc: Likewise. * testsuite/18_support/source_location/1.cc: Likewise. * testsuite/18_support/source_location/version.cc: Likewise. * testsuite/18_support/type_info/constexpr.cc: Likewise. * testsuite/18_support/uncaught_exceptions/uncaught_exceptions.cc: Likewise. * testsuite/19_diagnostics/stacktrace/output.cc: Likewise. * testsuite/19_diagnostics/stacktrace/synopsis.cc: Likewise. * testsuite/19_diagnostics/stacktrace/version.cc: Likewise. * testsuite/20_util/addressof/requirements/constexpr.cc: Likewise. * testsuite/20_util/allocator_traits/header-2.cc: Likewise. * testsuite/20_util/allocator_traits/header.cc: Likewise. * testsuite/20_util/as_const/1.cc: Likewise. Likewise. * testsuite/20_util/bitset/cons/constexpr_c++23.cc: Likewise. * testsuite/20_util/bitset/version.cc: Likewise. * testsuite/20_util/duration/arithmetic/constexpr_c++17.cc: Likewise. * testsuite/20_util/duration_cast/rounding.cc: Likewise. * testsuite/20_util/enable_shared_from_this/members/weak_from_this.cc: Likewise. * testsuite/20_util/exchange/constexpr.cc: Likewise. * testsuite/20_util/expected/synopsis.cc: Likewise. * testsuite/20_util/expected/version.cc: Likewise. * testsuite/20_util/function_objects/bind_front/1.cc: Likewise. * testsuite/20_util/function_objects/bind_front/2.cc: Likewise. * testsuite/20_util/function_objects/invoke/3.cc: Likewise. * testsuite/20_util/function_objects/invoke/4.cc: Likewise. * testsuite/20_util/function_objects/invoke/constexpr.cc: Likewise. * testsuite/20_util/function_objects/invoke/version.cc: Likewise. * testsuite/20_util/function_objects/searchers.cc: Likewise. * testsuite/20_util/integer_comparisons/1.cc: Likewise. * testsuite/20_util/integer_comparisons/2.cc: Likewise. * testsuite/20_util/is_bounded_array/value.cc: Likewise. * testsuite/20_util/is_layout_compatible/value.cc: Likewise. * testsuite/20_util/is_layout_compatible/version.cc: Likewise. * testsuite/20_util/is_nothrow_swappable/requirements/explicit_instantiation.cc: Likewise. * testsuite/20_util/is_nothrow_swappable/requirements/typedefs.cc: Likewise. * testsuite/20_util/is_nothrow_swappable/value.cc: Likewise. * testsuite/20_util/is_nothrow_swappable/value.h: Likewise. * testsuite/20_util/is_nothrow_swappable_with/requirements/explicit_instantiation.cc: Remove redundant checks already tested elsewhere. * testsuite/20_util/is_nothrow_swappable_with/requirements/typedefs.cc: Likewise. * testsuite/20_util/is_nothrow_swappable_with/value.cc: Disable PCH. * testsuite/20_util/is_pointer_interconvertible/value.cc: Likewise. * testsuite/20_util/is_pointer_interconvertible/version.cc: Likewise. * testsuite/20_util/is_scoped_enum/value.cc: Likewise. * testsuite/20_util/is_scoped_enum/version.cc: Likewise. * testsuite/20_util/is_swappable/requirements/explicit_instantiation.cc: Remove redundant checks already tested elsewhere. * testsuite/20_util/is_swappable/requirements/typedefs.cc: Remove redundant checks already tested elsewhere. * testsuite/20_util/is_swappable/value.cc: Disable PCH. * testsuite/20_util/is_swappable/value.h: Reorder headers. * testsuite/20_util/is_swappable_with/requirements/explicit_instantiation.cc: Remove redundant checks already tested elsewhere. * testsuite/20_util/is_swappable_with/requirements/typedefs.cc: Remove redundant checks already tested elsewhere. * testsuite/20_util/is_swappable_with/value.cc: Disable PCH. * testsuite/20_util/is_unbounded_array/value.cc: Likewise. * testsuite/20_util/move_only_function/cons.cc: Likewise. * testsuite/20_util/move_only_function/version.cc: Likewise. * testsuite/20_util/optional/monadic/and_then.cc: Likewise. * testsuite/20_util/optional/requirements.cc: Likewise. * testsuite/20_util/optional/version.cc: Likewise. * testsuite/20_util/owner_less/void.cc: Likewise. * testsuite/20_util/reference_from_temporary/value.cc: Likewise. * testsuite/20_util/reference_from_temporary/version.cc: Likewise. * testsuite/20_util/shared_ptr/atomic/atomic_shared_ptr.cc: Likewise. * testsuite/20_util/shared_ptr/creation/array.cc: Likewise. * testsuite/20_util/shared_ptr/creation/overwrite.cc: Likewise. * testsuite/20_util/shared_ptr/creation/version.cc: Likewise. * testsuite/20_util/time_point_cast/rounding.cc: Likewise. * testsuite/20_util/to_chars/constexpr.cc: Likewise. * testsuite/20_util/to_chars/result.cc: Likewise. * testsuite/20_util/to_chars/version.cc: Likewise. * testsuite/20_util/to_underlying/1.cc: Likewise. * testsuite/20_util/to_underlying/version.cc: Likewise. * testsuite/20_util/tuple/apply/1.cc: Likewise. * testsuite/20_util/tuple/cons/constexpr_allocator_arg_t.cc: Likewise. * testsuite/20_util/tuple/make_from_tuple/1.cc: Likewise. * testsuite/20_util/tuple/p2321r2.cc: Likewise. * testsuite/20_util/tuple/tuple_element_t.cc: Likewise. * testsuite/20_util/unique_ptr/cons/constexpr_c++20.cc: Likewise. * testsuite/20_util/unique_ptr/creation/for_overwrite.cc: Likewise. * testsuite/20_util/unreachable/1.cc: Likewise. * testsuite/20_util/unreachable/version.cc: Likewise. * testsuite/20_util/unwrap_reference/1.cc: Likewise. * testsuite/20_util/unwrap_reference/3.cc: Likewise. * testsuite/20_util/variant/constexpr.cc: Likewise. * testsuite/20_util/variant/version.cc: Likewise. * testsuite/20_util/variant/visit_inherited.cc: Likewise. * testsuite/20_util/void_t/1.cc: Likewise. * testsuite/21_strings/basic_string/capacity/char/resize_and_overwrite.cc: Likewise. * testsuite/21_strings/basic_string/cons/char/constexpr.cc: Likewise. * testsuite/21_strings/basic_string/cons/wchar_t/constexpr.cc: Likewise. * testsuite/21_strings/basic_string/erasure.cc: Likewise. * testsuite/21_strings/basic_string/numeric_conversions/char/to_string_float.cc: Likewise. * testsuite/21_strings/basic_string/numeric_conversions/version.cc: Likewise. * testsuite/21_strings/basic_string/version.cc: Likewise. * testsuite/21_strings/basic_string_view/operations/contains/char.cc: Likewise. * testsuite/21_strings/basic_string_view/operations/contains/char/2.cc: Likewise. * testsuite/21_strings/basic_string_view/operations/copy/char/constexpr.cc: Likewise. * testsuite/21_strings/char_traits/requirements/constexpr_functions_c++17.cc: Likewise. * testsuite/21_strings/char_traits/requirements/constexpr_functions_c++20.cc: Likewise. * testsuite/21_strings/char_traits/requirements/version.cc: Likewise. * testsuite/23_containers/array/comparison_operators/constexpr.cc: Likewise. * testsuite/23_containers/array/creation/1.cc: Likewise. * testsuite/23_containers/array/creation/2.cc: Likewise. * testsuite/23_containers/array/element_access/constexpr_c++17.cc: Likewise. * testsuite/23_containers/array/requirements/constexpr_fill.cc: Likewise. * testsuite/23_containers/array/requirements/constexpr_iter.cc: Likewise. * testsuite/23_containers/deque/erasure.cc: Likewise. * testsuite/23_containers/forward_list/erasure.cc: Likewise. * testsuite/23_containers/list/erasure.cc: Likewise. * testsuite/23_containers/map/erasure.cc: Likewise. * testsuite/23_containers/queue/cons_from_iters.cc: Likewise. * testsuite/23_containers/set/erasure.cc: Likewise. * testsuite/23_containers/span/1.cc: Likewise. * testsuite/23_containers/span/2.cc: Likewise. * testsuite/23_containers/stack/cons_from_iters.cc: Likewise. * testsuite/23_containers/unordered_map/erasure.cc: Likewise. * testsuite/23_containers/unordered_map/operations/1.cc: Likewise. * testsuite/23_containers/unordered_set/erasure.cc: Likewise. * testsuite/23_containers/unordered_set/operations/1.cc: Likewise. * testsuite/23_containers/vector/cons/constexpr.cc: Likewise. * testsuite/23_containers/vector/erasure.cc: Likewise. * testsuite/23_containers/vector/requirements/version.cc: Likewise. * testsuite/24_iterators/insert_iterator/constexpr.cc: Likewise. * testsuite/25_algorithms/clamp/constexpr.cc: Likewise. * testsuite/25_algorithms/clamp/requirements/explicit_instantiation/1.cc: Remove redundant checks already tested elsewhere. * testsuite/25_algorithms/constexpr_macro.cc: Likewise. * testsuite/25_algorithms/cpp_lib_constexpr.cc: Likewise. * testsuite/25_algorithms/fold_left/1.cc: Likewise. * testsuite/25_algorithms/pstl/feature_test-2.cc: Likewise. * testsuite/25_algorithms/pstl/feature_test-3.cc: Likewise. * testsuite/25_algorithms/pstl/feature_test-4.cc: Likewise. * testsuite/25_algorithms/pstl/feature_test-5.cc: Likewise. * testsuite/25_algorithms/pstl/feature_test.cc: Likewise. * testsuite/26_numerics/bit/bit.byteswap/byteswap.cc: Likewise. * testsuite/26_numerics/bit/bit.byteswap/version.cc: Likewise. * testsuite/26_numerics/bit/bit.cast/bit_cast.cc: Likewise. * testsuite/26_numerics/bit/bit.cast/version.cc: Likewise. * testsuite/26_numerics/bit/header-2.cc: Likewise. * testsuite/26_numerics/bit/header.cc: Likewise. * testsuite/26_numerics/complex/1.cc: Likewise. * testsuite/26_numerics/complex/2.cc: Likewise. * testsuite/26_numerics/endian/2.cc: Likewise. * testsuite/26_numerics/endian/3.cc: Likewise. * testsuite/26_numerics/gcd/1.cc: Likewise. * testsuite/26_numerics/lcm/1.cc: Likewise. * testsuite/26_numerics/lerp/1.cc: Likewise. * testsuite/26_numerics/lerp/version.cc: Likewise. * testsuite/26_numerics/midpoint/integral.cc: Likewise. * testsuite/26_numerics/midpoint/version.cc: Likewise. * testsuite/26_numerics/numbers/1.cc: Likewise. * testsuite/26_numerics/numbers/2.cc: Likewise. * testsuite/27_io/basic_filebuf/native_handle/char/1.cc: Likewise. * testsuite/27_io/basic_filebuf/native_handle/version.cc: Likewise. * testsuite/27_io/basic_ofstream/open/char/noreplace.cc: Likewise. * testsuite/27_io/basic_ofstream/open/wchar_t/noreplace.cc: Likewise. * testsuite/27_io/basic_syncbuf/1.cc: Likewise. * testsuite/27_io/basic_syncbuf/2.cc: Likewise. * testsuite/27_io/basic_syncstream/1.cc: Likewise. * testsuite/27_io/basic_syncstream/2.cc: Likewise. * testsuite/27_io/spanstream/1.cc: Likewise. * testsuite/27_io/spanstream/version.cc: Likewise. * testsuite/29_atomics/atomic/cons/value_init.cc: Likewise. * testsuite/29_atomics/atomic/lock_free_aliases.cc: Likewise. * testsuite/29_atomics/atomic/wait_notify/1.cc: Likewise. * testsuite/29_atomics/atomic/wait_notify/2.cc: Likewise. * testsuite/29_atomics/headers/stdatomic.h/c_compat.cc: Likewise. * testsuite/29_atomics/headers/stdatomic.h/version.cc: Likewise. * testsuite/30_threads/barrier/1.cc: Likewise. * testsuite/30_threads/barrier/2.cc: Likewise. * testsuite/30_threads/condition_variable_any/stop_token/1.cc: Likewise. * testsuite/30_threads/condition_variable_any/stop_token/2.cc: Likewise. * testsuite/30_threads/jthread/1.cc: Likewise. * testsuite/30_threads/jthread/version.cc: Likewise. * testsuite/30_threads/latch/1.cc: Likewise. * testsuite/30_threads/latch/2.cc: Likewise. * testsuite/30_threads/scoped_lock/requirements/typedefs.cc: Likewise. * testsuite/30_threads/semaphore/1.cc: Likewise. * testsuite/30_threads/semaphore/2.cc: Likewise. * testsuite/30_threads/stop_token/1.cc: Likewise. * testsuite/30_threads/stop_token/2.cc: Likewise. * testsuite/experimental/feat-char8_t.cc: Likewise. * testsuite/experimental/iterator/ostream_joiner.cc: Likewise. * testsuite/experimental/numeric/gcd.cc: Likewise. * testsuite/experimental/scopeguard/uniqueres.cc: Likewise. * testsuite/std/concepts/1.cc: Likewise. * testsuite/std/concepts/2.cc: Likewise. * testsuite/std/ranges/adaptors/as_const/1.cc: Likewise. * testsuite/std/ranges/adaptors/as_rvalue/1.cc: Likewise. * testsuite/std/ranges/adaptors/chunk/1.cc: Likewise. * testsuite/std/ranges/adaptors/chunk_by/1.cc: Likewise. * testsuite/std/ranges/adaptors/enumerate/1.cc: Likewise. * testsuite/std/ranges/adaptors/join_with/1.cc: Likewise. * testsuite/std/ranges/adaptors/slide/1.cc: Likewise. * testsuite/std/ranges/adaptors/stride/1.cc: Likewise. * testsuite/std/ranges/cartesian_product/1.cc: Likewise. * testsuite/std/ranges/headers/ranges/synopsis.cc: Likewise. * testsuite/std/ranges/repeat/1.cc: Likewise. * testsuite/std/ranges/version_c++23.cc: Likewise. * testsuite/std/ranges/zip/1.cc: Likewise. * testsuite/std/time/syn_c++20.cc: Likewise. * testsuite/experimental/feat-cxx14.cc: Likewise. Include <algorithm> and <iterator>. * testsuite/23_containers/array/tuple_interface/get_neg.cc: Adjust dg-error line numbers.
-
Jonathan Wakely authored
I noticed that our C++23 features were not being defined when using Clang 16 with -std=c++2b, because it only defines __cplusplus=202101L but <bits/version.h> uses 202302L since my r14-3252-g0c316669b092fb change. This changes <bits/version.h> to use 202100 instead of the final 202302 value so that we support Clang 16's -std=c++2b mode. libstdc++-v3/ChangeLog: * include/bits/version.def (stds): Use >= 202100 for C++23 condition. * include/bits/version.h: Regenerate. * include/std/thread: Use > C++20 instead of >= C++23 for __cplusplus condition.
-
Jonathan Wakely authored
We don't need any library concepts to define the constraints for rvalue stream overloads, only compiler support. So change the test from using __cpp_lib_concepts to __cpp_concepts >= 201907L. libstdc++-v3/ChangeLog: * include/std/istream (__rvalue_stream_extraction_t): Test __cpp_concepts instead of __cpp_lib_concepts. * include/std/ostream (__derived_from_ios_base): Likewise. (__rvalue_stream_insertion_t): Likewise.
-
Jakub Jelinek authored
The following testcase is miscompiled on x86_64 since PR110551 r14-4968 commit. That commit added 2 peephole2s, one for mov imm,%rXX; mov %rYY,%rax; mulq %rXX -> mov imm,%rax; mulq %rYY which I believe is ok, and another one for mov imm,%rXX; mov %rYY,%rdx; mulx %rXX, %rZZ, %rWW -> mov imm,%rdx; mulx %rYY, %rZZ, %rWW which is wrong. Both peephole2s verify that %rXX above is dead at the end of the pattern, by checking if %rXX is either one of the registers overwritten in the multiplication (%rdx:%rax in the first case, the 2 destination registers of mulx in the latter case), because we no longer set %rXX to that immediate (we set %rax resp. %rdx to it instead) when the peephole2 replaces it. But, we also need to ensure that the other register previously set to the value of %rYY and newly to imm isn't used after the multiplication, and neither of the peephole2s does that. Now, for the first one (at least assuming in the % pattern the matching operand (i.e. hardcoded %rax resp. %rdx) after RA will always go first) I think it is always the case, because operands[2] if it must be %rax register will be overwritten by mulq writing to %rdx:%rax. But in the second case, there is no reason why %rdx couldn't be used after the pattern, and if it is (like in the testcase), we can't make those changes. So, the patch checks similarly to operands[0] that operands[2] (which ought to be %rdx if RA puts the % match_dup operand first and nothing swaps it afterwards) is either the same register as one of the destination registers of mulx or dies at the end of the multiplication. 2023-11-16 Jakub Jelinek <jakub@redhat.com> PR target/112526 * config/i386/i386.md (mov imm,%rax; mov %rdi,%rdx; mulx %rax -> mov imm,%rdx; mulx %rdi): Verify in define_peephole2 that operands[2] dies or is overwritten at the end of multiplication. * gcc.target/i386/bmi2-pr112526.c: New test.
-