- Jul 28, 2023
-
-
Andrew MacLeod authored
* tree-ssa-propagate.cc (substitute_and_fold_engine::value_on_edge): Move from value-query.cc. (substitute_and_fold_engine::value_of_stmt): Ditto. (substitute_and_fold_engine::range_of_expr): New. * tree-ssa-propagate.h (substitute_and_fold_engine): Inherit from range_query. New prototypes. * value-query.cc (value_query::value_on_edge): Relocate. (value_query::value_of_stmt): Ditto. * value-query.h (class value_query): Remove. (class range_query): Remove base class. Adjust prototypes.
-
Andrew MacLeod authored
PR tree-optimization/110205 * gimple-range-cache.h (ranger_cache::m_estimate): Delete. * range-op-mixed.h (operator_bitwise_xor::op1_op2_relation_effect): Add final override. * range-op.cc (operator_lshift): Add missing final overrides. (operator_rshift): Ditto.
-
Joseph Myers authored
* be.po, da.po, de.po, el.po, es.po, fi.po, fr.po, hr.po, id.po, ja.po, nl.po, ru.po, sr.po, sv.po, tr.po, uk.po, vi.po, zh_CN.po, zh_TW.po: Update.
-
Jose E. Marchesi authored
clang disables tail call optimizations in BPF targets. Do the same in GCC. gcc/ChangeLog: * config/bpf/bpf.cc (bpf_option_override): Disable tail-call optimizations in BPF target.
-
Harald Anlauf authored
gcc/fortran/ChangeLog: PR fortran/110825 * gfortran.texi: Clarify argument passing convention. * trans-expr.cc (gfc_conv_procedure_call): Do not pass the character length as hidden argument when the declared dummy argument is assumed-type. gcc/testsuite/ChangeLog: PR fortran/110825 * gfortran.dg/assumed_type_18.f90: New test.
-
Honza authored
I have noticed that for all these three cases I need same update of loop exit probability. While my earlier patch unified it for unrollers, this patch makes it more general and also simplifies tree-ssa-loop-split.cc. I also refactored the code, since with all the special cases for corrupted profile it gets relatively long. I now also handle multiple loop exits in RTL unroller. Bootstrapped/regtested x86_64-linux, comitted. gcc/ChangeLog: * cfgloopmanip.cc (loop_count_in): Break out from ... (loop_exit_for_scaling): Break out from ... (update_loop_exit_probability_scale_dom_bbs): Break out from ...; add more sanity check and debug info. (scale_loop_profile): ... here. (create_empty_loop_on_edge): Fix whitespac. * cfgloopmanip.h (update_loop_exit_probability_scale_dom_bbs): Declare. * loop-unroll.cc (unroll_loop_constant_iterations): Use update_loop_exit_probability_scale_dom_bbs. * tree-ssa-loop-manip.cc (update_exit_probability_after_unrolling): Remove. (tree_transform_and_unroll_loop): Use update_loop_exit_probability_scale_dom_bbs. * tree-ssa-loop-split.cc (split_loop): Use update_loop_exit_probability_scale_dom_bbs.
-
Patrick O'Neill authored
On rv32 targets, this patch fixes: FAIL: gcc.target/riscv/rvv/autovec/madd-split2-1.c -O3 -ftree-vectorize (test for excess errors) cc1: error: ABI requires '-march=rv32' gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/madd-split2-1.c: Add -mabi=lp64d to dg-options. Signed-off-by:
Patrick O'Neill <patrick@rivosinc.com>
-
Ng YongXiang authored
PR c++/110057 PR ipa/83054 gcc/cp/ChangeLog: * init.cc (build_vec_delete_1): Devirtualize array destruction. gcc/testsuite/ChangeLog: * g++.dg/warn/pr83054.C: Remove devirtualization warning. * g++.dg/lto/pr89335_0.C: Likewise. * g++.dg/tree-ssa/devirt-array-destructor-1.C: New test. * g++.dg/tree-ssa/devirt-array-destructor-2.C: New test. * g++.dg/warn/pr83054-2.C: New test. Signed-off-by:
Ng Yong Xiang <yongxiangng@gmail.com>
-
Jan Hubicka authored
extend tree-ssa-loop-split to understand test of the form if (i==0) and if (i!=0) which triggers only during the first iteration. Naturally we should also be able to trigger last iteration or split into 3 cases if the test indeed can fire in the middle of the loop. Last iteration is bit trickier pattern matching so I want to do it incrementally, but I implemented easy case using value range that handled loops with constant iterations. The testcase gets misupdated profile, I will also fix that incrementally. gcc/ChangeLog: PR middle-end/77689 * tree-ssa-loop-split.cc: Include value-query.h. (split_at_bb_p): Analyze cases where EQ/NE can be turned into LT/LE/GT/GE; return updated guard code. (split_loop): Use guard code. gcc/testsuite/ChangeLog: PR middle-end/77689 * g++.dg/tree-ssa/loop-split-1.C: New test.
-
Roger Sayle authored
This patch is one of a series of fixes for PR rtl-optimization/110587, a compile-time regression with -O0, that attempts to address the underlying cause. As noted previously, the pathological test case pr28071.c contains a large number of useless register-to-register moves that can produce quadratic behaviour (in LRA). These moves are generated during RTL expansion in emit_group_load_1, where the middle-end attempts to simplify the source before calling extract_bit_field. This is reasonable if the source is a complex expression (from before the tree-ssa optimizers), or a SUBREG, or a hard register, but it's not particularly useful to copy a pseudo register into a new pseudo register. This patch eliminates that redundancy. The -fdump-tree-expand for pr28071.c compiled with -O0 currently contains 777K lines, with this patch it contains 717K lines, i.e. saving about 60K lines (admittedly of debugging text output, but it makes the point). 2023-07-28 Roger Sayle <roger@nextmovesoftware.com> Richard Biener <rguenther@suse.de> gcc/ChangeLog PR middle-end/28071 PR rtl-optimization/110587 * expr.cc (emit_group_load_1): Simplify logic for calling force_reg on ORIG_SRC, to avoid making a copy if the source is already in a pseudo register.
-
Jan Hubicka authored
this patch fixes profile update in the first case of loop splitting. The pass still gives up on very basic testcases: __attribute__ ((noinline,noipa)) void test1 (int n) { if (n <= 0 || n > 100000) return; for (int i = 0; i <= n; i++) { if (i < n) do_something (); if (a[i]) do_something2(); } } Here I needed to do the conditoinal that enforces sane value range of n. The reason is that it gives up on: !number_of_iterations_exit (loop1, exit1, &niter, false, true) and without the conditonal we get assumption that n>=0 and not INT_MAX. I think from overflow we shold derive that INT_MAX test is not needed and since the loop does nothing for n<0 it is also just an paranoia. I am not sure how to fix this though :(. In general the pass does not really need to compute iteration count. It only needs to know what direction the IVs go so it can detect tests that fires in first part of iteration space. Rich, any idea what the correct test should be? In testcase: for (int i = 0; i < 200; i++) if (i < 150) do_something (); else do_something2 (); the old code did wrong update of the exit condition probabilities. We know that first loop iterates 150 times and the second loop 50 times and we get it by simply scaling loop body by the probability of inner test. With the patch we now get: <bb 2> [count: 1000]: <bb 3> [count: 150000]: <- loop 1 correctly iterates 149 times # i_10 = PHI <i_7(8), 0(2)> do_something (); i_7 = i_10 + 1; if (i_7 <= 149) goto <bb 8>; [99.33%] else goto <bb 17>; [0.67%] <bb 8> [count: 149000]: goto <bb 3>; [100.00%] <bb 16> [count: 1000]: # i_15 = PHI <i_18(17)> <bb 9> [count: 49975]: <- loop 2 should iterate 50 times but we are slightly wrong # i_3 = PHI <i_15(16), i_14(13)> do_something2 (); i_14 = i_3 + 1; if (i_14 != 200) goto <bb 13>; [98.00%] else goto <bb 7>; [2.00%] <bb 13> [count: 48975]: goto <bb 9>; [100.00%] <bb 17> [count: 1000]: <- this test is always true becuase it is reached form bb 3 # i_18 = PHI <i_7(3)> if (i_18 != 200) goto <bb 16>; [99.95%] else goto <bb 7>; [0.05%] <bb 7> [count: 1000]: return; The reason why we are slightly wrong is the condtion in bb17 that is always true but the pass does not konw it. Rich any idea how to do that? I think connect_loops should work out the cas where the loop exit conditon is never satisfied at the time the splitted condition fails for first time. Before patch on hmmer we get a lot of mismatches: Profile report here claims: dump id |static mismat|dynamic mismatch | |in count |in count |time | lsplit | 5 +5| 8151850567 +8151850567| 531506481006 +57.9%| ldist | 9 +4| 15345493501 +7193642934| 606848841056 +14.2%| ifcvt | 10 +1| 15487514871 +142021370| 689469797790 +13.6%| vect | 35 +25| 17558425961 +2070911090| 517375405715 -25.0%| cunroll | 42 +7| 16898736178 -659689783| 452445796198 -4.9%| loopdone| 33 -9| 2678017188 -14220718990| 330969127663 | tracer | 34 +1| 2678018710 +1522| 330613415364 +0.0%| fre | 33 -1| 2676980249 -1038461| 330465677073 -0.0%| expand | 28 -5| 2497468467 -179511782|--------------------------| With patch lsplit | 0 | 0 | 328723360744 -2.3%| ldist | 0 | 0 | 396193562452 +20.6%| ifcvt | 1 +1| 71010686 +71010686| 478743508522 +20.8%| vect | 14 +13| 697518955 +626508269| 299398068323 -37.5%| cunroll | 13 -1| 489349408 -208169547| 257777839725 -10.5%| loopdone| 11 -2| 402558559 -86790849| 201010712702 | tracer | 13 +2| 402977200 +418641| 200651036623 +0.0%| fre | 13 | 402622146 -355054| 200344398654 -0.2%| expand | 11 -2| 333608636 -69013510|--------------------------| So no mismatches for lsplit and ldist and also lsplit thinks it improves speed by 2.3% rather than regressig it by 57%. Update is still not perfect since we do not work out that the second loop never iterates. Ifcft wrecks profile by desing since it insert conditonals with both arms 100% that will be eliminated later after vect. It is not clear to me what happens in vect though. Bootstrapped/regtested x86_64-linux, comitted. gcc/ChangeLog: PR middle-end/106923 * tree-ssa-loop-split.cc (connect_loops): Change probability of the test preconditioning second loop to very_likely. (fix_loop_bb_probability): Handle correctly case where on of the arms of the conditional is empty. (split_loop): Fold the test guarding first condition to see if it is constant true; Set correct entry block probabilities of the split loops; determine correct loop eixt probabilities. gcc/testsuite/ChangeLog: PR middle-end/106293 * gcc.dg/tree-prof/loop-split-1.c: New test. * gcc.dg/tree-prof/loop-split-2.c: New test. * gcc.dg/tree-prof/loop-split-3.c: New test.
-
Eric Botcazou authored
gcc/ada/ * gcc-interface/trans.cc (gnat_to_gnu): Restrict previous change to the case where the simple return statement has got no storage pool.
-
Clément Chigot authored
All functions but Interrupt_Wait in s-inmaop__posix are checking the result of their syscalls with an assert. However, any return code of sigwait different than 0 means that something went wrong for it. From sigwait man: > RETURN VALUE > On success, sigwait() returns 0. On error, it returns a > positive error number (listed in ERRORS). gcc/ada/ * libgnarl/s-inmaop__posix.adb: Add assert after sigwait in Interrupt_Wait
-
Javier Miranda authored
Add dummy build-in-place parameters when a BIP function does not require the BIP parameters but it is a dispatching operation that inherited them. gcc/ada/ * einfo-utils.adb (Underlying_Type): Protect recursion call against non-available attribute Etype. * einfo.ads (Protected_Subprogram): Fix typo in documentation. * exp_ch3.adb (BIP_Function_Call_Id): New subprogram. (Expand_N_Object_Declaration): Improve code that evaluates if the object is initialized with a BIP function call. * exp_ch6.adb (Is_True_Build_In_Place_Function_Call): New subprogram. (Add_Task_Actuals_To_Build_In_Place_Call): Add dummy actuals if the function does not require the BIP task actuals but it is a dispatching operation that inherited them. (Build_In_Place_Formal): Improve code to avoid never-ending loop if the BIP formal is not found. (Add_Dummy_Build_In_Place_Actuals): New subprogram. (Expand_Call_Helper): Add calls to Add_Dummy_Build_In_Place_Actuals. (Expand_N_Extended_Return_Statement): Adjust assertion. (Expand_Simple_Function_Return): Adjust assertion. (Make_Build_In_Place_Call_In_Allocator): No action needed if the called function inherited the BIP extra formals but it is not a true BIP function. (Make_Build_In_Place_Call_In_Assignment): Ditto. * exp_intr.adb (Expand_Dispatching_Constructor_Call): Remove code reporting unsupported case (since this patch adds support for it). * sem_ch6.adb (Analyze_Subprogram_Body_Helper): Adding assertion to ensure matching of BIP formals when setting the Protected_Formal field of a protected subprogram to reference the corresponding extra formal of the subprogram that implements it. (Might_Need_BIP_Task_Actuals): New subprogram. (Create_Extra_Formals): Improve code adding inherited extra formals.
-
Pascal Obry authored
gcc/ada/ * s-oscons-tmplt.c: Add support for SO_BINDTODEVICE constant. * libgnat/g-socket.ads (Set_Socket_Option): Handle SO_BINDTODEVICE option. (Get_Socket_Option): Handle SO_BINDTODEVICE option. * libgnat/g-socket.adb: Likewise. (Get_Socket_Option): Handle the case where IF_NAMESIZE is not defined and so equal to -1.
-
Léo Creuse authored
This change corrects the Has_Decision predicate in par_sco.adb to properly consider predicates of quantified expressions as decisions. gcc/ada/ * par_sco.adb (Has_Decision): Consider that quantified expressions contain decisions.
-
Ronan Desplanques authored
This patch only affects the single-entry implementation of protected objects. Before this patch, there was a race condition where a task that called an entry could put itself to sleep right after another task had executed the entry as a proxy and signalled the not-yet-waiting first task, which caused the first task to enter a deadlock. Note that this race condition has been identified and fixed before for the implementations of the run-time that live under hie/. This patch reworks the locking sequence so that it is closer to the one that's used in the multiple-entry implementation of protected objects. The code for the multiple-entry implementation is spread across multiple subprograms. To draw a parallel with the section this patch modifies, one can read the following subprograms: - System.Tasking.Protected_Objects.Operations.Protected_Entry_Call - System.Tasking.Entry_Calls.Wait_For_Completion - System.Tasking.Entry_Calls.Check_Pending_Actions_For_Entry_Call This patch also adds a comment that explicitly states the locking constraint that must hold in the affected section. gcc/ada/ * libgnarl/s-tposen.adb: Fix race condition. Add comment to justify the locking timing.
-
Viljar Indus authored
gcc/ada/ * exp_util.adb (Find_Optional_Prim_Op): use "No" instead of "= Empty"
-
Piotr Trojanek authored
When skipping check on subprograms built for class-wide preconditions we must deal with the current scope not being a subprogram, e.g. it could be a declare-block. gcc/ada/ * sem_res.adb (Resolve_Actuals): Add guard for the call to Class_Preconditions_Subprogram.
-
Eric Botcazou authored
It occurs at compile time on an aggregate of a 2-dimensional packed array type whose component type is itself a packed array, because the compiler is trying to pack the intermediate aggregate and ends up rewriting a bunch of subcomponents. This optimization was originally devised for the case of a scalar component type so the change adds this restriction. gcc/ada/ * exp_aggr.adb (Is_Two_Dim_Packed_Array): Return true only if the component type of the array is scalar.
-
Piotr Trojanek authored
GNAT has a heuristic to warn about missing return statements in functions. This warning was escalated to errors when operating in GNATprove mode and SPARK_Mode was On. However, this heuristic was imprecise and caused spurious errors. Also, it was applied after the Push_Scope/End_Scope, so for functions acting as compilation units it was using the wrong SPARK_Mode. It is better to simply leave this detection to GNATprove. gcc/ada/ * sem_ch6.adb (Check_Statement_Sequence): Only warn about missing return statements and let GNATprove emit a check when needed.
-
Tom Tromey authored
This patch changes xsnamest and gen_il-gen to emit various constants as enums rather than a sequence of preprocessor defines. This enables better debugging and somewhat better type safety. gcc/ada/ * fe.h (Convention): Now inline function. * gen_il-gen.adb (Put_C_Type_And_Subtypes.Put_Enum_Lit) (Put_C_Type_And_Subtypes.Put_Kind_Subtype, Put_C_Getter): Emit enum. * snames.h-tmpl (Name_Id, Name_, Attribute_Id, Attribute_) (Convention_Id, Convention_, Pragma_Id, Pragma_): Now enum. (Get_Attribute_Id, Get_Pragma_Id): Now inline functions. * types.h (Node_Kind, Entity_Kind, Convention_Id, Name_Id): Now enum. * xsnamest.adb (Output_Header_Line, Make_Value): Emit enum.
-
Piotr Trojanek authored
Minor typo in comment. gcc/ada/ * libgnat/a-except.ads (Save_Occurrence): Fix typo.
-
Piotr Trojanek authored
It is much simpler and safer for the routine Number_Formals to accept subprogram entities that have no formals. gcc/ada/ * einfo-utils.adb (Number_Formals): Change types in body. * einfo-utils.ads (Number_Formals): Change type in spec. * einfo.ads (Number_Formals): Change type in comment. * sem_ch13.adb (Is_Property_Function): Fix style in a caller of Number_Formals that was likely to crash because of missing guards.
-
Piotr Trojanek authored
Fix crash occurring when attribute System'To_Address is used without a WITH clause for package System. gcc/ada/ * sem_warn.adb (Check_Infinite_Loop_Warning): Don't look at the type of actual parameter when it has no type at all, e.g. because the entire subprogram call is illegal.
-
xuli authored
Computation of `vsadd`, `vsaddu`, `vssub`, and `vssubu` do not need the rounding mode, therefore the intrinsics of these instructions do not have the parameter for rounding mode control. gcc/ChangeLog: * config/riscv/riscv-vector-builtins-bases.cc: remove rounding mode of vsadd[u] and vssub[u]. * config/riscv/vector.md: Ditto. gcc/testsuite/ChangeLog: * g++.target/riscv/rvv/base/bug-12.C: Adapt testcase. * g++.target/riscv/rvv/base/bug-14.C: Ditto. * g++.target/riscv/rvv/base/bug-18.C: Ditto. * g++.target/riscv/rvv/base/bug-19.C: Ditto. * g++.target/riscv/rvv/base/bug-20.C: Ditto. * g++.target/riscv/rvv/base/bug-21.C: Ditto. * g++.target/riscv/rvv/base/bug-22.C: Ditto. * g++.target/riscv/rvv/base/bug-23.C: Ditto. * g++.target/riscv/rvv/base/bug-3.C: Ditto. * g++.target/riscv/rvv/base/bug-8.C: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-100.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-101.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-102.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-103.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-104.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-105.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-106.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-107.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-108.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-109.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-110.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-111.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-112.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-113.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-114.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-115.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-116.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-117.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-118.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-119.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-97.c: Ditto. * gcc.target/riscv/rvv/base/binop_vx_constraint-98.c: Ditto. * gcc.target/riscv/rvv/base/merge_constraint-1.c: Ditto. * gcc.target/riscv/rvv/base/fixed-point-vxrm-error.c: New test. * gcc.target/riscv/rvv/base/fixed-point-vxrm.c: New test.
-
Jan Hubicka authored
while looking on profile misupdate on hmmer I noticed that loop splitting pass is not able to handle the loop it has as an example it should apply on: One transformation of loops like: for (i = 0; i < 100; i++) { if (i < 50) A; else B; } into: for (i = 0; i < 50; i++) { A; } for (; i < 100; i++) { B; } The problem is that ivcanon turns the test into i != 100 and the pass explicitly gives up on any loops ending with != test. It needs to know the directoin of the induction variable in order to derive right conditions, but that can be done also from step. It turns out that there are no testcases for basic loop splitting. I will add some with the profile update fix. gcc/ChangeLog: * tree-ssa-loop-split.cc (split_loop): Also support NE driven loops when IV test is not overflowing. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/ifc-12.c: Disable loop splitting. * gcc.target/i386/avx2-gather-6.c: Likewise. * gcc.target/i386/avx2-vect-aggressive.c: Likewise.
-
liuhongt authored
Prevent rtl optimization of vec_duplicate + zero_extend to vpbroadcastm since there could be an extra kmov after RA. gcc/ChangeLog: PR target/110788 * config/i386/sse.md (avx512cd_maskb_vec_dup<mode>): Add UNSPEC_MASKOP. (avx512cd_maskw_vec_dup<mode>): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/pr110788.c: New test.
-
GCC Administrator authored
-
- Jul 27, 2023
-
-
David Faust authored
BPF ISA V4 introduces sign-extending move and load operations. This patch makes the BPF backend generate those instructions, when enabled and useful. A new option, -m[no-]smov gates generation of these instructions, and is enabled by default for -mcpu=v4 and above. Tests for the new instructions and documentation for the new options are included. PR target/110782 PR target/110784 gcc/ * config/bpf/bpf.opt (msmov): New option. * config/bpf/bpf.cc (bpf_option_override): Handle it here. * config/bpf/bpf.md (*extendsidi2): New. (extendhidi2): New. (extendqidi2): New. (extendsisi2): New. (extendhisi2): New. (extendqisi2): New. * doc/invoke.texi (Option Summary): Add -msmov eBPF option. (eBPF Options): Add -m[no-]smov. Document that -mcpu=v4 also enables -msmov. gcc/testsuite/ * gcc.target/bpf/sload-1.c: New test. * gcc.target/bpf/sload-pseudoc-1.c: New test. * gcc.target/bpf/smov-1.c: New test. * gcc.target/bpf/smov-pseudoc-1.c: New test.
-
David Faust authored
This patch makes some minor cleanups to eBPF options documented in invoke.texi: - Delete some vestigal docs for removed -mkernel option - Add -mbswap and -msdiv to the option summary - Note the negative versions of several options - Note that -mcpu=v4 also enables -msdiv. gcc/ * doc/invoke.texi (Option Summary): Remove -mkernel eBPF option. Add -mbswap and -msdiv eBPF options. (eBPF Options): Remove -mkernel. Add -mno-{jmpext, jmp32, alu32, v3-atomics, bswap, sdiv}. Document that -mcpu=v4 also enables -msdiv.
-
David Faust authored
The pseudo-C output templates for these instructions were incorrectly using operand 1 rather than operand 2 on the RHS, which led to some very incorrect assembly generation with -masm=pseudoc. gcc/ * config/bpf/bpf.md (add<AM:mode>3): Use %w2 instead of %w1 in pseudo-C dialect output template. (sub<AM:mode>3): Likewise. gcc/testsuite/ * gcc.target/bpf/alu-2.c: New test. * gcc.target/bpf/alu-pseudoc-2.c: Likewise.
-
Jan Hubicka authored
gcc/ChangeLog: * tree-vect-loop.cc (optimize_mask_stores): Make store likely.
-
Jan Hubicka authored
This patch fixes profile update after RTL unroll, that is now done same way as in tree one. We still produce (slightly) corrupted profile for multiple exit loops I can try to fix incrementally. I also updated testcases to look for profile mismatches so they do not creep back in again. gcc/ChangeLog: * cfgloop.h (single_dom_exit): Declare. * cfgloopmanip.h (update_exit_probability_after_unrolling): Declare. * cfgrtl.cc (struct cfg_hooks): Fix comment. * loop-unroll.cc (unroll_loop_constant_iterations): Update exit edge. * tree-ssa-loop-ivopts.h (single_dom_exit): Do not declare it here. * tree-ssa-loop-manip.cc (update_exit_probability_after_unrolling): Break out from ... (tree_transform_and_unroll_loop): ... here; gcc/testsuite/ChangeLog: * gcc.dg/tree-prof/peel-1.c: Test for profile mismatches. * gcc.dg/tree-prof/unroll-1.c: Test for profile mismatches. * gcc.dg/tree-ssa/peel1.c: Test for profile mismatches. * gcc.dg/unroll-1.c: Test for profile mismatches. * gcc.dg/unroll-3.c: Test for profile mismatches. * gcc.dg/unroll-4.c: Test for profile mismatches. * gcc.dg/unroll-5.c: Test for profile mismatches. * gcc.dg/unroll-6.c: Test for profile mismatches.
-
Tobias Burnus authored
The previous version failed to diagnose when the 'teams' was nested more deeply inside the target region, e.g. inside a DO or some block or structured block. PR fortran/110725 PR middle-end/71065 gcc/fortran/ChangeLog: * openmp.cc (resolve_omp_target): Minor cleanup. * parse.cc (decode_omp_directive): Find TARGET statement also higher in the stack. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/teams-6.f90: Extend.
-
Jonathan Wakely authored
A decimal point was being added to the end of the string for {:#.0} because the __expc character was not being set, for the _Pres_none presentation type, so __s.find(__expc) didn't the 'e' in "1e+01" and so we created "1e+01." by appending the radix char to the end. This can be fixed by ensuring that __expc='e' is set for the _Pres_none case. I realized we can also set __expc='P' and __expc='E' when needed, to save a call to std::toupper later. For the {:#.0g} format, __expc='e' was being set and so the 'e' was found in "1e+10" but then __z = __prec - __sigfigs would wraparound to SIZE_MAX. That meant we would decide not to add a radix char because the number of extra characters to insert would be 1+SIZE_MAX i.e. zero. This can be fixed by using __z == 0 when __prec == 0. libstdc++-v3/ChangeLog: PR libstdc++/108046 * include/std/format (__formatter_fp::format): Ensure __expc is always set for all presentation types. Set __z correctly for zero precision. * testsuite/std/format/functions/format.cc: Check problem cases.
-
Jan Hubicka authored
Fixe profile update in tree_transform_and_unroll_loop which is used by predictive comming. I stared by attempt to fix gcc.dg/tree-ssa/update-unroll-1.c I xfailed last week, but it turned to be harder job. Unrolling was never fixed for changes in duplicate_loop_body_to_header_edge which is now smarter on getting profile right when some exists are eliminated. A lot of manual profile can thus now be done using existing infrastructure. I also noticed that scale_dominated_blocks_in_loop does job identical to loop I wrote in scale_loop_profile and thus I commonized the implementaiton and removed recursion. I also extended duplicate_loop_body_to_header_edge to handle flat profiles same way as we do in vectorizer. Without it we end up with less then 0 iteration count in gcc.dg/tree-ssa/update-unroll-1.c (it is unrolled 32times but predicted to iterated fewer times) and added missing code to update loop_info. gcc/ChangeLog: * cfgloopmanip.cc (scale_dominated_blocks_in_loop): Move here from tree-ssa-loop-manip.cc and avoid recursion. (scale_loop_profile): Use scale_dominated_blocks_in_loop. (duplicate_loop_body_to_header_edge): Add DLTHE_FLAG_FLAT_PROFILE flag. * cfgloopmanip.h (DLTHE_FLAG_FLAT_PROFILE): Define. (scale_dominated_blocks_in_loop): Declare. * predict.cc (dump_prediction): Do not ICE on uninitialized probability. (change_edge_frequency): Remove. * predict.h (change_edge_frequency): Remove. * tree-ssa-loop-manip.cc (scale_dominated_blocks_in_loop): Move to cfgloopmanip.cc. (niter_for_unrolled_loop): Remove. (tree_transform_and_unroll_loop): Fix profile update. gcc/testsuite/ChangeLog: * gcc.dg/pr102385.c: Check for no profile mismatches. * gcc.dg/pr96931.c: Check for no profile mismatches. * gcc.dg/tree-ssa/predcom-1.c: Check for no profile mismatches. * gcc.dg/tree-ssa/predcom-2.c: Check for no profile mismatches. * gcc.dg/tree-ssa/predcom-3.c: Check for no profile mismatches. * gcc.dg/tree-ssa/predcom-4.c: Check for no profile mismatches. * gcc.dg/tree-ssa/predcom-5.c: Check for no profile mismatches. * gcc.dg/tree-ssa/predcom-7.c: Check for one profile mismatch. * gcc.dg/tree-ssa/predcom-8.c: Check for no profile mismatches. * gcc.dg/tree-ssa/predcom-dse-1.c: Check for no profile mismatches. * gcc.dg/tree-ssa/predcom-dse-10.c: Check for no profile mismatches. * gcc.dg/tree-ssa/predcom-dse-11.c: Check for no profile mismatches. * gcc.dg/tree-ssa/predcom-dse-12.c: Check for no profile mismatches. * gcc.dg/tree-ssa/predcom-dse-2.c: Check for no profile mismatches. * gcc.dg/tree-ssa/predcom-dse-3.c: Check for no profile mismatches. * gcc.dg/tree-ssa/predcom-dse-4.c: Check for no profile mismatches. * gcc.dg/tree-ssa/predcom-dse-5.c: Check for no profile mismatches. * gcc.dg/tree-ssa/predcom-dse-6.c: Check for no profile mismatches. * gcc.dg/tree-ssa/predcom-dse-7.c: Check for no profile mismatches. * gcc.dg/tree-ssa/predcom-dse-8.c: Check for no profile mismatches. * gcc.dg/tree-ssa/predcom-dse-9.c: Check for no profile mismatches. * gcc.dg/tree-ssa/update-unroll-1.c: Unxfail.
-
Jan Hubicka authored
This fixes two bugs in tree-ssa-loop-im.cc. First is that cap probability is not reliable, but it is constructed with adjusted quality. Second is that sometimes the conditional has wrong joiner BB count. This is visible on testsuite/gcc.dg/pr102385.c however the testcase triggers another profile update bug in pcom, so I will update it in followup patch. gcc/ChangeLog: * tree-ssa-loop-im.cc (execute_sm_if_changed): Turn cap probability to guessed; fix count of new_bb.
-
Jan Hubicka authored
profile_count::apply_probability misses check for uninitialized probability which leads to completely random results on applying uninitialized probability to initialized scale. This can make difference when i.e. inlining -fno-guess-branch-probability function to -fguess-branch-probability one. gcc/ChangeLog: * profile-count.h (profile_count::apply_probability): Fix handling of uninitialized probabilities, optimize scaling by probability 1.
-
Richard Biener authored
The following fixes the lack of simplification of a vector shift by an out-of-bounds shift value. For scalars this is done both by CCP and VRP but vectors are not handled there. This results in PR91838 differences in outcome dependent on whether a vector shift ISA is available and thus vector lowering does or does not expose scalar shifts here. The following adds a match.pd pattern to catch uniform out-of-bound shifts, simplifying them to zero when not sanitizing shift amounts. PR tree-optimization/91838 * gimple-match-head.cc: Include attribs.h and asan.h. * generic-match-head.cc: Likewise. * match.pd (([rl]shift @0 out-of-bounds) -> zero): New pattern.
-