Skip to content
Snippets Groups Projects
  1. Jan 03, 2018
    • Richard Sandiford's avatar
      poly_int: get_mask_mode · 87133c45
      Richard Sandiford authored
      
      This patch makes TARGET_GET_MASK_MODE take polynomial nunits and
      vector_size arguments.  The gcc_assert in default_get_mask_mode
      is now handled by the exact_div call in vector_element_size.
      
      2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
      	    Alan Hayward  <alan.hayward@arm.com>
      	    David Sherwood  <david.sherwood@arm.com>
      
      gcc/
      	* target.def (get_mask_mode): Take the number of units and length
      	as poly_uint64s rather than unsigned ints.
      	* targhooks.h (default_get_mask_mode): Update accordingly.
      	* targhooks.c (default_get_mask_mode): Likewise.
      	* config/i386/i386.c (ix86_get_mask_mode): Likewise.
      	* doc/tm.texi: Regenerate.
      
      Co-Authored-By: default avatarAlan Hayward <alan.hayward@arm.com>
      Co-Authored-By: default avatarDavid Sherwood <david.sherwood@arm.com>
      
      From-SVN: r256130
      87133c45
    • Richard Sandiford's avatar
      poly_int: omp_max_vf · 9d2f08ab
      Richard Sandiford authored
      
      This patch makes omp_max_vf return a polynomial vectorization factor.
      We then need to be able to stash a polynomial value in
      OMP_CLAUSE_SAFELEN_EXPR too:
      
         /* If max_vf is non-zero, then we can use only a vectorization factor
            up to the max_vf we chose.  So stick it into the safelen clause.  */
      
      For now the cfgloop safelen is still constant though.
      
      2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
      	    Alan Hayward  <alan.hayward@arm.com>
      	    David Sherwood  <david.sherwood@arm.com>
      
      gcc/
      	* omp-general.h (omp_max_vf): Return a poly_uint64 instead of an int.
      	* omp-general.c (omp_max_vf): Likewise.
      	* omp-expand.c (omp_adjust_chunk_size): Update call to omp_max_vf.
      	(expand_omp_simd): Handle polynomial safelen.
      	* omp-low.c (omplow_simd_context): Add a default constructor.
      	(omplow_simd_context::max_vf): Change from int to poly_uint64.
      	(lower_rec_simd_input_clauses): Update accordingly.
      	(lower_rec_input_clauses): Likewise.
      
      Co-Authored-By: default avatarAlan Hayward <alan.hayward@arm.com>
      Co-Authored-By: default avatarDavid Sherwood <david.sherwood@arm.com>
      
      From-SVN: r256129
      9d2f08ab
    • Richard Sandiford's avatar
      poly_int: vect_nunits_for_cost · c5126ce8
      Richard Sandiford authored
      
      This patch adds a function for getting the number of elements in
      a vector for cost purposes, which is always constant.  It makes
      it possible for a later patch to change GET_MODE_NUNITS and
      TYPE_VECTOR_SUBPARTS to a poly_int.
      
      2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
      	    Alan Hayward  <alan.hayward@arm.com>
      	    David Sherwood  <david.sherwood@arm.com>
      
      gcc/
      	* tree-vectorizer.h (vect_nunits_for_cost): New function.
      	* tree-vect-loop.c (vect_model_reduction_cost): Use it.
      	* tree-vect-slp.c (vect_analyze_slp_cost_1): Likewise.
      	(vect_analyze_slp_cost): Likewise.
      	* tree-vect-stmts.c (vect_model_store_cost): Likewise.
      	(vect_model_load_cost): Likewise.
      
      Co-Authored-By: default avatarAlan Hayward <alan.hayward@arm.com>
      Co-Authored-By: default avatarDavid Sherwood <david.sherwood@arm.com>
      
      From-SVN: r256128
      c5126ce8
    • Richard Sandiford's avatar
      poly_int: SLP max_units · 4b6068ea
      Richard Sandiford authored
      
      This match makes tree-vect-slp.c track the maximum number of vector
      units as a poly_uint64 rather than an unsigned int.
      
      2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
      	    Alan Hayward  <alan.hayward@arm.com>
      	    David Sherwood  <david.sherwood@arm.com>
      
      gcc/
      	* tree-vect-slp.c (vect_record_max_nunits, vect_build_slp_tree_1)
      	(vect_build_slp_tree_2, vect_build_slp_tree): Change max_nunits
      	from an unsigned int * to a poly_uint64_pod *.
      	(calculate_unrolling_factor): New function.
      	(vect_analyze_slp_instance): Use it.  Track polynomial max_nunits.
      
      Co-Authored-By: default avatarAlan Hayward <alan.hayward@arm.com>
      Co-Authored-By: default avatarDavid Sherwood <david.sherwood@arm.com>
      
      From-SVN: r256127
      4b6068ea
    • Richard Sandiford's avatar
      poly_int: vectoriser vf and uf · d9f21f6a
      Richard Sandiford authored
      
      This patch changes the type of the vectorisation factor and SLP
      unrolling factor to poly_uint64.  This in turn required some knock-on
      changes in signedness elsewhere.
      
      Cost decisions are generally based on estimated_poly_value,
      which for VF is wrapped up as vect_vf_for_cost.
      
      The patch doesn't on its own enable variable-length vectorisation.
      It just makes the minimum changes necessary for the code to build
      with the new VF and UF types.  Later patches also make the
      vectoriser cope with variable TYPE_VECTOR_SUBPARTS and variable
      GET_MODE_NUNITS, at which point the code really does handle
      variable-length vectors.
      
      The patch also changes MAX_VECTORIZATION_FACTOR to INT_MAX,
      to avoid hard-coding a particular architectural limit.
      
      The patch includes a new test because a development version of the patch
      accidentally used file print routines instead of dump_*, which would
      fail with -fopt-info.
      
      2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
      	    Alan Hayward  <alan.hayward@arm.com>
      	    David Sherwood  <david.sherwood@arm.com>
      
      gcc/
      	* tree-vectorizer.h (_slp_instance::unrolling_factor): Change
      	from an unsigned int to a poly_uint64.
      	(_loop_vec_info::slp_unrolling_factor): Likewise.
      	(_loop_vec_info::vectorization_factor): Change from an int
      	to a poly_uint64.
      	(MAX_VECTORIZATION_FACTOR): Bump from 64 to INT_MAX.
      	(vect_get_num_vectors): New function.
      	(vect_update_max_nunits, vect_vf_for_cost): Likewise.
      	(vect_get_num_copies): Use vect_get_num_vectors.
      	(vect_analyze_data_ref_dependences): Change max_vf from an int *
      	to an unsigned int *.
      	(vect_analyze_data_refs): Change min_vf from an int * to a
      	poly_uint64 *.
      	(vect_transform_slp_perm_load): Take the vf as a poly_uint64 rather
      	than an unsigned HOST_WIDE_INT.
      	* tree-vect-data-refs.c (vect_analyze_possibly_independent_ddr)
      	(vect_analyze_data_ref_dependence): Change max_vf from an int *
      	to an unsigned int *.
      	(vect_analyze_data_ref_dependences): Likewise.
      	(vect_compute_data_ref_alignment): Handle polynomial vf.
      	(vect_enhance_data_refs_alignment): Likewise.
      	(vect_prune_runtime_alias_test_list): Likewise.
      	(vect_shift_permute_load_chain): Likewise.
      	(vect_supportable_dr_alignment): Likewise.
      	(dependence_distance_ge_vf): Take the vectorization factor as a
      	poly_uint64 rather than an unsigned HOST_WIDE_INT.
      	(vect_analyze_data_refs): Change min_vf from an int * to a
      	poly_uint64 *.
      	* tree-vect-loop-manip.c (vect_gen_scalar_loop_niters): Take
      	vfm1 as a poly_uint64 rather than an int.  Make the same change
      	for the returned bound_scalar.
      	(vect_gen_vector_loop_niters): Handle polynomial vf.
      	(vect_do_peeling): Likewise.  Update call to
      	vect_gen_scalar_loop_niters and handle polynomial bound_scalars.
      	(vect_gen_vector_loop_niters_mult_vf): Assert that the vf must
      	be constant.
      	* tree-vect-loop.c (vect_determine_vectorization_factor)
      	(vect_update_vf_for_slp, vect_analyze_loop_2): Handle polynomial vf.
      	(vect_get_known_peeling_cost): Likewise.
      	(vect_estimate_min_profitable_iters, vectorizable_reduction): Likewise.
      	(vect_worthwhile_without_simd_p, vectorizable_induction): Likewise.
      	(vect_transform_loop): Likewise.  Use the lowest possible VF when
      	updating the upper bounds of the loop.
      	(vect_min_worthwhile_factor): Make static.  Return an unsigned int
      	rather than an int.
      	* tree-vect-slp.c (vect_attempt_slp_rearrange_stmts): Cope with
      	polynomial unroll factors.
      	(vect_analyze_slp_cost_1, vect_analyze_slp_instance): Likewise.
      	(vect_make_slp_decision): Likewise.
      	(vect_supported_load_permutation_p): Likewise, and polynomial
      	vf too.
      	(vect_analyze_slp_cost): Handle polynomial vf.
      	(vect_slp_analyze_node_operations): Likewise.
      	(vect_slp_analyze_bb_1): Likewise.
      	(vect_transform_slp_perm_load): Take the vf as a poly_uint64 rather
      	than an unsigned HOST_WIDE_INT.
      	* tree-vect-stmts.c (vectorizable_simd_clone_call, vectorizable_store)
      	(vectorizable_load): Handle polynomial vf.
      	* tree-vectorizer.c (simduid_to_vf::vf): Change from an int to
      	a poly_uint64.
      	(adjust_simduid_builtins, shrink_simd_arrays): Update accordingly.
      
      gcc/testsuite/
      	* gcc.dg/vect-opt-info-1.c: New test.
      
      Co-Authored-By: default avatarAlan Hayward <alan.hayward@arm.com>
      Co-Authored-By: default avatarDavid Sherwood <david.sherwood@arm.com>
      
      From-SVN: r256126
      d9f21f6a
    • Richard Sandiford's avatar
      match.pd handling of three-constant bitops · fba05d9e
      Richard Sandiford authored
      
      natch.pd tries to reassociate two bit operations if both of them have
      constant operands.  However, with the polynomial integers added later,
      there's no guarantee that a bit operation on two integers can be folded
      at compile time.  This means that the pattern can trigger for operations
      on three constants, and as things stood could endlessly oscillate
      between the two associations.
      
      This patch keeps the existing pattern for the normal case of a
      non-constant first operand.  When all three operands are constant it
      tries to find a pair of constants that do fold.  If none do, it keeps
      the original expression as-was.
      
      2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
      	    Alan Hayward  <alan.hayward@arm.com>
      	    David Sherwood  <david.sherwood@arm.com>
      
      gcc/
      	* match.pd: Handle bit operations involving three constants
      	and try to fold one pair.
      
      Co-Authored-By: default avatarAlan Hayward <alan.hayward@arm.com>
      Co-Authored-By: default avatarDavid Sherwood <david.sherwood@arm.com>
      
      From-SVN: r256125
      fba05d9e
    • Richard Sandiford's avatar
      Add an alternative vector loop iv mechanism · 0f26839a
      Richard Sandiford authored
      Normally we adjust the vector loop so that it iterates:
      
         (original number of scalar iterations - number of peels) / VF
      
      times, enforcing this using an IV that starts at zero and increments
      by one each iteration.  However, dividing by VF would be expensive
      for variable VF, so this patch adds an alternative in which the IV
      increments by VF each iteration instead.  We then need to take care
      to handle possible overflow in the IV.
      
      The new mechanism isn't used yet; a later patch replaces the
      "if (1)" with a check for variable VF.
      
      2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
      
      gcc/
      	* tree-vect-loop-manip.c: Include gimple-fold.h.
      	(slpeel_make_loop_iterate_ntimes): Add step, final_iv and
      	niters_maybe_zero parameters.  Handle other cases besides a step of 1.
      	(vect_gen_vector_loop_niters): Add a step_vector_ptr parameter.
      	Add a path that uses a step of VF instead of 1, but disable it
      	for now.
      	(vect_do_peeling): Add step_vector, niters_vector_mult_vf_var
      	and niters_no_overflow parameters.  Update calls to
      	slpeel_make_loop_iterate_ntimes and vect_gen_vector_loop_niters.
      	Create a new SSA name if the latter choses to use a ste other
      	than zero, and return it via niters_vector_mult_vf_var.
      	* tree-vect-loop.c (vect_transform_loop): Update calls to
      	vect_do_peeling, vect_gen_vector_loop_niters and
      	slpeel_make_loop_iterate_ntimes.
      	* tree-vectorizer.h (slpeel_make_loop_iterate_ntimes, vect_do_peeling)
      	(vect_gen_vector_loop_niters): Update declarations after above changes.
      
      From-SVN: r256124
      0f26839a
    • Ben Elliston's avatar
      e50ffab3
    • Ben Elliston's avatar
      config.guess: Import latest version. · ef7d7cf5
      Ben Elliston authored
      	* config.guess: Import latest version.
      	* config.sub: Likewise.
      
      From-SVN: r256122
      ef7d7cf5
    • Michael Meissner's avatar
      rs6000.md (floor<mode>2): Add support for IEEE 128-bit round to integer instructions. · 2d71e7b8
      Michael Meissner authored
      [gcc]
      2018-01-02  Michael Meissner  <meissner@linux.vnet.ibm.com>
      
      	* config/rs6000/rs6000.md (floor<mode>2): Add support for IEEE
      	128-bit round to integer instructions.
      	(ceil<mode>2): Likewise.
      	(btrunc<mode>2): Likewise.
      	(round<mode>2): Likewise.
      
      [gcc/testsuite]
      2018-01-02  Michael Meissner  <meissner@linux.vnet.ibm.com>
      
      	* gcc.target/powerpc/float128-hw2.c: Add tests for ceilf128,
      	floorf128, truncf128, and roundf128.
      	* gcc.target/powerpc/float128-hw5.c: New tests for _Float128
      	optimizations added in match.pd.
      	* gcc.target/powerpc/float128-hw6.c: Likewise.
      	* gcc.target/powerpc/float128-hw7.c: Likewise.
      	* gcc.target/powerpc/float128-hw8.c: Likewise.
      	* gcc.target/powerpc/float128-hw9.c: Likewise.
      	* gcc.target/powerpc/float128-hw10.c: Likewise.
      	* gcc.target/powerpc/float128-hw11.c: Likewise.
      
      From-SVN: r256118
      2d71e7b8
    • GCC Administrator's avatar
      Daily bump. · 50d75500
      GCC Administrator authored
      From-SVN: r256116
      50d75500
  2. Jan 02, 2018
    • Aaron Sawdey's avatar
      rs6000-string.c (expand_block_move): Allow the use of unaligned VSX load/store on P8/P9. · 3b0cb1a5
      Aaron Sawdey authored
      2018-01-02  Aaron Sawdey  <acsawdey@linux.vnet.ibm.com>
      
              * config/rs6000/rs6000-string.c (expand_block_move): Allow the use of
              unaligned VSX load/store on P8/P9.
              (expand_block_clear): Allow the use of unaligned VSX
      	load/store on P8/P9.
      
      From-SVN: r256112
      3b0cb1a5
    • Bill Schmidt's avatar
      rs6000-p8swap.c (swap_feeds_both_load_and_store): New function. · 6012c652
      Bill Schmidt authored
      2018-01-02  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
      
      	* config/rs6000/rs6000-p8swap.c (swap_feeds_both_load_and_store):
      	New function.
      	(rs6000_analyze_swaps): Mark a web unoptimizable if it contains a
      	swap associated with both a load and a store.
      
      From-SVN: r256111
      6012c652
    • Andrew Waterman's avatar
      RISC-V: Fix for icache flush issue on multicore processors. · f1bdc63a
      Andrew Waterman authored
      	gcc/
      	* config/riscv/linux.h (ICACHE_FLUSH_FUNC): New.
      	* config/riscv/riscv.md (clear_cache): Use it.
      
      From-SVN: r256109
      f1bdc63a
    • Artyom Skrobov's avatar
      * web.c: Remove out-of-date comment. · a7e92aff
      Artyom Skrobov authored
      From-SVN: r256106
      a7e92aff
    • Richard Sandiford's avatar
      Fix REG_ARGS_SIZE handling when pushing TLS addresses · 2bc6986d
      Richard Sandiford authored
      The new assert in add_args_size_note triggered for gcc.dg/tls/opt-3.c
      and others on m68k.  This looks like a pre-existing bug: if we pushed
      a value that needs a call to something like __tls_get_addr, we ended
      up with two different REG_ARGS_SIZE notes on the same instruction.
      
      It seems to be OK for emit_single_push_insn to push something that
      needs a call to __tls_get_addr:
      
            /* We have to allow non-call_pop patterns for the case
      	 of emit_single_push_insn of a TLS address.  */
            if (GET_CODE (pat) != PARALLEL)
      	return 0;
      
      so I think the bug is in the way this is handled rather than the fact
      that it occurs at all.
      
      If we're pushing a value X that needs a call C to calculate, we'll
      add REG_ARGS_SIZE notes to the pushes and pops for C as part of the
      call sequence.  Then emit_single_push_insn calls fixup_args_size_notes
      on the whole push sequence (the calculation of X, including C,
      and the push of X itself).  This is where the double notes came from.
      But emit_single_push_insn_1 adjusted stack_pointer_delta *before* the
      push, so the notes added for C were relative to the situation after
      the future push of X rather than before it.
      
      Presumably this didn't matter in practice because the note added
      second tended to trump the note added first.  But code is allowed to
      walk REG_NOTES without having to disregard secondary notes.
      
      2018-01-02  Richard Sandiford  <richard.sandiford@linaro.org>
      
      gcc/
      	* expr.c (fixup_args_size_notes): Check that any existing
      	REG_ARGS_SIZE notes are correct, and don't try to re-add them.
      	(emit_single_push_insn_1): Move stack_pointer_delta adjustment to...
      	(emit_single_push_insn): ...here.
      
      From-SVN: r256105
      2bc6986d
    • Richard Sandiford's avatar
      Make CONST_VECTOR_ELT handle implicitly-encoded elements · cd5ff7bc
      Richard Sandiford authored
      This patch makes CONST_VECTOR_ELT handle implicitly-encoded elements,
      in a similar way to VECTOR_CST_ELT.
      
      2018-01-02  Richard Sandiford  <richard.sandiford@linaro.org>
      
      gcc/
      	* rtl.h (CONST_VECTOR_ELT): Redefine to const_vector_elt.
      	(const_vector_encoded_nelts): New function.
      	(CONST_VECTOR_NUNITS): Redefine to use GET_MODE_NUNITS.
      	(const_vector_int_elt, const_vector_elt): Declare.
      	* emit-rtl.c (const_vector_int_elt_1): New function.
      	(const_vector_elt): Likewise.
      	* simplify-rtx.c (simplify_immed_subreg): Avoid taking the address
      	of CONST_VECTOR_ELT.
      
      From-SVN: r256104
      cd5ff7bc
    • Richard Sandiford's avatar
      Make more use of rtx_vector_builder · 3d8ca53d
      Richard Sandiford authored
      This patch makes various bits of CONST_VECTOR-building code use
      rtx_vector_builder, operating directly on a specific encoding.
      
      2018-01-02  Richard Sandiford  <richard.sandiford@linaro.org>
      
      gcc/
      	* expr.c: Include rtx-vector-builder.h.
      	(const_vector_mask_from_tree): Use rtx_vector_builder and operate
      	directly on the tree encoding.
      	(const_vector_from_tree): Likewise.
      	* optabs.c: Include rtx-vector-builder.h.
      	(expand_vec_perm_var): Use rtx_vector_builder and create a repeating
      	sequence of "u" values.
      	* vec-perm-indices.c: Include rtx-vector-builder.h.
      	(vec_perm_indices_to_rtx): Use rtx_vector_builder and operate
      	directly on the vec_perm_indices encoding.
      
      From-SVN: r256103
      3d8ca53d
    • Richard Sandiford's avatar
      New CONST_VECTOR layout · 3877c560
      Richard Sandiford authored
      This patch makes CONST_VECTOR use the same encoding as VECTOR_CST.
      
      One problem that occurs in RTL but not at the tree level is that a fair
      amount of code uses XVEC and XVECEXP directly on CONST_VECTORs (which is
      valid, just with looser checking).  This is complicated by the fact that
      vectors are also represented as PARALLELs in some target interfaces,
      so using XVECEXP is a good polymorphic way of handling both forms.
      
      Rather than try to untangle all that, the best approach seemed to be to
      continue to encode every element in a fixed-length vector.  That way only
      target-independent and AArch64 code need to be precise about using
      CONST_VECTOR_ELT over XVECEXP.
      
      After this change is no longer valid to modify CONST_VECTORs in-place.
      This needed some fix-up in the powerpc backends.
      
      2018-01-02  Richard Sandiford  <richard.sandiford@linaro.org>
      
      gcc/
      	* doc/rtl.texi (const_vector): Describe new encoding scheme.
      	* Makefile.in (OBJS): Add rtx-vector-builder.o.
      	* rtx-vector-builder.h: New file.
      	* rtx-vector-builder.c: Likewise.
      	* rtl.h (rtx_def::u2): Add a const_vector field.
      	(CONST_VECTOR_NPATTERNS): New macro.
      	(CONST_VECTOR_NELTS_PER_PATTERN): Likewise.
      	(CONST_VECTOR_DUPLICATE_P): Likewise.
      	(CONST_VECTOR_STEPPED_P): Likewise.
      	(CONST_VECTOR_ENCODED_ELT): Likewise.
      	(const_vec_duplicate_p): Check for a duplicated vector encoding.
      	(unwrap_const_vec_duplicate): Likewise.
      	(const_vec_series_p): Check for a non-duplicated vector encoding.
      	Say that the function only returns true for integer vectors.
      	* emit-rtl.c: Include rtx-vector-builder.h.
      	(gen_const_vec_duplicate_1): Delete.
      	(gen_const_vector): Call gen_const_vec_duplicate instead of
      	gen_const_vec_duplicate_1.
      	(const_vec_series_p_1): Operate directly on the CONST_VECTOR encoding.
      	(gen_const_vec_duplicate): Use rtx_vector_builder.
      	(gen_const_vec_series): Likewise.
      	(gen_rtx_CONST_VECTOR): Likewise.
      	* config/powerpcspe/powerpcspe.c: Include rtx-vector-builder.h.
      	(swap_const_vector_halves): Take an rtx pointer rather than rtx.
      	Build a new vector rather than modifying a CONST_VECTOR in-place.
      	(handle_special_swappables): Update call accordingly.
      	* config/rs6000/rs6000-p8swap.c: Include rtx-vector-builder.h.
      	(swap_const_vector_halves): Take an rtx pointer rather than rtx.
      	Build a new vector rather than modifying a CONST_VECTOR in-place.
      	(handle_special_swappables): Update call accordingly.
      
      From-SVN: r256102
      3877c560
    • Richard Sandiford's avatar
      Use CONST_VECTOR_ELT instead of XVECEXP · 8eff75e0
      Richard Sandiford authored
      This patch replaces target-independent uses of XVECEXP with uses
      of CONST_VECTOR_ELT.  This kind of replacement isn't necessary
      for code specific to targets other than AArch64.
      
      2018-01-02  Richard Sandiford  <richard.sandiford@linaro.org>
      
      gcc/
      	* simplify-rtx.c (simplify_const_binary_operation): Use
      	CONST_VECTOR_ELT instead of XVECEXP.
      
      From-SVN: r256101
      8eff75e0
    • Richard Sandiford's avatar
      Use ssizetype selectors for autovectorised VEC_PERM_EXPRs · b00cb3bf
      Richard Sandiford authored
      The previous patches mean that there's no reason that constant
      VEC_PERM_EXPRs need to have the same shape as the data inputs.
      This patch makes the autovectoriser use sizetype elements instead,
      so that indices don't get truncated for large or variable-length
      vectors.
      
      2018-01-02  Richard Sandiford  <richard.sandiford@linaro.org>
      
      gcc/
      	* tree-cfg.c (verify_gimple_assign_ternary): Allow the size of
      	the selector elements to be different from the data elements
      	if the selector is a VECTOR_CST.
      	* tree-vect-stmts.c (vect_gen_perm_mask_any): Use a vector of
      	ssizetype for the selector.
      
      From-SVN: r256100
      b00cb3bf
    • Richard Sandiford's avatar
      Use vec_perm_builder::series_p in shift_amt_for_vec_perm_mask · d3867483
      Richard Sandiford authored
      This patch makes shift_amt_for_vec_perm_mask use series_p to check
      for the simple case of a natural linear series before falling back
      to testing each element individually.  The series_p test works with
      variable-length vectors but testing every individual element doesn't.
      
      2018-01-02  Richard Sandiford  <richard.sandiford@linaro.org>
      
      gcc/
      	* optabs.c (shift_amt_for_vec_perm_mask): Try using series_p
      	before testing each element individually.
      	* tree-vect-generic.c (lower_vec_perm): Likewise.
      
      From-SVN: r256099
      d3867483
    • Richard Sandiford's avatar
      Rework VEC_PERM_EXPR folding · 1a1c441d
      Richard Sandiford authored
      This patch reworks the VEC_PERM_EXPR folding so that more of it
      works for variable-length vectors.  E.g. it means that we can
      now recognise variable-length permutes that reduce to a single
      vector, or cases in which a variable-length permute only needs
      one input.  There should be no functional change for fixed-length
      vectors.
      
      2018-01-02  Richard Sandiford  <richard.sandiford@linaro.org>
      
      gcc/
      	* selftest.h (selftest::vec_perm_indices_c_tests): Declare.
      	* selftest-run-tests.c (selftest::run_tests): Call it.
      	* vector-builder.h (vector_builder::operator ==): New function.
      	(vector_builder::operator !=): Likewise.
      	* vec-perm-indices.h (vec_perm_indices::series_p): Declare.
      	(vec_perm_indices::all_from_input_p): New function.
      	* vec-perm-indices.c (vec_perm_indices::series_p): Likewise.
      	(test_vec_perm_12, selftest::vec_perm_indices_c_tests): Likewise.
      	* fold-const.c (fold_ternary_loc): Use tree_to_vec_perm_builder
      	instead of reading the VECTOR_CST directly.  Detect whether both
      	vector inputs are the same before constructing the vec_perm_indices,
      	and update the number of inputs argument accordingly.  Use the
      	utility functions added above.  Only construct sel2 if we need to.
      
      From-SVN: r256098
      1a1c441d
    • Richard Sandiford's avatar
      Use explicit encodings for simple permutes · d980067b
      Richard Sandiford authored
      This patch makes users of vec_perm_builders use the compressed encoding
      where possible.  This means that they work with variable-length vectors.
      
      2018-01-02  Richard Sandiford  <richard.sandiford@linaro.org>
      
      gcc/
      	* optabs.c (expand_vec_perm_var): Use an explicit encoding for
      	the broadcast of the low byte.
      	(expand_mult_highpart): Use an explicit encoding for the permutes.
      	* optabs-query.c (can_mult_highpart_p): Likewise.
      	* tree-vect-loop.c (calc_vec_perm_mask_for_shift): Likewise.
      	* tree-vect-stmts.c (perm_mask_for_reverse): Likewise.
      	(vectorizable_bswap): Likewise.
      	* tree-vect-data-refs.c (vect_grouped_store_supported): Use an
      	explicit encoding for the power-of-2 permutes.
      	(vect_permute_store_chain): Likewise.
      	(vect_grouped_load_supported): Likewise.
      	(vect_permute_load_chain): Likewise.
      
      From-SVN: r256097
      d980067b
    • Richard Sandiford's avatar
      Add a vec_perm_indices_to_tree helper function · 736d0f28
      Richard Sandiford authored
      This patch adds a function for creating a VECTOR_CST from a
      vec_perm_indices, operating directly on the encoding.
      
      2018-01-02  Richard Sandiford  <richard.sandiford@linaro.org>
      
      gcc/
      	* vec-perm-indices.h (vec_perm_indices_to_tree): Declare.
      	* vec-perm-indices.c (vec_perm_indices_to_tree): New function.
      	* tree-ssa-forwprop.c (simplify_vector_constructor): Use it.
      	* tree-vect-slp.c (vect_transform_slp_perm_load): Likewise.
      	* tree-vect-stmts.c (vectorizable_bswap): Likewise.
      	(vect_gen_perm_mask_any): Likewise.
      
      From-SVN: r256096
      736d0f28
    • Richard Sandiford's avatar
      Make vec_perm_indices use new vector encoding · e3342de4
      Richard Sandiford authored
      This patch changes vec_perm_indices from a plain vec<> to a class
      that stores a canonicalized permutation, using the same encoding
      as for VECTOR_CSTs.  This means that vec_perm_indices now carries
      information about the number of vectors being permuted (currently
      always 1 or 2) and the number of elements in each input vector.
      
      A new vec_perm_builder class is used to actually build up the vector,
      like tree_vector_builder does for trees.  vec_perm_indices is the
      completed representation, a bit like VECTOR_CST is for trees.
      
      The patch just does a mechanical conversion of the code to
      vec_perm_builder: a later patch uses explicit encodings where possible.
      
      The point of all this is that it makes the representation suitable
      for variable-length vectors.  It's no longer necessary for the
      underlying vec<>s to store every element explicitly.
      
      In int-vector-builder.h, "using the same encoding as tree and rtx constants"
      describes the endpoint -- adding the rtx encoding comes later.
      
      2018-01-02  Richard Sandiford  <richard.sandiford@linaro.org>
      
      gcc/
      	* int-vector-builder.h: New file.
      	* vec-perm-indices.h: Include int-vector-builder.h.
      	(vec_perm_indices): Redefine as an int_vector_builder.
      	(auto_vec_perm_indices): Delete.
      	(vec_perm_builder): Redefine as a stand-alone class.
      	(vec_perm_indices::vec_perm_indices): New function.
      	(vec_perm_indices::clamp): Likewise.
      	* vec-perm-indices.c: Include fold-const.h and tree-vector-builder.h.
      	(vec_perm_indices::new_vector): New function.
      	(vec_perm_indices::new_expanded_vector): Update for new
      	vec_perm_indices class.
      	(vec_perm_indices::rotate_inputs): New function.
      	(vec_perm_indices::all_in_range_p): Operate directly on the
      	encoded form, without computing elided elements.
      	(tree_to_vec_perm_builder): Operate directly on the VECTOR_CST
      	encoding.  Update for new vec_perm_indices class.
      	* optabs.c (expand_vec_perm_const): Create a vec_perm_indices for
      	the given vec_perm_builder.
      	(expand_vec_perm_var): Update vec_perm_builder constructor.
      	(expand_mult_highpart): Use vec_perm_builder instead of
      	auto_vec_perm_indices.
      	* optabs-query.c (can_mult_highpart_p): Use vec_perm_builder and
      	vec_perm_indices instead of auto_vec_perm_indices.  Use a single
      	or double series encoding as appropriate.
      	* fold-const.c (fold_ternary_loc): Use vec_perm_builder and
      	vec_perm_indices instead of auto_vec_perm_indices.
      	* tree-ssa-forwprop.c (simplify_vector_constructor): Likewise.
      	* tree-vect-data-refs.c (vect_grouped_store_supported): Likewise.
      	(vect_permute_store_chain): Likewise.
      	(vect_grouped_load_supported): Likewise.
      	(vect_permute_load_chain): Likewise.
      	(vect_shift_permute_load_chain): Likewise.
      	* tree-vect-slp.c (vect_build_slp_tree_1): Likewise.
      	(vect_transform_slp_perm_load): Likewise.
      	(vect_schedule_slp_instance): Likewise.
      	* tree-vect-stmts.c (perm_mask_for_reverse): Likewise.
      	(vectorizable_mask_load_store): Likewise.
      	(vectorizable_bswap): Likewise.
      	(vectorizable_store): Likewise.
      	(vectorizable_load): Likewise.
      	* tree-vect-generic.c (lower_vec_perm): Use vec_perm_builder and
      	vec_perm_indices instead of auto_vec_perm_indices.  Use
      	tree_to_vec_perm_builder to read the vector from a tree.
      	* tree-vect-loop.c (calc_vec_perm_mask_for_shift): Take a
      	vec_perm_builder instead of a vec_perm_indices.
      	(have_whole_vector_shift): Use vec_perm_builder and
      	vec_perm_indices instead of auto_vec_perm_indices.  Leave the
      	truncation to calc_vec_perm_mask_for_shift.
      	(vect_create_epilog_for_reduction): Likewise.
      	* config/aarch64/aarch64.c (expand_vec_perm_d::perm): Change
      	from auto_vec_perm_indices to vec_perm_indices.
      	(aarch64_expand_vec_perm_const_1): Use rotate_inputs on d.perm
      	instead of changing individual elements.
      	(aarch64_vectorize_vec_perm_const): Use new_vector to install
      	the vector in d.perm.
      	* config/arm/arm.c (expand_vec_perm_d::perm): Change
      	from auto_vec_perm_indices to vec_perm_indices.
      	(arm_expand_vec_perm_const_1): Use rotate_inputs on d.perm
      	instead of changing individual elements.
      	(arm_vectorize_vec_perm_const): Use new_vector to install
      	the vector in d.perm.
      	* config/powerpcspe/powerpcspe.c (rs6000_expand_extract_even):
      	Update vec_perm_builder constructor.
      	(rs6000_expand_interleave): Likewise.
      	* config/rs6000/rs6000.c (rs6000_expand_extract_even): Likewise.
      	(rs6000_expand_interleave): Likewise.
      
      From-SVN: r256095
      e3342de4
    • Richard Sandiford's avatar
      Check whether a vector of QIs can store all indices · 6da64f1b
      Richard Sandiford authored
      The patch to remove the vec_perm_const optab checked whether replacing
      a constant permute with a variable permute is safe, or whether it might
      truncate the indices.  This patch adds a corresponding check for whether
      variable permutes can be lowered to QImode-based permutes.
      
      2018-01-02  Richard Sandiford  <richard.sandiford@linaro.org>
      
      gcc/
      	* optabs-query.c (can_vec_perm_var_p): Check whether lowering
      	to qimode could truncate the indices.
      	* optabs.c (expand_vec_perm_var): Likewise.
      
      From-SVN: r256094
      6da64f1b
    • Richard Sandiford's avatar
      Remove vec_perm_const optab · f151c9e1
      Richard Sandiford authored
      One of the changes needed for variable-length VEC_PERM_EXPRs -- and for
      long fixed-length VEC_PERM_EXPRs -- is the ability to use constant
      selectors that wouldn't fit in the vectors being permuted.  E.g. a
      permute on two V256QIs can't be done using a V256QI selector.
      
      At the moment constant permutes use two interfaces:
      targetm.vectorizer.vec_perm_const_ok for testing whether a permute is
      valid and the vec_perm_const optab for actually emitting the permute.
      The former gets passed a vec<> selector and the latter an rtx selector.
      Most ports share a lot of code between the hook and the optab, with a
      wrapper function for each interface.
      
      We could try to keep that interface and require ports to define wider
      vector modes that could be attached to the CONST_VECTOR (e.g. V256HI or
      V256SI in the example above).  But building a CONST_VECTOR rtx seems a bit
      pointless here, since the expand code only creates the CONST_VECTOR in
      order to call the optab, and the first thing the target does is take
      the CONST_VECTOR apart again.
      
      The easiest approach therefore seemed to be to remove the optab and
      reuse the target hook to emit the code.  One potential drawback is that
      it's no longer possible to use match_operand predicates to force
      operands into the required form, but in practice all targets want
      register operands anyway.
      
      The patch also changes vec_perm_indices into a class that provides
      some simple routines for handling permutations.  A later patch will
      flesh this out and get rid of auto_vec_perm_indices, but I didn't
      want to do all that in this patch and make it more complicated than
      it already is.
      
      2018-01-02  Richard Sandiford  <richard.sandiford@linaro.org>
      
      gcc/
      	* Makefile.in (OBJS): Add vec-perm-indices.o.
      	* vec-perm-indices.h: New file.
      	* vec-perm-indices.c: Likewise.
      	* target.h (vec_perm_indices): Replace with a forward class
      	declaration.
      	(auto_vec_perm_indices): Move to vec-perm-indices.h.
      	* optabs.h: Include vec-perm-indices.h.
      	(expand_vec_perm): Delete.
      	(selector_fits_mode_p, expand_vec_perm_var): Declare.
      	(expand_vec_perm_const): Declare.
      	* target.def (vec_perm_const_ok): Replace with...
      	(vec_perm_const): ...this new hook.
      	* doc/tm.texi.in (TARGET_VECTORIZE_VEC_PERM_CONST_OK): Replace with...
      	(TARGET_VECTORIZE_VEC_PERM_CONST): ...this new hook.
      	* doc/tm.texi: Regenerate.
      	* optabs.def (vec_perm_const): Delete.
      	* doc/md.texi (vec_perm_const): Likewise.
      	(vec_perm): Refer to TARGET_VECTORIZE_VEC_PERM_CONST.
      	* expr.c (expand_expr_real_2): Use expand_vec_perm_const rather than
      	expand_vec_perm for constant permutation vectors.  Assert that
      	the mode of variable permutation vectors is the integer equivalent
      	of the mode that is being permuted.
      	* optabs-query.h (selector_fits_mode_p): Declare.
      	* optabs-query.c: Include vec-perm-indices.h.
      	(selector_fits_mode_p): New function.
      	(can_vec_perm_const_p): Check whether targetm.vectorize.vec_perm_const
      	is defined, instead of checking whether the vec_perm_const_optab
      	exists.  Use targetm.vectorize.vec_perm_const instead of
      	targetm.vectorize.vec_perm_const_ok.  Check whether the indices
      	fit in the vector mode before using a variable permute.
      	* optabs.c (shift_amt_for_vec_perm_mask): Take a mode and a
      	vec_perm_indices instead of an rtx.
      	(expand_vec_perm): Replace with...
      	(expand_vec_perm_const): ...this new function.  Take the selector
      	as a vec_perm_indices rather than an rtx.  Also take the mode of
      	the selector.  Update call to shift_amt_for_vec_perm_mask.
      	Use targetm.vectorize.vec_perm_const instead of vec_perm_const_optab.
      	Use vec_perm_indices::new_expanded_vector to expand the original
      	selector into bytes.  Check whether the indices fit in the vector
      	mode before using a variable permute.
      	(expand_vec_perm_var): Make global.
      	(expand_mult_highpart): Use expand_vec_perm_const.
      	* fold-const.c: Includes vec-perm-indices.h.
      	* tree-ssa-forwprop.c: Likewise.
      	* tree-vect-data-refs.c: Likewise.
      	* tree-vect-generic.c: Likewise.
      	* tree-vect-loop.c: Likewise.
      	* tree-vect-slp.c: Likewise.
      	* tree-vect-stmts.c: Likewise.
      	* config/aarch64/aarch64-protos.h (aarch64_expand_vec_perm_const):
      	Delete.
      	* config/aarch64/aarch64-simd.md (vec_perm_const<mode>): Delete.
      	* config/aarch64/aarch64.c (aarch64_expand_vec_perm_const)
      	(aarch64_vectorize_vec_perm_const_ok): Fuse into...
      	(aarch64_vectorize_vec_perm_const): ...this new function.
      	(TARGET_VECTORIZE_VEC_PERM_CONST_OK): Delete.
      	(TARGET_VECTORIZE_VEC_PERM_CONST): Redefine.
      	* config/arm/arm-protos.h (arm_expand_vec_perm_const): Delete.
      	* config/arm/vec-common.md (vec_perm_const<mode>): Delete.
      	* config/arm/arm.c (TARGET_VECTORIZE_VEC_PERM_CONST_OK): Delete.
      	(TARGET_VECTORIZE_VEC_PERM_CONST): Redefine.
      	(arm_expand_vec_perm_const, arm_vectorize_vec_perm_const_ok): Merge
      	into...
      	(arm_vectorize_vec_perm_const): ...this new function.  Explicitly
      	check for NEON modes.
      	* config/i386/i386-protos.h (ix86_expand_vec_perm_const): Delete.
      	* config/i386/sse.md (VEC_PERM_CONST, vec_perm_const<mode>): Delete.
      	* config/i386/i386.c (ix86_expand_vec_perm_const_1): Update comment.
      	(ix86_expand_vec_perm_const, ix86_vectorize_vec_perm_const_ok): Merge
      	into...
      	(ix86_vectorize_vec_perm_const): ...this new function.  Incorporate
      	the old VEC_PERM_CONST conditions.
      	* config/ia64/ia64-protos.h (ia64_expand_vec_perm_const): Delete.
      	* config/ia64/vect.md (vec_perm_const<mode>): Delete.
      	* config/ia64/ia64.c (ia64_expand_vec_perm_const)
      	(ia64_vectorize_vec_perm_const_ok): Merge into...
      	(ia64_vectorize_vec_perm_const): ...this new function.
      	* config/mips/loongson.md (vec_perm_const<mode>): Delete.
      	* config/mips/mips-msa.md (vec_perm_const<mode>): Delete.
      	* config/mips/mips-ps-3d.md (vec_perm_constv2sf): Delete.
      	* config/mips/mips-protos.h (mips_expand_vec_perm_const): Delete.
      	* config/mips/mips.c (mips_expand_vec_perm_const)
      	(mips_vectorize_vec_perm_const_ok): Merge into...
      	(mips_vectorize_vec_perm_const): ...this new function.
      	* config/powerpcspe/altivec.md (vec_perm_constv16qi): Delete.
      	* config/powerpcspe/paired.md (vec_perm_constv2sf): Delete.
      	* config/powerpcspe/spe.md (vec_perm_constv2si): Delete.
      	* config/powerpcspe/vsx.md (vec_perm_const<mode>): Delete.
      	* config/powerpcspe/powerpcspe-protos.h (altivec_expand_vec_perm_const)
      	(rs6000_expand_vec_perm_const): Delete.
      	* config/powerpcspe/powerpcspe.c (TARGET_VECTORIZE_VEC_PERM_CONST_OK):
      	Delete.
      	(TARGET_VECTORIZE_VEC_PERM_CONST): Redefine.
      	(altivec_expand_vec_perm_const_le): Take each operand individually.
      	Operate on constant selectors rather than rtxes.
      	(altivec_expand_vec_perm_const): Likewise.  Update call to
      	altivec_expand_vec_perm_const_le.
      	(rs6000_expand_vec_perm_const): Delete.
      	(rs6000_vectorize_vec_perm_const_ok): Delete.
      	(rs6000_vectorize_vec_perm_const): New function.
      	(rs6000_do_expand_vec_perm): Take a vec_perm_builder instead of
      	an element count and rtx array.
      	(rs6000_expand_extract_even): Update call accordingly.
      	(rs6000_expand_interleave): Likewise.
      	* config/rs6000/altivec.md (vec_perm_constv16qi): Delete.
      	* config/rs6000/paired.md (vec_perm_constv2sf): Delete.
      	* config/rs6000/vsx.md (vec_perm_const<mode>): Delete.
      	* config/rs6000/rs6000-protos.h (altivec_expand_vec_perm_const)
      	(rs6000_expand_vec_perm_const): Delete.
      	* config/rs6000/rs6000.c (TARGET_VECTORIZE_VEC_PERM_CONST_OK): Delete.
      	(TARGET_VECTORIZE_VEC_PERM_CONST): Redefine.
      	(altivec_expand_vec_perm_const_le): Take each operand individually.
      	Operate on constant selectors rather than rtxes.
      	(altivec_expand_vec_perm_const): Likewise.  Update call to
      	altivec_expand_vec_perm_const_le.
      	(rs6000_expand_vec_perm_const): Delete.
      	(rs6000_vectorize_vec_perm_const_ok): Delete.
      	(rs6000_vectorize_vec_perm_const): New function.  Remove stray
      	reference to the SPE evmerge intructions.
      	(rs6000_do_expand_vec_perm): Take a vec_perm_builder instead of
      	an element count and rtx array.
      	(rs6000_expand_extract_even): Update call accordingly.
      	(rs6000_expand_interleave): Likewise.
      	* config/sparc/sparc.md (vec_perm_constv8qi): Delete in favor of...
      	* config/sparc/sparc.c (sparc_vectorize_vec_perm_const): ...this
      	new function.
      	(TARGET_VECTORIZE_VEC_PERM_CONST): Redefine.
      
      From-SVN: r256093
      f151c9e1
    • Richard Sandiford's avatar
      Refactor expand_vec_perm · 279b8057
      Richard Sandiford authored
      This patch splits the variable handling out of expand_vec_perm into
      a subroutine, so that the next patch can use a different interface
      for expanding constant permutes.  expand_vec_perm now does all the
      CONST_VECTOR handling directly and defers to expand_vec_perm_var
      for other rtx codes.  Handling CONST_VECTORs includes handling the
      fallback to variable permutes.
      
      The patch also adds an assert for valid optab modes to expand_vec_perm_1,
      so that we get it when using optabs for CONST_VECTORs.  The MODE_VECTOR_INT
      part was previously in expand_vec_perm and the mode_for_int_vector part
      is new.
      
      Most of the patch is just reindentation.
      
      2018-01-02  Richard Sandiford  <richard.sandiford@linaro.org>
      
      gcc/
      	* optabs.c (expand_vec_perm_1): Assert that SEL has an integer
      	vector mode and that that mode matches the mode of the data
      	being permuted.
      	(expand_vec_perm): Split handling of non-CONST_VECTOR selectors
      	out into expand_vec_perm_var.  Do all CONST_VECTOR handling here,
      	directly using expand_vec_perm_1 when forcing selectors into
      	registers.
      	(expand_vec_perm_var): New function, split out from expand_vec_perm.
      
      From-SVN: r256092
      279b8057
    • Richard Sandiford's avatar
      Split can_vec_perm_p into can_vec_perm_{var,const}_p · 7ac7e286
      Richard Sandiford authored
      This patch splits can_vec_perm_p into two functions: can_vec_perm_var_p
      for testing permute operations with variable selection vectors, and
      can_vec_perm_const_p for testing permute operations with specific
      constant selection vectors.  This means that we can pass the constant
      selection vector by reference.
      
      Constant permutes can still use a variable permute as a fallback.
      A later patch adds a check to makre sure that we don't truncate the
      vector indices when doing this.
      
      However, have_whole_vector_shift checked:
      
        if (direct_optab_handler (vec_perm_const_optab, mode) == CODE_FOR_nothing)
          return false;
      
      which had the effect of disallowing the fallback to variable permutes.
      I'm not sure whether that was the intention or whether it was just
      supposed to short-cut the loop on targets that don't support permutes.
      (But then why bother?  The first check in the loop would fail and
      we'd bail out straightaway.)
      
      The patch adds a parameter for disallowing the fallback.  I think it
      makes sense to do this for the following code in the VEC_PERM_EXPR
      folder:
      
      	  /* Some targets are deficient and fail to expand a single
      	     argument permutation while still allowing an equivalent
      	     2-argument version.  */
      	  if (need_mask_canon && arg2 == op2
      	      && !can_vec_perm_p (TYPE_MODE (type), false, &sel)
      	      && can_vec_perm_p (TYPE_MODE (type), false, &sel2))
      
      since it's really testing whether the expand_vec_perm_const code expects
      a particular form.
      
      2018-01-02  Richard Sandiford  <richard.sandiford@linaro.org>
      
      gcc/
      	* optabs-query.h (can_vec_perm_p): Delete.
      	(can_vec_perm_var_p, can_vec_perm_const_p): Declare.
      	* optabs-query.c (can_vec_perm_p): Split into...
      	(can_vec_perm_var_p, can_vec_perm_const_p): ...these two functions.
      	(can_mult_highpart_p): Use can_vec_perm_const_p to test whether a
      	particular selector is valid.
      	* tree-ssa-forwprop.c (simplify_vector_constructor): Likewise.
      	* tree-vect-data-refs.c (vect_grouped_store_supported): Likewise.
      	(vect_grouped_load_supported): Likewise.
      	(vect_shift_permute_load_chain): Likewise.
      	* tree-vect-slp.c (vect_build_slp_tree_1): Likewise.
      	(vect_transform_slp_perm_load): Likewise.
      	* tree-vect-stmts.c (perm_mask_for_reverse): Likewise.
      	(vectorizable_bswap): Likewise.
      	(vect_gen_perm_mask_checked): Likewise.
      	* fold-const.c (fold_ternary_loc): Likewise.  Don't take
      	implementations of variable permutation vectors into account
      	when deciding which selector to use.
      	* tree-vect-loop.c (have_whole_vector_shift): Don't check whether
      	vec_perm_const_optab is supported; instead use can_vec_perm_const_p
      	with a false third argument.
      	* tree-vect-generic.c (lower_vec_perm): Use can_vec_perm_const_p
      	to test whether the constant selector is valid and can_vec_perm_var_p
      	to test whether a variable selector is valid.
      
      From-SVN: r256091
      7ac7e286
    • Richard Sandiford's avatar
      Pass vec_perm_indices by reference · 4aae3cb3
      Richard Sandiford authored
      This patch makes functions take vec_perm_indices by reference rather
      than value, since a later patch will turn vec_perm_indices into a class
      that would be more expensive to copy.
      
      2018-01-02  Richard Sandiford  <richard.sandiford@linaro.org>
      
      gcc/
      	* optabs-query.h (can_vec_perm_p): Take a const vec_perm_indices *.
      	* optabs-query.c (can_vec_perm_p): Likewise.
      	* fold-const.c (fold_vec_perm): Take a const vec_perm_indices &
      	instead of vec_perm_indices.
      	* tree-vectorizer.h (vect_gen_perm_mask_any): Likewise,
      	(vect_gen_perm_mask_checked): Likewise,
      	* tree-vect-stmts.c (vect_gen_perm_mask_any): Likewise,
      	(vect_gen_perm_mask_checked): Likewise,
      
      From-SVN: r256090
      4aae3cb3
    • Richard Sandiford's avatar
      The vec_perm code falls back to doing byte-level permutes if element-level... · 3ea109a3
      Richard Sandiford authored
      The vec_perm code falls back to doing byte-level permutes if element-level permutes aren't supported.
      
      qimode_for_vec_perm
      
      The vec_perm code falls back to doing byte-level permutes if
      element-level permutes aren't supported.  There were two copies
      of the code to calculate the mode, and later patches add another,
      so this patch splits it out into a helper function.
      
      2018-01-02  Richard Sandiford  <richard.sandiford@linaro.org>
      
      gcc/
      	* optabs-query.h (qimode_for_vec_perm): Declare.
      	* optabs-query.c (can_vec_perm_p): Split out qimode search to...
      	(qimode_for_vec_perm): ...this new function.
      	* optabs.c (expand_vec_perm): Use qimode_for_vec_perm.
      
      From-SVN: r256089
      3ea109a3
    • Thomas Koenig's avatar
      re PR fortran/45689 ([F03] Missing transformational intrinsic in the trans_func_f2003 list) · a1d6c052
      Thomas Koenig authored
      2017-01-02  Thomas Koenig  <tkoenig@gcc.gnu.org>
      
      	PR fortran/45689
      	* intrinsic.c (add_function): Add gfc_simplify_maxloc and
      	gfc_simplify_minloc to maxloc and minloc, respectively.
      	* intrinsic.h: Add prototypes for gfc_simplify_minloc
      	and gfc_simplify_maxloc.
      	* simplify.c (min_max_chose): Adjust prototype.  Modify function
      	to have a return value which indicates if the extremum was found.
      	(is_constant_array_expr): Fix typo in comment.
      	(simplify_minmaxloc_to_scalar): New function.
      	(simplify_minmaxloc_nodim): New function.
      	(new_array): New function.
      	(simplify_minmaxloc_to_array): New function.
      	(gfc_simplify_minmaxloc): New function.
      	(simplify_minloc): New function.
      	(simplify_maxloc): New function.
      
      2017-01-02  Thomas Koenig  <tkoenig@gcc.gnu.org>
      
      	PR fortran/45689
      	* gfortran.dg/minloc_4.f90: New test case.
      	* gfortran.dg/maxloc_4.f90: New test case.
      
      From-SVN: r256088
      a1d6c052
    • Jakub Jelinek's avatar
      re PR c++/83556 (ICE in gimplify_expr, at gimplify.c:12004) · 0a552ae2
      Jakub Jelinek authored
      	PR c++/83556
      	* tree.c (replace_placeholders_r): Pass NULL as last argument to
      	cp_walk_tree instead of d->pset.  If non-TREE_CONSTANT and
      	non-PLACEHOLDER_EXPR tree has been seen already, set *walk_subtrees
      	to false and return.
      	(replace_placeholders): Pass NULL instead of &pset as last argument
      	to cp_walk_tree.
      
      	* g++.dg/cpp0x/pr83556.C: New test.
      
      From-SVN: r256086
      0a552ae2
    • Thomas Koenig's avatar
      re PR fortran/45689 ([F03] Missing transformational intrinsic in the trans_func_f2003 list) · a9ec0cfc
      Thomas Koenig authored
      2018-01-02  Thomas Koenig  <tkoenig@gcc.gnu.org>
      
      	PR fortran/45689
      	PR fortran/83650
      	* simplify.c (gfc_simplify_cshift): Re-implement to allow full
      	range of arguments.
      
      2018-01-02  Thomas Koenig  <tkoenig@gcc.gnu.org>
      
      	PR fortran/45689
      	PR fortran/83650
      	* gfortran.dg/simplify_cshift_1.f90: Correct erroneous case.
      	* gfortran.dg/simplify_cshift_4.f90: New test.
      
      From-SVN: r256084
      a9ec0cfc
    • Aaron Sawdey's avatar
      Add missing changelog entry: · 7616c40b
      Aaron Sawdey authored
      2017-12-12  Aaron Sawdey  <acsawdey@linux.vnet.ibm.com>
      
              PR target/82190
              * config/rs6000/rs6000-string.c (expand_block_compare,
              expand_strn_compare): Fix set_mem_size() calls.
      
      From-SVN: r256083
      7616c40b
    • Marek Polacek's avatar
      re PR c++/83644 (ICE using type alias from recursive decltype in noexcept or return type) · dd2ce397
      Marek Polacek authored
      	PR c++/83644
      	* g++.dg/cpp1z/pr83644.C: New test.
      
      From-SVN: r256082
      dd2ce397
    • Aaron Sawdey's avatar
      rtlanal.c (canonicalize_condition): Return 0 if final rtx does not have a conditional at the top. · e698996f
      Aaron Sawdey authored
      2018-01-02  Aaron Sawdey  <acsawdey@linux.vnet.ibm.com>
      
              * rtlanal.c (canonicalize_condition): Return 0 if final rtx
              does not have a conditional at the top.
      
      Forgot this changelog entry.
      
      From-SVN: r256081
      e698996f
    • Aaron Sawdey's avatar
      rtlanal.c (canonicalize_condition): Return 0 if final rtx does not have a conditional at the top. · 6aff9af1
      Aaron Sawdey authored
              * rtlanal.c (canonicalize_condition): Return 0 if final rtx
              does not have a conditional at the top.
      
      From-SVN: r256079
      6aff9af1
    • Marek Polacek's avatar
      re PR c++/81860 (Call to undefined inline function involving inheriting constructors) · 6ff9491a
      Marek Polacek authored
      	PR c++/81860
      	* g++.dg/cpp0x/inh-ctor30.C: New test.
      
      From-SVN: r256076
      6ff9491a
Loading