Skip to content
Snippets Groups Projects
Commit ea66f57c authored by Richard Sandiford's avatar Richard Sandiford
Browse files

aarch64: Add mf8 data movement intrinsics


This patch adds mf8 variants of what I'll loosely call the existing
"data movement" intrinsics, including the recent FEAT_LUT ones.
I think this completes the FP8 intrinsic definitions.

The new intrinsics are defined entirely in the compiler.  This should
make it easy to move the existing non-mf8 variants into the compiler
as well, but that's too invasive for stage 3 and so is left to GCC 16.

I wondered about trying to reduce the cut-&-paste in the .def file,
but in the end decided against it.  I have a plan for specifying this
information in a different format, but again that would need to wait
until GCC 16.

The patch includes some support for gimple folding.  I initially
tested the patch without it, so that all the rtl expansion code
was exercised.

vlut.c fails for all types with big-endian ILP32, but that's
for a later patch.

gcc/
	* config/aarch64/aarch64.md (UNSPEC_BSL, UNSPEC_COMBINE, UNSPEC_DUP)
	(UNSPEC_DUP_LANE, UNSPEC_GET_LANE, UNSPEC_LD1_DUP, UNSPEC_LD1x2)
	(UNSPEC_LD1x3, UNSPEC_LD1x4, UNSPEC_SET_LANE, UNSPEC_ST1_LANE)
	(USNEPC_ST1x2, UNSPEC_ST1x3, UNSPEC_ST1x4, UNSPEC_VCREATE)
	(UNSPEC_VEC_COPY): New unspecs.
	* config/aarch64/iterators.md (UNSPEC_TBL): Likewise.
	* config/aarch64/aarch64-simd-pragma-builtins.def: Add definitions
	of the mf8 data movement intrinsics.
	* config/aarch64/aarch64-protos.h
	(aarch64_advsimd_vector_array_mode): Declare.
	* config/aarch64/aarch64.cc
	(aarch64_advsimd_vector_array_mode): Make public.
	* config/aarch64/aarch64-builtins.h (qualifier_const_pointer): New
	aarch64_type_qualifiers member.
	* config/aarch64/aarch64-builtins.cc (AARCH64_SIMD_VGET_LOW_BUILTINS)
	(AARCH64_SIMD_VGET_HIGH_BUILTINS): Add mf8 variants.
	(aarch64_int_or_fp_type): Handle qualifier_modal_float.
	(aarch64_num_lanes): New function.
	(binary_two_lanes, load, load_lane, store, store_lane): New signatures.
	(unary_lane): Likewise.
	(simd_type::nunits): New member function.
	(simd_types): Add pointer types.
	(aarch64_fntype): Handle the new signatures.
	(require_immediate_lane_index): Use aarch64_num_lanes.
	(aarch64_pragma_builtins_checker::check): Handle the new intrinsics.
	(aarch64_convert_address): (aarch64_dereference_pointer):
	(aarch64_canonicalize_lane, aarch64_convert_to_lane_mask)
	(aarch64_pack_into_v128s, aarch64_expand_permute_pair)
	(aarch64_expand_tbl_tbx): New functions.
	(aarch64_expand_pragma_builtin): Handle the new intrinsics.
	(aarch64_force_gimple_val, aarch64_copy_vops, aarch64_fold_to_val)
	(aarch64_dereference, aarch64_get_lane_bit_index, aarch64_get_lane)
	(aarch64_set_lane, aarch64_fold_combine, aarch64_fold_load)
	(aarch64_fold_store, aarch64_ext_index, aarch64_rev_index)
	(aarch64_trn_index, aarch64_uzp_index, aarch64_zip_index)
	(aarch64_fold_permute): New functions, some split out from
	aarch64_general_gimple_fold_builtin.
	(aarch64_gimple_fold_pragma_builtin): New function.
	(aarch64_general_gimple_fold_builtin): Use the new functions above.
	* config/aarch64/aarch64-simd.md (aarch64_dup_lane<mode>)
	(aarch64_dup_lane_<vswap_width_name><mode>): Add "@" to name.
	(aarch64_simd_vec_set<mode>): Likewise.
	(*aarch64_simd_vec_copy_lane_<vswap_width_name><mode>): Likewise.
	(aarch64_simd_bsl<mode>): Likewise.
	(aarch64_combine<mode>): Likewise.
	(aarch64_cm<optab><mode><vczle><vczbe>): Likewise.
	(aarch64_simd_ld2r<vstruct_elt>): Likewise.
	(aarch64_vec_load_lanes<mode>_lane<vstruct_elt>): Likewise.
	(aarch64_simd_ld3r<vstruct_elt>): Likewise.
	(aarch64_simd_ld4r<vstruct_elt>): Likewise.
	(aarch64_ld1x3<vstruct_elt>): Likewise.
	(aarch64_ld1x4<vstruct_elt>): Likewise.
	(aarch64_st1x2<vstruct_elt>): Likewise.
	(aarch64_st1x3<vstruct_elt>): Likewise.
	(aarch64_st1x4<vstruct_elt>): Likewise.
	(aarch64_ld<nregs><vstruct_elt>): Likewise.
	(aarch64_ld1<VALL_F16: Likewise.mode>): Likewise.
	(aarch64_ld1x2<vstruct_elt>): Likewise.
	(aarch64_ld<nregs>_lane<vstruct_elt>): Likewise.
	(aarch64_<PERMUTE: Likewise.perm_insn><mode><vczle><vczbe>): Likewise.
	(aarch64_ext<mode>): Likewise.
	(aarch64_rev<REVERSE: Likewise.rev_op><mode><vczle><vczbe>): Likewise.
	(aarch64_st<nregs><vstruct_elt>): Likewise.
	(aarch64_st<nregs>_lane<vstruct_elt>): Likewise.
	(aarch64_st1<VALL_F16: Likewise.mode>): Likewise.

gcc/testsuite/
	* gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h: Add mfloat8
	support.
	* gcc.target/aarch64/advsimd-intrinsics/compute-ref-data.h: Likewise.
	* gcc.target/aarch64/advsimd-intrinsics/vbsl.c: Likewise.
	* gcc.target/aarch64/advsimd-intrinsics/vcombine.c: Likewise.
	* gcc.target/aarch64/advsimd-intrinsics/vcreate.c: Likewise.
	* gcc.target/aarch64/advsimd-intrinsics/vdup-vmov.c: Likewise.
	* gcc.target/aarch64/advsimd-intrinsics/vdup_lane.c: Likewise.
	* gcc.target/aarch64/advsimd-intrinsics/vext.c: Likewise.
	* gcc.target/aarch64/advsimd-intrinsics/vget_high.c: Likewise.
	* gcc.target/aarch64/advsimd-intrinsics/vld1.c: Likewise.
	* gcc.target/aarch64/advsimd-intrinsics/vld1_dup.c: Likewise.
	* gcc.target/aarch64/advsimd-intrinsics/vld1_lane.c: Likewise.
	* gcc.target/aarch64/advsimd-intrinsics/vld1x2.c: Likewise.
	* gcc.target/aarch64/advsimd-intrinsics/vld1x3.c: Likewise.
	* gcc.target/aarch64/advsimd-intrinsics/vld1x4.c: Likewise.
	* gcc.target/aarch64/advsimd-intrinsics/vldX.c: Likewise.
	* gcc.target/aarch64/advsimd-intrinsics/vldX_dup.c: Likewise.
	* gcc.target/aarch64/advsimd-intrinsics/vldX_lane.c: Likewise.
	* gcc.target/aarch64/advsimd-intrinsics/vrev.c: Likewise.
	* gcc.target/aarch64/advsimd-intrinsics/vset_lane.c: Likewise.
	* gcc.target/aarch64/advsimd-intrinsics/vshuffle.inc: Likewise.
	* gcc.target/aarch64/advsimd-intrinsics/vst1_lane.c: Likewise.
	* gcc.target/aarch64/advsimd-intrinsics/vst1x2.c: Likewise.
	* gcc.target/aarch64/advsimd-intrinsics/vst1x3.c: Likewise.
	* gcc.target/aarch64/advsimd-intrinsics/vst1x4.c: Likewise.
	* gcc.target/aarch64/advsimd-intrinsics/vstX_lane.c: Likewise.
	* gcc.target/aarch64/advsimd-intrinsics/vtbX.c: Likewise.
	* gcc.target/aarch64/advsimd-intrinsics/vtrn.c: Likewise.
	* gcc.target/aarch64/advsimd-intrinsics/vtrn_half.c: Likewise.
	* gcc.target/aarch64/advsimd-intrinsics/vuzp.c: Likewise.
	* gcc.target/aarch64/advsimd-intrinsics/vuzp_half.c: Likewise.
	* gcc.target/aarch64/advsimd-intrinsics/vzip.c: Likewise.
	* gcc.target/aarch64/advsimd-intrinsics/vzip_half.c: Likewise.
	* gcc.target/aarch64/simd/lut.c: Likewise.
	* gcc.target/aarch64/vdup_lane_1.c: Likewise.
	* gcc.target/aarch64/vdup_lane_2.c: Likewise.
	* gcc.target/aarch64/vdup_n_1.c: Likewise.
	* gcc.target/aarch64/vect_copy_lane_1.c: Likewise.
	* gcc.target/aarch64/simd/mf8_data_1.c: New test.
	* gcc.target/aarch64/simd/mf8_data_2.c: Likewise.

Co-authored-by: default avatarSaurabh Jha <saurabh.jha@arm.com>
parent 5f40ff8e
No related branches found
No related tags found
Loading
Showing
with 1345 additions and 126 deletions
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment