-
- Downloads
VECT: Add vec_mask_len_{load_lanes,store_lanes} patterns
This patch is add vec_mask_len_{load_lanes,store_stores} autovectorization patterns. Here we want to support this following autovectorization: void foo (int8_t *__restrict a, int8_t *__restrict b, int8_t *__restrict cond, int n) { for (intptr_t i = 0; i < n; ++i) { if (cond[i]) a[i] = b[i * 2] + b[i * 2 + 1]; } } ARM SVE IR: https://godbolt.org/z/cro1Eqc6a # loop_mask_60 = PHI <next_mask_82(4), max_mask_81(3)> ... mask__39.12_63 = vect__3.11_61 != { 0, ... }; vec_mask_and_66 = loop_mask_60 & mask__39.12_63; ... vect_array.15 = .MASK_LOAD_LANES (_57, 8B, vec_mask_and_66); ... For RVV, we would like to see IR: loop_len = SELECT_VL; ... mask__39.12_63 = vect__3.11_61 != { 0, ... }; ... vect_array.15 = .MASK_LEN_LOAD_LANES (_57, 8B, mask__39.12_63, loop_len, bias); ... Bootstrap and Regression on X86 passed. Ok for trunk ? gcc/ChangeLog: * doc/md.texi: Add vec_mask_len_{load_lanes,store_lanes} patterns. * internal-fn.cc (expand_partial_load_optab_fn): Ditto. (expand_partial_store_optab_fn): Ditto. * internal-fn.def (MASK_LEN_LOAD_LANES): Ditto. (MASK_LEN_STORE_LANES): Ditto. * optabs.def (OPTAB_CD): Ditto.
Loading
Please register or sign in to comment