RISC-V: Implement vector SAT_TRUNC for signed integer
This patch would like to implement the sstrunc for vector signed integer.
Form 1:
#define DEF_VEC_SAT_S_TRUNC_FMT_1(NT, WT, NT_MIN, NT_MAX) \
void __attribute__((noinline)) \
vec_sat_s_trunc_##NT##_##WT##_fmt_1 (NT *out, WT *in, unsigned limit) \
{ \
unsigned i; \
for (i = 0; i < limit; i++) \
{ \
WT x = in[i]; \
NT trunc = (NT)x; \
out[i] = (WT)NT_MIN <= x && x <= (WT)NT_MAX \
? trunc \
: x < 0 ? NT_MIN : NT_MAX; \
} \
}
DEF_VEC_SAT_S_TRUNC_FMT_1(int32_t, int64_t, INT32_MIN, INT32_MAX)
Before this patch:
27 │ vsetvli a5,a2,e64,m1,ta,ma
28 │ vle64.v v1,0(a1)
29 │ slli a3,a5,3
30 │ slli a4,a5,2
31 │ sub a2,a2,a5
32 │ add a1,a1,a3
33 │ vadd.vv v0,v1,v5
34 │ vsetvli zero,zero,e32,mf2,ta,ma
35 │ vnsrl.wx v2,v1,a6
36 │ vncvt.x.x.w v1,v1
37 │ vsetvli zero,zero,e64,m1,ta,ma
38 │ vmsgtu.vv v0,v0,v4
39 │ vsetvli zero,zero,e32,mf2,ta,mu
40 │ vneg.v v2,v2
41 │ vxor.vv v1,v2,v3,v0.t
42 │ vse32.v v1,0(a0)
43 │ add a0,a0,a4
44 │ bne a2,zero,.L3
After this patch:
16 │ vsetvli a5,a2,e32,mf2,ta,ma
17 │ vle64.v v1,0(a1)
18 │ slli a3,a5,3
19 │ slli a4,a5,2
20 │ sub a2,a2,a5
21 │ add a1,a1,a3
22 │ vnclip.wi v1,v1,0
23 │ vse32.v v1,0(a0)
24 │ add a0,a0,a4
25 │ bne a2,zero,.L3
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/ChangeLog:
* config/riscv/autovec.md (sstrunc<mode><v_double_trunc>2): Add
new pattern sstrunc for double trunc.
(sstrunc<mode><v_quad_trunc>2): Ditto but for quad trunc.
(sstrunc<mode><v_oct_trunc>2): Ditto but for oct trunc.
* config/riscv/riscv-protos.h (expand_vec_double_sstrunc): Add
new func decl to expand double trunc.
(expand_vec_quad_sstrunc): Ditto but for quad trunc.
(expand_vec_oct_sstrunc): Ditto but for oct trunc.
* config/riscv/riscv-v.cc (expand_vec_double_sstrunc): Add new
func to expand double trunc.
(expand_vec_quad_sstrunc): Ditto but for quad trunc.
(expand_vec_oct_sstrunc): Ditto but for oct trunc.
Signed-off-by:
Pan Li <pan2.li@intel.com>
Loading
Please register or sign in to comment