Skip to content
Snippets Groups Projects
Commit b5a05815 authored by Pan Li's avatar Pan Li
Browse files

RISC-V: Implement vector SAT_TRUNC for signed integer


This patch would like to implement the sstrunc for vector signed integer.

Form 1:
  #define DEF_VEC_SAT_S_TRUNC_FMT_1(NT, WT, NT_MIN, NT_MAX)             \
  void __attribute__((noinline))                                        \
  vec_sat_s_trunc_##NT##_##WT##_fmt_1 (NT *out, WT *in, unsigned limit) \
  {                                                                     \
    unsigned i;                                                         \
    for (i = 0; i < limit; i++)                                         \
      {                                                                 \
        WT x = in[i];                                                   \
        NT trunc = (NT)x;                                               \
        out[i] = (WT)NT_MIN <= x && x <= (WT)NT_MAX                     \
	  ? trunc                                                       \
	  : x < 0 ? NT_MIN : NT_MAX;                                    \
      }                                                                 \
  }

DEF_VEC_SAT_S_TRUNC_FMT_1(int32_t, int64_t, INT32_MIN, INT32_MAX)

Before this patch:
  27   │     vsetvli a5,a2,e64,m1,ta,ma
  28   │     vle64.v v1,0(a1)
  29   │     slli    a3,a5,3
  30   │     slli    a4,a5,2
  31   │     sub a2,a2,a5
  32   │     add a1,a1,a3
  33   │     vadd.vv v0,v1,v5
  34   │     vsetvli zero,zero,e32,mf2,ta,ma
  35   │     vnsrl.wx    v2,v1,a6
  36   │     vncvt.x.x.w v1,v1
  37   │     vsetvli zero,zero,e64,m1,ta,ma
  38   │     vmsgtu.vv   v0,v0,v4
  39   │     vsetvli zero,zero,e32,mf2,ta,mu
  40   │     vneg.v  v2,v2
  41   │     vxor.vv v1,v2,v3,v0.t
  42   │     vse32.v v1,0(a0)
  43   │     add a0,a0,a4
  44   │     bne a2,zero,.L3

After this patch:
  16   │     vsetvli a5,a2,e32,mf2,ta,ma
  17   │     vle64.v v1,0(a1)
  18   │     slli    a3,a5,3
  19   │     slli    a4,a5,2
  20   │     sub a2,a2,a5
  21   │     add a1,a1,a3
  22   │     vnclip.wi   v1,v1,0
  23   │     vse32.v v1,0(a0)
  24   │     add a0,a0,a4
  25   │     bne a2,zero,.L3

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/ChangeLog:

	* config/riscv/autovec.md (sstrunc<mode><v_double_trunc>2): Add
	new pattern sstrunc for double trunc.
	(sstrunc<mode><v_quad_trunc>2): Ditto but for quad trunc.
	(sstrunc<mode><v_oct_trunc>2): Ditto but for oct trunc.
	* config/riscv/riscv-protos.h (expand_vec_double_sstrunc): Add
	new func decl to expand double trunc.
	(expand_vec_quad_sstrunc): Ditto but for quad trunc.
	(expand_vec_oct_sstrunc): Ditto but for oct trunc.
	* config/riscv/riscv-v.cc (expand_vec_double_sstrunc): Add new
	func to expand double trunc.
	(expand_vec_quad_sstrunc): Ditto but for quad trunc.
	(expand_vec_oct_sstrunc): Ditto but for oct trunc.

Signed-off-by: default avatarPan Li <pan2.li@intel.com>
parent 2987ca61
No related branches found
No related tags found
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment